Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 71 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
71
Dung lượng
301,06 KB
Nội dung
om C nh Vi en Zo ne Machine Learning Si Chapter 11 SinhVienZone.com https://fb.com/sinhvienzonevn om Machine Learning Si nh Vi en Zo ne C • What is learning? Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning ne C • What is learning? nh Vi en Zo • “That is what learning is You suddenly understand something you've understood all your life, but in a new way.” Si (Doris Lessing – 2007 Nobel Prize in Literature) Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning C • Arthur Samuel (1959): nh Vi en Zo ne "Field of study that gives computers the ability to learn without being explicitly programmed” • Tom Mitchell (1997): Si "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E” Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning Si nh Vi en Zo ne C • How to construct programs that automatically improve with experience Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning nh Vi en • Learning problem: Zo ne C • How to construct programs that automatically improve with experience Si – Task T – Performance measure P – Training experience E Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning C • Chess game: nh Vi en Zo ne – Task T: playing chess games – Performance measure P: percent of games won against opponents Si – Training experience E: playing practice games againts itself Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning C • Handwriting recognition: nh Vi en Zo ne – Task T: recognizing and classifying handwritten words – Performance measure P: percent of words correctly classified – Training experience E: handwritten words with given Si classifications Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn Example Example GRAY? + + + + + + + + - - + + - + MAMMAL? LARGE? VEGETARIAN? WILD? Elephant + + - + + + + - (Mouse) + + + - (Giraffe) + - + - (Dinosaur) + + + - + Zo ne C + nh Vi en Si Prediction om Experience + + + - + ? + - + - + ? + + + - - ? Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Example Sky Sunny Warm Normal Sunny Warm High Rainy Cold High Sunny Warm Humidity Wind Water Forecast EnjoySport Warm Same Yes Strong Warm Same Yes Strong Warm Change No High Strong Cool Change Yes Low Weak Strong Zo nh Vi en Si Prediction AirTemp ne Example C Experience Rainy Cold High Strong Warm Change ? Sunny Warm Normal Strong Warm Same ? Sunny Warm Low Strong Cool Same ? Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 10 15 November 2011 https://fb.com/sinhvienzonevn Decision Trees om Humidity High C Normal ne Yes nh Vi en Zo Sunny Yes Sky AirTemp Humidity Sunny Warm Sunny Warm Rainy Cold Sunny Warm Cloudy Cloudy Rainy Wind Water Forecast Normal Strong Warm Same Yes High Strong Warm Same Yes High Strong Warm Change No High Strong Cool Change Yes Warm High Weak Cool Same Yes Cold High Weak Cool Same No 57 May 3, 2014 SinhVienZone.com Cloudy AirTemp No Si No Sky Enjoy Warm Cold Yes Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn No ne C om Decision Trees + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - + + + - - - nh Vi en + + + - - - Si A1 = v1 Zo + + + - - - A2 = v2 58 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn .C om Homogenity of Examples Si nh Vi en Zo ne • Entropy(S) = - p+log2p+ - p-log2p- 0.5 59 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn impurity measure Si nh Vi en Zo ne • Entropy(S) = ∑i=1,c- pilog2pi C om Homogenity of Examples 60 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn .C om Information Gain nh Vi en Zo ne • Gain(S, A) = Entropy(S) - ∑v∈Values(A)(|Sv|/|S|).Entropy(Sv) Sv2 Si Sv1 A 61 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn om Example Zo ne = 0.389 + 0.528 = 0.917 C • Entropy(S) = - p+log2p+ - p-log2p- = - (4/6)log2(4/6) - (2/6)log2(2/6) • Gain(S, Sky) nh Vi en = Entropy(S) - ∑v∈{Sunny, Rainy, Cloudy}(|Sv|/|S|)Entropy(Sv) = Entropy(S) - [(3/6).Entropy(SSunny) + (1/6).Entropy(SRainy) + (2/6).Entropy(SCloudy)] Si = Entropy(S) - (2/6).Entropy(SCloudy) = Entropy(S) - (2/6)[- (1/2)log2(1/2) - (1/2)log2(1/2)] = 0.917 - 0.333 = 0.584 62 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn om Example • Gain(S, Water) Zo ne = 0.389 + 0.528 = 0.917 C • Entropy(S) = - p+log2p+ - p-log2p- = - (4/6)log2(4/6) - (2/6)log2(2/6) nh Vi en = Entropy(S) - ∑v∈{Warm, Cool}(|Sv|/|S|)Entropy(Sv) = Entropy(S) - [(3/6).Entropy(SWarm) + (3/6).Entropy(SCool)] = Entropy(S) - (3/6).2.[- (2/3)log2(2/3) - (1/3)log2(1/3)] =0 Si = Entropy(S) - 0.389 - 0.528 63 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn ne No Cloudy ? nh Vi en Yes Rainy Zo Sunny C Sky om Example Si • Gain(SCloudy, AirTemp) = Entropy(SCloudy) - ∑v∈{Warm, Cold}(|Sv|/|S|)Entropy(Sv) =1 • Gain(SCloudy, Humidity) = Entropy(SCloudy) - ∑v∈{Normal, High}(|Sv|/|S|)Entropy(Sv) =0 64 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn .C Si nh Vi en Zo ne • Hypothesis space: complete! om Inductive Bias 65 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn .C ne • Hypothesis space: complete! om Inductive Bias Zo • Shorter trees are preferred over larger trees Si nh Vi en • Prefer the simplest hypothesis that fits the data 66 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn om Inductive Bias nh Vi en ⇒ Preference bias Zo ne C • Decision Tree algorithm: searches incompletely thru a complete hypothesis space • Cadidate-Elimination searches completely thru an incomplete hypothesis space Si ⇒ Restriction bias 67 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn om Overfitting Si nh Vi en Zo ne C • h∈H is said to overfit the training data if there exists h’∈H, such that h has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances 68 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn om Overfitting nh Vi en Zo ne C • h∈H is said to overfit the training data if there exists h’∈H, such that h has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances: – There is noise in the data Si – The number of training examples is too small to produce a representative sample of the target concept 69 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn .C om Overfitting Si nh Vi en Zo ne Loãi học h học h’ 70 May 3, 2014 SinhVienZone.com tập kiểm tra tập huấn luyện Thời gian hoïc Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn om Homework Exercises Si nh Vi en Zo ne C 3-1→3.4 (Chapter 3, ML textbook) 71 May 3, 2014 SinhVienZone.com Cao Hoang Tru CSE Faculty - HCMUT https://fb.com/sinhvienzonevn ...om Machine Learning Si nh Vi en Zo ne C • What is learning? Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning ne... https://fb.com/sinhvienzonevn om Machine Learning Si nh Vi en Zo ne C • What is learning? Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 12 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning Learner... Experience Zo ne C • What is learning? Cao Hoang Tru CSE Faculty - HCMUT SinhVienZone.com 13 15 November 2011 https://fb.com/sinhvienzonevn om Machine Learning C • Learning is an (endless) generalization