trí tuệ nhân tạothan lambert,inst eecs berkeley edu CS188 Announcements Self Grade Drop You have 1 for the semester See Piazza Homework due tonight Project 3 on Friday Discussing new attendance policy[.]
CS188: Announcements Self Grade Drop: You have for the semester See Piazza Homework due tonight Project on Friday Discussing new attendance policy CuuDuongThanCong.com https://fb.com/tailieudientucntt CS 188: Artificial Intelligence http://bit.ly/cs188bnr CuuDuongThanCong.com Bayes’ Nets https://fb.com/tailieudientucntt Probability Recap http://bit.ly/cs188bnr Conditional probability: P(x|y ) = P (x,y ) P (y ) Product rule: P(x, y ) = P(x|y )P(y ) Bayes Rule: P(y |x) = P (x|y )P (y ) P (x ) Chain rule: P(x1 , x2 , , xn ) = P(x1 )P(x2 |x1 )P(x3 |x1 , x2 ) = Πni=1 P(xi |x1 , , xi−1 ) X , Y independent if and only if: ∀x, y : P(x, y ) = P(x)P(y) X and Y are conditionally independent given Z if and only if: ∀x, y , z : P(x, y |z) = P(x|z)P(y |z) CuuDuongThanCong.com https://fb.com/tailieudientucntt Ghostbusters Chain Rule http://bit.ly/cs188bnr [Demo: Ghostbuster – with probability (L12D2) ] Each sensor depends only on where the ghost is That means, the two sensors are conditionally independent, given the ghost position T +t +t +t +t +t -t -t -t B +b +b -b -b +b +b -b -b CuuDuongThanCong.com G +g -g +g -g +g -g +g -g P(T,B,G) 0.16 0.16 0.24 0.04 0.04 0.24 0.06 0.06 T: Top square is red B: Bottom square is red G: Ghost is in the top Givens: P( +g ) = 0.5 P( -g ) = 0.5 P( +t | +g ) = 0.8 P( +t | -g ) = 0.4 P( +b | +g ) = 0.4 P( +b | -g ) = 0.8 P(T , B, G) = P(G)P(T |G)P(B|G) https://fb.com/tailieudientucntt Bayes’Nets: Big Picture http://bit.ly/cs188bnr CuuDuongThanCong.com https://fb.com/tailieudientucntt Bayes’ Nets: Big Picture CuuDuongThanCong.com Two problems with using full joint distribution tables as our probabilistic models: t Unless there are only a few variables, the joint is WAY too big to represent explicitly t Hard to learn (estimate) anything empirically about more than a few variables at a time Bayes’ nets: a technique for describing complex joint distributions (models) using simple, local distributions (conditional probabilities) t More properly called graphical models t We describe how variables locally interact t Local interactions chain together to give global, indirect interactions t For now, we’ll be vague about how these interactions are specified https://fb.com/tailieudientucntt Example Bayes’ Net: Insurance CuuDuongThanCong.com https://fb.com/tailieudientucntt Example Bayes’ Net: Car CuuDuongThanCong.com https://fb.com/tailieudientucntt Ghostbusters Bayes Net CuuDuongThanCong.com R1 ··· G https://fb.com/tailieudientucntt Rn Graphical Model Notation CuuDuongThanCong.com Nodes: variables (with domains) t Can be assigned (observed) or unassigned (unobserved) Arcs: interactions t Similar to CSP constraints t Indicate “direct influence” between variables t Formally: encode conditional independence (more later) For now: imagine that arrows mean direct causation (in general, they don’t!) https://fb.com/tailieudientucntt Bayes’ Net Semantics A set of nodes, one per variable X A directed, acyclic graph A conditional distribution for each node A1 ··· An t A collection of distributions over X , one for each combination of parents’ values t CPT: conditional probability table t Description of a noisy “causal” process X A Bayes net = Topology (graph) + Local Conditional Probabilities CuuDuongThanCong.com https://fb.com/tailieudientucntt Probabilities in Bayesnets Bayes’ nets implicitly encode joint distributions t As a product of local conditional distributions P(x1 , , xn ) = ∏i Pr (xi |Parents(Xi )) t To see what probability a BN gives to a full assignment, multiply all the relevant conditionals together: P(x1 , , xn ) = ∏i Pr (xi |Parents(Xi )) Example: Pr (G = b|T = r , B = r ) = Pr (G = r |T = r )P(S = s|B = r ) Pr (G = b|T = r , B = r ) = Pr (G = r |T = r , B = r )P(S = s|B = r ) CuuDuongThanCong.com https://fb.com/tailieudientucntt Probabilities in BNs Why are we guaranteed that setting results is joint distribution? Chain rule (valid for all distributions): P(x1 , , xn ) = ∏i P(xi |x1 , , xi ) Assume conditional independences: P(x1 , , xn ) = ∏i P(xi |parents(Xi ) Consequence: Not every BN can represent every joint distribution t The topology enforces certain conditional independencies! CuuDuongThanCong.com https://fb.com/tailieudientucntt Example: Coin Flips X1 h t X2 0.5 0.5 h t Xn 0.5 0.5 ··· h t 0.5 0.5 Only distributions whose variables are absolutely independent can be represented by a Bayes’ net with no arcs CuuDuongThanCong.com https://fb.com/tailieudientucntt Example: Traffic R +r -r R R +r +r -r -r T P(+r,-t) = P(+r) × P(-t | +r) = (1/4) × (1/4) = 1/16 CuuDuongThanCong.com https://fb.com/tailieudientucntt T 1/4 3/4 T +t -t +t -t P(T|R) 3/4 1/4 1/2 1/2