trí tuệ nhân tạothan lambert,inst eecs berkeley edu

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	7
Dung lượng	3,74 MB

Nội dung

trí tuệ nhân tạothan lambert,inst eecs berkeley edu Today Continuing Hidden Markov Models! Hidden Markov Models Video of Demo Pacman – Sonar (no beliefs) Hidden Markov Models Markov chains not so usef[.]

Today Hidden Markov Models Video of Demo Pacman – Sonar (no beliefs) Example: Weather HMM Example: Ghostbusters HMM Continuing Hidden Markov Models! Hidden Markov Models P(X1 ) = uniform Markov chains not so useful for most agents t Need observations to update your beliefs Hidden Markov models (HMMs) t Underlying Markov chain over states X t You observe outputs (effects) at each time step 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/6 1/2 0 0 P(X1 ) Rt-1 +r +r -r -r Rt P(Rt|Rt-1) +r 0.7 -r 0.3 +r 0.3 -r 0.7 Rt +r +r -r -r Ut +u -u +u -u P(Ut|Rt) 0.9 0.1 0.2 0.8 An HMM is defined by: t Initial distribution: P(X1 ) t Transitions:P(Xt |Xt−1 ) t Emissions: P(Et |Xt ) 1/6 1/6 P(X |X ) = usually move clockwise, but sometimes move in random direction or stay in place P(Rij |X ) = same sensor model as before: red means close, green/yellow means farther away P(X |X = h1, 2i) [Demo: Ghostbusters – Circular Dynamics – HMM (L14D2)] CuuDuongThanCong.com https://fb.com/tailieudientucntt Video of Demo Ghostbusters – Circular Dynamics – HMM Conditional Independence HMMs have two important independence properties: t Markov hidden process: future depends on past via the present t Current observation independent of all else given current state Are E1 and E3 independent now? No!! Need to condition on X1 , or X2 or X3 Filtering / Monitoring Example: Robot Localization Real HMM Examples Robot tracking: t Observations are range readings (continuous) t States are positions on a map (continuous) Speech recognition HMMs: t Observations are acoustic signals (continuous valued) t States are specific positions in specific words (so, tens of thousands) Machine translation HMMs: t Observations are words (tens of thousands) t States are translation options Example: Robot Localization Filtering, or monitoring, is the task of tracking the distribution Bt (X ) = Pt (Xt |e1 , , et ) (the belief state) over time We start with B1 (X ) in an initial setting, usually uniform As time passes, or we get observations, we update B(X ) The Kalman filter was invented in the 60’s and first implemented as a method of trajectory estimation for the Apollo program Sensor model: can read in which directions there is a wall, never more than mistake Motion model: may not execute action with small prob Example from Michael Pfeiffer CuuDuongThanCong.com Lighter grey: was possible to get the reading, but less likely b/c required mistake https://fb.com/tailieudientucntt Inference: Find State Given Evidence Two Steps: Passage of Time + Observation Inference: Base Cases We are given evidence at each time and want to know Idea: start with P(X1 ) and derive Bt in terms of Bt−1 t equivalently, derive Bt +1 in terms of Bt Passage of Time P(X1 |e1 ) = P(X1 , e1 )/P(e1 ) ∝ P(X1 )P(e1 |X1 ) Example: Passage of Time P(X2 ) = ∑X1 P(x1 , X2 ) = ∑X1 P(X2 |x1 )P(x1 ) Observation As time passes, uncertainty “accumulates” Assume we have current belief P(X |evidence to date) Assume we have current belief P(X |previous evidence): B (Xt +1 ) = P(Xt +1 |e1:t ) B(Xt ) = P(Xt |e1:t ) Then, after evidence comes in: Then, after one time step passes: P(Xt +1 |e1:t +1 ) = P(Xt +1 , et +1 |e1:t )/P(et +1 |e1:t ) ∝ P(et +1 |e1:t , Xt +1 )P(Xt +1 |e1:t ) = P(et +1 |Xt +1 )P(Xt +1 |e1:t ) P(Xt +1 |e1:t ) = ∑xt P(Xt +1 , xt |e1:t ) = ∑xt P(Xt +1 |e1:t , xt )P(xt |e1:t ) = ∑xt P(Xt +1 |xt )P(xt |e1:t ) Or, compactly: B(Xt +1 ) ∝ P(et +1 |Xt +1 )B(Xt ) Or, compactly: Bt0+1 (Xt +1 ) = ∑xt P(Xt +1 |xt )Bt (xt ) Basic idea: beliefs “reweighted” by likelihood of evidence Basic idea: beliefs get “pushed” through the transitions t With the “B” notation: Bt0 (·), without et , Bt (·), with et Unlike passage of time, we have to renormalize CuuDuongThanCong.com https://fb.com/tailieudientucntt Example: Observation Example: Weather HMM: Online Belief Updates Every time step, we start with current P(X |evidence) We update for time: P(xt |e1:t ) = ∑xt−1 P(xt−1 |e1:t−1 ) · P(xt |xt−1 ) We update for evidence: After observation Before observation P(xt |e1:t ) ∝X P(xt−1 |e1:t−1 ) · P(xt |xt−1 ) The forward algorithm does both at once (and doesn’t normalize) As we get observations, beliefs get reweighted, uncertainty “decreases” B(X ) ∝ P(e|X )B (X ) The Forward Algorithm Pacman – Sonar (P4) Video of Demo Pacman – Sonar (with beliefs) We are given evidence at each time and want to know: Bt (X ) = P(Xt |e1:t ) We can derive the following updates P(X |e1:t ) ∝ P(Xt |e1:t ) Normalize anytime = ∑xt−1 P(xt−1 , xt |e1:t ) = ∑xt−1 P(xt−1 |e1:t−1 )P(xt |xt−1 )P(et |xt ) = P(et |xt ) ∑xt−1 P(xt |xt−1 )P(xt−1 |e1:t−1 ) [Demo: Pacman – Sonar – No Beliefs(L14D1)] CuuDuongThanCong.com https://fb.com/tailieudientucntt Next Up Particle Filtering Particle Filtering Filtering: approximate solution Sometimes |X | is too big to use exact inference t |X | may be too big to even store B(X ) t E.g X is continuous Solution: approximate inference t Track samples of X , not all values t Samples are called particles t Time per step is linear in the number of samples t But: number needed may be large t In memory: list of particles, not states Particle Filtering and Applications of HMMs This is how robot localization works in practice Particle is just new name for sample Representation: Particles Particle Filtering: Elapse Time Our representation of P(X) is now a list of N particles (samples) Particles: (3,3) (2,3) (3,3) (3,2) (3,3) (3,2) (1,2) (3,3) (3,3) (2,3) t Generally, N

Ngày đăng: 25/11/2022, 23:06