xử lý ngôn ngữ tự nhiên,kai wei chang,www cs virginia edu Lecture 16 The CKY parsing algorithm Kai Wei Chang CS @ University of Virginia kw@kwchang net Couse webpage http //kwchang net/teaching/NLP16[.]
Lecture 16: The CKY parsing algorithm Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt How to represent the structure CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Phrase structure (constituency) trees v Can be modeled by Context-free grammars v We will see how constituent parse and dependency parse are related CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Parse tree defined by CFG CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Naïve top-down parsing v # possible trees is exponential v Many sub-trees are the same CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Two key issues v Computational complexity v Can we reuse the computations? Dynamic programming v Ambiguity v (Lexicalized) PCFG CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Chomsky Normal Form v Chomsky Normal Form allows only two kinds of right-hand sides: v Two non-terminals: (e.g., VP → ADV VP) v One terminal: VP → eat v Any CFG can be rewritten into an equivalent Chomsky Normal Form CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Chomsky Normal Form v Chomsky Normal Form allows only two kinds of right-hand sides: v Two non-terminals: (e.g., VP → ADV VP) v One terminal: VP → eat v Any CFG can be rewritten into an equivalent Chomsky Normal Form v Try this: VP → VBD NP PP PP CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt More about the conversion v Eliminate rules with non-solitary terminals 𝐴 → 𝑋$ … 𝑎 … 𝑋( add Y* → 𝑎 and replace the rule with 𝐴 → 𝑋$ … 𝑌* … 𝑋(, Y* → 𝑎 v Eliminate right-hand sides with > nonterminals 𝐴 → 𝑋$ … 𝑋 … 𝑋( replace the rule by 𝐴 → 𝑋$ 𝐴$ , 𝐴$ → 𝑋0𝐴0 , … , 𝐴(10 → 𝑋(1$𝑋( v Remove unit rules 𝐴 → 𝐵 replace each rule 𝐵 → 𝑋$ 𝑋0 with A → 𝑋$ 𝑋0 CS6501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt One more example v Example from https://en.wikipedia.org/wiki/Chomsky_normal_form Expr Term Factor Primar y AddOp MulOp → Expr → Term → Factor → Primary | Expr AddOp Term | AddOp Term | Term MulOp Factor | Factor ^ Primary → number | variable → + → * | − | / | ( Expr ) CS6501: NLP CuuDuongThanCong.com 10 https://fb.com/tailieudientucntt ... constituent parse and dependency parse are related CS6 501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt Parse tree defined by CFG CS6 501: NLP CuuDuongThanCong.com https://fb.com/tailieudientucntt... VP § NP § NP § PP § NP 3rd pass § … CS6 501: NLP CuuDuongThanCong.com 13 https://fb.com/tailieudientucntt This is correct, but we repeatedly check same pairs! CS6 501: NLP CuuDuongThanCong.com 14... Every cell stores constituents found between i and j CS6 501: NLP CuuDuongThanCong.com 16 https://fb.com/tailieudientucntt Avoid duplicate work: S NP VP NP Det N NP NP PP VP V NP VP VP PP PP P NP CS6 501: NLP CuuDuongThanCong.com