Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Mod

Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging" docx

... discriminative word- character hybrid model for joint Chi- nese word segmentation and POS tagging. Our word- character hybrid model offers high performance since it can handle both known and unknown words. ... and the 4th IJCNLP of the AFNLP, pages 513–521, Suntec, Singapore, 2-7 August 2009. c 2009 ACL and AFNLP An Error-Driven Word- Character Hybrid...

Ngày tải lên: 17/03/2014, 01:20

9 338 0

Báo cáo khoa học: "An experiment on the upper bound of interjudge agreement: the case of tagging" docx

... 55724 words and 102527 morphological analyses (an average of 1.84 analyses per word) . One was an article about Japanese culture ('Pop'); one concerned patents ('Pat'); ... benchmark corpus alone does not nec- essarily suffice for getting an objective view of the tagger's performance. Subconscious 'bad' habits of this type need to be factored out...

Ngày tải lên: 08/03/2014, 21:20

5 353 0

Báo cáo khoa học: "An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition" pdf

... previous and next words, character n-grams of the current word, Part of Speech tag of the current word and surround- ing words, the shallow parse chunk of the current word, shape of the current word, ... Department Stanford University Stanford, CA 94305 vijayk@cs.stanford.edu Christopher D. Manning Computer Science Department Stanford University Stanford, CA 94305 manning@cs.stanford...

Ngày tải lên: 31/03/2014, 01:20

8 279 0

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

... F-measure POS- Segmentation 0.971 POS+ Segmentation 0.973 POS+ Joint S&T 0.925 Table 3: F-measure on segmentation and Joint S&T of perceptrons. POS- : perceptron trained without POS, POS+ : ... the POS information and reported the F-measure on segmentation only, while the second performed Joint S&T using POS information and reported the F-measure both...

Ngày tải lên: 08/03/2014, 01:20

8 445 0

Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

... search for the next possible word. This word- by -word method for segmentation was ﬁrst proposed in (Zhang and Clark, 2007), 1386 and was then further used in POS tagging in (Zhang and Clark, ... inter- mediate sub -word structure for joint segmentation and tagging. Since the sub-words are large enough in practice, the decoding for POS tagging over sub- words i...

Ngày tải lên: 17/03/2014, 00:20

10 412 0

Tài liệu Báo cáo khoa học: "An Improved Redundancy Elimination Algorithm for Underspeciﬁed Representations" pdf

... well- deﬁned model- theoretic semantics and therefore no concept of semantic equivalence. On the other hand, we do not need to solve the full semantic equivalence problem, as we only want to compare formulas ... R-permutable iff they are possible dominators of each other and (( f 1 ,ch(F 1 ,F 2 )),( f 2 ,ch(F 2 ,F 1 ))) ∈ P(R). For example, in Fig. 1, the fragments 1 and 2 are permut...

Ngày tải lên: 20/02/2014, 12:20

8 429 0

Tài liệu Báo cáo khoa học: "AN EXTENDED LR PARSING ALGORITHM FOR GRAMMARS USING FEATURE-BASED SYNTACTIC CATEGORIES " pot

... search for them, and a reduce action which pe,forms instanliation tit)ring parsing. Some details of the LR parsing algorithm are assumed from Aho and Ullman (1987) and Aho and Johnson (1974), and ... (1987) and Aho and Johnson (1974), and more formal definitions and notations of a feature- based grammar formalism from Pollard and Sag (1987) and Shieber (1986)....

Ngày tải lên: 22/02/2014, 10:20

6 334 0

Báo cáo khoa học: "An Endogeneous Corpus-Based Method for Structural Noun Phrase Disambiguation" pptx

... D'AIR # AIR FROID CIRCUIT D'AIR FROID # CIRCUIT D'AIR # AIR FROID Since none of the sub-groups RE JET D'AIR and AIR FROID on the one hand, and CIRCUIT D'AIR and ... ambiguous MLNP. For example, RE JET D'AIR FROID and CIRCUIT D'AIR FROID are two ambiguous MLNP extracted from the test corpus and parsed by parsing rule [lo] above : REJET D&...

Ngày tải lên: 09/03/2014, 01:20

6 269 0

Báo cáo khoa học: "An Optimal-Time Binarization Algorithm for Linear Context-Free Rewriting Systems with Fan-Out Two" ppt

... nonterminals in its right-hand side. Let X 1 and X 2 be two disjoint position sets (i.e., X 1 ∩ X 2 = ∅), with f(X 1 ) = k 1 and f(X 2 ) = k 2 and with associated endpoint sets E 1 and E 2 , respectively. ... algorithm for transforming LCFRS with fan-out at most 2 into a binary form, whenever this is possible. This results in asymptotical run-time improvement for known parsing alg...

Ngày tải lên: 23/03/2014, 16:21

9 376 0

Báo cáo khoa học: "An Unsupervised Morpheme-Based HMM for Hebrew Morphological Disambiguation" pdf

... unsupervised models over this data set: Word model [W], and Morpheme model [M]. We also tested two diﬀerent sets of initial conditions. Uni- form distribution [Uniform]: For each word, each analysis ... into a word, such as the four known combinations of the word bclm, the two possible combinations of the word hn‘im, and their possible tags within the original words. Based...

Ngày tải lên: 23/03/2014, 18:20

8 309 0