Báo cáo khoa học: "PARSING THE LOB CORPUS" pdf
... " ;The pastry chef placed the pie in the oven." In the figure, items to the left of the vertical line are the phrases and rules popped off the stack. To the right of each item is a ... to the same 64,000 words of the LOB corpus. The results were compared to the LOB part of speech pre-tags, and are listed in Figure 1. 4 If a word was pre-tagged as being...
Ngày tải lên: 31/03/2014, 18:20
... depending on the context. 1406 For example, the character 吼 ‘people’ in 撇嗤吼 the one who plants’ is a suffix, but in the personal name 撱嗤吼 ‘Zhou Shuren’ it isn’t. The structures of these two words ... be the right- most NNf. Therefore, in 卣敯埚 ‘former president’ the head is 敯埚 ‘president’. In passing, the readers should note the fact that in Figure 9, we have to add...
Ngày tải lên: 17/03/2014, 00:20
... case the postposition is the head of the NP. The system maintains a list of Hindi postpositions to identify Hindi PPs. For example, consider the translation pair the lady in the room had cooked the food∼ ... example, consider the parsing and phrase structure of the English sentence given above. In the first level the Inter-phrase relations (cor- responding to the phra...
Ngày tải lên: 23/03/2014, 18:20
Báo cáo khoa học: "Parsing the WSJ using CCG and Log-Linear Models" pptx
... l, where h f is the head word of the lex- ical category, f is the lexical category, s is the argu- ment slot, h a is the head word of the argument, and l indicates whether the dependency is ... and the empirical expected value, of each feature (to calcu- late the gradient); and the value of the likelihood function. For the normal-form model, the empiri- cal expected...
Ngày tải lên: 23/03/2014, 19:20
Báo cáo khoa học: " Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques" doc
... both the c-structure and the f-structure of the parse. For example, the WSJ’s ADJP-PRD la- bel must correspond to an AP in the c-structure and an XCOMP in the f-structure. In this version of the corpus, ... occurences of the same lex- ical item are indicated explicitly in the LFG rep- resentation but not in the DR representation. The main conceptual difference between t...
Ngày tải lên: 23/03/2014, 20:20
Báo cáo khoa học: "Parsing the Wall Street Journal with the Inside-Outside Algorithm" potx
... of the available training material ,and evaluate the effect of the training size on the bracketing perform,'mce. Then, we describe a method for reducing the number of parameters in the ... negligible. In the above formula, the probabil- ity P(c) of the partially bracketed sentence c is computed as the sum of the probabilities of all derivations compat- ible wi...
Ngày tải lên: 01/04/2014, 00:20
Báo cáo khoa học: "Classifying the Hungarian Web" pdf
... enjoys the same kind of superiority, being big- ger than the next two competitors put together, that the British Navy had when Britannia ruled the waves. The verb vizslazni (originally from the noun ... have about the same fre- quency in the topic as in general language can't help us distinguish whether the document came from the Bernoulli source associated with th...
Ngày tải lên: 31/03/2014, 20:20
Tài liệu Báo cáo khoa học: "Is the End of Supervised Parsing in Sight?" pdf
... derivations, the probability of a tree is the sum of the probabilities of the derivations producing that tree. The probability of a derivation is the product of the subtree probabilities. The original ... subtrees thereof on a held- out corpus, either by taking their relative frequencies, or by iteratively training the subtree parameters using the EM algorithm (referr...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Charting the Depths of Robust Speech Parsing" pdf
... and b) and another edge which has been built from them (edge c), the latter should get a bet- ter score than the sequence of the original two edges. If there is another edge from the parser which ... results. The deci- sion when to switch to the next best path of a given WHG depends on the length of the input and on the time already used. After the pars- ing of on...
Ngày tải lên: 20/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Parsing, Projecting & Prototypes: Repurposing Linguistic Data on the Web" doc
... 189,244. We then ran the new language ID algorithm on the IGTs, and Table 1 shows the language distribution of the IGTs in ODIN according to the output of the algorithm. For instance, the third ... the crawled documents as ungrammatical (usually with an asterisk “*” at the beginning of the language line). Those IGTs are kept in ODIN too because they could be useful to othe...
Ngày tải lên: 22/02/2014, 02:20