Báo cáo khoa học: "Ranking Algorithms for Named–Entity Extraction: Boosting and the Voted Perceptron" pdf

Báo cáo khoa học: "Ranking Algorithms for Named–Entity Extraction: Boosting and the Voted Perceptron" pdf

Báo cáo khoa học: "Ranking Algorithms for Named–Entity Extraction: Boosting and the Voted Perceptron" pdf

... an input candidate: for is the ’th tag in the tagged sequence. for is the ’th word. for is if begins with a lower- case letter, otherwise. for is a transformation of , where the transformation ... quotes. For example, The Day They Shot John Lennon”, the name of a band, appears in the running example. Define to be the index of any double quo- tation marks in the candi...

Ngày tải lên: 17/03/2014, 08:20

8 388 0
Tài liệu Báo cáo khoa học: "Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations" ppt

Tài liệu Báo cáo khoa học: "Inducing Gazetteers for Named Entity Recognition by Large-scale Clustering of Dependency Relations" ppt

... cluster gazetteer and the Wikipedia one improves the accuracy of Japanese NER. The next question is whether these gazetteers improve the accuracy further when they are used together. The accuracies ... first sentence of a Wikipedia article. The last word in the noun phase is then extracted and becomes the hyper- nym of the entity described by the article. For exam- pl...

Ngày tải lên: 20/02/2014, 09:20

9 429 0
Tài liệu Báo cáo khoa học: "Tabular Algorithms for TAG Parsing" potx

Tài liệu Báo cáo khoa học: "Tabular Algorithms for TAG Parsing" potx

... e I } The hypotheses defined for this parsing system are the standard ones and therefore they will be omitted in the next parsing systems described in this paper. The key steps in the parsing ... (-,-) and 5 =~ ai+l aj if and only if (p, q) = (-, -). The items of the new parsing schema, denoted buEx, are obtained by refining the items of CYK. The dotted rules e...

Ngày tải lên: 22/02/2014, 03:20

8 292 0
Báo cáo khoa học: "Joint Inference of Named Entity Recognition and Normalization for Tweets" doc

Báo cáo khoa học: "Joint Inference of Named Entity Recognition and Normalization for Tweets" doc

... annotated data set, and show that our method outper- forms the baseline that handles these two tasks separately, boosting the F1 from 80.2% to 83.6% for NER, and the Accuracy from 79.4% to 82.6% for NEN, ... F1 for NER and 82.6% Accuracy for NEN, outperforming the baseline with 80.2%F1 for NER and 79.4% Accuracy for NEN. We summarize our contributions as follows....

Ngày tải lên: 07/03/2014, 18:20

10 444 0
Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

... procedure: for all states and all , if there exists another path and transition such that , , and , and either (i) and or (ii) there exists such that and and , then we add to the set: . See figure 4 for an ... i.e. q’ r’ π’ q e r e’ π Figure 4: The path is invalid if , , , and either (i) and or (ii) and . , for all . Then, . The history-less state has no incoming...

Ngày tải lên: 08/03/2014, 04:22

8 389 0
Báo cáo khoa học: "Clustering Clauses for High-Level Relation Detection: An Information-theoretic Approach" pdf

Báo cáo khoa học: "Clustering Clauses for High-Level Relation Detection: An Information-theoretic Approach" pdf

... con- flicts, and another for other countries; a cluster for winning game scores, and another for ties; etc. The fact that the algorithm separated these clusters indicates that the distinction between them ... cluster containing the subject word of the clause, and the same for the verb and object words. For example, the sentence The terrorist threw the grena...

Ngày tải lên: 08/03/2014, 02:21

8 261 0
Báo cáo khoa học: "Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web" pot

Báo cáo khoa học: "Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web" pot

... to the size of the manual corpus. When we trained with that size of the automatic corpus, the performance was very low compared to the performance of the manual cor- pus. The reason is that the ... the satisfiable performance. We measured the perfor- mance according to the size of the automatic cor- pus. We carried out the experiment with the deci- sion list learnin...

Ngày tải lên: 08/03/2014, 04:22

4 397 0
Báo cáo khoa học: "A Strategy for Dynamic Interpretation: a Fragment and an Implementation" pot

Báo cáo khoa học: "A Strategy for Dynamic Interpretation: a Fragment and an Implementation" pot

... A. Empty represents the empty tree and the function A gives the information at the current node, the left subtree, and the right subtree. The information con- tent of the nodes is of two kinds: ... parentheses as usual, and use T for a formula which is always true, I for a formula which is always false. The semantics of QDL is as for first order logic, wit...

Ngày tải lên: 09/03/2014, 01:20

10 366 0
Báo cáo khoa học: "Automatic Discovery of Named Entity Variants – Grammar-driven Approaches to Non-alphabetical Transliterations" pptx

Báo cáo khoa học: "Automatic Discovery of Named Entity Variants – Grammar-driven Approaches to Non-alphabetical Transliterations" pptx

... of these two language variants and mine potential variant pairs from their collocates. These potential variant pairs are then checked for their phonological similarity to deter- mine whether they ... lan- guage such as Chinese are opaque and not easy to compare. On the hand, there is often more than one way to transliterate a foreign name. On the other hand, dialectal difference as...

Ngày tải lên: 17/03/2014, 04:20

4 234 0
w