Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

... [END:TITLE]. The number inside the chunk token is the length of the text chunk, not counting whitespace; from this point on only the length of the text chunks is used, and therefore the structural ... considered the most reliable, these were used as the basis for the computation of recall and precision. For this reason, and because the human-judged set included...

Ngày tải lên: 08/03/2014, 06:20

8 229 0
Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

... consists of the crawler and the raw web page storage. The crawler periodically downloads two kinds of web pages, which are put into the storage. The first kind of web pages are parallel web pages ... round of the mining process. The second layer consists of the extractor, the filter, the classifiers and the readability evaluator, which are applied sequentially. The...

Ngày tải lên: 20/02/2014, 05:20

6 658 0
Báo cáo khoa học: "Automating the Acquisition of Bilingual Terminology" potx

Báo cáo khoa học: "Automating the Acquisition of Bilingual Terminology" potx

... with the position-sensitive method. As in figure 6, the score in parentheses is the recall score when attention is restricted to the set of 706 NPs. The 50% threshold is the default for the ... Although the compound problem can also be ad- dressed by morphological decomposition of com- pounds, there are two other advantages to com- pare the languages at the phrasa...

Ngày tải lên: 09/03/2014, 01:20

7 297 0
Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

... in the last link, the phrase preceding the vertical bar is the name of the article, while the following phrase is what is actually displayed to a visitor of the webpage. Near the end of the ... al-Khadamāt</ENAMEX> (MAK), we hypothesize that the text in the parentheses is an alternate name of the organiza- tion. We also looked for unmarked strings of the...

Ngày tải lên: 20/02/2014, 09:20

9 429 1
Báo cáo khoa học: "Validating the web-based evaluation of NLG systems" potx

Báo cáo khoa học: "Validating the web-based evaluation of NLG systems" potx

... it to the client for display at any time. 3 The experiments The web experiment. For the GIVE-1 challenge, 1143 valid games were collected over the Internet over the course of three months. These ... This brings the number of Internet subjects to 322 for the success rate, and to 227 (only successful games) for the other measures. Task success is the percentage of...

Ngày tải lên: 31/03/2014, 00:20

4 301 0
Tài liệu Báo cáo khoa học: "A COMPUTATIONAL MECHANISM FOR PRONOMINAL REFERENCE" pot

Tài liệu Báo cáo khoa học: "A COMPUTATIONAL MECHANISM FOR PRONOMINAL REFERENCE" pot

... antecedence information for personal pronouns contributes to this goal. In the next section, we show how our algorithm overcomes these limitations. 3. THE ALGORITHM Before giving the details of the ... remains. The WML translation for (23) is: (AWARD (THE COMMITTEE) (THE PRIZES) iT001) where IT001 is a WML constant marked for disjoint reference: IT001 ~ (THE COMMI...

Ngày tải lên: 21/02/2014, 20:20

10 513 0
Báo cáo khoa học: "SMS based Interface for FAQ Retrieval" pot

Báo cáo khoa học: "SMS based Interface for FAQ Retrieval" pot

... t from the dictionary and returns Q t , the set of all ques- tions in the corpus that contain the term t. We call the above process as querying the index on the term t. The details of the index ... specify the ∼ symbol at the end of each to- ken of the SMS query. For example, the SMS query “romg actvt” on the FAQ corpus is refor- mulated as “romg∼ 0.3 actvt∼ 0.3”...

Ngày tải lên: 08/03/2014, 00:20

9 362 0
Báo cáo khoa học: "A Common Framework for Syntactic Annotation" pot

Báo cáo khoa học: "A Common Framework for Syntactic Annotation" pot

... various levels. The process allows one to specify, on the one hand, the informational properties of the scheme (i.e., its capacity to represent a given piece of information), and, on the other, the way the ... eliminating the need for the intermediary concrete AML format. However, especially for existing formats, it is typically more straightforward to perform the two-...

Ngày tải lên: 08/03/2014, 05:20

8 347 0
Báo cáo khoa học: "Semantic Role Labeling for Coreference Resolution" pot

Báo cáo khoa học: "Semantic Role Labeling for Coreference Resolution" pot

... these features to the decision classes. The importance of SRL is also indicated by the analysis of the contribution of individual features to the overall performance. Table 6 shows the per- formance ... Experiments 3.1 Performance Metrics We report in the following tables the MUC score (Vilain et al., 1995). Scores in Table 2 are computed for all noun phrases appearing in ei...

Ngày tải lên: 08/03/2014, 21:20

4 293 0
Báo cáo khoa học: "Cognitively Motivated Features for Readability Assessment" pot

Báo cáo khoa học: "Cognitively Motivated Features for Readability Assessment" pot

... identify the common nouns in the document, and we find the union of the common nouns and the named entity noun phrases in the text. The union of these two sets is our definition of “entity” for ... in the form of average error scores. (For each article in the Weekly Reader testing data, we calculate the difference between the output score of the model and the...

Ngày tải lên: 08/03/2014, 21:20

9 343 0
Từ khóa:
w