Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

... round of the mining process. The second layer consists of the extractor, the filter, the classifiers and the readability evaluator, which are applied sequentially. The extractor scans the raw web page ... consists of the crawler and the raw web page storage. The crawler periodically downloads two kinds of web pages, which are put into the storage. The first kind...

Ngày tải lên: 20/02/2014, 05:20

6 658 0
Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

... in the last link, the phrase preceding the vertical bar is the name of the article, while the following phrase is what is actually displayed to a visitor of the webpage. Near the end of the ... another language) . The au- thors noted that their results would need to pass a manual supervision step before being useful for the NER task, and thus did not evaluate the...

Ngày tải lên: 20/02/2014, 09:20

9 429 1
Tài liệu Báo cáo khoa học: "Analyzing the Errors of Unsupervised Learning" docx

Tài liệu Báo cáo khoa học: "Analyzing the Errors of Unsupervised Learning" docx

... system. If these match the empirical counts, then the M-step does not change the parameters. But if the supervised system predicts too many JJs, for example, then the M-step will update the parameters ... on the distance from the true θ ∗ for the HMM as we increase the number of examples. In the unsupervised case, we use the following procedure to obtain a surrogat...

Ngày tải lên: 20/02/2014, 09:20

9 490 0
Tài liệu Báo cáo khoa học: "A Modular Toolkit for Coreference Resolution" pdf

Tài liệu Báo cáo khoa học: "A Modular Toolkit for Coreference Resolution" pdf

... as well as additional in- formation such as part-of-speech tags and merging these information into markables that are the start- ing point for the mentions used by the coreference resolution ... chun- ker, with the Stanford POS tagger (Toutanova et al., 2003), the YamCha chunker (Kudoh and Mat- sumoto, 2000) and the Stanford Named Entity Rec- ognizer (Finkel et al., 2005), the...

Ngày tải lên: 20/02/2014, 09:20

4 419 0
Tài liệu Báo cáo khoa học: "Conditional Modality Fusion for Coreference Resolution" pdf

Tài liệu Báo cáo khoa học: "Conditional Modality Fusion for Coreference Resolution" pdf

... compared to training them jointly, because independent training of the modality-specific classi- fiers forces them to account for data that they can- not possibly explain. For example, if the speaker is not ... x i ; w) The form of the potential function ψ is where our intuitions about the role of the hidden variable are formalized. Our goal is to include the non-verbal featur...

Ngày tải lên: 20/02/2014, 12:20

8 347 0
Tài liệu Báo cáo khoa học: "REPRESENTATION OF TEXTS FOR INFORMATION RETRIEVAL" pdf

Tài liệu Báo cáo khoa học: "REPRESENTATION OF TEXTS FOR INFORMATION RETRIEVAL" pdf

... stands within the constraints, and test whether it can be pro- gressively modified in response to observed deficien- cies, until either the desired level of performance in solving the problem ... ing users' information needs their anomalous states of knowledge when they approach the system. The analysis produced graph-like structures, or association maps, of the abstr...

Ngày tải lên: 21/02/2014, 20:20

2 419 0
Tài liệu Báo cáo khoa học: "A LOGICAL SEMANTICS FOR FEATURE STRUCTURES" pdf

Tài liệu Báo cáo khoa học: "A LOGICAL SEMANTICS FOR FEATURE STRUCTURES" pdf

... structures. Figure 3 defines the syntax of well formed formulas. In the following sections symbols from the Greek alpha- bet axe used to stand for arbitrary formulas in FML. The formulas NIL and TOP ... the satisfiability problem for CNF formulas of propositional logic can be reduced to the consistency (or satisfia- bility) problem for formulas in FML. Thus, the...

Ngày tải lên: 21/02/2014, 20:20

10 421 0
Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

... [END:TITLE]. The number inside the chunk token is the length of the text chunk, not counting whitespace; from this point on only the length of the text chunks is used, and therefore the structural ... to the trans- lated page in the other language. Exploration of the Web suggests that parent pages and sib- ling pages cover the major relationships between paral...

Ngày tải lên: 08/03/2014, 06:20

8 229 0
Tài liệu Báo cáo khoa học: "Mining metalinguistic activity in corpora to create lexical resources using Information Extraction techniques: the MOP system" doc

Tài liệu Báo cáo khoa học: "Mining metalinguistic activity in corpora to create lexical resources using Information Extraction techniques: the MOP system" doc

... credited to the fact that the writer needs to mark these sentences for spe- cial processing by the reader, as they dissect across two different semiotic levels: a metalan- guage and its object language, ... similarity of 65% for comparison between a golden standard slot entry and the one provided by the application. Thus, if the autonym or the informational segment is a...

Ngày tải lên: 20/02/2014, 15:20

8 459 0
Tài liệu Báo cáo khoa học: Seeking the determinants of the elusive functions of Sco proteins pptx

Tài liệu Báo cáo khoa học: Seeking the determinants of the elusive functions of Sco proteins pptx

... mitochondrial matrix, the transmembrane helix and the following  20 residues) is crucial to determin- ing the aggregation state of these proteins. The data therefore support the hypothesis that this ... close to the N-terminal trans- membrane helix anchoring the protein to the inner membrane of mitochondria [27]. Therefore, the N-ter- minal segment (containing the residue...

Ngày tải lên: 14/02/2014, 18:20

19 744 0
w