Báo cáo khoa học: "A Cross-Lingual ILP Solution to Zero Anaphora Resolution" potx

Báo cáo khoa học: "A Cross-Lingual ILP Solution to Zero Anaphora Resolution" potx

Báo cáo khoa học: "A Cross-Lingual ILP Solution to Zero Anaphora Resolution" potx

... Association for Computational Linguistics A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida Tokyo Institute of Technology 2-12-1, ˆ Ookayama, Meguro, Tokyo 152-8552, Japan ryu-i@cl.cs.titech.ac.jp Massimo ... (7) 3.2 A subject detection model The greatest difficulty in zero anaphora resolution in comparison to, say, pronoun resolution, is zero anaphora dete...

Ngày tải lên: 07/03/2014, 22:20

10 510 0
Báo cáo khoa học: "A Syntax-Free Approach to Japanese Sentence Compression" potx

Báo cáo khoa học: "A Syntax-Free Approach to Japanese Sentence Compression" potx

... Syntax-Free Approach to Japanese Sentence Compression Tsutomu HIRAO, Jun SUZUKI and Hideki ISOZAKI NTT Communication Science Laboratories, NTT Corp. 2-4 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0237 ... summariza- tion systems often have to process megabytes of documents. Parsers are still slow and users of on- 827 demand summarization systems are not prepared to wait for parsing to...

Ngày tải lên: 08/03/2014, 00:20

8 464 0
Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc

Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc

... converted to dependency trees us- ing Stanford Parser (Marneffe et al., 2006). We con- vert the tokens in training data to lower case, and re-tokenize the sentences using the same tokenizer from ... sensitive to parser er- rors; on the other hand, integrated model is forced to use a longer distortion limit which leads to more search errors during decoding time. It is possible to 9...

Ngày tải lên: 19/02/2014, 19:20

9 616 0
Tài liệu Báo cáo khoa học: "An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation" ppt

Tài liệu Báo cáo khoa học: "An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation" ppt

... methods lies in the knowledge acquisition solutions they adopt. 2.1 Automatic Generation of Training Corpus Automatic corpus tagging is a solution to WSD, which generates large-scale corpus ... EPs. A Chinese thesaurus is adopted and revised to meet this de- mand. Extended Version of TongYiCiCiLin To extend the TongYiCiCiLin (Cilin) to hold more words, several linguistic res...

Ngày tải lên: 20/02/2014, 12:20

8 414 0
Tài liệu Báo cáo khoa học: "A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging∗" docx

Tài liệu Báo cáo khoa học: "A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging∗" docx

... differences hold to a lesser degree when a partial dictionary is provided. With MLHMM, different tokens of the same word type are usually assigned to the same cluster, but types are assigned to clusters ... t. We will use τ and ω to refer to the entire transition and out- put parameter sets. This model assumes that the prior over state transitions is the same for all his- tories, an...

Ngày tải lên: 20/02/2014, 12:20

8 524 0
Tài liệu Báo cáo khoa học: "A Feature Based Approach to Leveraging Context for Classifying Newsgroup Style Discussion Segments" pptx

Tài liệu Báo cáo khoa học: "A Feature Based Approach to Leveraging Context for Classifying Newsgroup Style Discussion Segments" pptx

... state of a simple finite-state automaton that only has two states. The automaton is set to initial state (q 0 ) at the top of a message. It makes a transition to state (q 1 ) when it encounters ... is to enable the quality and nature of discussions that occur within an on-line discussion board to be communicated in a summary to a potential new- comer or group moderators. We p...

Ngày tải lên: 20/02/2014, 12:20

4 519 0
Tài liệu Báo cáo khoa học: "A Limited-Domain English to Japanese Medical Speech Translator Built Using REGULUS 2" doc

Tài liệu Báo cáo khoa học: "A Limited-Domain English to Japanese Medical Speech Translator Built Using REGULUS 2" doc

... on the other. We propose to demon- strate a prototype system instantiating this architecture, which has been built on top of the Open Source REGULUS 2 platform. The prototype translates spoken ... Road Mountain View, CA 94040 vvandal3@aol.com Hitoshi Isahara, Kyoko Kanzaki Communications Research Laboratory 3-5 Hikaridai Seika-cho, Soraku-gun Kyoto, Japan 619-0289 {isahara,kanzaki}@crl.go.j...

Ngày tải lên: 20/02/2014, 16:20

4 393 0
Báo cáo khoa học: "A Nonparametric Bayesian Approach to Acoustic Model Discovery" docx

Báo cáo khoa học: "A Nonparametric Bayesian Approach to Acoustic Model Discovery" docx

... the future, we plan to explore phonological context and use more flexible topological structures to model acoustic units within our framework. Acknowledgements The authors would like to thank Hung-an ... R 39 to denote the t th feature frame of the i th utterance. Fig. 1 illustrates how the speech signal of a single word utterance banana is converted to a sequence of feature vectors...

Ngày tải lên: 07/03/2014, 18:20

10 478 0
Báo cáo khoa học: "A Two-step Approach to Sentence Compression of Spoken Utterances" pdf

Báo cáo khoa học: "A Two-step Approach to Sentence Compression of Spoken Utterances" pdf

... first step, 8 anno- tators were asked to select words to be removed to compress the sentences. In the second step, 6 an- notators (different from the first step) were asked to pick the best one ... and reranking is able to yield additional gain, espe- cially when training is performed to take into account multiple references. 1 Introduction Sentence compression aims to preserve th...

Ngày tải lên: 07/03/2014, 18:20

5 426 1
Báo cáo khoa học: " a Movie Dialogue Corpus for Research and Development" potx

Báo cáo khoa học: " a Movie Dialogue Corpus for Research and Development" potx

... of the resulting dialogue collection. Total number of scripts collected 911 Total number of scripts processed 753 Total number of dialogues 132,229 Total number of speaker turns 764,146 Average ... inserted within the turn, misspelling of speaker names, etc. In addition to this, a semi-automatic process was still necessary to filter out movie scripts exhibiting extremely differ...

Ngày tải lên: 07/03/2014, 18:20

5 424 0
w