Tài liệu Báo cáo khoa học: "Improving Statistical Machine Tr

Tài liệu Báo cáo khoa học: "Improving Statistical Machine Translation with Monolingual Collocation" pdf

... 11-16 July 2010. c 2010 Association for Computational Linguistics Improving Statistical Machine Translation with Monolingual Collocation Zhanyi Liu 1 , Haifeng Wang 2 , Hua Wu 2 , Sheng Li 1 ... paper proposes to use monolingual collocations to improve Statistical Ma- chine Translation (SMT). We make use of the collocation probabilities, which are estimated from monol...

Ngày tải lên: 20/02/2014, 04:20

9 474 0

Tài liệu Báo cáo khoa học: " What Should Machine Translation Be?" pot

Ngày tải lên: 21/02/2014, 20:20

1 519 1

Tài liệu Báo cáo khoa học: "Resolution for Machine Translation of Telegraphic Messages" docx

... leads to a mistranslation in a machine translation system. Therefore, the issue becomes how to parse tele.graphic messages accurately and efficiently to produce high quahty translation output. ... Misparsing re- duced by omissions has a far-reaching consequence in machine translation. Namely, a misparse of the input often leads to a translation into the target language which...

Ngày tải lên: 22/02/2014, 03:20

8 365 0

Tài liệu Báo cáo khoa học: "SUBLANGUAGES IN MACHINE TRANSLATION" pdf

... system within the computer-aided Saarbriicken Translation System (STS), i.e. in human-aided MT and in machine- aided human translation. Titles of scientific papers from German databases were machine- translated ... sublanguage notion for disambi- guation and the selection of target language equivalents in machine translation. In this paper a theoretical concept and its imple-...

Ngày tải lên: 22/02/2014, 10:20

3 476 0

Tài liệu Báo cáo khoa học: "A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining" pptx

... the transliteration pairs. We propose a second model p 2 (e, f ) to deal with non-transliteration pairs (the “non-transliteration model”). Interpolation with the non-transliteration model allows the transliteration model ... initialized with a uniform distribu- tion and λ is set to 0.5. The expected count of a multigram q (E-step) is computed by multiplying the posterior probability of...

Ngày tải lên: 19/02/2014, 19:20

9 521 0

Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf

... learning algorithms and as extra word features in NLP systems. However, most of these models are built with only local context and one represen- tation per word. This is problematic because words are ... accounts for homonymy and polysemy by learning multiple embeddings per word. We introduce a new dataset with human judgments on pairs of words in sentential context, and evaluate our mo...

Ngày tải lên: 19/02/2014, 19:20

10 494 0

Tài liệu Báo cáo khoa học: "Improving Chinese Semantic Role Labeling with Rich Syntactic Features" ppt

... information of sub-trees in a given parse. With help of these new features, our system achieves 93.49 F-measure with hand-crafted parses. Comparison with the best reported results, 92.0 (Xue, ... arguments of a predicate are la- beled with a contiguous sequence of integers, in the form of AN (N is a natural number); the ad- juncts are annotated as such with the label AM followed b...

Ngày tải lên: 20/02/2014, 04:20

5 364 0

Tài liệu Báo cáo khoa học: Improving Classification of Medical Assertions in Clinical Notes" pdf

... instances with that label as positive instances and instances with any other label as negative instances. The final class label is assigned by choosing the class that was assigned with the ... its performance with our original system. 4.1 Data The training set includes 349 clinical notes, with 11,967 assertions of medical problems. The test set includes 477 texts with 18...

Ngày tải lên: 20/02/2014, 05:20

6 496 0

Tài liệu Báo cáo khoa học: "Improving Automatic Speech Recognition for Lectures through Transformation-based Rules Learned from Minimal Data" ppt

... Li, 2007). 1 Even with all of these, however, there remains a signiﬁcant gap between this WER and the threshold of 25%, at which lec- ture transcripts have been shown with statistical signiﬁcance ... you ⇓ Output all rules for replacing the incorrect ASR sequence with the correct text, using the entire sequence (a) or splices (b), with or without surrounding anchors: (a) the okay on...

Ngày tải lên: 20/02/2014, 07:20

9 427 0

Tài liệu Báo cáo khoa học: "Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition" pdf

... On the other hand, the system with preceding information is not significantly better than the system without it 5 . Other non-local information may improve performance with our framework and this ... of the classifier on development data is 74.64 (without preceding information) and 75.14 (with preceding information). 470 Table 5: Performance with filtering on the development data. (&l...

Ngày tải lên: 20/02/2014, 12:20

8 527 0