Báo cáo khoa học: "Improving On-line Handwritten Recognition

Báo cáo khoa học: "Improving On-line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation" docx

... Linguistics Improving On-line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation Vicent Alabau, Alberto Sanchis, Francisco Casacuberta Institut Tecnol ` ogic d’Inform ` atica Universitat ... Vera, s/n, Valencia, Spain {valabau,asanchis,fcn}@iti.upv.es Abstract In interactive machine translation (IMT), a hu- man expert i...

Ngày tải lên: 07/03/2014, 22:20

6 314 0

Tài liệu Báo cáo khoa học: "Improving Automatic Speech Recognition for Lectures through Transformation-based Rules Learned from Minimal Data" ppt

... by partitioning each 50 minute lecture into a training and a test set, where the training set is smaller than the test set. As mentioned in the intro- duction, it is feasible to obtain manual transcripts for ... ρ(r best , T ASR )) and the remaining rules are scored on the transformed training text. This ensures that the scoring and ranking of remaining rules takes into account the chan...

Ngày tải lên: 20/02/2014, 07:20

9 427 0

Báo cáo khoa học: "Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia" potx

... performance using four sets of features: (i) Monolingual Wiki-tagger based, using only the features in Group 1 (MONO); (ii) Bilingual label match and Wiki-tagger based, using features in Groups ... phrases in Wikipedia, using Wikipedia metadata. The following sources of information were used from Wikipedia: category an- notations on English documents, article links which l...

Ngày tải lên: 23/03/2014, 14:20

9 333 0

Báo cáo khoa học: "Arabic Named Entity Recognition: Using Features Extracted from Noisy Data" doc

... class All . The baseline results, FreqBaseline, assigns a test token the most frequent tag observed for it in the gold training data, if a test token is not observed in the training data, it is assigned ... ob- tained model outperformed the baseline. More re- cently, in (Chen and Ji, 2009), the authors report their comparative study between monolingual and cross-lingual bootstrapping. F...

Ngày tải lên: 23/03/2014, 16:20

5 249 0

Tài liệu Báo cáo khoa học: "Learning Syntactic Verb Frames Using Graphical Models" doc

... to VALEX using pGRs with a narrow window width. Since POS tagging is more reliable and robust across domains than parsing, retraining on new domains will not suffer the effects of a mismatched parsing model ... lex- icon for biomedical information extraction. In Com- putational Linguistics and Intelligent Text Processing. Springer Berlin / Heidelberg. 429 Proceedings of the 50th Annual Me...

Ngày tải lên: 19/02/2014, 19:20

10 431 0

Tài liệu Báo cáo khoa học: "An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation" docx

... In this way, our program can introduce some randomness into weight training. Hence users do not need to repeat MERT for obtaining stable and optimized weights using different starting points. ... target- language corpus. Finally, the resulting models are incorporated into the decoder which can automatically tune feature weights on the development set using minimum error rate tra...

Ngày tải lên: 19/02/2014, 20:20

6 531 0

Tài liệu Báo cáo khoa học: "Towards History-based Grammars: Using Richer Models for Probabilistic Parsing*" docx

... tailoring via the usual linguistic introspection in the hope of generating the correct parse. In head-to-head tests against one of the best existing robust probabilistic parsing models, ... definition of a history in the HBG model) and the corresponding rule used in expanding a node. Using the resulting data set we built a decision tree by classifying his- tories to local...

Ngày tải lên: 20/02/2014, 21:20

7 372 0

Báo cáo khoa học: "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation" pptx

... analysis using binary branching structures under word alignment and parse tree constraints. Bod (2007) also finds that discontinues phrasal rules make significant improvement in lin- guistically ... non-contiguous phrase modeling in both syntax-based and phrase-based systems. We also find that in Chinese-English translation task, gaps are more effective in Chinese side than...

Ngày tải lên: 17/03/2014, 01:20

9 281 0

Báo cáo khoa học: "A Comparative Study of Hypothesis Alignment and its Improvement for Machine Translation System Combination" pot

... decoding shows the best performance in combining outputs from multiple machine translation (MT) systems. However, overcoming different word orders presented in multiple MT systems dur- ing ... Networks for Combining Machine Translation Systems. In Pro- ceedings of COLING 2008, pp. 33–40. Manchester, Aug. S. Bangalore, G. Bordel, and G. Riccardi. 2001. Computing consensu...

Ngày tải lên: 17/03/2014, 01:20

8 547 1

Báo cáo khoa học: "Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation" potx

... description of Joshua’s main features, described in more detail in Li et al. (2009a): • Training Corpus Sub-sampling: We sup- port inducing a grammar from a subset of the training data, that consists ... proposed by Kishore Papineni (per- sonal communication), outlined in further detail in (Li et al., 2009a). The method achieves a 90% reduction in training corpus size while main...

Ngày tải lên: 17/03/2014, 02:20

4 275 0