... to- kenization /morpheme segmentation for alignment (Chung and Gildea, 2009; Habash and Sadat, 2006), we find that the best segmentation for alignment does not coincide with the gold-standard segmenta- tion and ... se- quence of morphemes), and alignment from target to source morphemes, given a source language se- quence of words (each consisting of a sequence of morphemes). An example morpheme segmentation and align- ment ... Linguistics, pages 895–904, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Unsupervised Bilingual Morpheme Segmentation and Alignment with Context-rich Hidden...
Ngày tải lên: 17/03/2014, 00:20
... that performs joint word alignment and phrase extraction, and found that joint estimation of word alignments and extraction sets improves both word alignment accuracy and translation results. In ... perform weight tuning and testing on specified development and test sets. We compare the accuracy of our proposed method of joint phrase alignment and extraction using the FLAT, HIER and HLEN models, ... model. For GIZA ++, we use the standard training reg- imen up to Model 4, and combine alignments with grow-diag-final -and. For the proposed models, we train for 100 iterations, and use the final sample...
Ngày tải lên: 20/02/2014, 04:20
English morpheme system and some applications of learning morpheme in establishing words
... and faster). 1.2.2. Affixational morpheme The affixational morpheme is further divided into inflectional morpheme and derivational morpheme. 1.2.2.1. The inflectional morpheme Inflectional morphemes ... called the lexical morpheme or simply the root. Example: book, system, school, etc. 6 Morpheme Root morpheme Affixational morpheme free morpheme Bound morpheme The inflectional morpheme The ... morphemes. Note that grammatical morphemes include forms that we can consider to be words like the, a, and, and of and others that make up parts of words like –s and -ed. 7 English morpheme system Luong...
Ngày tải lên: 08/04/2013, 09:31
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx
... Daum ´ e III and Marcu, 2005; Finkel et al., 2006) and for specific problems such as language modeling and utterance classifica- tion (Saraclar and Roark, 2005) and labeling and chunking (Shimizu and Haas, ... sentence, and T is the size of the tag set (T = 1 for pure word segmentation) . It worked well for word segmentation alone (Zhang and Clark, 2007), even with an agenda size as small as 8, and a simple ... tagger, and the best output is selected using the overall POS- segmentation probability score. In this system, the decoding for word segmentation and POS tagging are still performed separately, and...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "A HARDWARE ALGORITHM FOR HIGH SPEED MORPHEME EXTRACTION AND ITS IMPLEMENTATION" pptx
... various candidates. For example, "~ ~", one of the extracted morphemes, is com- posed of the 2nd candidate at the 1st position, the 1st candidate at the 2nd position and the 3rd candidate ... corresponds to the candi- date level. Candidates on the same level form one stream. For example, in Fig. 6(a), the character at the 3rd position has three candidates: the 1st candidate is '~', ... MACHINE DESIGN STRATEGY 2.1 MORPHEME EXTRACTION Morphological analysis methods are generally composed of two processes: (1) a morpheme ex- traction process and (2) a morpheme determina- tion...
Ngày tải lên: 21/02/2014, 20:20
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf
... p (position i −l), and select for position i a N-best list of candidate results from all these candidates. When we derive a candidate result from a word-POS pair p and a candidate q at prior ... seg- mentation only and joint segmentation and part-of-speech tagging. On the Penn Chinese Treebank 5.0, we obtain an error reduction of 18.5% on segmentation and 12% on joint seg- mentation and part-of-speech ... Segmentation and POS Tagging Given a Chinese character sequence: C 1:n = C 1 C 2 C n the segmentation result can be depicted as: C 1:e 1 C e 1 +1:e 2 C e m−1 +1:e m while the segmentation and...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment" doc
... show a signifi- cant improvement in precision and recall for word alignment when the improved dicitonary is used. 1 Introduction and Related Work Word alignment is a well-studied problem in Natu- ral ... 0.0026 0.0008 0.0037 Table 1: Percent improvement and p-value for recall and precision, comparing baseline and rebuilt dictio- naries at minscore 50 and maxlinks 1. for these parameter settings, ... algorithm is robust - it improves alignment regardless of how many links are allowed per word, what baseline dictionary is used, and boosts both precision and recall, and thus also the f-measure. To...
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx
... avoid- ing segmentation error propagation and exploiting POS information to help segmentation. A challenge for joint approaches is the large combined search space, which makes efficient decoding and ... Word-based and character-based word segmentation models: Comparison and combi- nation. In Coling 2010: Posters, pages 1211–1219, Beijing, China, August. Coling 2010 Organizing Com- mittee. Andr ´ e ... Linguistics. 1394 and was then further used in POS tagging in (Zhang and Clark, 2008). In our previous work(Sun, 2010), we presented a theoretical and empirical comparative analysis of character-based and...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Joint Hebrew Segmentation and Parsing using a PCFG-LA Lattice Parser" docx
... Adjectives and Nouns (which should agree on Gender and Number and definiteness), and between Subjects and Verbs (which should agree on Gender and Number). 3 PCFG-LA Grammar Estimation Klein and Manning ... joint segmentation and parsing of Chinese, empty element prediction (see (Cai et al., 2011) for a successful application), and a princi- pled handling of multiword-expressions, idioms and named-entities. ... Hebrew are 84.1%F assuming gold segmenta- tion and tagging (Tsarfaty and Sima’an, 2010) 9 , and 73.7%F starting from unsegmented text (Golderg et 5 The segmentation+ tagging accuracy of the HMM tagger...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging" docx
... ACL and AFNLP An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging Canasai Kruengkrai †‡ and Kiyotaka Uchimoto ‡ and Jun’ichi Kazama ‡ Yiou Wang ‡ and ... Yamamoto, and Yuji Matsumoto. 2004. Applying conditional random fields to japanese morphological analysis. In Proceedings of EMNLP, pages 230–237. John Lafferty, Andrew McCallum, and Fernando Pereira. ... morphology and context. In Proceed- ings of ACL, pages 277–284. Tetsuji Nakagawa and Kiyotaka Uchimoto. 2007. A hybrid approach to word segmentation and pos tag- ging. In Proceedings of ACL Demo and...
Ngày tải lên: 17/03/2014, 01:20
Báo cáo khoa học: "Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study" potx
... stan- dard. And correspondingly, the two annotation standards are naturally denoted as source standard and target standard, while the classifiers follow- ing the two annotation standards are respectively named ... the ACL and the 4th IJCNLP of the AFNLP, pages 522–530, Suntec, Singapore, 2-7 August 2009. c 2009 ACL and AFNLP Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS ... the efficacy of this method in the context of Chinese word segmentation and part-of-speech tagging, where no segmentation and POS tagging standards are widely accepted due to the lack of morphology...
Ngày tải lên: 17/03/2014, 01:20
Báo cáo khoa học: "Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling" pdf
... Cover and J. Thomas, (1991). Elements of Information Theory, John Wiley & Sons. R. Duda and P. Hart. (1973). Pattern Classification and Scene Analysis, John Wiley & Sons. Y. Grandvalet and ... art supervised CRF of McDonald and Pereira (2005), and also to self-training (Celeux and Gov- aert 1992; Yarowsky 1995), using the same fea- ture set as (McDonald and Pereira 2005). The CRF training ... the gradient, and thereby allows us to perform efficient iterative ascent for training. We apply our new training technique to the problem of sequence labeling and segmentation, and demon- strate...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Using Bilingual Comparable Corpora and Semi-supervised Clustering for Topic Tracking" ppt
... 2002. I.Dagan and K.Church, Termight: Coordinating humans and machines in bilingual terminology acquisition, Journal of MT, Vol. 20, No. 1, pp. 89-107, 1997. M.Franz and J.S.McCarley, Unsupervised and ... Research, and Interna- tional Communications Foundation. References J.Allan and R.Papka and V.Lavrenko, On-line new event detection and tracking, Proc. of the DARPA Workshop, 1998. J.Allan and V.Lavrenko ... (8) C Miss , C Fa , and P Target are the costs of a missed detection, false alarm, and priori probability of finding a target, respectively. C Miss , C Fa , and P Target are usually set to 10, 1, and 0.02,...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Semantic Discourse Segmentation and Labeling for Route Instructions" potx
... in ”go straight and make the first left you can, then go into the first door on the right side and stop” , LEFT and FIRST occur exactly once for the first action, and FIRST, DOOR and RIGHT are found ... ”room”, ”doorway” and their plural forms map to DOOR, and the or- dinal number 1 will be represented by ”first” and ”1st”, and so on. 5 Dataset As noted, we have 427 route instructions, and the average ... at- tempt shallow understandings and broad coverage, for these domains vocabulary is limited and very strong domain knowledge is available. Despite this, deeper understanding of unrestricted natural language...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "TBL-Improved Non-Deterministic Segmentation and POS Tagging for a Chinese Parser" pdf
... because segmentation and POS tagging standards vary, and our test data have not been used for a final evaluation before. Nev- ertheless, there are of course systems that perform word segmentation and ... were published: Florian and Ngai (2001) report an SF of 93.55% and a TA of 88.86% on CTB data. Ng and Low (2004) report an SF of 95.2% and a TA of 91.9% on CTB data. Finally, Zhang and Clark (2008) ... to the PKU gold standard. We will then induce obliga- tory and optional FST rules from this “grammar- compliant” gold standard and hope that these will be able to replace the hand-crafted transformation rules...
Ngày tải lên: 17/03/2014, 22:20
Báo cáo khoa học: "Unsupervised Recognition of Literal and Non-Literal Use of Idiomatic Expressions" potx
Ngày tải lên: 17/03/2014, 22:20
Báo cáo khoa học: "Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection" docx
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Joint Evaluation of Morphological Segmentation and Syntactic Parsing" pptx
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "A Bilingual Context Mining and Sentiment Analysis Summarization System" pot
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Coordinate Structure Analysis with Global Structural Constraints and Alignment-Based Local Features" pot
Ngày tải lên: 23/03/2014, 16:21
Bạn có muốn tìm thêm với từ khóa: