Báo cáo khoa học: "ConsentCanvas: Automatic Texturing for Improved Readability in EndUser License Agreements" pot
... Proceedings of the ACL-HLT 2011 Student Session, pages 41–45, Portland, OR, USA 19-24 June 2011. c 2011 Association for Computational Linguistics ConsentCanvas: Automatic Texturing for Improved Readability ... minimizing the cognitive and legal burden for both the end user and the licensor. Our system does not require a corpus for train- ing. 1 Introduction Less than 2% o...
Ngày tải lên: 23/03/2014, 16:20
... Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 603–610, Sydney, July 2006. c 2006 Association for Computational Linguistics An Automatic Method for Summary Evaluation ... conversations into English. They also generated translations using a system for each conversation. Then, they evaluated both translations using an automatic method, and obtained W H ,...
Ngày tải lên: 17/03/2014, 04:20
... such information. With the baseline lexicon, we performed the EM algorithm as in Table 2 to train the trigram LM. Here we used a 313 MB LM training corpus, which contains text news articles in ... in its own cluster. Adding new words into the lexicon, on the other hand, offers explicit reinforcement in PP of the reference char- acters. Such reinforcement offers the main positive boost...
Ngày tải lên: 20/02/2014, 07:20
Báo cáo khoa học: "Probabilistic Document Modeling for Syntax Removal in Text Summarization" ppt
... 2010 datasets us- ing the ROUGE metric. 1 Introduction While the dominant problem in Information Re- trieval in the first part of the century was finding relevant information within a datastream ... TAC involve taking a set of documents as input and outputting a short summary (either 100 or 250 words, depending on the year) containing what the system deems to be the most important informati...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Bypassed Alignment Graph for Learning Coordination in Japanese Sentences" doc
... listed above are concerned mainly with scope disam- biguation, reflecting the fact that detecting the presence of coordinations in a sentence (Task 1) is straightforward in English. Indeed, nearly 100% precision ... only in the first word. Both contain a particle to, which is one of the most fre- quent coordination markers in Japanese—but only the first sentence contains a coordinate str...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Statistical Machine Translation for Query Expansion in Answer Retrieval" pptx
... p LM (syn I 1 ) λ LM For estimation of the feature weights λ defined in equation (4) we employed minimum error rate (MER) training under the BLEU measure (Och, 2003). Training data for MER training were ... practice imagination concentration information consciousness different meditation relaxation qa-translation (-): birth industrial induced induces paraphrasing (-): way workers induc...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study" potx
... number of training examples in each sense in such a new training set is the same as that in the official train- ing data set of w. A WSD classifier was then trained on this new training set, and ... sense, accord- ing to some existing sense inventory in a diction- ary. This annotated corpus then serves as the training material for a learning algorithm. After training, a model...
Ngày tải lên: 08/03/2014, 04:22
Báo cáo khoa học: "Machine-learned contexts for linguistic operations in German sentence realization" doc
... complete, spanning parse: 85.14% of the sentences in the training and parameter tuning set, and 84.59% in the blind test set fall into that category. Most sentences yield more than one training case. ... a machine Computational Linguistics (ACL), Philadelphia, July 2002, pp. 25-32. Proceedings of the 40th Annual Meeting of the Association for learning approach. The linguistically...
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: "Data point selection for cross-language adaptation of dependency parsers" pot
... mapped into a common tagset using the technique described in Zeman and Resnik (2008). For our main results, which are pre- sented in Figure 1, we use the remaining three tree- banks as training ... training material for each language. The test section of the language in question is used for testing, while the POS sequences in the target train- ing section is used for training...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition" ppt
... call tuning), using the previ- ously trained prior for regularization. If we are un- able to find a match between features in the training and tuning datasets (for instance, if a word appears in the ... trained classifier to make predictions. 246 In the paradigm of inductive learning, (X train , Y train ) are known, while both X test and Y test are completely hidden during training tim...
Ngày tải lên: 23/03/2014, 17:20