... on the machine in linear time for the number of candidates, while conven- tional sequential algorithms are implemented in combinational time. 1 INTRODUCTION Recent advancement in natural ... characters from the input text into the character registers in the shift register block. 309 Sub-Strings Key String for Multiple Text Streams from Dictionary Block in Shift Regost...
Ngày tải lên: 21/02/2014, 20:20
... Christopher Manning 2005. Incorporating Non-local Informa- tion into Information Extraction Systems by Gibbs Sampling. In Proc. of ACL. Nizar Habash, 2007. Syntactic Preprocessing for Sta- tistical Machine ... training data that maps the segmented form of the word to its original form. The table is also useful in re- combining words that are erroneously segmented. If a certain word does...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "Probabilistic Document Modeling for Syntax Removal in Text Summarization" ppt
... Chain Monte Carlo In Practice. Chapman and Hall/CRC. Thomas L. Griffiths, Mark Steyvers, David M. Blei, and Joshua B. Tenenbaum. 2005. Integrating topics and syntax. In In Advances in Neural Information ... summarizer: exploring the factors that in- fluence summarization. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in i...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Bypassed Alignment Graph for Learning Coordination in Japanese Sentences" doc
... points improvement over the original SH in terms of F1 measure in coordination scope detection. Adding bypasses to alignment graphs further improved the performance, making a total of +4.7 points ... than that of any paths in the original alignment graph, the input sentence is deemed not containing coordinations. We assign to the bypass two types of features capturing the characterist...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Statistical Machine Translation for Query Expansion in Answer Retrieval" pptx
... p LM (syn I 1 ) λ LM For estimation of the feature weights λ defined in equation (4) we employed minimum error rate (MER) training under the BLEU measure (Och, 2003). Training data for MER training were ... in the question-answer cor- pus as two distinct languages. That is, the 10 million question-answer pairs extracted from FAQ pages are fed as parallel training data into an SMT t...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Machine-learned contexts for linguistic operations in German sentence realization" doc
... complete, spanning parse: 85.14% of the sentences in the training and parameter tuning set, and 84.59% in the blind test set fall into that category. Most sentences yield more than one training case. ... clauses (e.g., in imperatives), the finite verb is in initial position. Verb-second sentences contain one constituent preceding the finite verb, in the so-called “pre-field”....
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: A new paradigm for oxygen binding involving two types of ab contacts docx
... a basis for the continuity of haemoglobin and myoglobin functions in vivo, since the autoxidation reaction is inevitable in nature for all oxygen-binding haem proteins [21,23,24], as well as for ... contacts in HbA In haemoglobin (Hb) research, the central problem is understanding the mechanism for the cooperative oxygen binding to the a 2 b 2 tetramer. For human HbA, the a...
Ngày tải lên: 17/03/2014, 10:20
Báo cáo khoa học: "ConsentCanvas: Automatic Texturing for Improved Readability in EndUser License Agreements" pot
... enable cus- tomized texturing of EULAs and facilitate experi- mentation for understanding and evaluating gains in comprehension and readability. Finally, we will conduct a formal user evaluation ... passed to our rendering system, which inserts the corresponding HTML5 tags at the posi- tions in original plaintext EULA. We append a header to the output document to include the linked...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition" ppt
... call tuning), using the previ- ously trained prior for regularization. If we are un- able to find a match between features in the training and tuning datasets (for instance, if a word appears in the ... for identifying names and ontological relations in text using heuristics for inducing regularities from data. http://minorthird.sourceforge.net. Hal Daum ´ e III and Daniel Marcu....
Ngày tải lên: 23/03/2014, 17:20
Báo cáo khoa học: "Feature-based Method for Document Alignment in Comparable News Corpora" ppt
... http://www.straitstimes.com/ an English news agency in Singapore. Source © Singapore Press Holdings Ltd. 3 http://www.zaobao.com/ a Chinese news agency in Singa- pore. Source © Singapore Press Holdings ... Linguistic Independent Unit () Linguistic Independent Unit score (LIU) is de- fined as the piece of information, which is writ- ten in the same way for different languages. T...
Ngày tải lên: 24/03/2014, 03:20