Báo cáo khoa học: "One Tokenization per Source" potx

Báo cáo khoa học: "One Tokenization per Source" potx

Báo cáo khoa học: "One Tokenization per Source" potx

... One Tokenization per Source Jin GUt Kent Ridge Digital Labs 21 Heng Mui Keng Terrace, Singapore 119613 Abstract We report in this paper the observation of one tokenization per source. ... occurrences, have realized different tokenizations, and 0.08% tokenization errors would be introduced if forced to take one tokenization per fragment. 2.4 Tokenization Criteria Some...

Ngày tải lên: 17/03/2014, 07:20

7 364 0
Tài liệu Báo cáo khoa học: "Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop" pdf

Tài liệu Báo cáo khoa học: "Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop" pdf

... affixes to per- form tokenization, for any reasonable tokenization scheme. Finally, we can determine the POS tag, for any morphologically motivated POS tagset. Thus, we have performed tokenization, ... du(al), pl(ural) V, N, PN, AJ, PRO, REL, D sg Per Person 1, 2, 3 V, N, PN, PRO 3 Voice Voice act(ive), pass(ive) V act Asp Aspect imp(erfective), perf(ective), imperative V perf Figure 2...

Ngày tải lên: 20/02/2014, 15:20

8 385 0
Tài liệu Báo cáo khoa học: "Poliqarp An open source corpus indexer and search engine with syntactic extensions" docx

Tài liệu Báo cáo khoa học: "Poliqarp An open source corpus indexer and search engine with syntactic extensions" docx

... particular, the paper dis- cusses the motivation for such a new tool, the extended query syntax of Poliqarp and implementation and efficiency issues. 1 Introduction The aim of this paper is to present ... operator has the existential mean- ing, i.e., [case=acc] finds segments with at least one accusative interpretation marked as correct in the context (“disambiguated”). On the other hand, the o...

Ngày tải lên: 20/02/2014, 12:20

4 376 0
Báo cáo khoa học: "The OpenGrm open-source finite-state grammar software libraries" ppt

Báo cáo khoa học: "The OpenGrm open-source finite-state grammar software libraries" ppt

... "es"; 3.3 Standard Library Functions and Operations Built-in functions are provided that operate on FSTs and perform most of the operations that are available in the OpenFst library. ... w 1 . . . w n . Its proper prefixes include all sequences w 1 . . . w k for k < n. • There is a unigram state in every model, representing the empty string. • Every proper prefix of every n-gram .....

Ngày tải lên: 07/03/2014, 18:20

6 412 0
Báo cáo khoa học: "A Modular Open-Source System for Recognizing Textual Entailment" pot

Báo cáo khoa học: "A Modular Open-Source System for Recognizing Textual Entailment" pot

... subsection 2.1) is performed by a linear learn- 75 Figure 1: System architecture RTE challenge Median Best BIUTEE RTE-6 33.72 48.01 49.09 RTE-7 39.89 48.00 42.93 Table 1: Performance (F1) of ... alternative learning-algorithms can be integrated easily by im- plementing an appropriate interface. 2.3 Experimental results BIUTEE’s performance on the last two RTE chal- lenges (Bentivogli et al.,...

Ngày tải lên: 07/03/2014, 18:20

6 414 0
Báo cáo khoa học: "A Modified Joint Source-Channel Model for Transliteration" ppt

Báo cáo khoa học: "A Modified Joint Source-Channel Model for Transliteration" ppt

... algorithms presented in this work 191 focus on person names, locations and organizations. A machine transliteration system that is trained on person names is very important in a multilingual ... been evaluated on a training corpus of person names. A hybrid neural network and knowledge-based system to generate multiple English spellings for Arabic personal names is described in (Arbabi...

Ngày tải lên: 17/03/2014, 04:20

8 312 0
Báo cáo khoa học: "Language Production: the Source ofthe Dictionary" doc

Báo cáo khoa học: "Language Production: the Source ofthe Dictionary" doc

... pruccss For completion. Exauszivc details of these operations may be found in ["1~ . Contextual Effects The mechanisms of the dictionary per se perform two ~ncdons: (l) the association of ... tedious job that the human designer must perform. In the past. typically every object and relation has been given its own individual "lex" property with the literal phrase to be...

Ngày tải lên: 17/03/2014, 19:21

6 276 0
Báo cáo khoa học: "Simultaneous Tokenization and Part-of-Speech Tagging for Arabic without a Morphological Analyzer" doc

Báo cáo khoa học: "Simultaneous Tokenization and Part-of-Speech Tagging for Arabic without a Morphological Analyzer" doc

... Proceedings of the ACL 2010 Conference Short Papers, pages 342–347, Uppsala, Sweden, 11-16 July 2010. c 2010 Association for Computational Linguistics Simultaneous Tokenization and Part-of-Speech Tagging ... approach, with separate models to first do tokenization and then part-of-speech tagging (Diab et al., 2007; Diab, 2009). While these ap- proaches have somewhat lower performance than...

Ngày tải lên: 30/03/2014, 21:20

6 419 0
Báo cáo khoa học: "HunPos – an open source trigram tagger" ppt

Báo cáo khoa học: "HunPos – an open source trigram tagger" ppt

... standard, and perhaps the best, HMM-based POS tagger is TnT (Brants, 2000). We argue here that some of the crit- icism aimed at HMM performance on lan- guages with rich morphology should more properly ... tagger. The paper is structured as follows. In Section 1 we present our own system, HunPos, while in Sec- tion 2 we describe some of the implementation de- tails of TnT that we believe influe...

Ngày tải lên: 31/03/2014, 01:20

4 275 0
Tài liệu Báo cáo khoa học: "Adaptive Natural Language Interaction" potx

Tài liệu Báo cáo khoa học: "Adaptive Natural Language Interaction" potx

... how to express the properties of the ontology as sentences of the target languages. For example, if the indi- vidual templeOfAres has the property excavate- dIn, and that property has a microplan ... building. Stereotypes and profiles are combined into a single set of parameters by means of personal- ity models. Personality models are many-valued Description Logic definitions of the overall pre...

Ngày tải lên: 22/02/2014, 02:20

4 303 0
Từ khóa:
w