Báo cáo khoa học: "Japanese Morphological Analyzer using Word " doc
... being adjusted. 1 Previous Japanese Morphological Analyzers Most Japanese morphological analyzers use linguistic grammar, generate possible sequences of words from an input string, and select ... takagi@nttnly.isl.ntt.co.jp Abstract We developed a Japanese morphological analyzer that uses the co-occurrence of words to select the correct sequence of words in an unsegmented Ja...
Ngày tải lên: 08/03/2014, 05:21
... Japanese language is basically an SOV lan- guage, but word order is relatively free. In English the syntactic function of each word is represented by word order, while in Japanese it is represented by ... Japanese sentence is analyzed by using segments, called bunsetsu, that usually contain one or more content words like a noun, verb, or adjective, and zero or more function words like...
Ngày tải lên: 20/02/2014, 12:20
... orthogra- phy. In addition, we add two general requirements for morphological analyzers. First, we want both a morphological analyzer and a morphological gen- erator. Second, we want to use a representation that ... affixes can appear in a word. For example, the word wasayak- tubuwnahA ‘and they will write it’ has two pre- fixes, one circumfix and one suffix: 2 2 We analyze the imperfe...
Ngày tải lên: 23/03/2014, 18:20
Báo cáo khoa học: "Japanese Dependency Parsing Using Sequential Labeling for Semi-spoken Language" ppt
... amount of documents directly published by end users is increasing along with the growth of Web 2.0. Such documents of- ten contain spoken-style expressions, which are difficult to analyze using conventional parsers. ... independent units to modify other units. Documents published by end users (e.g., blogs) are increasing on the Internet along with the growth of Web 2.0. Such documents do not...
Ngày tải lên: 31/03/2014, 01:20
Báo cáo khoa học: "Bootstrapping Coreference Resolution Using Word Associations" potx
... features), named entity features and semantic word class fea- tures (e.g., from WordNet) that do not distinguish, 783 say, Obama from Hawking. In our approach, word association information is used for ... bootstraps a complete corefer- ence resolution (CoRe) system from word as- sociations mined from a large unlabeled cor- pus. We show that word associations are use- ful for CoRe – e....
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Evaluating Machine Translations using mNCD" doc
... M-TER, which use the flexible word matching modules from METEOR to find relaxed word- to -word alignments (Agar- wal and Lavie, 2008). The modules are able to align words even if they do not share ... translation tasks into English, the re- laxed alignment using a stem module and the synonym module affected 7.5 % of all words, whereas only 5.1 % of the words were changed in the tasks from...
Ngày tải lên: 30/03/2014, 21:20
... SAUMER: SENTENCE ANALYSIS USING METARULES Fred Popowich Natural Language Group Laboratory for Computer and Communications ... 1. INTRODUCTION The SAUMER system allows the user to specify a grammar for a natural language using rules and metarules rhts grammar can then be u¢,ed ~ obtain a semantic interpretation of ... lexicon is organised into two levels. For the semantic interpr...
Ngày tải lên: 01/04/2014, 00:20
Tài liệu Báo cáo khoa học: "Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking" pdf
... of morphological tagging as choosing an inflectional morphological tag (in this paper, the term morphological tagging” never refers to derivational morphology). The morphol- ogy of an Arabic word ... in- flected word form. He redefines the tagging task as a choice among the tags proposed by the dictionary, using a log-linear model trained on specific ambi- guity classes for individual...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx
... is the word boundary problem. It is impossible to use isolated word error correction techniques because there are no delimiters between words. The second is the short word prob- lem. Word distance ... likely word sequence from all combinations of exactly and approximately matched words using a Viterbi-like word segmentation algorithm and a sta- tistical language model con...
Ngày tải lên: 20/02/2014, 18:20
Báo cáo khoa học: "Semitic Morphological Analysis and Generation Using Finite State Transducers with Feature Structures" pot
... morphological analyzer, the orthographic analyzer was run on 400 word- forms selected randomly from the list compiled by Biniam, and the results were evaluated by a human reader. Of the 400 wordforms, ... conjunction affixes. For the orthographic version of the analyzer, a word is entered in Ge’ez script (UTF-8 encoding). The program romanizes the input using the SERA transcription...
Ngày tải lên: 17/03/2014, 22:20