Tài liệu Báo cáo khoa học: "Using Word Support Model to Improve Chinese Input System" ppt
... results show that: (1) the WSM is able to achieve tonal (sylla- bles input with four tones) and tone- less (syllables input without four tones) syllable -to -word (STW) accuracies of 99% and 92%, ... its frequency by 1. 2.2 Word Support Model The four steps of our WSM applied to identify words for a given Chinese syllables is as follows: Step 1. Input tonal or tone...
... entity. After establishing the vector space model (VSM) for each entity mention of the type, we adopt a clustering toolkit (CLUTO) to further divide the mentions into different subtypes. Finally, ... (CLUTO toolkit) 3 is used to divide it into different cohesive subtypes, each of which only contains the entities of the same background. For instance, the Air entities will be div...
... a Chinese character is not exactly the same as the function of an English word. Normally, two or more Chinese characters form a Chinese word to carry a meaning, although there are Chinese words ... elements in Chinese characters, and proposed a set of rules to decompose Chinese characters into elements that belong to this set of building blocks (Chu, 2008). Hence, i...
Tài liệu Báo cáo khoa học: "Using Automatically Transcribed Dialogs to Learn User Models in a Spoken Dialog System" doc
... Linguistics Using Automatically Transcribed Dialogs to Learn User Models in a Spoken Dialog System Umar Syed Department of Computer Science Princeton University Princeton, NJ 08540, USA usyed@cs.princeton.edu Jason ... A t and ˜ A t are all assumed to belong to finite sets, and so all the conditional distributions in our model are multinomials. Hence θ is a vec- tor that parameterizes...
Tài liệu Báo cáo khoa học: "Wikipedia as Sense Inventory to Improve Diversity in Web Search Results" doc
... possible to use sense inventories to improve Web search results diversity for one word queries? To answer this ques- tion, we focus on two broad-coverage lex- ical resources of a different nature: Word- Net, ... possible to use sense inventories to improve search results for one word queries? To answer 1357 this question, we will focus on two broad-coverage lexical resou...
Tài liệu Báo cáo khoa học: "Event-based Hyperspace Analogue to Language for Query Expansion" ppt
... with Topics 1-50 (title field), AP8889 with Topics 101-150 (title field) and WSJ9092 with Topics 201-250 (description field). All the collections are stemmed, and stop words are removed, prior to ... them auto- matically from predicate-argument structures and a dependency parse. We will use this space to per- form query expansion in IR, a task that aims to find additional words related to...
Tài liệu Báo cáo khoa học: "Hidden Markov Tree Model in Dependency-based Machine Translation∗" pptx
... correspond to autoseman- tic (meaningful) words and whose edges corre- spond to syntactic-semantic relations (dependen- cies). The nodes are labeled with the lemmas of the autosemantic words. Functional ... would like to show in the following that the tectogrammatical layer of language description is close enough to this ideal to make the HMTM approach practically applicable. Why Te...
Tài liệu Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble" pdf
... Linguistics Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble Sebastian Spiegler Intelligent Systems Laboratory, University of Bristol, U.K. spiegler@cs.bris.ac.uk Peter ... that they build a morphologi- cal model which is then applied to analyse words. Models are constructed using rule-based meth- ods (Mooney and Cali...
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx
... sequences up to the current char- acter, where the last word can be a complete word or a partial word. A problem arises in whether to give POS tags to incomplete words. If partial words are given ... which are different to those from Collins (2002) and are specific to Chinese, are shown in Table 2. The word segmentation features are extracted from word bigrams, capturing w...
Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf
... neural language models. 1 1 Introduction Vector-space models (VSM) represent word mean- ings with vectors that capture semantic and syntac- tic information of words. These representations can be used to induce ... sense-labeled words. However, in order to cluster accurately, it is important to capture both the syntax and semantics of words. While many approaches use local contexts to...
