... similar nouns to . 2.2 The WordNet Similarity Package We use the WordNet Similarity Package 0.05 and WordNet version 1.6. 2 The WordNet Similarity package supports a range of WordNet similarity scores. ... these words in the 2 domain specific corpora. 5.3 Discussion The results for 10 of the words from the quali- tative experiment are summarized in table 3 with the WordNet sense number for eac...
Ngày tải lên: 20/02/2014, 16:20
... Distributional Hypothesis (Harris, 1964), verbs occurring in similar sentences are likely to be semantically related. The Distributional Hypothesis suggests a generic equivalence between words. Related methods ... useful to better ex- plain the problem and to better understand the ap- plicability of our hypothesis. In WordNet, verbs are organized in synonymy sets (synsets) and different kinds...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Discovering Corpus-Specific Word Senses" pot
... h,) Cik 4111) 11‘ 41 4Wit ler,1110.1/. 1 7, cgtoserek■Ilt Figure 1: Local graph of the word mouse Figure 2: Local graph of the word wing 3 Markov Clustering Ambiguous words link otherwise unrelated areas of meaning E.g. rat ... an unsupervised al- gorithm which automatically discovers word senses from text. The algorithm is based on a graph model representing words and relationsh...
Ngày tải lên: 22/02/2014, 02:20
Tài liệu Báo cáo khoa học: "Discovering Global Patterns in Linguistic Networks through Spectral Analysis: A Case Study of the Consonant Inventories" pdf
... appear) for an ex- tensive survey). Examples include study of the WordNet (Sigman and Cecchi, 2002), syntactic dependency network of words (Ferrer-i-Cancho, 2005) and network of co-occurrence of ... connections between the language and the consonant nodes through a 0-1 matrix A as shown by a hypothetical example in Fig. 1. Fur- ther, in (Mukherjee et al., 2007), the authors define the Phoneme-...
Ngày tải lên: 22/02/2014, 02:20
Tài liệu Báo cáo khoa học: "Fixed Length Word Suffix for Factored Statistical Machine Translation" pdf
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning Sub-Word Units for Open Vocabulary Speech Recognition" doc
... coherence. Hybrid word/ sub -word recognizers can produce a sequence of sub -word units in place of OOV words. Ideally, the recognizer outputs a complete word for in-vocabulary (IV) utterances, and sub -word ... recognize words beyond their vocab- ulary, many of which are information rich terms, like named entities or foreign words. Hybrid word/ sub -word systems solve this problem by...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Yet Another Word Alignment Tool" docx
... with Yawat. As the mouse is moved over a word, th e word and all words linked with it are highlighted. The highlighting is removed when the mouse leaves the word in qu estion. This allows the annotator ... assoc iated words are shown only for one wor d at a time, as determined by the location of the mouse pointer. When the mouse is moved over a word in the text, the word and all the...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Guiding Statistical Word Alignment Models With Prior Knowledge" pdf
... a m 1 specifies the indices of source words that target words are aligned to. In an HMM-based word alignment model, source words are treated as Markov states while target words are observations that are ... as 1. In building word alignment models, a special “NULL” word is usually introduced to address tar- get words that align to no source words. Since this physically non-existing word...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf
... co-occurrence. Word based model. In this model, statistical data about word boundary frequencies for each character is retrieved word- wise. For example, in the case of a monosyllabic word only two word ... components of words, instead, they are contextual background providing informa- tion about the likelihood of whether each CB is also a wordbreak (WB). In other words, we model Chi...
Ngày tải lên: 20/02/2014, 12:20
... Moreover, separating the issue of word boundary identification from sentence understand- ing often leads to devising word segmentation rules which are arbitrary and word specific, 2 and hence not ... rectly pre-segmented words. It performs word boundary disambiguation concurrently with sentence understanding. In our investigation, we focus on sentences with clearly ambiguous w...
Ngày tải lên: 20/02/2014, 21:20