Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf
... Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data Sun Maosong, Shen Dayang*, Benjamin K Tsou** State Key Laboratory of Intelligent Technology and Systems, ... Chinese word segmentation developed so far, both statistical and rule-based, exploited two kinds of important resources, i.e., lexicon and hand-crafted linguistic resources...
Ngày tải lên: 20/02/2014, 18:20
... of a word w i used here to be able to com- pare segmentations resulting in a different number of words. This best segmentation can be computed easily using dynamic programming. 6 Results and discussion We ... unsupervized word segmentation systems in Section 3. Section 4 and Section 5 present the core of our system. Finally, in Section 6, we de- tail and discuss our results. 2...
Ngày tải lên: 19/02/2014, 19:20
... only local context and one represen- tation per word. This is problematic because words are often polysemous and global con- text can also provide useful information for learning word meanings. We ... architecture which 1) learns word embeddings that better capture the se- mantics of words by incorporating both local and global document context, and 2) accounts for homonymy and...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx
... pattern “number word + “number word can help to prevent seg- menting a long number word into two words. In order to avoid error propagation and make use of POS information for word segmentation, ... tagger, and the best output is selected using the overall POS- segmentation probability score. In this system, the decoding for word segmentation and POS tagging are still per...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Learning Word Senses With Feature Selection and Order Identification Capabilities" pdf
... (Pantel and Lin, 2002; Sch¨utze, 1998), there are other related efforts on word sense discrimination (Dorow and Widdows, 2003; Fukumoto and Suzuki, 1999; Pedersen and Bruce, 1997). In (Pedersen and ... For i = 1 to q do (2.1) Randomly split C T into disjoint halves, denoted as C T A and C T B ; (2.2) Estimate GMM parameter and cluster number on C T A using Cluster, and the...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Automatic error detection in the Japanese learners’ English spoken data" pdf
... targeted word, the one preceding and one following/ the targeted word and the one preceding/ the targeted word and the one following/ the targeted word and the two preceding/ the targeted word and ... word/ one preceding word and two following words), and the first and last letter of the word immediately following. (In Fig. 2, “t” and “e” in “telephone”.)...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Guiding an HPSG Parser using Semantic and Pragmatic Expectations" pdf
... by The Ohio State Center for Cognitive Science and The Ohio State Departments of Computer and Information Science and Linguistics grammar (using compiled knowledge) which is then used to realize ... language generation has been successfully demonstrated using highly compiled knowledge about speech acts and their related social actions. A design and prototype implementation...
Ngày tải lên: 20/02/2014, 21:20
Tài liệu Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble" pdf
... cross-validation into training and test subsets with the ratio of 9:1 we randomly split the data into training, validation and test sets with the ratio of 8:1:1. We then run our experiments and measured ... analyse words. Models are constructed using rule-based meth- ods (Mooney and Califf, 1996; Muggleton and Bain, 1999), connectionist methods (Rumelhart and McClelland, 1986...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc
... of salient words ag- gregated using synonymy, similarity, or subtrees of a thesaurus. However, salient word selection and aggregation is non-obvious and furthermore it falls into word sense disambiguation, ... frequent words F to generalize words to word classes”. We define a word class as either a word itself or its part of speech. Given a sentence s = w 1 , w 2 , . . . , w |s|...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt
... assessment of word represen- tations, we visualize the words most similar to a query word using vector similarity of the learned representations. Given a query word w and an- other word w ′ we ... embracing many social and attitudinal aspects of meaning (Wil- son et al., 2004; Alm et al., 2005; Andreevskaia and Bergler, 2006; Pang and Lee, 2005; Goldberg and Zhu, 2006; Sny...
Ngày tải lên: 20/02/2014, 04:20