... for joint chinese word segmentation and part-of-speech tagging. In Proceedings of ACL. Wenbin Jiang, Haitao Mi, and Qun Liu. 2008b. Word lattice reranking for chinese word segmentation and part-of-speech ... ACL and AFNLP An Error-Driven Word- Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging Canasai Kruengkrai †‡ and Kiyotaka Uchimoto ‡ and Jun’ichi Kazama ‡ Yiou Wang ‡ and ... discriminative word- character hybrid model for joint Chi- nese word segmentation and POS tagging. Our word- character hybrid model offers high performance since it can handle both known and unknown words....
Ngày tải lên: 17/03/2014, 01:20
Ngày tải lên: 17/03/2014, 01:20
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx
... UK {yue.zhang,stephen.clark}@comlab.ox.ac.uk Abstract For Chinese POS tagging, word segmentation is a preliminary step. To avoid error propa- gation and improve segmentation by utilizing POS information, segmentation and tagging can be ... proposed a hy- brid model for word segmentation and POS tagging using an HMM-based approach. Word information is used to process known-words, and character infor- mation is used for unknown words ... word segmentation and POS tagging are still performed separately, and exact inference for both is possible. However, the interaction be- tween POS and segmentation is restricted by rerank- ing: POS...
Ngày tải lên: 20/02/2014, 09:20
Báo cáo khoa học: "A Hybrid Approach to Word Segmentation and POS Tagging" doc
Ngày tải lên: 31/03/2014, 01:20
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf
... that segmentation and POS tagging task is to divide a character sequence into several subse- quences and label each of them a POS tag. It is a better idea to perform segmentation and POS tagging ... When we derive a candidate result from a word- POS pair p and a candidate q at prior position of p, we cal- culate the scores of the word LM, the POS LM, the labelling probability and the generating ... each word- POS pair p (of length l) to the tail of each candidate result at the prior position of p (position i −l), and select for position i a N-best list of candidate results from all these candidates....
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx
... model, joint word segmen- tation and POS tagging is decomposed into two steps: (1) coarse-grained word segmentation and tagging, and (2) fine-grained sub -word tagging. The workflow is shown in ... inter- mediate sub -word structure for joint segmentation and tagging. Since the sub-words are large enough in practice, the decoding for POS tagging over sub- words is efficient. Finally, the Chinese language ... in previous work (Zhang and Clark, 2010; Jiang et al., 2008b). In this paper, we present an effective and effi- cient solution for joint Chinese word segmentation and POS tagging. Our work is motivated...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "TBL-Improved Non-Deterministic Segmentation and POS Tagging for a Chinese Parser" pdf
Ngày tải lên: 17/03/2014, 22:20
Báo cáo khoa học: "Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection" docx
Ngày tải lên: 23/03/2014, 14:20
Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf
... ~') > mi(;~?: t~), and mY(~." ~) > mY(/~: f/:), however, "~J~:~""7~: ~'"'~}~:~'"'~: ~"should be separated and "~: ~'"'~:~'"'~: ... Abstract Chinese word segmentation is the first step in any Chinese NLP system. This paper presents a new algorithm for segmenting Chinese texts without making use of any lexicon and hand-crafted ... Chinese word segmentation is therefore the first step for any Chinese information processing system[ 1]. Almost all methods for Chinese word segmentation developed so far, both statistical and...
Ngày tải lên: 20/02/2014, 18:20
Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt
... decoding. 3 Chinese Word Segmentation (CWS) 3.1 Word segmentation as character tagging Considering the ambiguity problem that a Chinese character may appear in any relative position in a word and the ... beginning of a word and I all other positions; and 2) BMES: where B, M and E represent the beginning, middle and end of a multi- character word respectively, and S tags a single- character word. For ... Character- and word- based features of a possi- ble word w i over the input character sequence c. Suppose that w i = c i 0 c i 1 c i 2 , and its preceding and following char- acters are c l and c r respectively. parameter...
Ngày tải lên: 07/03/2014, 18:20
Báo cáo khoa học: "Subword-based Tagging for Confidence-dependent Chinese Word Segmentation" pdf
Ngày tải lên: 17/03/2014, 04:20
Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf
... Processing, pp. 147-173. Gao, J. and A. Wu and Mu Li and C N.Huang and H. Li and X. Xia and H. Qin. 2004. Adaptive Chinese Word Segmentation. In Proceedings of ACL-2004. Meng, H. and C. W. Ip. 1999. An ... N. 2003. Chinese Word Segmentation as Charac- ter Tagging. Computational Linguistics and Chinese Language Processing. 8(1): 29-48 Redington, M. and N. Chater and C. Huang and L. Chang and K. Chen. ... that Chinese word segmentation is the classifi- cation of a string of character-boundaries (CB’s) into either word- boundaries (WB’s) and non -word- boundaries. In Chinese, CB’s are delimited and...
Ngày tải lên: 20/02/2014, 12:20
Báo cáo khoa học: "Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation" doc
... to in- tegrate Chinese word segmentation, part-of-speech tagging and parsing (Wu and Zixin, 1998; Zhou and Su, 2003; Luo, 2003; Fung et al., 2004). However, in these research all words were considered ... Computational Linguis- tics. Wenbin Jiang, Liang Huang, and Qun Liu. 2009. Au- tomatic adaptation of annotation standards: Chinese word segmentation and POS tagging – a case study. In Proceedings of the ... Jun’ichi Kazama, Yiou Wang, Kentaro Torisawa, and Hitoshi Isahara. 2009. An error-driven word- character hybrid model for joint Chinese word segmentation and POS tagging. In Proceedings of the Joint Conference...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Discriminative Pruning of Language Models for Chinese Word Segmentation" ppt
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Improved Source-Channel Models for Chinese Word Segmentation" pdf
Ngày tải lên: 31/03/2014, 03:20
Báo cáo khoa học: "Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese" potx
... sequence of POS tags. The joint approach to word segmentation and POS tagging has been reported to improve word seg- mentation and POS tagging accuracies by more than 1% in Chinese (Zhang and Clark, ... q −1 and q −2 respectively denote the last-shifted word and the word shifted before q −1 . q.w and q.t respectively denote the (root) word form and POS tag of a subtree (word) q, and q.b and q.e ... bound- ary of the top word on the stack if the last action was A or SH(t). 1048 interaction between segmentation and POS tagging. 3 Model 3.1 Incremental Joint Segmentation, POS Tagging, and Dependency...
Ngày tải lên: 07/03/2014, 18:20
Tài liệu Báo cáo khoa học: "Unsupervized Word Segmentation: the case for Mandarin Chinese" doc
... systems based on Harris's hypothesis (see (Magistry and Sagot, 2011) and Jin (2007) for a longer discussion). Many errors are related to dates and Chinese numbers. This could and should be dealt ... len(w i ), where W is the segmentation corresponding to the sequence of words w 0 w 1 . . . w m , and len(w i ) is the length of a word w i used here to be able to com- pare segmentations resulting ... 0.59–0.79. In a segmented Chinese text, most of the tokens are uni- and bigrams but most of the types are bi- and trigrams (as unigrams are often high frequency grammatical words and trigrams the result...
Ngày tải lên: 19/02/2014, 19:20
Báo cáo khoa học: "SVD and Clustering for Unsupervised POS Tagging" docx
... counts all the cluster-j (j=1… k 1 ) words to the right of word i, and L ij counts all the cluster-j words to the left of word i. The new ma- trices L and R have dimension N types × k 1 . ... three evaluation criteria of Gao and Johnson (2008): M-to-1, 1-to-1, and VI. M-to-1 and 1-to- 1 are the tagging accuracies under the best many- to-one map and the greedy one-to-one map re- spectively; ... Tagging accuracy under the best M-to-1 map, the greedy 1-to-1 map, and VI, for the full PTB45 tagset and the reduced PTB17 tagset. HMM-EM, HMM-VB and HMM-GS show the best results from Gao and...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Fully Unsupervised Word Segmentation with BVE and MDL" pdf
Ngày tải lên: 30/03/2014, 21:20
Bạn có muốn tìm thêm với từ khóa: