deep learning for chinese word segmentation and pos tagging

Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging" docx

Báo cáo khoa học: "An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging" docx

... model for joint chinese word segmentation and part-of-speech tagging. In Proceedings of ACL. Wenbin Jiang, Haitao Mi, and Qun Liu. 2008b. Word lattice reranking for chinese word segmentation and part-of-speech ... word segmentation and pos tag- ging. In Proceedings of ACL Demo and Poster Ses- sions. Tetsuji Nakagawa. 2004. Chinese and japanese word segmentation using word- level and character-level information. ... discriminative word- character hybrid model for joint Chi- nese word segmentation and POS tagging. Our word- character hybrid model offers high performance since it can handle both known and unknown words....

Ngày tải lên: 17/03/2014, 01:20

9 338 0
Báo cáo khoa học: "Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study" potx

Báo cáo khoa học: "Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging – A Case Study" potx

... that when word segmenta- tion and POS tagging are conducted jointly, the performance for segmentation improves since the POS tags provide additional information to word segmentation (Ng and Low, ... in the context of Chinese word segmentation and part-of-speech tagging, where no segmentation and POS tagging standards are widely accepted due to the lack of morphology in Chinese. Experi- ments ... pars- ing (and translation). Experiments adapting from PD to CTB are con- ducted for two tasks: word segmentation alone, and joint segmentation and POS tagging (Joint S&T). The performance...

Ngày tải lên: 17/03/2014, 01:20

9 404 0
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

... proposed a hy- brid model for word segmentation and POS tagging using an HMM-based approach. Word information is used to process known-words, and character infor- mation is used for unknown words ... outputs. In this paper, we propose a novel joint model for Chinese word segmentation and POS tagging, which does not limiting the interaction between segmentation and POS information in reducing the combined ... rare POS pattern “number word + “number word can help to prevent seg- menting a long number word into two words. In order to avoid error propagation and make use of POS information for word segmentation, ...

Ngày tải lên: 20/02/2014, 09:20

9 576 0
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

... that segmentation and POS tagging task is to divide a character sequence into several subse- quences and label each of them a POS tag. It is a better idea to perform segmentation and POS tagging ... each word- POS pair p (of length l) to the tail of each candidate result at the prior position of p (position i −l), and select for position i a N-best list of candidate results from all these candidates. ... single-character word and multi- character word respectively. In order to perform POS tagging at the same time, we expand boundary tags to include POS information by attaching a POS to the tail...

Ngày tải lên: 08/03/2014, 01:20

8 445 0
Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx

... inter- mediate sub -word structure for joint segmentation and tagging. Since the sub-words are large enough in practice, the decoding for POS tagging over sub- words is efficient. Finally, the Chinese language ... c #c ), the task of word segmentation and POS tagging is to predict a sequence of word and POS tag pairs y = (w 1 , p 1 , w #y , p #y ), where w i is a word, p i is its POS tag, and a “#” symbol ... stacked learning is used to acquire extended training data for sub -word tagging. 3 Method 3.1 Architecture In our stacked sub -word model, joint word segmen- tation and POS tagging is decomposed...

Ngày tải lên: 17/03/2014, 00:20

10 412 0
Báo cáo khoa học: "Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation" doc

Báo cáo khoa học: "Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation" doc

... model for integrated morphological and syntactic parsing. First and foremost, we cur- rently know of no other same effort in parsing the structures of Chinese words, and we have to anno- tate word ... many efforts to in- tegrate Chinese word segmentation, part-of-speech tagging and parsing (Wu and Zixin, 1998; Zhou and Su, 2003; Luo, 2003; Fung et al., 2004). However, in these research all words ... June. Association for Computational Linguis- tics. Wenbin Jiang, Liang Huang, and Qun Liu. 2009. Au- tomatic adaptation of annotation standards: Chinese word segmentation and POS tagging – a case...

Ngày tải lên: 17/03/2014, 00:20

10 476 0
Báo cáo khoa học: "Discriminative Pruning of Language Models for Chinese Word Segmentation" ppt

Báo cáo khoa học: "Discriminative Pruning of Language Models for Chinese Word Segmentation" ppt

... Bin Swen, and Baobao Chang. 2003. Specification for Corpus Processing at Peking University: Word Segmenta- tion, POS Tagging and Phonetic Notation. Journal of Chinese Language and Computing, ... Combined Model and KLD Model 5 Conclusions and Future Work A discriminative pruning criterion of n-gram lan- guage model for Chinese word segmentation was proposed in this paper, and a step-by-step ... model for Chinese word segmentation was pro- posed. Gao et al. (2005) further developed it to a linear mixture model. In these statistical models, language models are essential for word segmen- tation...

Ngày tải lên: 17/03/2014, 04:20

8 294 0
Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf

Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf

... Chinese word segmentation is therefore the first step for any Chinese information processing system[ 1]. Almost all methods for Chinese word segmentation developed so far, both statistical and ... Abstract Chinese word segmentation is the first step in any Chinese NLP system. This paper presents a new algorithm for segmenting Chinese texts without making use of any lexicon and hand-crafted ... Automatic Word Segmentation System for Written Chinese Texts", Journal of Chinese Information Processing, Vol. 1, No.2, 1987 (in Chinese) [2] Fan C.K.,Tsai WH., "Automatic Word Identification...

Ngày tải lên: 20/02/2014, 18:20

7 396 0
Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

... of a word and I all other positions; and 2) BMES: where B, M and E represent the beginning, middle and end of a multi- character word respectively, and S tags a single- character word. For example, ... NNS w 0 =last & w −1 = the → JJ Table 7: Deterministic constraints for POS tagging. Deterministic constraints for POS tagging For English POS tagging, we evaluate the deter- ministic constraints generated ... likelihood of each possible tag or the relative rank of their likelihoods. Deterministic constraints for character tagging For the character tagging formulation of Chinese word segmentation, we...

Ngày tải lên: 07/03/2014, 18:20

9 425 0
Báo cáo khoa học: "Subword-based Tagging for Confidence-dependent Chinese Word Segmentation" pdf

Báo cáo khoa học: "Subword-based Tagging for Confidence-dependent Chinese Word Segmentation" pdf

... proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entropy (MaxEnt) and ... a Chi- nese word has discriminative roles for word composition. For example, single-character words are more apt to form new words than are multiple-character words. Features using word length ... methods with Chinese word segmentation, with which our re- sults were compared. Section 5 provides the con- cluding remarks and outlines future goals. 2 Chinese word segmentation framework Our word segmentation...

Ngày tải lên: 17/03/2014, 04:20

8 348 0
Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf

Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf

... Christoper C. Yang and K. W. Li. 2005. A Heuristic Method Based on a Statistical Approach for Chinese Text Segmentation. Journal of the American Society for Information Science and Technology, ... Each word in a sentence is compared to word dictionary en- tries, and if the word is not in the dictionary, then the system assumes that the word has spelling er- rors. Then corrected candidate ... corrected candidate words are suggested by the system from the word dictionary, according to some metric to measure the similarity between the target word and its candidate word, such as edit-distance...

Ngày tải lên: 20/02/2014, 12:20

4 523 0
Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf

Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf

... Processing, pp. 147-173. Gao, J. and A. Wu and Mu Li and C N.Huang and H. Li and X. Xia and H. Qin. 2004. Adaptive Chinese Word Segmentation. In Proceedings of ACL-2004. Meng, H. and C. W. Ip. 1999. An ... N. 2003. Chinese Word Segmentation as Charac- ter Tagging. Computational Linguistics and Chinese Language Processing. 8(1): 29-48 Redington, M. and N. Chater and C. Huang and L. Chang and K. Chen. ... that Chinese word segmentation is the classifi- cation of a string of character-boundaries (CB’s) into either word- boundaries (WB’s) and non -word- boundaries. In Chinese, CB’s are delimited and...

Ngày tải lên: 20/02/2014, 12:20

4 301 0
Báo cáo khoa học: "Accurate Learning for Chinese Function Tags from Minimal Features" pdf

Báo cáo khoa học: "Accurate Learning for Chinese Function Tags from Minimal Features" pdf

... features for function labeling. Specifically, our proposal is to classify function types directly from lexical features like words and their POS tags and the surface sentence informa- tion like the word ... round. FT1 word & POS tags within [-2,+2] FT2 word & POS tags within [-3,+3] FT3 word & POS tags within [-4,+4] FT4 FT3 plus POS bigrams within [-4,+4] FT5 FT4 plus verbs FT6 FT5 plus POS ... performance. We adopt auto- matic POS tagger of (Qin et al., 2008), which got the first place in the forth SIGHAN Chinese POS tagging bakeoff on CTB open test, to assign POS tags for our data. Following...

Ngày tải lên: 08/03/2014, 01:20

9 515 0
Using Online Learning for At-Risk Students and Credit Recovery ppt

Using Online Learning for At-Risk Students and Credit Recovery ppt

... scalable and able to expand more easily than programs based entirely on brick -and- mortar classrooms. Success stories and anecdotes regarding the benefits and value of online learning for both ... high demand online courses in career planning and basic math, and optional courses in digital photography and forensic science, to motivate students while they develop the independent learning ... school, not -for- profit, for- profit, or other institution. Thirty states and more than half of the school districts in the United States offer online courses and services, and online learning is...

Ngày tải lên: 15/03/2014, 04:20

18 380 0
Báo cáo khoa học: "Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling" pdf

Báo cáo khoa học: "Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling" pdf

... regularizer can be seen as a composition, , where , and , . For scalar , the second derivative of a composition, , is given by (Boyd and Vandenberghe 2004) Although and are concave here, since is ... classification), since hand-labeling individ- ual words and word boundaries is much harder than assigning text-level class labels. Many approaches have been proposed for semi- supervised learning in the ... training set consisting of 5448 words, and considered alternative unlabeled train- ing sets, (5210 words), (10,208 words), and (25,145 words), consisting of the same, 2 times and 5 times as many sentences...

Ngày tải lên: 17/03/2014, 04:20

8 382 0
Xem thêm

Bạn có muốn tìm thêm với từ khóa:
