... perform poorly on Twitter (Finin et al., 2010). One of the most fundamental parts of the linguis- tic pipeline is part- of- speech (POS) tagging, a basic form of syntactic analysis which has countless appli- cations ... to test the efficacy of this feature set for part- of- speech tagging given lim- ited training data. We randomly divided the set of 1,827 annotated tweets into a training set of 1,000 (14,542 tokens), ... address the problem of part- of- speech tag- ging for English data from the popular micro- blogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results...
Ngày tải lên: 20/02/2014, 04:20
... achieving accuracy of 97.98%, which is a significant improve- ment over the state -of- the-art for Bulgarian. 1 Introduction Part- of- speech (POS) tagging is the task of as- signing each of the words in ... larger inventory of POS tags, e.g., the Penn Treebank (Marcus et al., 1993) uses 48 tags: 36 for part- of- speech, and 12 for punctuation and currency symbols. This increase in the number of tags is partially ... four major types of ambiguity: 1. Between the wordforms of the same lexeme, i.e., in the paradigm. For example, , an inflected form of (‘sofa’, mascu- line), can mean (a) ‘the sofa’ (definite, singu- lar,...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Weakly Supervised Part-of-Speech Tagging for Morphologically-Rich, Resource-Scarce Languages" potx
Ngày tải lên: 24/03/2014, 03:20
Báo cáo khoa học: "Simultaneous Tokenization and Part-of-Speech Tagging for Arabic without a Morphological Analyzer" doc
Ngày tải lên: 30/03/2014, 21:20
Báo cáo khoa học: "Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part-of-Speech Tagging" docx
... Association for Computational Linguistics Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part- of- Speech Tagging Ashish Vaswani 1 Adam Pauls 2 David Chiang 1 1 Information ... Proceedings of the 7th International Con- ference on Independent Component Analysis and Signal Separation (ICA2007). S. Ravi and K. Knight. 2009. Minimized models for unsupervised part- of- speech tagging. ... second-order partial derivatives are all zero, as are those of the equality con- straints. We perform this optimization for each instance of (15). These optimizations could easily be per- formed in...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Semisupervised condensed nearest neighbor for part-of-speech tagging" pot
... C ′ from the new data set which is a mixture of labeled and unlabeled data points. See Figure 4 for details. 3 Part- of- speech tagging Our part- of- speech tagging data set is the standard data set ... semi- supervised part- of- speech tagging and present the best published result on the Wall Street Journal data set. 1 Introduction Labeled data for natural language processing tasks such as part- of- speech tagging ... Linguistics Semisupervised condensed nearest neighbor for part- of- speech tagging Anders Søgaard Center for Language Technology University of Copenhagen Njalsgade 142, DK-2300 Copenhagen S soegaard@hum.ku.dk Abstract This...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf
... and Part- of- Speech Tagging Wenbin Jiang † Liang Huang ‡ Qun Liu † Yajuan L ¨ u † † Key Lab. of Intelligent Information Processing ‡ Department of Computer & Information Science Institute of ... segmentation and part- of- speech tagging. On the Penn Chinese Treebank 5.0, we obtain an error reduction of 18.5% on segmentation and 12% on joint seg- mentation and part- of- speech tagging over the perceptron-only ... can be transformed to a tagging problem by as- signing each character a boundary tag of the follow- ing four types: • b: the begin of the word • m: the middle of the word • e: the end of the word •...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Machine Aided Error-Correction Environment for Korean Morphological Analysis and Part-of-Speech Tagging" pptx
... output of tagger. The training is leveraged to learn the error-correction rules. 3 Proposed Model 3.1 The Causes of Part- of- Speech Tagging Error We will mention important causes to make POS tagging ... M.S. Thesis, McGill University, School of Computer Science. G. Lee and J. Lee. 1996. "Rule-based error cor- rection for statistical part- of- speech tagging& quot;. Korea-China Joint Symposium ... 125-131. H. Lim, J. Kim, and H. Rim. 1996. "A Korean Transformation-based Part- of- Speech Tagger with Lexical information of mistagged Eo- jeol". Korea-China Joint Symposium on Ori-...
Ngày tải lên: 08/03/2014, 05:21
Báo cáo khoa học: "Categorial Fluidity in Chinese and its Implications for Part-of-speech Tagging" pptx
... Fluidity in Chinese and its Implications for Part- of- speech Tagging OiYeeKwong Benjamin K. Tsou Language Information Sciences Research Centre City University of Hong Kong, Kowloon, Hong Kong {rlolivia, ... Applications. In Proceedings of the ICCLC International Conference on Chinese Language Comput- ing, Chicago, pages 233-238. Xia, F. 2000. The Part- Of- Speech Tagging Guidelines for the Penn Chinese ... each tag consists of a letter code for the general classification (i.e. noun, verb, etc.) of the word, and another for the sub-classification according to the particular con- text. For example, when...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" potx
... 2011. c 2011 Association for Computational Linguistics A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part- of- Speech Tagging Weiwei Sun Department of Computational Linguistics, ... the lack of morphology that often provides important clues for POS tagging, and the POS tags contain much syntactic information, which need context information within a large window for disambiguation. ... s k = {c[i : j]} denote the set of all segments of a partition. Given multiple partitions of a character sequence S = {s k }, there is one and only one merged partition s S = {c[i : j]} s.t. 1....
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Minimized Models for Unsupervised Part-of-Speech Tagging" pot
... new methods for un- supervised part- of- speech tagging. We adopt the problem formulation of Merialdo (1994), in which we are given a raw word sequence and a dictio- nary of legal tags for each word ... In Proceedings of the ACL. K. Toutanova and M. Johnson. 2008. A Bayesian LDA-based model for semi-supervised part- of- speech tagging. In Proceedings of the Advances in Neural Information Processing ... IJCNLP of the AFNLP, pages 504–512, Suntec, Singapore, 2-7 August 2009. c 2009 ACL and AFNLP Minimized Models for Unsupervised Part- of- Speech Tagging Sujith Ravi and Kevin Knight University of Southern...
Ngày tải lên: 17/03/2014, 01:20
Báo cáo khoa học: "Part-of-Speech Tagging Considering Surface Form for an Agglutinative Language" doc
Ngày tải lên: 23/03/2014, 19:20
Tài liệu Báo cáo khoa học: "Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection" pptx
... and Robust Part- of- Speech Tagging Using Dynamic Model Selection Jinho D. Choi Department of Computer Science University of Colorado Boulder choijd@colorado.edu Martha Palmer Department of Linguistics University ... Singer. 2003. Feature-Rich Part- of- Speech Tagging with a Cyclic Dependency Network. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computa- tional Linguistics ... 2011. Semi-supervised condensed nearest neighbor for part- of- speech tagging. In Pro- ceedings of the 49th Annual Meeting of the Associa- tion for Computational Linguistics: Human Language Technologies,...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging∗" docx
... Bayesian Approach to Unsupervised Part- of- Speech Tagging ∗ Sharon Goldwater Department of Linguistics Stanford University sgwater@stanford.edu Thomas L. Griffiths Department of Psychology UC Berkeley tom griffiths@berkeley.edu Abstract Unsupervised ... possible parts of speech allowed for each word. (This also fixes W t , the number of possible words for tag t.) The dictionary was constructed by listing, for each word, all tags found for that ... es- timation (MLE) of the model parameters. We show using part- of- speech tagging that a fully Bayesian approach can greatly im- prove performance. Rather than estimating a single set of parameters,...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop" pdf
... morphologically tagging (including part- of- speech tagging) are the same operation, which consists of three phases. First, we obtain from our morphological analyzer a list of all possible analyses for the ... the values of a large number of (or- thogonal) features, such as basic part- of- speech (i.e., noun, verb, and so on), voice, gender, number, infor- mation about the clitics, and so on. 2 For Arabic, ... tokenizing and morphologically tagging (including part- of- speech tagging) Arabic words in one process. We learn classifiers for individual morphological features, as well as ways of using these classifiers...
Ngày tải lên: 20/02/2014, 15:20
Báo cáo khoa học: "A Cost Sensitive Part-of-Speech Tagging: Differentiating Serious Errors from Minor Errors" pptx
... 2011. Semisupervised condensed near- est neighbor for part- of- speech tagging. In Proceed- ings of the 49th Annual Meeting of the Association of Computational Linguistics. pp. 48–52. Drahom´ıra ... serious er- rors help to improve the performance of sub- sequent NLP tasks. 1 Introduction Part- of- speech (POS) tagging is needed as a pre- processor for various natural language processing (NLP) ... Since POS tagging is normally performed in the early step of NLP tasks, the errors in POS tagging are critical in that they affect subsequent steps and often lower the overall performance of NLP...
Ngày tải lên: 07/03/2014, 18:20
Báo cáo khoa học: "Examining the Content Load of Part of Speech Blocks for Information Retrieval" pptx
... membership of the parts of speech within such blocks reflects the content load of the blocks, on the basis that open class parts of speech are more content-bearing than closed class parts of speech. ... Association for Computational Linguistics Examining the Content Load of Part of Speech Blocks for Information Retrieval Christina Lioma Department of Computing Science University of Glasgow 17 ... U.K. xristina@dcs.gla.ac.uk Iadh Ounis Department of Computing Science University of Glasgow 17 Lilybank Gardens Scotland, U.K. ounis@dcs.gla.ac.uk Abstract We investigate the connection between part of speech (POS) distribution...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Part of Speech Tagger for Assamese Text" docx
... a manually tagged corpus of about 10000 words for training, we obtain a tagging accuracy of nearly 87% for test inputs. 1 Introduction Part of Speech (POS) tagging is the process of marking up words ... for machine assisted POS tagging of Bangla corpora. Pammi and Prahllad (2007) developed a POS tagger and chunker using Decision Forests. This work explored different methods for POS tagging of ... richness of the language, many words of Assamese occur in secondary forms in texts. This increases the number of POS tags that needed for the language. Also, often there are differences of opinion...
Ngày tải lên: 17/03/2014, 02:20
Báo cáo khoa học: "Unsupervised Part-of-Speech Tagging Employing Efficient Graph Clustering" ppt
Ngày tải lên: 17/03/2014, 04:20