a phrasebased statistical model for sms text normalization

Tài liệu Báo cáo khoa học: "A Phrase-based Statistical Model for SMS Text Normalization" ppt

Tài liệu Báo cáo khoa học: "A Phrase-based Statistical Model for SMS Text Normalization" ppt

... com- mon transformations. 4 SMS Normalization We view the SMS language as a variant of Eng- lish language with some derivations in vocabu- lary and grammar. Therefore, we can treat SMS normalization ... inadequate for providing a complete solution for SMS normalization. 2.3 SMS Normalization versus Text Para- phrasing Problem Others may regard SMS normalization as a para- phrasing problem. Broadly ... 2006. c 2006 Association for Computational Linguistics A Phrase-based Statistical Model for SMS Text Normalization AiTi Aw, Min Zhang, Juan Xiao, Jian Su Institute of Infocomm Research 21 Heng...

Ngày tải lên: 20/02/2014, 12:20

8 400 0
Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf

Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf

... needs a word dictionary and takes long time for searching many character combinations. 61 4.2 Experiment Results and Analyses We used two separate Eumjeol n-grams as lan- guage models for ... be divided into statistical algorithms and rule-based algorithms. Statistical algorithms generally use character n- gram (Eojeol 1 or Eumjeol 2 n-gram in Korean) (Kang and Woo, 2001; Kwon, ... exist spaces As shown above, the performance is dependent of the language model (n-gram) performance. Jaso transition probabilities can be obtained easily from small corpus because the...

Ngày tải lên: 20/02/2014, 12:20

4 523 0
Báo cáo khoa học: "A Unified Statistical Model for the Identification of English BaseNP" pptx

Báo cáo khoa học: "A Unified Statistical Model for the Identification of English BaseNP" pptx

... important subtask for many natural language processing applications, such as partial parsing, information retrieval and machine translation. A baseNP is a simple noun phrase that does not contain other ... pp.218-224. COLING-ACL’98 Lance A. Ramshaw and Michael P. Marcus ( In Press). Text chunking using transformation-based learning. In Natural Language Processing Using Very large Corpora. Kluwer. Originally appeared in ... Treebank II, and the definition of baseNP is the same as Ramshaw’s, Table 1 summarizes the average performance on both baseNP tagging and POS tagging, each section of the whole Penn Treebank was...

Ngày tải lên: 08/03/2014, 05:20

8 482 0
Tài liệu Báo cáo khoa học: "A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining" pptx

Tài liệu Báo cáo khoa học: "A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining" pptx

... system learns this as a non-transliteration but it is wrongly annotated as a transliteration in the gold standard. Arabic nouns have an article “al” attached to them which is translated in English as ... uses Hidden Markov Models (Nabende, 2010; Darwish, 2010; Jiampojamarn et al., 2010), Finite State Au- tomata (Noeman and Madkour, 2010) and Bayesian learning (Kahki et al., 2011) to learn transliteration pairs ... International Language Resources and Evaluation (LREC’10), Val- letta, Malta. Sittichai Jiampojamarn, Kenneth Dwyer, Shane Bergsma, Aditya Bhargava, Qing Dou, Mi-Young Kim, and Grzegorz Kondrak....

Ngày tải lên: 19/02/2014, 19:20

9 521 0
Tài liệu Báo cáo khoa học: "A Localized Prediction Model for Statistical Machine Translation" ppt

Tài liệu Báo cáo khoa học: "A Localized Prediction Model for Statistical Machine Translation" ppt

... paper, we present a block-based model for statis- tical machine translation. A block is a pair of phrases which are translations of each other. For example, Fig. 1 shows an Arabic-English translation ... Conference (HLT 04), pages 177–184, Boston, MA, May. Christoph Tillmann and Fei Xia. 2003. A Phrase-based Unigram Model for Statistical Machine Translation. In Companian Vol. of the Joint HLT and NAACL Confer- ence ... set of candidates. This computational advantage is the main reason that we adopt the local model in this paper. 3.3 Global versus Local Models Both the global and the localized log-linear models...

Ngày tải lên: 20/02/2014, 15:20

8 578 0
Tài liệu A COMPREHENSIVE QUANTITATIVE MODEL FOR ANALYZING BOND REFUNDING DECISIONS pptx

Tài liệu A COMPREHENSIVE QUANTITATIVE MODEL FOR ANALYZING BOND REFUNDING DECISIONS pptx

... replaced by a floating-rate bond, a floating-rate bond replaced by a fixed-rate bond, and a floating-rate bond replaced by another floating-rate bond with a different index or a different margin. MOTIVATION ... have to be evaluated on a case by case basis to calculate the exact costs or savings produced by the various interacting variables. Consequently, there is a need for an interactive computer model ... fixed-rate or floating-rate bonds can similarly be investigated by suitably amending the appropriate input variables. Several what-if scenarios can be investigated (or simulated) to determine breakeven...

Ngày tải lên: 15/02/2014, 13:20

9 357 1
Tài liệu Towards a conceptual reference model for project management information systems ppt

Tài liệu Towards a conceptual reference model for project management information systems ppt

... outlined above by introducing a very fundamental data structure called Initiative (Fig. 3). An initiative is a generalization of any form of action that has a defined start and end date and is unde rtaken ... rtaken to reach a goal. Therefore, an initiative may be a program, a project, a sub-project, a pro ject phase, a work package, an activity or a task (indicated by the inheritance relationship between ... Their feasibility, profitability, and strategic impact are analyzed so that a final decision can be made regard- ing their implementation (Idea Evaluation). This phase ends with a formal go/no-go...

Ngày tải lên: 18/02/2014, 07:20

12 721 0
Tài liệu Báo cáo khoa học: "A Hybrid Hierarchical Model for Multi-Document Summarization" ppt

Tài liệu Báo cáo khoa học: "A Hybrid Hierarchical Model for Multi-Document Summarization" ppt

... 4: Manual Evaluations Here, we manually evaluate quality of summaries, a common DUC task. Human annotators are given two sets of summary text for each document set, generated from two approaches: ... 12 Overall 24 66 2 Table 4: Frequency results of manual quality evaluations. Results are statistically significant based on t-test. T ie indi- cates evaluations where two summaries are rated equal. according ... this paper. In this paper, we present a novel approach that formulates MDS as a prediction problem based on a two-step hybrid model: a generative model for hierarchical topic discovery and a regression model...

Ngày tải lên: 20/02/2014, 04:20

10 559 0
Tài liệu Báo cáo khoa học: "A Unified Graph Model for Sentence-based Opinion Retrieval" pdf

Tài liệu Báo cáo khoa học: "A Unified Graph Model for Sentence-based Opinion Retrieval" pdf

... represented by a bag-of-word. Among the words, there is a topic term Avatar (t 1 ) occurring twice, i.e. Avatar in A and Avatar in C, and two senti- ment words comfortable (o 1 ) and favorite (o 2 ) ... 4.1.1 Benchmark Datasets Our experiments are based on the Chinese benchmark dataset, COAE08 (Zhao et al., 2008). COAE dataset is the benchmark data set for the opinion retrieval track in the ... performance. In this paper, we propose a sentence-based ap- proach based on a new information representa- tion, namely topic-sentiment word pair, to cap- ture intra-sentence contextual information...

Ngày tải lên: 20/02/2014, 04:20

9 585 0
Tài liệu Báo cáo khoa học: "A probabilistic generative model for an intermediate constituency-dependency representation" pptx

Tài liệu Báo cáo khoa học: "A probabilistic generative model for an intermediate constituency-dependency representation" pptx

... utilize a state of the art parser for PS trees (Charniak, 1999), and transform each candidate to TDS. This strategy can be considered a first step to efficiently test and compare different models before ... next. 3.4 Evaluation Metrics for TDS The re-ranking framework described above, al- lows us to keep track of the original PS of each TDS candidate. This provides an implicit advan- tage for evaluating ... (92). Michael J. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. the- sis, University of Pennsylvania. Marie-Catherine de Marneffe and Christopher D. Man- ning....

Ngày tải lên: 20/02/2014, 04:20

6 556 0
Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx

Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx

... Recognition Evaluation (LRE) data. The database was intended to establish a baseline of performance capability for language recognition of conversational tele- phone speech. The database contains recorded ... identification us- ing Gaussian Mixture model tokenization , in Proc. of ICASSP. Yonghong Yan, and Etienne Barnard. 1995. An ap- proach to automatic language identification based on language dependent ... 515–522, Ann Arbor, June 2005. c 2005 Association for Computational Linguistics A Phonotactic Language Model for Spoken Language Identification Haizhou Li and Bin Ma Institute for Infocomm Research...

Ngày tải lên: 20/02/2014, 15:20

8 437 0
Tài liệu Báo cáo khoa học: "A SPEECH-FIRST MODEL FOR REPAIR DETECTION AND CORRECTION" docx

Tài liệu Báo cáo khoa học: "A SPEECH-FIRST MODEL FOR REPAIR DETECTION AND CORRECTION" docx

... these cases as repairs, as well as to distinguish them from nonfrag- ment repairs. Thus, pausal duration may serve as a general acoustic cue for repair detection, particularly for the class ... that rely on accurate transcription to identify repair candidates " ;text- first". Text- first approaches have explored the potential contributions of lexical and grammatical information ... glottalization. 5 Although interruption glottalization is usually associated with fragments, not all fragments are glottalized. In our database, 62% of fragments are not glottalized, and...

Ngày tải lên: 20/02/2014, 21:20

8 502 0
Designing a Virtual Reality Model for Aesthetic Surgery docx

Designing a Virtual Reality Model for Aesthetic Surgery docx

... serve as a three-dimensional atlas of the anatomy ger- mane to aesthetic surgery of the face. Although these models can be viewed from any angle and made selectively transparent to illustrate anatomical ... photographs enhanced in Adobe Photoshop 7.0 and materials de- signed in Maya. R ESULTS A virtual reality model of surgical superficial facial anatomy was created. Included in this model are the ... cleft palate repair. Plast. Reconstr. Surg. 115: 236, 2005. 21. Cutting, C., Oliker, A. , Khorammabadi, D., and Haddad, B. A deformer-based surgical simulator program for cleft lip and palate surgery....

Ngày tải lên: 07/03/2014, 17:20

5 305 0
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf

... propose a cascaded linear model for joint Chinese word segmentation and part- of-speech tagging. With a character-based perceptron as the core, combined with real- valued features such as language models, ... at the same time, we expand boundary tags to include POS information by attaching a POS to the tail of a boundary tag as a postfix following Ng and Low (2004). As each tag is now composed of a ... ap- proach of discriminative models treats segmentation as a labelling problem by assigning each character a boundary tag (Xue and Shen, 2003), Joint S&T can be conducted in a labelling fashion...

Ngày tải lên: 08/03/2014, 01:20

8 445 0
w