Tài liệu Báo cáo khoa học: "Unsupervized Word Segmentation: the case for Mandarin Chinese" doc

Tài liệu Báo cáo khoa học: "Unsupervized Word Segmentation: the case for Mandarin Chinese" doc

Tài liệu Báo cáo khoa học: "Unsupervized Word Segmentation: the case for Mandarin Chinese" doc

... measure without the need for fine-tuning the balance between the two. The evolution of the results w.r.t. word length is consistent with the supervized cross-evaluation re- sults of the various segmentation ... characters). However, they opti- mize their parameter for each setting. We therefore consider that their system does take into account the level of processing whic...

Ngày tải lên: 19/02/2014, 19:20

5 467 1
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

... some partial words are “justified” as complete words by the current POS information. On the other hand, if partial words are not given POS tag features, the correct segmentation for long words can ... 1: The perceptron learning algorithm useful only for the POS “number word in the base- line tagger, is also an effective indicator of the seg- mentation of the two words (esp...

Ngày tải lên: 20/02/2014, 09:20

9 576 0
Tài liệu Báo cáo khoa học: Evolutionary relationships of the prolyl oligopeptidase family enzymes docx

Tài liệu Báo cáo khoa học: Evolutionary relationships of the prolyl oligopeptidase family enzymes docx

... life forms and that the b-propeller domain has been part of the family for billions of years. There are striking differences in the mutation rates between the enzymes and POP was found to be the ... based on the initial 3D alignment. The neighbor- joining tree was constructed for the peptidase domains of the enzymes (corresponding to the pig POP residues 1–72 and 428–71...

Ngày tải lên: 19/02/2014, 13:20

11 478 0
Tài liệu Báo cáo khoa học: "Fast Decoding and Optimal Decoding for Machine Translation" doc

Tài liệu Báo cáo khoa học: "Fast Decoding and Optimal Decoding for Machine Translation" doc

... of e ), (the fertility of the NULL word) , (the k French word produced by e in a), (the position of in f), (the position of the first fertile word to the left of e in a), (the ceiling of the average ... f, then an optimal decoder will search for an e that maximizes P(e f) 1 The symbols in this formula are: (the length of e), (the length off), e (the i English word...

Ngày tải lên: 20/02/2014, 18:20

8 440 0
Tài liệu Báo cáo khoa học: "Large linguistically-processed Web corpora for multiple languages" doc

Tài liệu Báo cáo khoa học: "Large linguistically-processed Web corpora for multiple languages" doc

... various documents from the annotated corpus, we decided to perform a further round of cleaning. There are two reasons for this: first, we can exploit the annotation to find other anomalous documents, ... (dictionary definitions of the word, top pages of companies with the word in their name), whereas combining more than two words retrieved pages with lists of words, rather than coll...

Ngày tải lên: 22/02/2014, 02:20

4 314 0
Tài liệu Báo cáo khoa học: "Text Alignment in a Tool for Translating Revised Documents" docx

Tài liệu Báo cáo khoa học: "Text Alignment in a Tool for Translating Revised Documents" docx

... materials in the draft. In such cases, in addition to the revised text, the tool copies into the draft both the relevant text from the old version and the relevant translation and marks them appropriately. ... sections in the TL text from the existing transla- tion and update materials from the SL text, thereby reducing the effort required from the translator. Thi...

Ngày tải lên: 22/02/2014, 10:20

5 456 0
Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf

Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf

... contexts for clustering word instances, which is used in the multi-prototype ver- sion of our model that accounts for words with mul- tiple senses. We evaluate our new model on the standard WordSim-353 ... via a joint training objective. The model learns word representations that better cap- ture the semantics of words, while still keeping syn- tactic information. These improv...

Ngày tải lên: 19/02/2014, 19:20

10 494 0
Tài liệu Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble" pdf

Tài liệu Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble" pdf

... describe the process of word generation from the left to the right by alternately using two dice, the first for de- ciding whether to place a morpheme boundary in the current word position and the ... Another way of classifying ap- proaches is based on the learning aspect during the construction of the morphological model. If the data for training the model has t...

Ngày tải lên: 20/02/2014, 04:20

9 558 0
Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc

Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc

... ω j b 0 otherwise where ω k a and ω j b are the a-th and b-th word classes of s  k and s  j , respectively. In other words, the matching score equals 1 if the a-th and the b-th tokens of the two ... a monochrome JJ NN 2 structure picture dot NN 3 data Figure 1: The Word- Class Lattice for the sentences in Table 1. The support of each word class is reported beside the...

Ngày tải lên: 20/02/2014, 04:20

10 567 0
Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt

Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt

... probability to a document d using a joint distribu- tion over the document and θ. The model assumes each word w i ∈ d is conditionally independent of the other words given θ. The probability of a docu- ment ... on the average polarity of docu- ments in which the words occur. Given a set of labeled documents D where s k is the sentiment label for document d k , we wish to max...

Ngày tải lên: 20/02/2014, 04:20

9 591 0
w