Báo cáo khoa học: "Learning Intonation Rules for Concept to Speech Generation" pptx

Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt

Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt

... Huang, Andrew Y. Ng, and Christopher Potts Stanford University Stanford, CA 94305 [amaas, rdaly, ptpham, yuze, ang, cgpotts]@stanford.edu Abstract Unsupervised vector-based approaches to se- mantics can ... semantic word vectors by apply- ing singular value decomposition (SVD) to factor a term–document co-occurrence matrix. It is typical to weight and normalize the matrix values prior...

Ngày tải lên: 20/02/2014, 04:20

9 591 0
Tài liệu Báo cáo khoa học: "Models and Training for Unsupervised Preposition Sense Disambiguation" pptx

Tài liệu Báo cáo khoa học: "Models and Training for Unsupervised Preposition Sense Disambiguation" pptx

... sufficiently coarse to allow for a good generalization. Unknown words are as- sumed to have all possible senses applicable to their respective word class (i.e. all noun senses for words labeled ... P(¯o|¯p) · P(o|¯o) We want to incorporate as much information as possible into the model to constrain the choices. In Figure 1c, we condition ¯p on both ¯ h and ¯o, to reflect the fac...

Ngày tải lên: 20/02/2014, 05:20

6 437 0
Tài liệu Báo cáo khoa học: "A Multimodal Interface for Access to Content in the Home" pdf

Tài liệu Báo cáo khoa học: "A Multimodal Interface for Access to Content in the Home" pdf

... allowing the user to ‘glue’ together references to multiple actors or directors in order to constrain the search. For example, they can say “movies with THIS actor and THIS direc- tor” and point ... and directors are presented as but- tons. Pointing at (i.e., clicking on) these buttons results in a search for all of the movies with that particular actor or director, allowing us...

Ngày tải lên: 20/02/2014, 12:20

8 586 0
Tài liệu Báo cáo khoa học: "Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Tranfor slation" pptx

Tài liệu Báo cáo khoa học: "Syntactic Phrase Reordering for English-to-Arabic Statistical Machine Tranfor slation" pptx

... System description For the English source, we first tokenize us- ing the Stanford Log-linear Part-of -Speech Tag- ger (Toutanova et al., 2003). We then proceed to split the data into smaller sentences ... feature to the Factored Model that models noun case agreement and verb person con- jugation, and show that it leads to a more gram- matically correct output for English -to- Greek...

Ngày tải lên: 22/02/2014, 02:20

8 377 0
Tài liệu Báo cáo khoa học: "Contrastive accent in a data-to-speech system" doc

Tài liệu Báo cáo khoa học: "Contrastive accent in a data-to-speech system" doc

... meanings. A generation system for spoken language should therefore be able to produce appropriate ac- centuation patterns for its output messages. One of the factors determining accentuation ... mentioned for the second time and therefore regarded as 'given'. However, this lack of accent creates the impression that Kluivert scored for Ajax too, whereas in fact he sco...

Ngày tải lên: 22/02/2014, 03:20

3 394 0
Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc

Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Definition and Hypernym Extraction" doc

... backtracking from M |s k |,|s j | to M 0,0 . We add to the set of vertices V the tokens of the gen- eralized sentence s  j for which there is no align- ment to s  k and we add to E the edges (ω j 1 , ... definitional, that is they provide a formal explanation for the term of interest. While it is not feasible to manually search texts for definitions, this task can be automatiz...

Ngày tải lên: 20/02/2014, 04:20

10 567 0
Tài liệu Báo cáo khoa học: "Learning Sub-Word Units for Open Vocabulary Speech Recognition" doc

Tài liệu Báo cáo khoa học: "Learning Sub-Word Units for Open Vocabulary Speech Recognition" doc

... likely pronunciation for each word. It is straightforward to extend to multiple pronunciations by first sampling a pronunciation for each word and then sampling a segmentation for that pronunciation. 8 Once ... continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 9(3). Christopher White, Jasha Droppo, Alex Acero, and Ju- lian Odell. 2007. Maximum entro...

Ngày tải lên: 20/02/2014, 04:20

10 443 0
Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

... search 1 http://www.engkoo.com. functions are often limited, making it hard for users to effectively find information they are interested in. Lastly, existing tools tend to focus exclusively on dictionary, machine translation ... title/non-title classifiers, are applied to each term/sentence pair. The readability evaluator assigns a score to each term/sentence pair according to Form...

Ngày tải lên: 20/02/2014, 05:20

6 658 0
Tài liệu Báo cáo khoa học: "Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields" pdf

Tài liệu Báo cáo khoa học: "Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields" pdf

... for WORD=BROWN it would be lower, perhaps 0.4. These distributions need not be not estimated with great precision—it is far better to have the free- dom to express shades of gray than to be force ... unsupervised approaches to the same problems, seeking to improve accuracy with the addition of lower cost unlabeled data. Tradi- tional approaches to semi-supervised learning are ap...

Ngày tải lên: 20/02/2014, 09:20

9 493 1
Tài liệu Báo cáo khoa học: "Learning Source-Target Surface Patterns for Web-based Terminology Translation" pdf

Tài liệu Báo cáo khoa học: "Learning Source-Target Surface Patterns for Web-based Terminology Translation" pdf

... number of tokens in between. (5) Extract the strings of tokens from E to F (or the other way around) within a maximum distance of d (d is set to 3) to produce ranked surface patterns P. For ... conjunctive query (i.e. E AND F) for each pair (E, F) in a bilingual term list to a search engine. (2) Tokenize the retrieved summaries into three types of tokens: I. A punctuatio...

Ngày tải lên: 20/02/2014, 15:20

4 344 0
w