... percentages are the aver- age of the judges' individual classifications. 399 Acquiring LexicalGeneralizationsfrom Corpora: A Case Study for Diathesis Alternations Maria Lapata School of Cognitive ... threshold values varied from frame to flame but not from verb to verb and were determined by taking into account for each frame its overall frame frequency which was es- timated from the COMLEX subcategorization ... alternating verbs from large balanced corpora by using partial- parsing methods and taxonomic information, and discuss how corpus data can be used to quantify lin- guistic generalizations. ...
... automatic acquisition of lexical in-formation from large repositories of unannotatedtext (such as the web, corpora of published text,etc.) is starting to produce large scale lexical re-sources ... paper describes a novel systemfor acquiring adjectival subcategorizationframes (SCFs) and associated frequencyinformation from English corpus data.The system incorporates a decision-treeclassifier ... frames from untagged text. In Meet-ing of the Association for Computational Linguistics,pages 209–214.E. J. Briscoe and J. Carroll. 1997. Automatic Extractionof Subcategorization from Corpora. ...
... CHAPTER 3: A COMPARATIVE STUDY ON LEXICAL COHESIVE DEVICES IN ENGLISH AND VIETNAMESE CORPORATE ADVERTISEMENTS1. General picture of lexical cohesive devices in Corporate advertisementsUnderstanding ... clearly and in details. However, in Vietnamese corporate advertisements, the copywriters hardly notice the equivalent lexical items that are rendered from Vietnamese into English or vice versus. ... description of lexical cohesion features in English- figure out how these devices are used in texts - make comparative analysis of lexical cohesion between English and Vietnamese corporate advertisements...
... Singapore, 4 August 2009.c2009 ACL and AFNLPExtracting Comparative Sentences from Korean Text Documents Us-ing Comparative Lexical Patterns and Machine Learning Techniques Seon Yang Department ... comparative sentences from text documents. This paper first investigates many comparative sentences referring to pre-vious studies and then defines a set of compar-ative keywords from them. A sentence ... to eliminate non-comparative sentences only from comparative sentence candidates with a CKL2 keyword. 4 Eliminating Non-comparative Sen-tences from the Candidates 3 As you can see in...
... (Cucerzan and Yarowsky, 1999) and (Collinsand Singer, 1999) present algorithms to obtainNEs from untagged corpora. However, they focuson the classification stage of already segmentedentities, and ... feature vector from this example in the following manner:First, we split both words into all possiblesubstrings of up to size two:We build a feature vector by coupling sub-strings from the two ... Computational LinguisticsWeakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora Alexandre Klementiev Dan RothDept. of Computer ScienceUniversity of IllinoisUrbana,...
... information from free-text has been successfully carried out in the past (Hearst, 1999; Manning, 1993), automatically ex-tracting lexical resources (including terminologi-cal definitions) from text ... information from a machine-readable dic-tionary. 3 Locating metalinguistic information in text: two approaches When implementingan IE application to mine metalinguistic information from text, ... tackle is how to obtain a reliable set of can-didate sentences from free text for input into the next phases of extraction. From our initial corpus analysis we selected 44 patterns that showed...
... Japanese-English language pair,especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for thedisambiguation of translation ... comparable corpora- based techniques, re-spectively compared to the hybrid two-stages com-parable corpora and linguistics-based pruning.The proposed approach based on bi-directionalcomparable corpora ... TR2-007.P. Fung. 2000. A Statistical View of Bilingual Lexi-con Extraction: From Parallel Corpora to Non-Parallel Corpora. In Jean Veronis, Ed. Parallel Text Process-ing.G. Grefenstette. 1999....
... ( (from SF0) (to San Francisco))))).) GR (Tell ((me (((about the) public) transportation)) ( (from SF0) ((to San) (Francisco .))))) GB ((Tell (me (about (((the public) transportation) ( (from ... corpus, the inside prob- abilities of longer spans of c are computed from INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA Fernando Pereira 2D-447, AT~zT Bell Laboratories PO Box ... inferred from raw text. In addition, the number of iterations needed to reach a good grammar can be reduced; in extreme cases, a good solution is found from parsed text but not from raw text....
... nouns or proper nouns is converted from their positions in the text into a vector. 3. Match pairs of positional difference vec- tors~ giving scores. All vectors from English and Chinese are matched ... dim(V2) 240 A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Pascale Fung Computer Science Department Columbia University New York, NY ... in the texts. For every word pair from this lexicon, we had ob- tained a DTW score and a DTW path. If we plot the points on the DTW paths of all word pairs from the lexicon, we get a graph...
... Data from Bilingual Texts. In Pro-ceedings of the First International Lexical Acquisition Workshop, Detroit.Church, K., Gale, W., Hanks, P., and Hindle, D. (1991).Using Statistics in Lexical ... linguistic analysis. Theoriginality of our approach comes from the factthat collocations are not extracted from raw texts,but rather from syntactically parsed texts. The lin-guistic analysis ... textual corpora from the World Trade Organisation (WTO), whichconsist in parallel documents in three languages:English, French and Spanish. All the examplesgiven in this paper are taken from...
... translationknowledge acquisition from WWWnews sites, this paper studies issues onthe effect of cross-language retrieval ofrelevant texts in bilingual lexicon ac-quisition from comparable corpora. Weexperimentally ... parallel/comparative corpora. However, the sizes as well as the domainof existing parallel/comparative corpora are lim-ited, while it is very expensive to manually col-lect parallel/comparative corpora. ... translationknowledge acquisition from parallel/comparative corpora, various kinds of translation knowledgeare acquired.Within this framework of translation knowledgeacquisition from WWW news sites, this...
... this paper we presented a novel algorithm forrapidly prototyping virtual instructors from human-human corpora without manual annotation. Usingour algorithm and the GIVE corpus we have gener-ated ... sum, this paper presents a novel way of au-tomatically prototyping task-oriented virtual agents from corpora who are able to effectively and natu-rally help a user complete a task in a virtual ... world.ReferencesSudeep Gandhe and David Traum. 2007. Creating spo-ken dialogue characters fromcorpora without annota-tions. In Proceedings of Interspeech, Belgium.Andrew Gargett, Konstantina...