0

adjectival subcategorization frames from corpora

Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT" doc

Báo cáo khoa học

... AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT Michael R. Brent MIT AI Lab 545 Technology Square Cambridge, ... open-class dictionary) and gener- ates a partial list of verbs occurring in the text and the subcategorization frames (SFs) in which they occur. Verbs are detected by a novel tech- nique based on ... corpora. 1 INTRODUCTION This paper describes an implemented program that takes an untagged text corpus and generates a partial list of verbs occurring in it and the sub- categorization frames...
  • 6
  • 416
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic Acquisition of Adjectival Subcategorization from Corpora" docx

Báo cáo khoa học

... describes a novel systemfor acquiring adjectival subcategorization frames (SCFs) and associated frequencyinformation from English corpus data.The system incorporates a decision-treeclassifier ... sub-categorization frames from untagged text. In Meet-ing of the Association for Computational Linguistics,pages 209–214.E. J. Briscoe and J. Carroll. 1997. Automatic Extractionof Subcategorization from Corpora. ... first systems capable of automatically learn-ing a small number of verbal subcategorization frames (SCFs) from English corpora emerged overa decade ago (Brent, 1991; Manning, 1993). Subse-quent...
  • 8
  • 390
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Acquiring Lexical Generalizations from Corpora: A Case Study for Diathesis Alternations" pdf

Báo cáo khoa học

... values varied from frame to flame but not from verb to verb and were determined by taking into account for each frame its overall frame frequency which was es- timated from the COMLEX subcategorization ... corpus id- iosyncrasies can affect subcategorization frequen- cies (cf. Roland and Jurafsky (1998) for an exten- sive discussion). This suggests that different corpora may give different results ... shal- low syntactic processing. Alternating verbs were ac- quired from the BNC by using Gsearch as a chunk parser. Erroneous frames were discarded by apply- ing linguistic heuristics, statistical...
  • 8
  • 483
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora" ppt

Báo cáo khoa học

... (Cucerzan and Yarowsky, 1999) and (Collinsand Singer, 1999) present algorithms to obtainNEs from untagged corpora. However, they focuson the classification stage of already segmentedentities, and ... feature vector from this example in the following manner:First, we split both words into all possiblesubstrings of up to size two:We build a feature vector by coupling sub-strings from the two ... Computational LinguisticsWeakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora Alexandre Klementiev Dan RothDept. of Computer ScienceUniversity of IllinoisUrbana,...
  • 8
  • 391
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Báo cáo khoa học

... Japanese-English language pair,especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for thedisambiguation of translation ... comparable corpora- based techniques, re-spectively compared to the hybrid two-stages com-parable corpora and linguistics-based pruning.The proposed approach based on bi-directionalcomparable corpora ... TR2-007.P. Fung. 2000. A Statistical View of Bilingual Lexi-con Extraction: From Parallel Corpora to Non-Parallel Corpora. In Jean Veronis, Ed. Parallel Text Process-ing.G. Grefenstette. 1999....
  • 4
  • 377
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA" ppt

Báo cáo khoa học

... ( (from SF0) (to San Francisco))))).) GR (Tell ((me (((about the) public) transportation)) ( (from SF0) ((to San) (Francisco .))))) GB ((Tell (me (about (((the public) transportation) ( (from ... corpus, the inside prob- abilities of longer spans of c are computed from INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA Fernando Pereira 2D-447, AT~zT Bell Laboratories PO Box ... inferred from raw text. In addition, the number of iterations needed to reach a good grammar can be reduced; in extreme cases, a good solution is found from parsed text but not from raw text....
  • 8
  • 285
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

Báo cáo khoa học

... nouns or proper nouns is converted from their positions in the text into a vector. 3. Match pairs of positional difference vec- tors~ giving scores. All vectors from English and Chinese are matched ... dim(V2) 240 A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Pascale Fung Computer Science Department Columbia University New York, NY ... in the texts. For every word pair from this lexicon, we had ob- tained a DTW score and a DTW path. If we plot the points on the DTW paths of all word pairs from the lexicon, we get a graph...
  • 8
  • 426
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" docx

Báo cáo khoa học

... linguistic analysis. Theoriginality of our approach comes from the factthat collocations are not extracted from raw texts,but rather from syntactically parsed texts. The lin-guistic analysis ... textual corpora from the World Trade Organisation (WTO), whichconsist in parallel documents in three languages:English, French and Spanish. All the examplesgiven in this paper are taken from ... returns chunks of partial analyses. If132Creating a Multilingual Collocation Dictionary from Large Text Corpora Luka Nerima, Violeta Seretan, Eric WehrliLanguage Technology Laboratory (LATL),...
  • 4
  • 479
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Báo cáo khoa học

... translationknowledge acquisition from WWWnews sites, this paper studies issues onthe effect of cross-language retrieval ofrelevant texts in bilingual lexicon ac-quisition from comparable corpora. Weexperimentally ... parallel/comparative corpora. However, the sizes as well as the domainof existing parallel/comparative corpora are lim-ited, while it is very expensive to manually col-lect parallel/comparative corpora. ... translationknowledge acquisition from parallel/comparative corpora, various kinds of translation knowledgeare acquired.Within this framework of translation knowledgeacquisition from WWW news sites, this...
  • 8
  • 477
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Prototyping virtual instructors from human-human corpora" pdf

Báo cáo khoa học

... this paper we presented a novel algorithm forrapidly prototyping virtual instructors from human-human corpora without manual annotation. Usingour algorithm and the GIVE corpus we have gener-ated ... sum, this paper presents a novel way of au-tomatically prototyping task-oriented virtual agents from corpora who are able to effectively and natu-rally help a user complete a task in a virtual ... world.ReferencesSudeep Gandhe and David Traum. 2007. Creating spo-ken dialogue characters from corpora without annota-tions. In Proceedings of Interspeech, Belgium.Andrew Gargett, Konstantina...
  • 6
  • 220
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora" pot

Báo cáo khoa học

... engineering is desired.Paraphrases can be extracted from non-parallel corpora using contextual similarity (Lin, 1998).They can also be obtained from parallel corpora if such data is available (Barzilay ... Ibrahim et al., 2003). Recently, there arealso a number of studies that extract paraphrases from multilingual corpora (Bannard and Callison-Burch, 2005; Zhao et al., 2008).The approach in (Barzilay ... Singapore, 4 August 2009.c2009 ACL and AFNLPExtracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora Xiaoyin Wang1,2, David Lo1, Jing Jiang1, Lu Zhang2, Hong Mei21School...
  • 4
  • 293
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

Báo cáo khoa học

... field.Comparable corpora exhibit various degrees ofparallelism. Fung and Cheung (2004a) describe corpora ranging from noisy parallel, to compara-ble, and finally to very non-parallel. Corpora from the ... comparable corpora from the Romanian translations of the EuropeanUnion’s acquis communautaire which we mined from the Web, and has about 10M English words.We downloaded comparable data from three ... lexicon extraction from compara-ble corpora. In ACL 2004, pages 527–534.Philipp Koehn and Kevin Knight. 2000. Estimatingword translation probabilities from unrelated mono-lingual corpora using...
  • 8
  • 263
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Automatic Identification of Word Translations from Unrelated English and German Corpora" pot

Báo cáo khoa học

... corpora, but - as empirically shown by Rapp - it also holds for non-parallel corpora. It can be expected that this clue will work best with parallel corpora, second-best with comparable corpora, ... translations from non-parallel corpora. Proceedings of the 5th Annual Workshop on Very Large Cor- pora, Hong Kong, 192-202. Fung, P.; Yee, L. Y. (1998). An IR approach for translating new words from ... word associations based on the co-occurrences of words in large corpora. In: Proceedings of the 1st Work- shop on Very Large Corpora: Columbus, Ohio, 84- 93. 526 German test word Baby...
  • 8
  • 438
  • 0
Corporate Executive Salaries – The Argument from Economic Effi ciency ppt

Corporate Executive Salaries – The Argument from Economic Effi ciency ppt

Cao đẳng - Đại học

... demonstrated that for Australian corporations, the correlation between corporate performance and executive salary was negative, that is, the highest paid executives control-led corporations with the ... distinguish the contribution of the executive from the fortunes of the corporation as a whole. Attempts to compare performance against similar corporations might allow comparative evaluation ... ciency may have been due to corporate leadership, such as through restruc-turing of corporations.  is is plausible but diffi cult to prove. It cannot be isolated from other potential causes...
  • 9
  • 229
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Creating a Multilingual Collocation Dictionary from Large Text Corpora" ppt

Báo cáo khoa học

... (paragraph-level)structure of documents is examined, possibly usingmark-up from text encoding.133Creating a Multilingual Collocation Dictionary from Large Text Corpora Luka Nerima, Violeta Seretan, Eric WehrliLanguage ... linguistic analysis. Theoriginality of our approach comes from the factthat collocations are not extracted from raw texts,but rather from syntactically parsed texts. The lin-guistic analysis ... textual corpora from the World Trade Organisation (WTO), whichconsist in parallel documents in three languages:English, French and Spanish. All the examplesgiven in this paper are taken from...
  • 4
  • 353
  • 0

Xem thêm