constructing semantic space models from parsed corpora

Tài liệu Báo cáo khoa học: "Semantic Taxonomy Induction from Heterogenous Evidence" doc

Tài liệu Báo cáo khoa học: "Semantic Taxonomy Induction from Heterogenous Evidence" doc

... COLING-02. M. Hearst. 1992. Automatic Acquisition of Hyponyms from Large Text Corpora. Proc. COLING-92. D. Hindle. 1990. Noun classification from predicate- argument structures. Proc. ACL-90. D. Lenat. ... or automatically discov- ered textual patterns. By contrast, our algorithm flexibly incorporates evidence from multiple clas- sifiers over heterogenous relationships to optimize the entire structure ... (H ij |E H ij ) is then trained using logistic regression, predicting the noun-pair hypernymy label from WordNet from the feature vector of lexico-syntactic patterns. The hypernym classifier described above...

Ngày tải lên: 20/02/2014, 12:20

8 410 0
Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

... Japanese-English language pair, especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for the disambiguation of translation ... comparable corpora- based techniques, re- spectively compared to the hybrid two-stages com- parable corpora and linguistics-based pruning. The proposed approach based on bi-directional comparable corpora ... TR2-007. P. Fung. 2000. A Statistical View of Bilingual Lexi- con Extraction: From Parallel Corpora to Non-Parallel Corpora. In Jean Veronis, Ed. Parallel Text Process- ing. G. Grefenstette. 1999....

Ngày tải lên: 20/02/2014, 16:20

4 377 0
Tài liệu Báo cáo khoa học: "Inducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information" pdf

Tài liệu Báo cáo khoa học: "Inducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information" pdf

... frequencies range from 8 to 31,710. Our target classification is based on semantic in- tuitions, not on our knowledge of the syntactic be- haviour. As an extreme example, the semantic class Support ... reliable enough to serve as proxy for human judgement (Schulte im Walde, 2002a). 3 German Semantic Verb Classes Semantic verb classes have been defined for sev- eral languages, with dominant examples ... containing an expletive es (frame types including x)? Space limitations allow us only a few insights. Clusters (a) and (b) are pure sub-classes of the semantic verb class Propositional Attitude. The...

Ngày tải lên: 20/02/2014, 21:20

8 359 0
Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

... Term Correspondences from Compa- rable Corpora Previously studied techniques of estimating bilin- gual term correspondences from comparable cor- pora are mostly based on the idea that semantically similar ... parallel/comparative corpora. However, the sizes as well as the domain of existing parallel/comparative corpora are lim- ited, while it is very expensive to manually col- lect parallel/comparative corpora. ... translation knowledge acquisition from parallel/comparative corpora, various kinds of translation knowledge are acquired. Within this framework of translation knowledge acquisition from WWW news sites, this...

Ngày tải lên: 22/02/2014, 02:20

8 477 0
Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

... language modeling: Where do we go from here? In Proceed- ings of IEEE:88(8). Rosenfeld R. 2000. Incorporating Linguistic Structure into Statistical Language Models. In Philosophical Transactions ... trigram language models we used the SRI language modelling toolkit (Stol- cke, 2002) with Good-Turing discounting. The first model was generated directly from the MP3 corpus we got from the GF grammar. ... significant improvement. It seems from the tests that the quality of the data is more important than the quantity. This makes extraction of domain data from larger corpora an important issue and increases...

Ngày tải lên: 22/02/2014, 02:20

8 381 0
Tài liệu Early Atomic Models – From Mechanical to Quantum (1904-1913) pptx

Tài liệu Early Atomic Models – From Mechanical to Quantum (1904-1913) pptx

... α-particles from a metal surface. The lead plate P is situated between the α-source AB (an active glass tube) and the detecting screen S (observed with microscope M). The only path from source ... apparatus used by Geiger to study α-scattering. In a 2 m evacuated glass tube, the α-particles from the source R passed through a narrow slit S, producing an image on the phosphorescent screen ... one, and then two metal foils. The microscope M could be adjusted to move across the screen. [From Geiger 1908, p. 174] 35 There had been conflicting interpretations of the experimental...

Ngày tải lên: 22/02/2014, 08:20

61 347 0
Báo cáo khoa học: "Compositional Matrix-Space Models of Language" potx

Báo cáo khoa học: "Compositional Matrix-Space Models of Language" potx

... approaches rang- ing from statistical word space models to symbolic grammar formalisms. 1 Introduction In computational linguistics and information re- trieval, Vector Space Models (Salton et al., ... linguistic models are sub- sumed by this general idea ranging from purely symbolic approaches (like type systems and cate- gorial grammars) to rather statistical models (like vector space and word space ... sequences of arbitrary length. This way, abstracting from specific initial mental state vectors, our semantic space S can be seen as a function space of mental transformations represented by matrices,...

Ngày tải lên: 07/03/2014, 22:20

10 280 0
Báo cáo khoa học: "HAL-based Cascaded Model for Variable-Length Semantic Pattern Induction from Psychiatry Web Resources" pdf

Báo cáo khoa học: "HAL-based Cascaded Model for Variable-Length Semantic Pattern Induction from Psychiatry Web Resources" pdf

... variable-length semantic patterns from all possible combinations of words in the corpora. However, statistical methods may suffer from data sparseness problem, thus they require large corpora with ... be- tween two semantic patterns in the HAL space. As mentioned earlier, after concept combination, a semantic pattern becomes a new concept in the HAL space, which means the semantic pattern ... to annotate the whole web corpora. Besides, it is also im- practical to enumerate all possible combinations of words from the web corpora, and then search for the semantic patterns. To address...

Ngày tải lên: 08/03/2014, 02:21

8 376 0
Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

... . Class-based models. In many applications, it is nat- ural and convenient to construct class-based language models, that is models based on classes of words (Brown et al., 1992). Such models are ... by as- signing them some probabilities. There are classical techniques for constructing language models such as - gram models with various smoothing techniques (see Chen and Goodman (1998) and ... construction of language models. We present new and efficient algorithms to address these more gen- eral problems. Counting. Classical language models are constructed by deriving statistics from large input...

Ngày tải lên: 08/03/2014, 04:22

8 389 0
Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

... of bilingual lexicon extraction from parallel corpora. This assumption should also be reasonable for many types of comparable corpora such as Wikipedia or news corpora, which are topically aligned ... from unrelated English and Ger- man corpora. In Proceedings of the 37th Annual 458 laxes them by lowering the threshold and expand- ing the search space by incrementing the max- imum search space ... efficiently bridge the gap between languages. That seed lexicon is usually crawled from the Web or obtained from parallel corpora. Recently, Li et al. (2011) have proposed an ap- proach that improves...

Ngày tải lên: 08/03/2014, 21:20

11 290 0
Báo cáo khoa học: "Acquisition of Conceptual Data Models from Natural Language Descriptions" doc

Báo cáo khoa học: "Acquisition of Conceptual Data Models from Natural Language Descriptions" doc

... ad-hoc semantic grammars specialized to the application domain. Norton (1982) describes a program that acquires knowledge of the BASIC programming language's syntax and semantics from a ... explicit conceptual data models from natural language text or dialogue is being investigated. The knowledge brought to bear on this task is classified into syntactic, semantic and systems analysis ... beth syntactic and semantic components. As such they are potential models for the development of interfaces to new types of software systems. However, their approach to semantics cannot be...

Ngày tải lên: 09/03/2014, 01:20

8 328 0
Báo cáo khoa học: "Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora" pptx

Báo cáo khoa học: "Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora" pptx

... defined in the toolkit: “parallel data mining from comparable corpora and “named entity/terminology extraction and mapping from comparable corpora . The next section provides a general overview ... mining from comparable corpora aligns comparable corpora in the document level (section 2.1). This step is crucial as the further steps are computationally intensive. To minimise search space, ... sentence pairs are extracted from the aligned comparable corpora (section 2.2). The workflow for named entity (NE) and terminology extraction and mapping from comparable corpora extracts data in...

Ngày tải lên: 16/03/2014, 20:20

6 289 0
Báo cáo khoa học: "The S-Space Package: An Open Source Package for Word Space Models" pdf

Báo cáo khoa học: "The S-Space Package: An Open Source Package for Word Space Models" pdf

... Benchmarks Word space benchmarks assess the semantic con- tent of the space through analyzing the geomet- ric properties of the space itself. Currently used benchmarks assess the semantics by inspecting ... information on the algorithms, code documentation and mailing list archives. 2 Word Space Models Word space models are based on the contextual distribution in which a word occurs. This ap- proach ... approximation, and Word Sense Induction (WSI) models. Document-based models divide a corpus into discrete documents and construct the vector space from word fre- quencies in the documents. The...

Ngày tải lên: 17/03/2014, 00:20

6 410 0
Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

... snippet. Incorporating a noun phrase chunker would elim- inate some of these cases, but far from all of them. We concluded that even such a restrictive pattern is not sufficient for semantic class ... by their semantic lexicon learner were not present in Word- Net. Automatic semantic lexicon acquisition could be used to enhance existing resources such as Word- Net, or to produce semantic lexicons ... methods have been developed for automatic semantic class identification, under the rubrics of lexical acquisition, hyponym acquisition, semantic lexicon induction, semantic class learn- ing, and web-based...

Ngày tải lên: 17/03/2014, 02:20

9 340 0
Báo cáo khoa học: "Coreference Resolution Using Semantic Relatedness Information from Automatically Discovered Patterns" pptx

Báo cáo khoa học: "Coreference Resolution Using Semantic Relatedness Information from Automatically Discovered Patterns" pptx

... senses and semantic relations are not available from the database (Vieira and Poe- sio, 2000). In recent years, increasing interest has been seen in mining semantic relations from large text corpora. ... to obtain the semantic relatedness information. The evaluation on ACE data set shows that the pattern based semantic information is helpful for coreference resolution. 1 Introduction Semantic relatedness ... the same entity should have a certain semantic relation. To obtain this semantic information, previ- ous work on reference resolution usually leverages a semantic lexicon like WordNet (Vieira...

Ngày tải lên: 17/03/2014, 04:20

8 271 0
w