semantic class learning from the web with hyponym pattern linkage graphs

Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

... applies hyponym patterns to the web and ac- quires contexts around them. The KnowItAll system (Etzioni et al., 2005) also uses hyponym patterns to extract class instances from the web and then evalu- ates ... doubly-anchored hyponym pattern to query the web and extract semantic class instances: CLASS NAME such as CLASS MEMBER and *. We hypothesized that a doubly-anchored pattern, which includes both the class ... weakly super- vised semantic class learning from the web, using a single powerful hyponym pattern com- bined with graph structures, which capture two properties associated with pattern- based extractions:...

Ngày tải lên: 17/03/2014, 02:20

9 340 0
Báo cáo khoa học: "Learning to Extract Relations from the Web using Minimal Supervision" ppt

Báo cáo khoa học: "Learning to Extract Relations from the Web using Minimal Supervision" ppt

... the acquisition relationship coincide with the two arguments. They do not contribute any bias, since they are replaced with the generic tags e 1  and e 2  in all sentences from the bag. There are ... computed as the product of the weights of all the tokens in the sequence. The aim of this new weighting scheme, as detailed in the next section, is to eliminate the bias caused by the special structure ... (in FrameNet, these are the lexical units associated with the target frame). 5.1 A Solution for Type I Bias In order to account for how strongly the words in a sequence are correlated with either of the...

Ngày tải lên: 23/03/2014, 18:20

8 371 0
Learning from the project

Learning from the project

... more comfortable dealing with people than with figures. I’m planning to discuss this with my mentor. Learning from the project 205 pairing up and also with the whole team working with the learning whole ... working closely with each other so that the one who is learning can try out the new way of working with the help and support of the more experienced person. If one team is teaching another these roles ... shall plan the consultation with others in the team and shall take a lead in the meetings or workshops we decide to hold. All of these objectives will be completed during the period of the project....

Ngày tải lên: 24/10/2013, 08:20

18 472 0
Tài liệu Báo cáo khoa học: "Extraction and Approximation of Numerical Attributes from the Web" pdf

Tài liệu Báo cáo khoa học: "Extraction and Approximation of Numerical Attributes from the Web" pdf

... 1.695m]’). We then extract new pat- terns from the retrieved search engine snippets and re-query the Web with the new patterns to obtain more attribute values. We provided the framework with unit ... each kind. These patterns are the only attribute-specific resource in our framework. Value extraction. The first pattern group, P values , allows extraction of the attribute values from the Web. All ... val- ues for each pattern. We extend this group it- eratively from the given seed as commonly done in pattern- based acquisition methods. To do this we re-query the Web with the obtained (object,...

Ngày tải lên: 20/02/2014, 04:20

10 466 0
Tài liệu Báo cáo khoa học: "Automatic Collection of Related Terms from the Web" pptx

Tài liệu Báo cáo khoa học: "Automatic Collection of Related Terms from the Web" pptx

... query is a term, its hit is the number of pages that contain the term on the Web. We use the following notation. H(x)= the number of pages that contain the term x” The number H (x) can be used ... in the compiled corpus. R: the target term did not exist on the collected web pages. Only 43 terms (20%) out of 210 terms were col- lected by the system. This low recall primarily comes from the ... explanation of the term. 4. There are several technical terms that are re- lated to the term. We have implemented the checking program of the first two conditions in the system: the thirdcondition can...

Ngày tải lên: 20/02/2014, 16:20

4 437 0
Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

... that, using the new web mining scheme, the web mining throughput is increased by 32%; (ii) The quality of the mined data is improved. By lever- aging the web pages’ HTML structures, the sen- tence ... English-Chinese parallel data from the web. The mining procedure is initiated by acquiring Chinese website list. We have downloaded about 300,000 URLs of Chinese websites from the web directories at ... (1) Given a web site, the root page and web pages directly linked from the root page are downloaded. Then for each of the downloaded web page, all of its anchor texts (i.e. the hyperlinked...

Ngày tải lên: 08/03/2014, 02:21

8 435 0
Báo cáo khoa học: "Automatic Acquisition of Ranked Qualia Structures from the Web" potx

Báo cáo khoa học: "Automatic Acquisition of Ranked Qualia Structures from the Web" potx

... appropriate queries to the web search engine and choosing the article leading to the highest number of results. The corresponding patterns are then matched in the 50 snippets returned by the search engine ... to automatically learning qualia structures from the Web. Such an approach is especially interesting either for lexicog- 894 matched. On the basis of these, we then calculate the probability ... relies on the counts of each qualia element as produced by the lexico-syntactic patterns (P-measure). We describe these measures in the fol- lowing. 4.1 Web- based Jaccard Measure (Web- Jac) Our web- based...

Ngày tải lên: 08/03/2014, 02:21

8 379 0
Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

... suffixes with top φ 2 In our modified version of the competitive link- ing algorithm, the link score of a pair of words is the sum of the φ 2 scores of the words themselves, their prefixes and their ... pairs, where the translation of the in-parenthesis terms is a suffix of the pre-parenthesis text. The lengths and frequency counts of the suffixes have been used to determine what is the translation ... alignment harder than necessary. We therefore trimmed the pre-parenthesis text with a length-based constraint. The cut-off point is the first (counting from right to left) potential boundary...

Ngày tải lên: 17/03/2014, 02:20

9 612 0
Báo cáo khoa học: "Extracting Hypernym Pairs from the Web" potx

Báo cáo khoa học: "Extracting Hypernym Pairs from the Web" potx

... the two web ex- periments and a combination of the best web ap- proach with the morphological approach. The con- junctive web pattern N en N rates best, because of its high frequency. The recall ... interested in em- ploying the web for the extraction of hypernym re- lations. We are especially curious about whether the size of the web allows to achieve meaningful results with basic extraction ... pattern- based methods for collect- ing hypernym relations from the web. We compare our approach with hypernym ex- traction from morphological clues and from large text corpora. We show that the...

Ngày tải lên: 17/03/2014, 04:20

4 395 0
Báo cáo khoa học: "Compiling French-Japanese Terminologies from the Web" pptx

Báo cáo khoa học: "Compiling French-Japanese Terminologies from the Web" pptx

... to the output set. Then, we augment it with the alignments from FJJ whose terms are not already in FJ. The resulting set is denoted FJJ'. We then augment FJJ' with the pairs from ... translation. They use a compositional method to generate a set of translation candidates from which they select the most likely translation by using empirical evidence from the web. The method ... around the seed. 2.2 Automatic Term Recognition The next step is to extract candidate related terms from the corpus. Because the sentences compos- ing the corpus are related to the seed, the...

Ngày tải lên: 17/03/2014, 22:20

8 372 0
Báo cáo khoa học: "Extracting Sequences from the Web" pptx

Báo cáo khoa học: "Extracting Sequences from the Web" pptx

... query. 288 Pattern Example the ORD the fifth the RB ORD the very first the JJS the best the RB JJS the very best the ORD JJS the third biggest the RBS JJ the most popular the ORD RBS JJ the second ... for classifying s and one for classifying (x, k). We then rank ex- tractions by taking the product of the two classi- fiers’ confidence scores. We now describe the features used in the two classifiers ... tags and the pattern- based features “x {is,are,was,were} the kth s” and the kth s {is,are,was,were} x”. The features are then combined using a Naive Bayes classifier. In addition to the local,...

Ngày tải lên: 23/03/2014, 16:20

5 309 0
Báo cáo khoa học: "Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web" pdf

Báo cáo khoa học: "Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web" pdf

... leveraging the vast size of the Web. Our hypothesis is that there exist some key terms and patterns that provide clues to the rela- tions between pairs. From the snippets retrieved by the search ... for these challenges is to combine frequency informa- tion from the Web and the “high quality” charac- teristic of Wikipedia text. 4 Pattern Combination Method for Relation Extraction With the ... depen- dency patterns and surface patterns. The algorithm is based on k-means clustering for relation cluster- ing. The dependency pattern has the properties of being more accurate, but the Web context...

Ngày tải lên: 23/03/2014, 16:21

9 345 0
Báo cáo khoa học: "Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web" pot

Báo cáo khoa học: "Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web" pot

... the Classifier. The Pattern Learner uses the seeds to learn likely patterns of relation occurrences. Then, the Instance Extractor uses the patterns to extract the candidate instances from the ... sen- tence. These phrases are matched to the slots of the patterns. In other respects, the pattern matching and extraction process is straightforward. 3.3 Classifier The goal of the final classification ... the patterns gener- ated by the Pattern Learner to the text corpus. In order to be able to match the slots of the patterns, the Instance Extractor utilizes an external shallow parser from the...

Ngày tải lên: 23/03/2014, 18:20

8 310 0
w