Báo cáo khoa học: "Using Conditional Random Fields to Extrac

Báo cáo khoa học: "Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums" docx

... of ACL-08: HLT, pages 710–718, Columbus, Ohio, USA, June 2008. c 2008 Association for Computational Linguistics Using Conditional Random Fields to Extract Contexts and Answers of Questions from ... Con- ditional Random Fields (CRFs) to detect the contexts and answers of questions from forum threads. We improve the basic framework by Skip-chain CRFs...

Ngày tải lên: 23/03/2014, 17:20

9 605 0

Báo cáo khoa học: "Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech" pptx

... (Section 7). 2 Conditional Random Fields CRFs can be considered as a generalization of lo- gistic regression to label sequences. They deﬁne a conditional probability distribution of a label sequence ... sequences ¯ y of the observation sequence x. We extract two types of features from a sequence pair: 1. Current label and information about the observation sequenc...

Ngày tải lên: 08/03/2014, 04:22

7 541 0

Báo cáo khoa học: "Using Conditional Random Fields For Sentence Boundary Detection In Speech" potx

... use of various automatic taggers that map the word sequence to other representations. Tagged versions of the word stream are provided to support various generaliza- tions of the words and to ... In addition to this NIST SU error metric, we use the total number of interword boundaries as the denomi- nator, and thus obtain results for the per-boundary- based metric. 3.2 Fe...

Ngày tải lên: 31/03/2014, 03:20

8 393 0

Báo cáo khoa học: "Scaling Conditional Random Fields Using Error-Correcting Codes" docx

... good performance. 15 The random code of 200 bits required 1,300Mb of RAM, taking a total of 293 hours to train and 3 hours to decode (54,397 tokens) on similar machines to those used before. We ... required 220Mb of RAM and took a total of 30 minutes to train each of the 200 binary CRFs, this time on Pentium 4 machines with 1Gb RAM. Decoding of the 47,377 test toke...

Ngày tải lên: 31/03/2014, 03:20

8 260 0

Báo cáo khoa học: "Training Conditional Random Fields with Multivariate Evaluation Measures" potx

... sentences and 211,727 tokens), and section 20 as test data (2,012 sentences and 47,377 tokens), with 11 different chunk-tags, such as NP and VP plus the ‘O’ tag, which repre- sents the outside of ... English NER data was taken from the Reuters Corpus2 1 . The data consists of 203,621, 51,362 and 46,435 tokens from 14,987, 3,466 and 3,684 sentences in training, developm...

Ngày tải lên: 17/03/2014, 04:20

8 304 0

Tài liệu Báo cáo khoa học: "Using Cross-Entity Inference to Improve Event Extraction" docx

... scope from a single document to a cluster of topic-related docu- ments and employed a rule-based approach to propagate consistent trigger classification and event arguments across sentences and ... entity mention will be regarded as a candidate event mention, and a randomly selected entity mention from the candidate will be the star- ing of the whole extraction proces...

Ngày tải lên: 20/02/2014, 04:20

10 531 0

Tài liệu Báo cáo khoa học: "Using Automatically Transcribed Dialogs to Learn User Models in a Spoken Dialog System" doc

... many thou- sands of employees. Users call the directory system and provide the name of a callee they wish to be connected to. The system then requests additional 122 information from the user, ... system action S t+1 according to its dialog management policy. Concretely, the val- ues of S t , U t , A t and ˜ A t are all assumed to belong to ﬁnite sets, and so all the c...

Ngày tải lên: 20/02/2014, 09:20

4 471 0

Tài liệu Báo cáo khoa học: "Using Word Support Model to Improve Chinese Input System" ppt

... From Table 3a, the tonal and toneless STW improvements of the MSIME by using the WP identifier and the WSM are (18.9%, 10.1%) and (25.6%, 16.6%), respectively. From Table 3b, the tonal and ... results. The WP database and system dictionary of the WP identifier is same with that of the WSM. From Table 2, it shows the average tonal and toneless STW accuracies and...

Ngày tải lên: 20/02/2014, 12:20

8 359 0

Báo cáo khoa học: "Using Non-lexical Features to Identify Effective Indexing Terms for Biomedical Illustrations" docx

... classiﬁcation compared to that of reviewer B. Since the reviewers of the training data each classiﬁed terms from different sets of randomly selected images, it is impossible to calculate their inter-annotator agreement. 4.2 ... replaced by the Art and Architecture Thesaurus 6 and an equiv- alent mapping tool to annotate images related to art and art history (Klavans et a...

Ngày tải lên: 17/03/2014, 22:20

8 364 0

Báo cáo khoa học: "Using Machine Learning Techniques to Build a Comma Checker for Basque" pdf

... were automatically calculated:  number of verb chunks to the beginning and to the end of the sentence  number of nominal chunks to the beginning and to the end of the sentence  number of ... sentence  number of subordinate-clause marks to the beginning and to the end of the sentence  distance (in tokens) to the beginning and to the end of th...

Ngày tải lên: 31/03/2014, 01:20

8 385 0