Báo cáo khoa học: "truecasing" doc

Báo cáo khoa học: "truecasing" doc

... broadcast news documents and the translated Xinhua News Agency (XINHUA) documents in the ACE corpus do not contain any case information, while human transcribed broadcast news documents contain ... extraction of mentions of entities and relations between them from textual data. The textual documents are from newswire, broadcast news with text derived from automatic speech recognition (A...

Ngày tải lên: 17/03/2014, 06:20

8 287 0

Tài liệu Thuyết trình báo cáo khoa học - Presentation doc

... Luck! Thuyết trình báo cáo khoa học - Presentation - Bài 3/3 - Bảo vệ TN • • 1 • 2 • 3 • 4 • 5 (2 votes) 12/05/2007 Trang 1 / 4 Mục lục bài viết Thuyết trình báo cáo khoa học - Presentation ... lược chúng ta dùng Thuyết trình báo cáo khoa học - Presentation - Bài 1/3 • • 1 • 2 • 3 • 4 • 5 (8 votes) 02/05/2007 Trang 1 / 2 Thuyết trình báo cáo khoa học...

Ngày tải lên: 19/01/2014, 21:20

31 600 1

Tài liệu Báo cáo khoa học: "Obfuscating Document Stylometry to Preserve Author Anonymity" pptx

... obfuscated document. This is important since there is no clear consensus as to which features should be used for authorship attribution. 2 Document Obfuscation Our approach to document obfuscation ... deletions to be equally weighted. However, while deletion sites in the document are easy to identify, Document Changes Doc Size Changes/1000 49 42 3849 10.9 50 46 2364 19.5...

Ngày tải lên: 20/02/2014, 12:20

8 377 0

Tài liệu Báo cáo khoa học: "Ensemble Document Clustering Using Weighted Hypergraph Generated by NMF" docx

... feed- back in IR, where retrieved documents are clus- tered, is actively researched (Hearst and Pedersen, 1996)(Kummamuru et al., 2004). In document clustering, the document is repre- sented as a ... hypergraph in the integration phase. Document clustering is the task of dividing a document’s data setinto groupsbased ondocumentsim- ilarity. This is the basic intelligent procedure, and is...

Ngày tải lên: 20/02/2014, 12:20

4 393 0

Tài liệu Báo cáo khoa học: "Bootstrapping" doc

Ngày tải lên: 20/02/2014, 21:20

8 389 0

Báo cáo khoa học: "News" docx

Ngày tải lên: 07/03/2014, 18:20

2 247 0

Báo cáo khoa học: "Labeling Documents with Timestamps: Learning from their Time Expressions" pot

... task in the NLP community: automatic document dating. Given a document with unknown origins, what char- acteristics of its text indicate the year in which the document was written? This paper proposes ... dating documents has come from the IR and knowledge management communities inter- ested in dating documents with unknown origins. de Jong et al. (2005) was among the ﬁrst to auto- matically...

Ngày tải lên: 07/03/2014, 18:20

9 367 0

Báo cáo khoa học: "Probabilistic Document Modeling for Syntax Removal in Text Summarization" ppt

... method. There are D document sets, M documents in each set, N M words in document M, and C syntax classes. proach would be to model the syntax and semantic words used in a document collection ... assuming that each document is generated solely from one topic distribution that is shared throughout each document set. This results in a smoothed language model for each document set’s content ......

Ngày tải lên: 07/03/2014, 22:20

6 449 0

Báo cáo khoa học: "TUTORIAL" doc

Ngày tải lên: 08/03/2014, 18:20

1 342 0

Báo cáo khoa học: "Learning Document-Level Semantic Properties from Free-text Annotations" pot

... that both the document text and the selection of keyphrases are governed by the underlying hidden properties of the document. Each property indexes a language model, thus allowing documents that ... keyphrase similarity values h – document keyphrases η – document keyphrase topics λ – probability of selecting η instead of φ c – selects between η and φ for word topics φ – document topic model z...

Ngày tải lên: 23/03/2014, 17:20

9 190 0