bulk processing of text

Tài liệu Báo cáo khoa học: "REQUIREMENTS OF TEXT PROCESSING LEXICONS " ppt

Tài liệu Báo cáo khoa học: "REQUIREMENTS OF TEXT PROCESSING LEXICONS " ppt

... resource of uncer- tain value. Indeed, many who have analyzed the contents of a dictionary have concluded that it is of little value to linguistics or artificial intelligence. Because of the ... representation of an entry in the lex- icon to be used in natural language processing systems. I describe herein what I have learned from this type of effort. I began with the objective of identifying ... suggest the poss- ibility of using a parser to analyze the def- initions of a word and thereby to create a net which will be capable of discriminating among all definitions of a word. The following...

Ngày tải lên: 21/02/2014, 20:20

2 335 0
Tapping into the Power of Text Mining

Tapping into the Power of Text Mining

... and semantics of text, most text mining approaches are based on the idea that a text document can be represented by a set of words, i.e. a text document is described based on the set of words contained ... task of the vector space representation of documents is to find an appropriate encoding of the feature vector. Each element of the vector usually represents a word (or a group of words) of the document ... that for many text mining tasks linguistic preprocessing is of limited value compared to the simple bag -of- words approach with basic preprocessing. The reason is that the co-occurrence of terms in...

Ngày tải lên: 31/08/2012, 16:46

37 1,3K 3
Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt

Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt

... head of constituents. The example of Fig. 1 shows that the placeholder 0 climbs up to the node governing all the NPs. 5.3 Pruning irrelevant information in large text trees Often only a portion of ... direct children of the nodes in N  . We apply such proce- dure only to the syntactic trees of texts before the computation of the kernel function. 6 Experimental investigation The aim of the experiments ... ex- amples of the previous section. From the point of view of bag -of- word methods, the pairs (T 1 , H 1 ) and (T 1 , H 2 ) have both the same intra-pair simi- larity since the sentences of T 1 and...

Ngày tải lên: 20/02/2014, 12:20

8 413 0
Tài liệu Báo cáo khoa học: "Robust Temporal Processing of News" pptx

Tài liệu Báo cáo khoa học: "Robust Temporal Processing of News" pptx

... third (7 of 21, 5 of which were of type TIME) of all missed time expressions came from numeric expressions being spelled out, e.g. “nineteen seventy- nine”. More than two thirds (11 of 16) of the time ... presence of a day of the week expression (“Monday” thru “Sunday”) in the same sentence FW: “today” is the first word of the sentence POS1: part -of- speech of the word before “today” POS2: part -of- speech ... Processing of VERBMOBIL. Proceedings of the Fifth Conference on Applied Natural Language Processing, 1997, 33-40. J. F. Allen. Maintaining Knowledge About Temporal Intervals. Communications of...

Ngày tải lên: 20/02/2014, 18:20

8 370 0
Tài liệu Báo cáo khoa học: "REPRESENTATION OF TEXTS FOR INFORMATION RETRIEVAL" pdf

Tài liệu Báo cáo khoa học: "REPRESENTATION OF TEXTS FOR INFORMATION RETRIEVAL" pdf

... volume constraints typical of DR systems. The modi~,cations are designed to recognize such aspects of discourse structure as establishment of topic; "setting of context; summarizing; concept ... first and last sentences of the text. These sentences may include the more important con- cepts, and thus should be more heavily weighted. 2. Repeat first sentence of paragraph after the last ... and last sentence of the text, or overweight the score for each cO-OCcurrence containing a title word. Concepts in the title are likely to be the most im- portant in the text, yet are unlikely...

Ngày tải lên: 21/02/2014, 20:20

2 419 0
Tài liệu Thermal Processing of Foods : Control and Automation ppt

Tài liệu Thermal Processing of Foods : Control and Automation ppt

... common implementation of this type of control and it can be applied to many types of heat exchangers. The cascade is not required, but does offer the advantages of faster response and provides a view of the ... Trim: 229mm X 152mm 38 Thermal Processing of Foods 3.2. Critical factors in retort processing 3.2.1. Temperature measurement The effect of temperature on the destruction of microorganisms can be described ... maintain. Installation of the instruments on a digital network can require just one set of wires to connect the devices in series instead of one set of wires per instrument. Maintenance is enhanced because of the inherent...

Ngày tải lên: 21/02/2014, 22:20

220 416 1
Tài liệu Báo cáo khoa học: "Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text" doc

Tài liệu Báo cáo khoa học: "Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text" doc

... fluency of text with shallow structural features: case studies of machine translation and human-written text Jieun Chae University of Pennsylvania chaeji@seas.upenn.edu Ani Nenkova University of Pennsylvania nenkova@seas.upenn.edu Abstract Sentence ... Correlations between text quality assess- ment of the articles and the percentage of fluent sentences according to different models. text, and levels of fluency in the automatically pro- duced text. The distinctions ... stretches of text. But even in human written text, the presence of more verbs can make a difference in fluency (Bailin and Grafstein, 2001). Consider the following two sen- tences: • In his state of...

Ngày tải lên: 22/02/2014, 02:20

9 438 0
Báo cáo khoa học: Autocatalytic processing of procathepsin B is triggered by proenzyme activity doc

Báo cáo khoa học: Autocatalytic processing of procathepsin B is triggered by proenzyme activity doc

... the formation of a noncleavable procathepsin B mutant. Next, we evaluated the activity of the mature forms resulting from the processing of procathepsin B mutants. All these forms of cathepsin ... mixture (100 lL) contained 500 ng of a plasmid template, 50 pmol of each of the three oligonucleotides (the two outer and a mutagenic one), 20 nmol of each of the four deoxynucleo- side triphosphates, ... respectively. When specified, processing was accelerated by the addition of dextran sulfate (25 lgÆmL )1 ) or decelerated by the addi- tion of E-64 in the processing buffer. The final concentra- tion of procathepsin...

Ngày tải lên: 07/03/2014, 03:20

9 425 0
Báo cáo khoa học: A role for serglycin proteoglycan in granular retention and processing of mast cell secretory granule components ppt

Báo cáo khoa học: A role for serglycin proteoglycan in granular retention and processing of mast cell secretory granule components ppt

... the amount of actual sulfated PGs is not directly correlated with the level of SG mRNA, and it was therefore of interest to also follow the levels of sulfated PGs during the course of MC differentiation. ... processing. These findings indicate that the processing of pro-CPA occurs in (at least) two steps, and that the processing of the intermediate form of CPA to mature protease is dependent on a ... dramatic effects of the SG knockout on granular staining properties and storage of proteases, it was first important to determine whether the lack of SG affected the actual assembly of granules and...

Ngày tải lên: 07/03/2014, 11:20

12 438 0
Báo cáo khoa học: "A Computational Model of Text Reuse in Ancient Literary Texts" potx

Báo cáo khoa học: "A Computational Model of Text Reuse in Ancient Literary Texts" potx

... text, and the Gospel of Mark as the source text. We use a Greek New Testament corpus prepared by the Center for Computer Analysis of Texts at the University of Pennsylvania 3 , based on the text ... to a text- reuse hypothesis y ∈ Y ∪ {}. X is the set of verses in the target text. In our case, x train = (x 1 , . . . , x 458 ) is the sequence of verses in L train , and x test is that of L test . ... de- rived. If w i , w j are the vectors of words of a 4 Note that the training set consists of only one x train — the Gospel of Luke. Luke’s only other book, the Acts of the Apostles, contains few identifiable...

Ngày tải lên: 08/03/2014, 02:21

8 536 0
Báo cáo khoa học: "Searching for Topics in a Large Collection of Texts" doc

Báo cáo khoa học: "Searching for Topics in a Large Collection of Texts" doc

... survey of some open ques- tions. Finally, a short summary is given in sec- tion 5. 2 Concept-formative clusters 2.1 Graph of a text collection Let be a collection of text documents; is the size of ... parameter. This feature of the GRA has been designed for the sake of generalization, in order to not overfit the input sample. The input of the GRA consists of (i) a sam- ple set of document vectors ... limited number of non- zero elements in the resulting vector. Formally, gets on input a set of vectors ; a corresponding set of values to be approximated; and a set of indexes of the ele- ments...

Ngày tải lên: 08/03/2014, 04:22

6 447 0
Báo cáo khoa học: "Diagnostic Processing of Japanese for Computer-Assisted Second Language Learning" doc

Báo cáo khoa học: "Diagnostic Processing of Japanese for Computer-Assisted Second Language Learning" doc

... diag- nostic processing of Japanese be- ing able to detect errors and inap- propriateness of sentences composed by the students in the given situ- ation and the context of the exer- cise texts. Using ... con- straints, 3. to communicate semantic contexts and situations to the students through assist- ing reading the texts by way of bidirec- tionally linking the text words with an electronic dictionary, 4. ... anaphora. Acknowledgment The authors are grateful to Prof. Jun-ichi Tsujii, University of Tokyo, for discussing and providing information on LTAG as well as the status quo of natural language processing. The work reported...

Ngày tải lên: 08/03/2014, 05:20

10 417 0
Báo cáo khoa học: "Thematic segmentation of texts: two methods for two kinds of texts" pdf

Báo cáo khoa học: "Thematic segmentation of texts: two methods for two kinds of texts" pdf

... different kinds of texts. We will discuss these results and give criteria to choose the more suitable method according to text characteristics. 3. Pre -processing of the texts As we are interested ... thematic dimension of the texts, they have to be represented by their significant features from that point of view. So, we only hold for each text the lemmatized form of its nouns, verbs ... =(gil,gi2, ,git) where gi is the number of occurrences of a given descriptor in Gi. The descriptors are the words extracted by the pre -processing of the current text. Term vectors are weighted....

Ngày tải lên: 08/03/2014, 05:21

5 364 0
Báo cáo khoa học: "Discourse Processing of Dialogues with Multiple Threads" pot

Báo cáo khoa học: "Discourse Processing of Dialogues with Multiple Threads" pot

... Implications of this model of Attentional State are explored more fully in (Rosd 1995). 3 Discourse Processing We evaluated the effectiveness of our theory of dis- course structure in the context of ... set of di- alogues, performance in terms of attaching the cur- rent chain of inference to the correct place in the plan tree for the purpose of augmenting temporal expressions from context ... theory of discourse structure in the spirit of (Grosz and Sidner 1986) which has played an influential role in the analysis of discourse entity saliency and in the development of dialogue processing...

Ngày tải lên: 08/03/2014, 07:20

8 266 0
Báo cáo khoa học: "PREDICTIVE COMBINATORS A METHOD PROCESSING OF COMBINATORY CATEGORIAL GRAMMARS" doc

Báo cáo khoa học: "PREDICTIVE COMBINATORS A METHOD PROCESSING OF COMBINATORY CATEGORIAL GRAMMARS" doc

... he of a more complex sort, namely, one of the so-called functor categories. Functor categories are of the form XIY , which is viewed as a function from categories of type Y to categories of ... even though they may or may not be weakly context-free, nevertheless can make use of many of the parsing techniques that have been developed for context-free grammars since the ap- plicability ... EFFICIENT PROCESSING OF COMBINATORY CATEGORIAL GRAMMARS Kent Wittenburg MCC, Human Interface Program 3500 West Balcones Center Drive Austin, TX 78759 Department of Linguistics University of Texas...

Ngày tải lên: 08/03/2014, 18:20

8 355 0

Bạn có muốn tìm thêm với từ khóa:

w