change columns of text

Tapping into the Power of Text Mining

Tapping into the Power of Text Mining

... area of text mining tackles problems of text representation, classification, clustering, information extraction or the search for and modelling of hidden patterns In this context the selection of ... and semantics of text, most text mining approaches are based on the idea that a text document can be represented by a set of words, i.e a text document is described based on the set of words contained ... generalization even in the presence of a large number of features and makes SVM 14 especially suitable for the classification of texts [Joa98] In the case of textual data the choice of the kernel function...

Ngày tải lên: 31/08/2012, 16:46

37 1,3K 3
Treatment of Textile Wastewater by a Coupling of Activated Sludge Process with Membrane Separation

Treatment of Textile Wastewater by a Coupling of Activated Sludge Process with Membrane Separation

... minutes Fig Effect of backflush pressure on the flux, MLSS of 8700 mg/l, TMP of 0.4 bar, CFV of 0.88 m/s, backflushing pressure of 1.5 bar and backflushing interval of s Fig Effect of backflush interval ... MINUTES Fig Effect of TMP on the membrane flux, MLSS of 9300 mg/L, CVF (v) of 0.79 m/s, without backflush Fig Effect of CFV (v) on the membrane flux, MLSS of 9300 mg/L, TMP of 0.4 bar, without ... on the flux, MLSS of 8700 mg/l, TMP of 0.4 bar, CFV of 0.88 m/s, backflushing pressure of 1.5 bar and backflushing interval of s Main experiments In the main experiments, the textile wastewater...

Ngày tải lên: 05/09/2013, 09:08

8 434 0
Comparative decolorizing efficiency of textile dye by mesophilic and thermophilic anaerobic treatments

Comparative decolorizing efficiency of textile dye by mesophilic and thermophilic anaerobic treatments

... production, TOC reduction, together with the reduction of dye were investigated The obtained results of decolorization of 100mgL-1 of RB4 and 200mgL-1 of MO used in the experiment were compared with ... and presence of dye was associated with inhibition of organic matters conversion pathways caused by of the accumulation of volatile fatty acids in the treatment system In the case of temperature ... were inhibited by the presence of MO which resulted in slow reduction of TOC, while the presence of RB4 inhibited methane productivity TOC reduction of treatment of RB4 was similar to the control...

Ngày tải lên: 05/09/2013, 09:38

10 405 0
Tài liệu MIT Joint Program on the Science and Policy of Global Change: Effects of Air Pollution Control on Climate pdf

Tài liệu MIT Joint Program on the Science and Policy of Global Change: Effects of Air Pollution Control on Climate pdf

... qualitatively, some of the important potential impacts of controls of air pollutants on temperature, we have carried out runs of the IGSM in which individual pollutant emissions, or combinations of these ... climate change and net ecosystem production of the terrestrial biosphere Global Biogeochemical Cycles, 12: 345-360 14 REPORT SERIES of the MIT Joint Program on the Science and Policy of Global Change ... concentrations of CH4 However, increasing NOx emissions should increase tropospheric O3 (and hence the primary source of OH), as well as increase the recycling rate of HO2 to OH (the second source of OH)...

Ngày tải lên: 17/02/2014, 22:20

19 612 1
Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt

Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt

... the examples of the previous section From the point of view of bag -of- word methods, the pairs (T1 , H1 ) and (T1 , H2 ) have both the same intra-pair similarity since the sentences of T1 and H1 ... marking of tree nodes with placeholders; and, (3) the pruning of irrelevant information in large syntactic trees 5.1 5.3 Pruning irrelevant information in large text trees Often only a portion of ... perceptron In Proceedings of ACL02 Courtney Corley and Rada Mihalcea 2005 Measuring the semantic similarity of texts In Proc of the ACL Workshop on Empirical Modeling of Semantic Equivalence and...

Ngày tải lên: 20/02/2014, 12:20

8 413 0
Tài liệu Báo cáo khoa học: "REPRESENTATION OF TEXTS FOR INFORMATION RETRIEVAL" pdf

Tài liệu Báo cáo khoa học: "REPRESENTATION OF TEXTS FOR INFORMATION RETRIEVAL" pdf

... volume constraints typical of DR systems The modi~,cations are designed to recognize such aspects of discourse structure as establishment of topic; "setting of context; summarizing; concept foregrounding; ... the relative effectiveness of the various modifications in improving the original representations - Weak Associations Figure Repeat first and last sentences of the text These sentences may ... Repeat first sentence of paragraph after the last sentence To integrate these sentences more fully into ~he overall structure Make the title the first and last sentence of the text, or overweight...

Ngày tải lên: 21/02/2014, 20:20

2 419 0
Tài liệu Báo cáo khoa học: "REQUIREMENTS OF TEXT PROCESSING LEXICONS " ppt

Tài liệu Báo cáo khoa học: "REQUIREMENTS OF TEXT PROCESSING LEXICONS " ppt

... 213-220 N e l e C u k , I A ° , tA n e w k i n d of d i c t i o n a r y a n d i t s r o l e as a c o r e c o m p o n e n t of a u t o matlc text processing systems," T.A Znformatlone, 1978, ... TR-511, Department of C o m puter Science, University of M a r y l a n d , College Park, Maryland, January 1977 Rieger,C and S.Small, Word Expert Parsing, TR-734, Department of C o m p u t e r ... r m d e s c r i b e d must ultimately constitute the elements out of w h i c h s e m a n t i c r a p r e s e n t a t l o n s of m u l t i s e n t e n c e t e x t s m u s t be c r e a t e d ,...

Ngày tải lên: 21/02/2014, 20:20

2 335 0
Tài liệu Báo cáo khoa học: "Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text" doc

Tài liệu Báo cáo khoa học: "Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text" doc

... of predicting which of the two is better is 90% Results are not as high but still promising for comparisons in fluency of translations of the same text The prediction becomes better when the texts ... stretches of text But even in human written text, the presence of more verbs can make a difference in fluency (Bailin and Grafstein, 2001) Consider the following two sentences: Table 1: Distribution of ... indicative of overall text quality (Pitler and Nenkova, 2008) We leave direct comparison for future work Table 7: Correlations between text quality assessment of the articles and the percentage of fluent...

Ngày tải lên: 22/02/2014, 02:20

9 438 0
Báo cáo khoa học: "A Computational Model of Text Reuse in Ancient Literary Texts" potx

Báo cáo khoa học: "A Computational Model of Text Reuse in Ancient Literary Texts" potx

... target text, and the Gospel of Mark as the source text We use a Greek New Testament corpus prepared by the Center for Computer Analysis of Texts at the University of Pennsylvania3 , based on the text ... scores for some of the derived sentences Text Ltrain Ltest Researcher (Bovon, 2002) (Jeremias, 1966) (Bovon, 2003) (Jeremias, 1966) Model B J Table 2: Two models of text reuse of Mark in Ltrain ... Gospel of Luke and the Gospel of Matthew have as their common sources two documents: the Gospel of Mark, and a lost text customarily denoted Q In particular, we will consider the Gospel of Luke2...

Ngày tải lên: 08/03/2014, 02:21

8 536 0
Báo cáo khoa học: "Searching for Topics in a Large Collection of Texts" doc

Báo cáo khoa học: "Searching for Topics in a Large Collection of Texts" doc

... graph of text collection ; an initial cut locally optimal cut Output: ’   )  © Input: } a set of vectors ; a corresponding set of values to be approximated; and a set of indexes of the ... sake of generalization, in order to not overfit the input sample The input of the GRA consists of (i) a sample set of document vectors with the correspond, (ii) a maximum number of ing values of ... The extensity of cut is defined as a positive function where is a threshold size of cut is called weight of is called weight of the connection between cuts and ; edge in graph # of edges between...

Ngày tải lên: 08/03/2014, 04:22

6 447 0
Báo cáo khoa học: "Thematic segmentation of texts: two methods for two kinds of texts" pdf

Báo cáo khoa học: "Thematic segmentation of texts: two methods for two kinds of texts" pdf

... tested on different kinds of texts We will discuss these results and give criteria to choose the more suitable method according to text characteristics Pre-processing of the texts As we are interested ... the thematic dimension of the texts, they have to be represented by their significant features from that point of view So, we only hold for each text the lemmatized form of its nouns, verbs and ... to the number of A occurrences and w k to the n u m b e r of B o c c u r r e n c e s In case of descriptor addition, the descriptor weight is set to the number of occurrences of the linked descriptor...

Ngày tải lên: 08/03/2014, 05:21

5 364 0
Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

... number of classes is small Exploratory analysis of the corpus highlighted the importance of having a similar number of texts per class This requirement made it impossible to use all the texts ... assumption of ordinality Using this cumulative model, when ≤ j ≤ J, the estimated probability of a text Y belonging to 1+ Implementation of the models Having covered the theoretical aspects of our ... source Likewise, not all the material from FFL textbooks is appropriate We established the following criteria for selecting textbooks and texts: Of course, if there is little work on French readability,...

Ngày tải lên: 08/03/2014, 21:20

9 514 0
Báo cáo khoa học: "A Corpus of Textual Revisions in Second Language Writing" pdf

Báo cáo khoa học: "A Corpus of Textual Revisions in Second Language Writing" pdf

... for text visualization and search, and release it to the research community It is expected to support studies on textual revision of language learners, and the effects of different types of feedback ... beginning and end of the drafts severely affected the precision, since they are often not quoted in brackets and are therefore indistinguishable from the text proper In comment-to -text alignment, ... the final versions are Microsoft Word documents Our first task, therefore, is to convert them into a machine-actionable, XML format conforming to the standards of the Text Encoding Initiative (TEI)...

Ngày tải lên: 16/03/2014, 20:20

5 421 0
Báo cáo khoa học: "QARLA:A Framework for the Evaluation of Text Summarization Systems" pdf

Báo cáo khoa học: "QARLA:A Framework for the Evaluation of Text Summarization Systems" pdf

... framework QUEEN: Estimation of the quality of an automatic summary We are now looking for a function QM,x (a) that estimates the quality of an automatic summary a ∈ A, given a set of models M and a similarity ... rightmost part of the figure, peers are distributed around the set of models, closely surrounding them, receiving a high JACK value A Case of Study In order to test the behaviour of our evaluation ... summary, the number of fragments of the reference summary which are also in the contrastive summary, in relation to the size of the contrastive summary DocSim: The number of documents used to...

Ngày tải lên: 17/03/2014, 05:20

10 518 0
Báo cáo khoa học: "BULK PROCESSING OF TEXT ON A MASSIVELY PARALLEL COMPUTER" docx

Báo cáo khoa học: "BULK PROCESSING OF TEXT ON A MASSIVELY PARALLEL COMPUTER" docx

... the size of the text approaches many tens of thousands of words, the number of unique words increased into the thousands Therefore, it can be concluded that the second implementation of the broadcasting ... original location of its text word • Figure illustrates the execution of the entire sort-scan algorithm on a sample sentence 132 Figure Illuswationof Sort-ScanAlgorithm Formalof ProcessorDiagram:JSttmn ... creation of this field, and the scanning of the maximum function across it Note that the size of the field being scanned is the size of the definition (8 bits for the timings below) plus the size of...

Ngày tải lên: 24/03/2014, 02:20

8 306 0
Báo cáo khoa học: "Automatic Detection of Text Genre" doc

Báo cáo khoa học: "Automatic Detection of Text Genre" doc

... computational cost of training on a large set of cues Variation measures capture the amount of variation of a certain count cue in a text (e.g the standard deviation in sentence length) This type of useful ... using 499 of the 802 texts in the Brown Corpus (While the Corpus contains 500 samples, many of the samples contain several texts.) For our experiments, we analyzed the texts in terms of three ... of facets, we can categorize this genre as INSTITUTIONAL (because of the use of we as in editorials and annual reports) and as NON-SUASIVE or non-argumentative (because of the low incidence of...

Ngày tải lên: 31/03/2014, 21:20

7 277 0
Geometric and Mechanical Modelling of Textiles pdf

Geometric and Mechanical Modelling of Textiles pdf

... CYarnSectionConstant CTextileWeave2D CYarnSectionInterp CTextileWeave3D CTextileWeave * CTextile CTexGen C HAPTER 2: G EOMETRIC MODELLING OF TEXTILES C HAPTER 2: G EOMETRIC MODELLING OF TEXTILES CYarn ... Section 2.7 CTextile is an assembly of yarns which is further specialised by CTextileWeave, CTextileWeave2D and CTextileWeave3D The specialised classes allow for automated creation of textile geometries ... closer t to experimental results 2.2.2 Textile geometrical modelling software In this section a brief review of the software packages used to model the geometry of textile fabrics is described TexGen...

Ngày tải lên: 31/03/2014, 23:20

271 3,5K 1
Báo cáo khoa học: "VARIOUS REPRESENTATIONS OF TEXT PROPOSED FOR EUROTRA" docx

Báo cáo khoa học: "VARIOUS REPRESENTATIONS OF TEXT PROPOSED FOR EUROTRA" docx

... introduction of the conventions they have adopted (names of properties and values, of types of occurrences, of strings ) The information of a linguistic nature is exclusively meant for the unification of ... the decomposition of the text) in a system with integrated coding, it suffices to introduce special codes (or to use existing codes, like end -of- text, formats ) to mark the text and to generate ... a means of specifying all the characteristics necessary for the recognition of formats on a wide range of formattors and text processing systems But we may assume that, independently of the formattor...

Ngày tải lên: 01/04/2014, 00:20

6 181 0
DICTIONARY OF TEXTILE

DICTIONARY OF TEXTILE

... repeat of shuttles ráp po hình hoa, repeat of design ráp po kiểu dệt, weave repeat, gait-over, repeat of interlacing, repeat of pattern, repeat of weave, pattern repeat, weaving repeat, unit of design, ... làm lược, reed-making machine máy làm mềm, softening machine, softener máy làm mềm đay, jute softener máy làm mềm vải, cloth-mellowing machine, cloth softener máy làm tơi, opening machine máy liên ... broken threads nối đầu vải (sự), joining of fabrics end-to-end nối đứt sợi-dọc, mend of warp break nối đứt sợi-ngang, mend of weft break nối sợi-dọc, joining of warp threads nồi (máy sợi-con), yarn...

Ngày tải lên: 08/04/2014, 01:06

63 278 2
w