Tài liệu Báo cáo khoa học: "Enhanced word decomposition by c

Tài liệu Báo cáo khoa học: "Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble" pdf

... 2006). They used a natural language tagger which was trained on the output of ParaMor and Morfes- sor. The goal was to mimic each algorithm since ParaMor is rule-based and there is no access ... approaches is based on the learning aspect during the construction of the morphological model. If the data for training the model has the same struc- ture as the desir...

Ngày tải lên: 20/02/2014, 04:20

9 558 0

Tài liệu Báo cáo khoa học: Enhanced thermostability of methyl parathion hydrolase from Ochrobactrum sp. M231 by rational engineering of a glycine to proline mutation pdf

... MPH a Forward: 5¢-TAGAATTCGCTGCTCCACAA GTTAGAACT-3¢ Reverse: 5¢-TA GCGGCCGCTTACTTTGGGTTA ACGACGGA-3¢ Mutant MPH b G194P 5¢-CCTGACGATTCTAAACCGTTCTTCAAGGGTGCC-3¢ G198P 5¢-AAAGGTTTCTTCAAG CCGGCCATGGCTTCCCTT-3¢ G194P ... least three replicates. The K m and k cat values were calculated by nonlinear regression using graphpad prism 5.0 (GraphPad Software Inc., La Jolla, CA, USA). Thermos...

Ngày tải lên: 15/02/2014, 01:20

8 740 0

Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf

... src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAA9gAAAFXCAIAAADWMXECAAAACXBIWXMAABYlAAAWJQFJUiTwAAAgAElEQVR42uzdeVxTV/438EPCkrAEBJe4AIorErSKCLhCI0WqI0IVFRWwLm1RnHZ0purgr1Z5nNrq6EBLW6tF24KKLYp1wYVCXQBBUEkQ17gAiooQgpCw3OT5485cLwEpVSABPu8Xf5h4c3O4uSSfe/I95xhoNBoCAAAAAADti4NDAAAAAACAIA4AAAAAgCAOAAAAAAAI4gAAAAAACOIAAAAAAIAgDgAAAACAIA4AAAAAAAjiAAAAAAAI4gAAAAAAgCAOAAAAAIAgDgAAAACAIA4AAAAAAAjiAAAA...

Ngày tải lên: 19/02/2014, 19:20

10 494 0

Tài liệu Báo cáo khoa học: "Unsupervized Word Segmentation: the case for Mandarin Chinese" doc

... segmentation of Mandarin Chinese data. The main drawbacks of ESA are the need to iterate the process on the corpus around 10 times to reach good performance levels and the need to set a param- eter ... most of the types are bi- and trigrams (as unigrams are often high frequency grammatical words and trigrams the result of more or less productive a xations). Th...

Ngày tải lên: 19/02/2014, 19:20

5 467 1

Tài liệu Báo cáo khoa học: "Learning Word-Class Lattices for Deﬁnition and Hypernym Extraction" doc

... lexico- syntactic “hard” patterns in that they allow a partial matching by calculating a generative degree of match probability between the test instance and the set of training instances. Thanks ... ω j b 0 otherwise where ω k a and ω j b are the a- th and b-th word classes of s  k and s  j , respectively. In other words, the matching score equals 1 if t...

Ngày tải lên: 20/02/2014, 04:20

10 567 0

Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt

... where the amount of labeled data is small relative to the amount of un- labeled data available. For all word vector models, we use 50-dimensional vectors. As a qualitative assessment of word representations, ... doc- uments are associated with a label on a star rating scale. We linearly map such star values to the inter- val s ∈ [0, 1] and treat them as a probabi...

Ngày tải lên: 20/02/2014, 04:20

9 591 0

Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

... a list of characters Variables: candidate sentence item – a list of (word, tag) pairs; maximum word- length record maxlen for each tag; the agenda list agendas; the tag dictionary tagdict; start index ... character, and combines each word with the partial candidates ending with its previous character. All input characters are processed in the same way, and the ﬁnal...

Ngày tải lên: 20/02/2014, 09:20

9 576 0

Tài liệu Báo cáo khoa học: "Discriminative Word Alignment with Conditional Random Fields" ppt

... both the Dice value and the Model 1 translation probability as real-valued features for each candidate pair, as well as a normalised score 67 over all possible candidate alignments for each target ... is a strong alignment candidate. The sum of these scores is also used as a feature. Each source word and POS tag pair are used as indicator features which allow the mo...

Ngày tải lên: 20/02/2014, 11:21

8 461 0

Tài liệu Báo cáo khoa học: "Direct Word Sense Matching for Lexical Substitution" ppt

... setting the results of the direct and indirect approaches are comparable. However, ad- dressing directly the binary classification task has practical advantages and can yield high precision values, as ... target word in the Senseval dataset. A classification instance is thus defined by a pair of source and target words and a given occurrence of the target wor...

Ngày tải lên: 20/02/2014, 12:20

8 362 0

Tài liệu Báo cáo khoa học: "SenseLearner: Word Sense Disambiguation for All Words in Unrestricted Text" doc

... SENSELEARNER framework. Note that the semantic models are applicable only to: (1) words that are covered by the word category deﬁned in the models; and (2) words that appeared at least once in the ... developed by (Yuret, 2004), which combines two Naive Bayes sta- tistical models, one based on surrounding collocations and another one based on a bag of words around t...

Ngày tải lên: 20/02/2014, 15:20

4 400 0