word and sentencealigned parallel corpora

Tài liệu Báo cáo khoa học: "ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation" pdf

Tài liệu Báo cáo khoa học: "ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation" pdf

... Valletta, Malt. D. Tufis¸, R. Ion, and N. Ide. 2004. Fine-Grained Word Sense Disambiguation Based on Parallel Cor- pora, Word Alignment, Word Clustering and Aligned Wordnets. In Proceedings of the ... to the focus word itself being the word form of the focus word, the lemma, Part-of-Speech and chunk information • local context features related to a window of three words preceding and following ... incorporates the automatically generated word alignments as labels. We applied an automatic post-processing step on these word alignments in or- der to remove leading and trailing determiners and prepositions....

Ngày tải lên: 20/02/2014, 05:20

6 538 0
Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

... reflects a certain cultural con- text and cannot be simply replaced by a word to word translation. • collocations: Some word pairs such as projects and ~(houses) are not direct translations. ... unique words in the Chinese text, N is the occurrence count of one English word and M the occurrence count of one Chinese word. We previously used some frequency difference con- straints and ... influenced the results. Apart from single Word to single word transla- tion such as Governor/~ and prosperity/~i~fl¢~, we also found many single word translations which show potential towards...

Ngày tải lên: 20/02/2014, 22:20

8 427 0
Tài liệu Báo cáo khoa học: "ALIGNING SENTENCES IN PARALLEL CORPORA" doc

Tài liệu Báo cáo khoa học: "ALIGNING SENTENCES IN PARALLEL CORPORA" doc

... that our technique wouhl work on other French and English corpora and even on other pairs of languages. The work of Gale and Church , [Gale and Church, 1991], who use a very similar method ... from one or the other of the corpora. If a person is given two parallel texts and asked to match up the sentences in them, it is na.tural for him to look at the words in the sen- tences. Elaborating ... re[erred to as tta.nsards. 169 And love and kisses to you, too. mugwumps who sit on the fence with their mugs on one side and their wumps on the other side and do not know which side to...

Ngày tải lên: 20/02/2014, 21:20

8 387 0
Tài liệu Canada Science and Technology Museum Corporation docx

Tài liệu Canada Science and Technology Museum Corporation docx

... scientific and technical objects, with special but not exclusive reference to Canada, and by demonstrating the products and processes of science and technology and their economic, social and cultural ... owned by the Corporation are recorded at cost and amortized over their estimated useful life. Land and buildings owned by the Government of Canada and under the control of the Corporation ... medicine, meteorology, surveying and mapping, and information technology; and Transportation: motorized and non-motorized wheel, track and trackless vehicles; motorized and non-motorized marine transportation,...

Ngày tải lên: 21/02/2014, 11:20

27 387 0
Tài liệu Báo cáo khoa học: "WORD AND OBJECT IN DISEASE DESCRIPTIONS" doc

Tài liệu Báo cáo khoa học: "WORD AND OBJECT IN DISEASE DESCRIPTIONS" doc

... between common English words and medical terms. We measured word frequency by "disease occur- rence", (the number of disease definitions in which a given word occurs one or more ... com~non English word like 'of', would be used in the descriptions of all kinds of dis- ease, and would accordingly have a high 'entropy'. Tables 2 and 3 show the top and bottom ... could co-occur in any location in the definition and in either order), and the co-occurrences expected from chance alone. Tables 4 and 5 show the top and bottom of a list of all pairs formed from...

Ngày tải lên: 21/02/2014, 20:20

4 527 0
Tài liệu Báo cáo khoa học: "Analysing Wikipedia and Gold-Standard Corpora for NER Training" ppt

Tài liệu Báo cáo khoa học: "Analysing Wikipedia and Gold-Standard Corpora for NER Training" ppt

... training, and suggest that their re- sults with a 340k -word Spanish corpus are compa- rable to 20k-40k words of gold-standard training data when using MUC-style evaluation metrics. 2.1 Gold-standard corpora We ... (CVN-68) has wordtype AA Aaa (AA-00). Wordtype with functions: We also map content words to wordtypes only—function words are retained, e.g. Bank of New England Corp. maps to Aaa of Aaa Aaa Aaa ... Association for Computational Linguistics Analysing Wikipedia and Gold-Standard Corpora for NER Training Joel Nothman and Tara Murphy and James R. Curran School of Information Technologies University...

Ngày tải lên: 22/02/2014, 02:20

9 478 0
Perspectives of Chief Ethics and Compliance Officers on the Detection and Prevention of Corporate Misdeeds ppt

Perspectives of Chief Ethics and Compliance Officers on the Detection and Prevention of Corporate Misdeeds ppt

... mechanisms of corporate governance, compliance, and ethics, and their collective role in preventing and mitigating excesses and scandals in the corporate sector. Earlier rounds of corporate scandal ... compliance and ethics activities. Notwithstanding these and other changes in law and regulation affecting corporate directors, in the years since SOX there has been a series of further scandals and ... integrity and corporate ethics starts with a senior- level chief ethics and compliance officer (CECO) who understands the compliance and ethics field, is empowered and experienced, and who has...

Ngày tải lên: 06/03/2014, 22:20

61 422 0
Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

... comparable corpora has focused on extracting word translations (Fung and Yee, 1998; Rapp, 1999; Diab and Finch, 2000; Koehn and Knight, 2000; Gaussier et al., 2004; Shao and Ng, 2004; Shinyama and Sekine, ... 19(1):61–74. Pascale Fung and Percy Cheung. 2004a. Mining very non -parallel corpora: Parallel sentence and lexicon extraction vie bootstrapping and EM. In EMNLP 2004, pages 57–63. Pascale Fung and Percy Cheung. ... automatic creation of parallel corpora. Sev- eral researchers (Zhao and Vogel, 2002; Vogel, 2003; Resnik and Smith, 2003; Fung and Cheung, 2004a; Wu and Fung, 2005; Munteanu and Marcu, 2005) have...

Ngày tải lên: 08/03/2014, 02:21

8 263 0
Báo cáo khoa học: "Paraphrasing with Bilingual Parallel Corpora" pot

Báo cáo khoa học: "Paraphrasing with Bilingual Parallel Corpora" pot

... em- ployed in word sense disambiguation work that uses parallel corpora (Diab and Resnik, 2002). The as- sumption made in the word sense disambiguation work is that if a source language word aligns ... Callison-Burch, David Talbot, and Miles Osborne. 2004. Statistical machine translation with word- and sentence-aligned parallel corpora. In Proceedings of ACL. Mona Diab and Philip Resnik. 2002. An ... unsupervised method for word sense tagging using parallel corpora. In Proceedings of ACL. Ali Ibrahim, Boris Katz, and Jimmy Lin. 2003. Extract- ing structural paraphrases from aligned monolingual corpora. ...

Ngày tải lên: 08/03/2014, 04:22

8 308 0
Báo cáo khoa học: "AUTOMATIC ALIGNMENT IN PARALLEL CORPORA" potx

Báo cáo khoa học: "AUTOMATIC ALIGNMENT IN PARALLEL CORPORA" potx

... unknown words, we can use the fact that most unknown words are nouns or proper nouns and merge this category with nouns. We can also merge acs that are represented with only a few distinct words ... with content words) reduces the number of parameters 335 AUTOMATIC ALIGNMENT IN PARALLEL CORPORA Harris Papageorgiou, Lambros Cranias, Stelios Piperidis I Institute for Language and Speech ... semantic load of a sentence as the patterns of tags of its content words. Content words are taken to be verbs, nouns, adjectives and adverbs. The complexity of transfer in translation imposes...

Ngày tải lên: 08/03/2014, 07:20

3 193 0
Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

... list. In other words, if the first translation candidate for the source word isola is the target word island, and, vice versa, the first translation candidate for the target word island is isola, ... pair (isola, island). 2. Remove the words isola and island from their respective vocabularies. 3. Since island is not in the vocabulary, the indirect association between arcipelago and island is not ... In other words, if the most prob- able translation candidate for a source word w S 1 is a target word w T 2 and, vice versa, the most prob- able translation candidate of the target word w T 2 451 Proceedings...

Ngày tải lên: 08/03/2014, 21:20

11 290 0
analyse the importance and impacts of corporate social responsibility (csr) in business towards the social and environmental issues in vietnam

analyse the importance and impacts of corporate social responsibility (csr) in business towards the social and environmental issues in vietnam

... such as Japan, America and Europe, companies need to strengthen research capacity to apply international standards on environment, such as ISO 14000 and environmental standards of the market ... have managers’ awareness and commitment to carry out and improve the initiative of CSR. If the managers are at least partly the proponent of the corporate culture, and integrating CSR requires ... and 20% of respondents have no opinion on this issue and no managers indicate a strongly disagree to the statement. Furthermore, tables 79% of respondents understand all about what is CSR and...

Ngày tải lên: 13/03/2014, 14:20

64 783 4
research on awareness and implementation of corporate social responsibility in a multinational company in vietnam case study nestle vietnam

research on awareness and implementation of corporate social responsibility in a multinational company in vietnam case study nestle vietnam

... multinational corporations but their acceptance and performance about CSR does not pay attention and concern; and it is the caused of many damages and disasters for environment and social. In ... dilute the tide, and when the tide go down, water and waste was mixed colors, and flows into the Dong Nai river. People living around said that the sewage canals have a black color and a very unpleasant ... the environmental, the cultural and the financial) and sustainability of behavior which contributes to a future for the people and the planet” (Pearce 2011); and as “voluntary disclosures of...

Ngày tải lên: 13/03/2014, 14:20

73 706 2
the importances and impacts of corporate social responsibility activities of hanoi construction & investment company (hacinco)

the importances and impacts of corporate social responsibility activities of hanoi construction & investment company (hacinco)

... figures target whilst managing and redirecting their behaviors on how to fit the ethical standards and improving the life quality of employees and their families, and the local community. 2.2. ... rearranged, and summarized in the “Review reports” and “Annual Performance” section. 46 Demacarty, P., 2009. Financial returns of corporate social responsibility, and the moral freedom and responsibility ... the directors and staffs, customers and CSR benefit receivers in Hanoi city. 3.6.3. Sample size 100 management and staffs questionnaire forms, 100 customers questionnaire forms and 300 CSR...

Ngày tải lên: 13/03/2014, 14:20

47 588 1
Báo cáo khoa học: "Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora" potx

Báo cáo khoa học: "Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora" potx

... (positive or negative);   and   are respectively the numbers of labeled instances in   and   ;     and     are parallel instances in   and   , respectively (i.e. ... on the right-hand side is the likelihood of labeled data for both   and   ; and the second term is the likelihood of the unlabeled parallel data . If we assume that parallel sentences ... for the Labeled Data Unlabeled Parallel Text and its Preprocessing. For the unlabeled parallel text, we use the ISI Chinese-English parallel corpus (Munteanu and Marcu, 2005), which was extracted...

Ngày tải lên: 17/03/2014, 00:20

11 302 0

Bạn có muốn tìm thêm với từ khóa:

w