... corpora that are phonetically similar to Japanese Katakana words as loanwords. We also corresponded the extracted loanwords to Japanese words, and produced a Japanese–Mongolian bilingual dictionary. ... 2006.c2006 Association for Computational LinguisticsExtracting loanwords from Mongolian corpora and producinga Japanese-Mongolian bilingual dictionary Badam-Osor Khaltar Graduate School of Library, ... It is feasible that a large number of loanwords in Korean can also be loanwords in Japanese. Additionally, Katakana words can be extracted from Japanese corpora with a high accuracy. Thus,...
... description and evaluation, including com-parative analysis, are available in Varga and Yo-koyama (2009). 2 Methodological background 2.1 Pivot based dictionary generation Pivot language based bilingual ... costly, alter-native methods are imperative. Pivot language based bilingualdictionary gen-eration is one plausible such alternative (Tanaka and Umemura, 1994; Sjöbergh, 2005; Shirai and Yamamoto, ... dyn36150@dip.yz.yamagata-u.ac.jp Yokoyama Shoichi Yamagata University, Graduate School of Science and Engineering yokoyama@yz.yamagata-u.ac.jp Abstract In this paper we introduce abilingual diction-ary...
... trans-lations for creating a tri-lingual collocation dic-tionary, with samples of actual use in language.Using past translations as reference for the transla-tor's further work was an ... unableto create a complete analysis of a sentence, theFips parser returns chunks of partial analyses. If132Creating a Multilingual Collocation Dictionary from Large Text CorporaLuka Nerima, ... V-Prep-N.Another argument in favour of a full syntacticalanalysis is that it solves the problem of all cases ofextraposed elements, such as passives, topicalisa-tion, and dislocation. To illustrate...
... word-aligned data is crucial for example-based machinetranslation, statistical machine translation, but alsoother applications such as cross-lingual informationretrieval. Since it is a hard and ... time-consuming taskto hand-align bilingual data, the automation of thistask receives a fair amount of attention. In this pa-per, we present an approach to improve the bilin-gual dictionary that is ... then are thencounts for all word pairs and , where is inthe source language vocabulary and is in the tar-get language vocabulary. Often, the scores also takeinto account the marginal probabilites...
... is length-based and integrates a shal-low content analysis. It begins by individuating a paragraph in the target text which is a first candi-date as target paragraph, and which we call"pivot". ... syntactical relation).When parallel corpora are available, also thetranslation equivalents of the collocation contextare displayed, thus allowing the user to see how a given collocation was translated ... trans-lations for creating a tri-lingual collocation dic-tionary, with samples of actual use in language.Using past translations as reference for the transla-tor's further work was an...
... like aufh6ren ('stop'), illustrated in (1). (1) a. Anna glaubt, dass Bernard aufh6rt. ('Anna believes that Bernard stops') b. Claudia h6rt jetzt auf. ('Claudia stops ... the last line mark the absence of a line break in the actual code. Feature specifications separated by tabs refer to sets of formatives in paradigmatic variation. Each line thus generates one ... for instance, Rosetta, fails to account for the lexeme character of separable verbs. As a consequence, spurious ambiguities and redundancies are created. Ambiguities arise between a simple...
... manuals. 3. Hand-coded lists are expensive to make, and in- variably incomplete. 4. A subcategorization dictionary obtained auto- matically from corpora can be updated quickly and easily as ... format. Next to each verb, listing just a subcategorization frame means that it appears in both the OALD and my subcat- egorization dictionary, a subcategorization frame preceded by a minus ... learning a foreign language. A subcategorization frame is a statement of what types of syntactic arguments a verb (or ad- jective) takes, such as objects, infinitives, that- clauses, participial...
... findwords with a similar distribution to the target word.Unlike Buitelaar et al. approach (Buitelaar andSacaleanu, 2001), they evaluated their method us-ing publically available resources, namely SemCor(Miller ... do-mains.Domain adaptation is also an approach for fo-cussing on domain-specific senses and used in theWSD task (Chand and Ng, 2007; Zhong et al., 2008;Agirre and Lacalle, 2009). Chan et. al. ... noun senses automatically using a thesaurusacquired from raw textual corpora and the Word-Net similarity package (McCarthy et al., 2004; Mc-Carthy et al., 2007). They used parsed data to findwords...
... than 1,000languages. The algorithm yields PANDIC-TIONARY, a novel multilingual dictionary. PANDICTIONARY contains more than fourtimes as many translations than in thelargest Wiktionary at ... PANDICTIONARY more than quadru-ples the size of the English Wiktionary, the largestavailable multilingual resource today.We plan to make PANDICTIONARY availableto the research community, and ... lexical datafrom machine-readable dictionary resources for ma-chine translation. In 3rd Intl Conference on Theoret-ical and Methodological Issues in Machine Transla-tion of Natural Language.P....
... records with information extracted from literature and curator-evaluated computational analy-sis. TrEMBL consists of computationally analyzed records that await full manual annotation. The Uni-Prot ... There are three databases in PIR: the Protein Sequence Database (PSD), iProClass, and PIR-NREF. PSD database includes functionally an-notated protein sequences. The iProClass database is a central ... “Disease or Syndrome”, “Virus”) to which all META concepts have been assigned. Other molecular biology databases - We also in-cluded several model organism databases or nomen-clature databases...
... herring, and oysters), lean meats, poultry, cheese, and eggs are also good sources. The only known plant sources are yeast, alfalfa, and two Japanese seaweeds—wakame and kombu. —Ameri-can Medical ... digestive, and urinary tracts and maintains healthy skin and hair. Beta carotene (also known as pro vitamin A) is converted to vitamin A by the body. Unlike retinol, beta carotene is an antioxi-dant a ... of magnification, an axis on a graph, and a female chromosome. It is a multiplication operator, a letter of the alphabet, and an arbitrary point in time. X is a kiss at the end of a love letter....
... thousands of feet. adaptable adj. flexible, pliable, pliant, compliant, accommodative, tractable, malleable, ductile, versatile; alterable, changeable: Men, in general, are not as adaptable as ... 1.9 alarm 1.10 amalgam 1.11 anachronism 1.12 apart 1.13 arbitrary 1.14 ashamed 1.15 atmosphere 1.16 audacious 1.17 available 1.18 awake 1.19 B 2.0 babble 2.1 beach 2.2 bias 2.3 blab ... abominable. aboriginal n. native, indigene, autochthon; Colloq Australian Abo, Offensive Australian aborigine , Slang Australian contemptuous boong: Many aboriginals are not assimilated to...