mining bilingual data from the web with adaptively learnt patterns

Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

Báo cáo khoa học: "A DOM Tree Alignment Model for Mining Parallel Data from the Web" doc

Ngày tải lên : 08/03/2014, 02:21
... pattern-based mining scheme support this new mining scheme. Our mining experiment shows that, using the new web mining scheme, the web mining throughput is increased by 32%; (ii) The quality of the ... English-Chinese parallel data from the web. The mining procedure is initiated by acquiring Chinese website list. We have downloaded about 300,000 URLs of Chinese websites from the web directories ... verification. Based on these mining results, the quality of the mined data, the mining coverage and mining efficiency are measured. First, we benchmarked the precision of the mined parallel...
  • 8
  • 435
  • 0
Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

Ngày tải lên : 17/03/2014, 02:20
... suffixes with top φ 2 In our modified version of the competitive link- ing algorithm, the link score of a pair of words is the sum of the φ 2 scores of the words themselves, their prefixes and their ... BLEU score based on the test data in the 2006 NIST MT Evaluation Workshop. 6 Related Work Nagata et al. (2001) made the first proposal to mine translations from the web. Their work was concentrated ... pairs, where the translation of the in-parenthesis terms is a suffix of the pre-parenthesis text. The lengths and frequency counts of the suffixes have been used to determine what is the translation...
  • 9
  • 612
  • 0
Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

Báo cáo khoa học: "Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs" pdf

Ngày tải lên : 17/03/2014, 02:20
... 2005) also uses hyponym patterns to extract class instances from the web and then evalu- ates them further by computing mutual information scores based on web queries. The work by (Widdows and ... progresses. Initially, the seed is the only trusted class member and the only vertex in the graph. The bootstrapping process begins by instan- tiating the doubly-anchored pattern with the seed class ... to instantiate the pattern. On the first iteration, the pattern is given to Google as a web query, and new class members are extracted from the retrieved text snippets. We wanted the system to...
  • 9
  • 340
  • 0
Tài liệu Module 11: Accessing Data from the Outlook 2000 Client ppt

Tài liệu Module 11: Accessing Data from the Outlook 2000 Client ppt

Ngày tải lên : 21/12/2013, 06:15
... through the use of the other Office Web components. Function of the Data Source Control The Data Source control is the reporting engine behind data access pages, PivotTable List controls, and data- bound ... list from a relational data source, the PivotTable Service is used to create a multidimensional data cube from the relational data bound to the Data Source control. This data cube is then used ... manipulate data from the data source, and disconnect from the data source when you finish using the data. One of the major benefits of ADO is that it requires fewer calls to achieve the same...
  • 62
  • 398
  • 0
Tài liệu Fertility, Family Planning, and Women’s Health: New Data From the 1995 National Survey of Family Growth pptx

Tài liệu Fertility, Family Planning, and Women’s Health: New Data From the 1995 National Survey of Family Growth pptx

Ngày tải lên : 12/02/2014, 23:20
... 19 nonvoluntaryintercourse.Onesetof questionswasintheinterviewer- administeredportionofthesurveyand thesecondwasintheself-administered portion(AudioCASI).Inthe interviewer-administeredseries,they wereaskedwhethertheirfirst intercoursewas‘‘voluntaryornot voluntary.’’Forabout8percentof women15–44yearsofagewhohave hadintercourse,theirfirstintercourse wasnotvoluntary(table21).Forthose whosefirstintercourseoccurredatage 15oryounger,thatfirstintercoursewas nonvoluntaryfor16percentcompared with7 percentorlessforthosewhose firstintercourseoccurredatage16or older.Thepercentwhosefirst intercoursewasnonvoluntaryisnearly 10percentamongwomenwhosefirst intercoursewasbefore1975compared withabout6percentamongwomenwho firsthadintercourseinthe1990’s (table21). Intheself-administered(Audio CASI)portionoftheinterview,women wereaskedarelatedbutdifferent question:whethertheyhadeverbeen forcedbyamantohavesexual intercourseagainsttheirwill.About 20percentofwomenreportedthatthey hadbeenforcedbyamantohave intercourseagainsttheirwillatsome timeintheirlives(table22).Thus, table21showsthatfor8percentof women,theirfirstintercoursewas nonvoluntary;table22showsthat 20percenthadhadnonvoluntary intercourseatsometime—not necessarilyatfirstintercourse.Table22 alsoshowsthat6percentofwomen reportedthattheywereforcedtohave intercoursebeforetheywere15and another6percentbeforetheywere18.A fairlyhighpercentofformerlymarried (divorcedorseparated)women—about 35percent—reportedthattheyhadbeen forcedtohaveintercourse.Thisfinding deservesfurtherstudy. FirstSexualPartner Therehasbeenmuchpublic discussionaboutthepartnersofsexually activeteenagers.Table23profilesthe ageofmalepartnersatwomen’sfirst voluntaryintercourse.Abouttwo-thirds (66percent)ofwomenwhohadtheir firstvoluntaryintercoursebeforethey were16hadfirstpartnerswhowere under18yearsofage;21percenthad firstpartners18–19yearsofage; 7percenthadfirstpartners20–22years ofage,2percenthadfirstpartners 23–24yearsofage,and4percenthad firstpartners25yearsofageorolder (table23). Only3percentofwomenhadtheir firstintercoursewithamantheyjust met.About3outof5women (61percent)were‘‘goingsteady’’or ‘‘goingtogether’’withthemantheyhad intercoursewiththefirsttime,andabout 1in5wereengagedormarriedtohim. About12percentofallwomenwere marriedwhentheyhadtheirfirst intercourse.Amongwomen40–44years ofage(bornin1951–55),23percent weremarriedtotheirpartneratfirst intercoursewhileabout2percentof women15–19yearsofage(born 1971–75)weremarriedtotheirfirst partner.Womenwholivedwithbothof theirparentsthroughouttheirchildhood weremorelikelythanotherwomento havebeenmarriedtotheirpartnerat firstintercourse(table24). FirstIntercourseRelativeto FirstMarriage Amongever-marriedwomen15–44 yearsofage,82percenthadfirst intercoursebeforetheyweremarried. About69percentofthosefirstmarried in1965–74hadtheirfirstintercourse beforemarriagecomparedwith 89percentofthosefirstmarriedinthe 1990’s.Only2percentofthosefirst marriedin1965–74hadtheirfirst intercourse5yearsormorebefore marriagecomparedwith56percentof thosefirstmarriedinthe1990’s (table25). NumberofSexualPartners Asmentionedpreviously,some questionsonabortion,sexualpartners, andforcedsexualintercoursewere askedinboththeinterviewer- administeredandtheself-administered (AudioCASI)portionsoftheinterview. Responsestosensitivequestionsappear tohavebeenaffectedbythecomputer self-administeredmodeofinterviewing. Tables26–31showdataonthenumber ofsexualpartnersinthelast1year,5 years,andlifetime,usingboththe interviewer-administeredandself- administeredmethods.Presentingdata basedonbothmodesofinterviewing allowstheexaminationofdifferencesin reportingduetothemodeof interviewing(table26versus27, table28versus29,andtable30versus 31);andtheselectionoffindingsmost appropriateforcomparisontoother surveys. About3percentofunmarried womentoldtheinterviewerthatthey hadhadfourormoremalesexual partnersinthelast12months(table26), comparedwith9percentreportingfour ormorepartnersinAudioCASI (table27).Asimilardisparitywasfound whencomparingtheinterviewerresults withAudioCASIresultsforthenumber ofpartnerssinceJanuary1991(alittle lessthan5years,onaverage). Amongunmarriedwomen,14percent toldtheinterviewertheyhadfouror moremalesexualpartnerssinceJanuary 1991(table28)while18percent reportedinAudioCASIthattheyhad hadfourormorepartnersinthattime (table29). Thistopicdeservesmoredetailed study,butitappearsthatusingthemore privateinterviewtechniquegavea higherandpresumablymorecomplete estimateofthenumberofpartners amongunmarriedwomen(8,11). MarriageandCohabitation Tables32–37show1995dataon formalmarriageandunmarried cohabitation.About38percentof women15–44yearsofagehadnever beenmarriedwheninterviewedin1995 (table32).Thepercentnevermarried washigherineveryagegroupin1995 thanitwasin1982(24).Abouthalfof women25–39yearsofagehavehadan unmarriedcohabitationwithamanat sometimeintheirlives;10to 11percentofwomenintheirtwenties arecurrentlycohabitingwithaman (table33). About30percentofwomen25–39 yearsofagelivedwithaman (cohabited)beforetheirfirstmarriage (table34).Overone-half(57percent)of Series23,No.19[Page5 Table ... 19 nonvoluntaryintercourse.Onesetof questionswasintheinterviewer- administeredportionofthesurveyand thesecondwasintheself-administered portion(AudioCASI).Inthe interviewer-administeredseries,they wereaskedwhethertheirfirst intercoursewas‘‘voluntaryornot voluntary.’’Forabout8percentof women15–44yearsofagewhohave hadintercourse,theirfirstintercourse wasnotvoluntary(table21).Forthose whosefirstintercourseoccurredatage 15oryounger,thatfirstintercoursewas nonvoluntaryfor16percentcompared with7 percentorlessforthosewhose firstintercourseoccurredatage16or older.Thepercentwhosefirst intercoursewasnonvoluntaryisnearly 10percentamongwomenwhosefirst intercoursewasbefore1975compared withabout6percentamongwomenwho firsthadintercourseinthe1990’s (table21). Intheself-administered(Audio CASI)portionoftheinterview,women wereaskedarelatedbutdifferent question:whethertheyhadeverbeen forcedbyamantohavesexual intercourseagainsttheirwill.About 20percentofwomenreportedthatthey hadbeenforcedbyamantohave intercourseagainsttheirwillatsome timeintheirlives(table22).Thus, table21showsthatfor8percentof women,theirfirstintercoursewas nonvoluntary;table22showsthat 20percenthadhadnonvoluntary intercourseatsometime—not necessarilyatfirstintercourse.Table22 alsoshowsthat6percentofwomen reportedthattheywereforcedtohave intercoursebeforetheywere15and another6percentbeforetheywere18.A fairlyhighpercentofformerlymarried (divorcedorseparated)women—about 35percent—reportedthattheyhadbeen forcedtohaveintercourse.Thisfinding deservesfurtherstudy. FirstSexualPartner Therehasbeenmuchpublic discussionaboutthepartnersofsexually activeteenagers.Table23profilesthe ageofmalepartnersatwomen’sfirst voluntaryintercourse.Abouttwo-thirds (66percent)ofwomenwhohadtheir firstvoluntaryintercoursebeforethey were16hadfirstpartnerswhowere under18yearsofage;21percenthad firstpartners18–19yearsofage; 7percenthadfirstpartners20–22years ofage,2percenthadfirstpartners 23–24yearsofage,and4percenthad firstpartners25yearsofageorolder (table23). Only3percentofwomenhadtheir firstintercoursewithamantheyjust met.About3outof5women (61percent)were‘‘goingsteady’’or ‘‘goingtogether’’withthemantheyhad intercoursewiththefirsttime,andabout 1in5wereengagedormarriedtohim. About12percentofallwomenwere marriedwhentheyhadtheirfirst intercourse.Amongwomen40–44years ofage(bornin1951–55),23percent weremarriedtotheirpartneratfirst intercoursewhileabout2percentof women15–19yearsofage(born 1971–75)weremarriedtotheirfirst partner.Womenwholivedwithbothof theirparentsthroughouttheirchildhood weremorelikelythanotherwomento havebeenmarriedtotheirpartnerat firstintercourse(table24). FirstIntercourseRelativeto FirstMarriage Amongever-marriedwomen15–44 yearsofage,82percenthadfirst intercoursebeforetheyweremarried. About69percentofthosefirstmarried in1965–74hadtheirfirstintercourse beforemarriagecomparedwith 89percentofthosefirstmarriedinthe 1990’s.Only2percentofthosefirst marriedin1965–74hadtheirfirst intercourse5yearsormorebefore marriagecomparedwith56percentof thosefirstmarriedinthe1990’s (table25). NumberofSexualPartners Asmentionedpreviously,some questionsonabortion,sexualpartners, andforcedsexualintercoursewere askedinboththeinterviewer- administeredandtheself-administered (AudioCASI)portionsoftheinterview. Responsestosensitivequestionsappear tohavebeenaffectedbythecomputer self-administeredmodeofinterviewing. Tables26–31showdataonthenumber ofsexualpartnersinthelast1year,5 years,andlifetime,usingboththe interviewer-administeredandself- administeredmethods.Presentingdata basedonbothmodesofinterviewing allowstheexaminationofdifferencesin reportingduetothemodeof interviewing(table26versus27, table28versus29,andtable30versus 31);andtheselectionoffindingsmost appropriateforcomparisontoother surveys. About3percentofunmarried womentoldtheinterviewerthatthey hadhadfourormoremalesexual partnersinthelast12months(table26), comparedwith9percentreportingfour ormorepartnersinAudioCASI (table27).Asimilardisparitywasfound whencomparingtheinterviewerresults withAudioCASIresultsforthenumber ofpartnerssinceJanuary1991(alittle lessthan5years,onaverage). Amongunmarriedwomen,14percent toldtheinterviewertheyhadfouror moremalesexualpartnerssinceJanuary 1991(table28)while18percent reportedinAudioCASIthattheyhad hadfourormorepartnersinthattime (table29). Thistopicdeservesmoredetailed study,butitappearsthatusingthemore privateinterviewtechniquegavea higherandpresumablymorecomplete estimateofthenumberofpartners amongunmarriedwomen(8,11). MarriageandCohabitation Tables32–37show1995dataon formalmarriageandunmarried cohabitation.About38percentof women15–44yearsofagehadnever beenmarriedwheninterviewedin1995 (table32).Thepercentnevermarried washigherineveryagegroupin1995 thanitwasin1982(24).Abouthalfof women25–39yearsofagehavehadan unmarriedcohabitationwithamanat sometimeintheirlives;10to 11percentofwomenintheirtwenties arecurrentlycohabitingwithaman (table33). About30percentofwomen25–39 yearsofagelivedwithaman (cohabited)beforetheirfirstmarriage (table34).Overone-half(57percent)of Series23,No.19[Page5 Table ... Human Services. These organizations, along with leading researchers from outside the government, helped to design the survey. Further details on the planning and operation of the survey are given...
  • 125
  • 760
  • 0
Tài liệu Fertility, Family Planning, and Reproductive Health of U.S. Women: Data From the 2002 National Survey of Family Growth doc

Tài liệu Fertility, Family Planning, and Reproductive Health of U.S. Women: Data From the 2002 National Survey of Family Growth doc

Ngày tải lên : 13/02/2014, 10:20
... of the Data The data in this report come primarily from the most recent cycle of the NSFG conducted in 2002, and, as a result, they have several strengths: + Comparability over time The data ... particularly the female survey, has been to collect data on factors affecting pregnancy and reproductive health in the United States. The NSFG supplements and complements the data from the National ... disagreement about the intendedness (at time of conception) of recent births, with the father’s attitudes based on the mother’s reports of his attitude. A forthcoming report will describe fathers’ attitudes...
  • 174
  • 933
  • 0
Tài liệu Báo cáo khoa học: "Extraction and Approximation of Numerical Attributes from the Web" pdf

Tài liệu Báo cáo khoa học: "Extraction and Approximation of Numerical Attributes from the Web" pdf

Ngày tải lên : 20/02/2014, 04:20
... 1.695m]’). We then extract new pat- terns from the retrieved search engine snippets and re-query the Web with the new patterns to obtain more attribute values. We provided the framework with unit ... stage. If there are several values with the same frequency we select the median of these values. Approximating the attribute value. In the case when we do not have any values remaining after the bounds ... indeed most (≥ 50%) of the retrieved values fit the re- trieved bounds. If the lower and/or upper bound 1311 contradicts more than half of the data, we reject the bound. Otherwise we remove all...
  • 10
  • 465
  • 0
Tài liệu Báo cáo khoa học: "Automatic Collection of Related Terms from the Web" pptx

Tài liệu Báo cáo khoa học: "Automatic Collection of Related Terms from the Web" pptx

Ngày tải lên : 20/02/2014, 16:20
... query is a term, its hit is the number of pages that contain the term on the Web. We use the following notation. H(x)= the number of pages that contain the term x” The number H (x) can be used ... in the compiled corpus. R: the target term did not exist on the collected web pages. Only 43 terms (20%) out of 210 terms were col- lected by the system. This low recall primarily comes from the ... explanation of the term. 4. There are several technical terms that are re- lated to the term. We have implemented the checking program of the first two conditions in the system: the thirdcondition can...
  • 4
  • 437
  • 0
Tài liệu Báo cáo khoa học: "Parsing, Projecting & Prototypes: Repurposing Linguistic Data on the Web" doc

Tài liệu Báo cáo khoa học: "Parsing, Projecting & Prototypes: Repurposing Linguistic Data on the Web" doc

Ngày tải lên : 22/02/2014, 02:20
... increases from 41,581 to 189,244. We then ran the new language ID algorithm on the IGTs, and Table 1 shows the language distribution of the IGTs in ODIN according to the output of the algorithm. ... return results in the form of language profiles. Although language profiles are by no means complete—they are subject to the availability of data to fill in the answers within the profiles—they provide ... embraced the Web as a means for dissemi- nating linguistic knowledge, the consequence is that a large quantity of analyzed language data can be found on the Web. In many cases, the data is richly...
  • 4
  • 432
  • 0
Báo cáo khoa học: "Automatic Acquisition of Ranked Qualia Structures from the Web" potx

Báo cáo khoa học: "Automatic Acquisition of Ranked Qualia Structures from the Web" potx

Ngày tải lên : 08/03/2014, 02:21
... appropriate queries to the web search engine and choosing the article leading to the highest number of results. The corresponding patterns are then matched in the 50 snippets returned by the search engine ... (not calculated over the Web) as well as the conditional probability cal- culated over the Web (Web- P) delivered the best re- sults, while the PMI-based ranking measure yielded the worst results. ... relies on the counts of each qualia element as produced by the lexico-syntactic patterns (P-measure). We describe these measures in the fol- lowing. 4.1 Web- based Jaccard Measure (Web- Jac) Our web- based...
  • 8
  • 378
  • 0
Báo cáo khoa học: "Extracting Hypernym Pairs from the Web" potx

Báo cáo khoa học: "Extracting Hypernym Pairs from the Web" potx

Ngày tải lên : 17/03/2014, 04:20
... relations from the web. We compare our approach with hypernym ex- traction from morphological clues and from large text corpora. We show that the abun- dance of available data on the web enables obtaining ... interested in em- ploying the web for the extraction of hypernym re- lations. We are especially curious about whether the size of the web allows to achieve meaningful results with basic extraction ... the two web ex- periments and a combination of the best web ap- proach with the morphological approach. The con- junctive web pattern N en N rates best, because of its high frequency. The recall...
  • 4
  • 395
  • 0

Xem thêm