efficient inference of crfs for large scale natural language data

Báo cáo khoa học: "Efficient Inference of CRFs for Large-Scale Natural Language Data" docx

Báo cáo khoa học: "Efficient Inference of CRFs for Large-Scale Natural Language Data" docx

Ngày tải lên : 23/03/2014, 17:20
... Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 281–284, Suntec, Singapore, 4 August 2009. c 2009 ACL and AFNLP Efficient Inference of CRFs for Large- Scale Natural Language Data Minwoo ... setting of parameters. We also present a simple but robust variant algorithm in which CRFs efficiently learn and predict large- scale natural language data. 2 Linear-chain CRFs Many versions of CRFs ... s for ours) and 7∼12 times faster for decod- ing (2.881 ms for MALLET, 5.028 ms for CRF++, and 0.418 ms for ours). This result demonstrates that learn- ing and decoding CRFs for large- scale natural...
  • 4
  • 400
  • 0
Báo cáo khoa học: "A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora" pot

Báo cáo khoa học: "A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora" pot

Ngày tải lên : 17/03/2014, 04:20
... produce large scale lexical re- sources which include frequency and usage infor- mation tuned to genres and sublanguages. Such resources are critical for natural language process- ing (NLP), both for ... enhancing the performance of state -of- art statistical systems and for improving the portability of these systems between domains. One type of lexical information with particular importance for NLP is ... Meeting of the Association of Computational Linguistics, pages 912–919, Prague, Czech Republic, June 2007. c 2007 Association for Computational Linguistics A System for Large- Scale Acquisition of...
  • 8
  • 551
  • 0
Tài liệu A Comparison of Approaches to Large-Scale Data Analysis pdf

Tài liệu A Comparison of Approaches to Large-Scale Data Analysis pdf

Ngày tải lên : 19/02/2014, 12:20
... number of records processed for each cluster size is therefore 5.6 million times the number of nodes. The perfor- mance of each system not only illustrates how each system scales as the amount of data ... “represen- tative of a large subset of the real programs written by users of MapReduce” [8]. For this task, each system must scan through a data set of 100-byte records looking for a three-character ... Vertica) allows for optional compression of stored data. It is not uncom- mon for compression to result in a factor of 6–10 space savings. Vertica’s internal data representation is highly optimized for data compression...
  • 14
  • 923
  • 0
Tài liệu Báo cáo khoa học: "Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT" pdf

Tài liệu Báo cáo khoa học: "Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT" pdf

Ngày tải lên : 19/02/2014, 19:20
... results for algorithms 1 and 4 on the Europarl data (ep) for different devtest and test sets. Europarl data were used in all runs for train- ing and for setting the meta-parameter of number of epochs. ... learning for SMT not only to large feature sets but also to large sets of parallel training data. Since inference for SMT (unlike many other learn- ing problems) is very expensive, especially on large training ... the results of the experimental comparison of the 4 algorithms of Section 4. The 7 Absolute improvements would be possible, e.g., by using larger language models or by adding news data to the...
  • 11
  • 547
  • 0
Tài liệu Experiences of Plantation and Large-Scale Farming in 20th Century Africa pdf

Tài liệu Experiences of Plantation and Large-Scale Farming in 20th Century Africa pdf

Ngày tải lên : 21/02/2014, 04:20
... levels of capi- talization of maize production – although of course these existed too. On average, maize production attracted large- scale farmers who were capital-poor, and whose use both of capital ... first half of the 1960s, in the form of adoption of (publicly bred) hybrid maize varieties and increased application of synthetic fertilizers. The key event here was the release of locally ... of ‘under cultivation’) and – to an even greater extent – of employment. 2 In terms of cov- erage, data or estimates based on secondary sources are available for PF and LSF crop are - as for...
  • 56
  • 533
  • 0
Tài liệu DEVELOPMENT OF STANDARDS OF AOX FOR SMALL SCALE PULP AND PAPER MILLS pdf

Tài liệu DEVELOPMENT OF STANDARDS OF AOX FOR SMALL SCALE PULP AND PAPER MILLS pdf

Ngày tải lên : 22/02/2014, 09:20
... At present none of the small scale paper mill in India is using chlorine dioxide because of the involvement of high cost of installation of chlorine dioxide plant and high cost of chlorine dioxide ... conditions for bleaching of pulp collected from respective paper mills. The level of AOX varied from 5.0 to 9.0 kg/t of pulp while in case of Mill C , the level of AOX was about 18.0 kg/t of pulp. -16- ... studies conducted on status of technology and level of AOX, the following recommendations are made: (i) Majority of the mills are using low dosages of caustic for cooking of mixed fibrous raw materials...
  • 41
  • 417
  • 0
Báo cáo khoa học: "Leveraging Reusability: Cost-effective Lexical Acquisition for Large-scale Ontology Translation" potx

Báo cáo khoa học: "Leveraging Reusability: Cost-effective Lexical Acquisition for Large-scale Ontology Translation" potx

Ngày tải lên : 08/03/2014, 02:21
... auto- matic techniques to large- scale parallel corpora where data sparsity poses a problem for low- frequency terms. Data sparsity is also an issue for more general state -of- the-art bilingual align- ment ... access to large archives of spoken language (Gustman, et al., 2002). Our process leverages a small set of manually- acquired English-Czech translations to translate a large ontology of keyword ... hours of video testimonies in 32 languages. Starting from an initial out -of- vocabulary (OOV) rate of 85%, we show that a small set of prioritized translations can be elicited from human infor- mants,...
  • 8
  • 325
  • 0
Mining Console Logs for Large-Scale System Problem Detection docx

Mining Console Logs for Large-Scale System Problem Detection docx

Ngày tải lên : 30/03/2014, 16:20
... dimension for each vector, its runtime is linear in the number of vectors, so detection can scale to large logs. PCA. PCA is a coordinate transformation method that maps a given set of data points ... insuf- ficient for effective problem determination [14]. We propose a general approach for mining console logs for detecting runtime problems in large- scale sys- tems. Instead of asking for user input ... count. Dimen- sions of the vector consist of (the union of) all useful mes- sage types across all groups, and the value of a dimension in the vector is the number of appearances of the corre- sponding...
  • 6
  • 415
  • 0
Báo cáo khoa học: "Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation" docx

Báo cáo khoa học: "Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation" docx

Ngày tải lên : 31/03/2014, 00:20
... algorithm for obtaining word clas- sifications for predictive class-based language models with which we were able to use billions of tokens of training data to obtain classifications for millions of words ... Jeffrey Dean. 2007. Large language models in machine translation. In Proceedings of the Con- ference on Empirical Methods in Natural Language Processing and on Computational Natural Language Learning ... fraction of the words for exchange will increase the number of iterations required to converge. In experiments we empirically determined that choosing a subset of roughly a third of the size of the...
  • 8
  • 336
  • 0
scalable decentralized object location and routing for large scale peer to peer systems

scalable decentralized object location and routing for large scale peer to peer systems

Ngày tải lên : 28/04/2014, 13:40
... performance in terms of the expected number of routing hops and the number of messages exchanged as part of a node join operation. This section focuses on another aspect of Pastry’s routing performance, ... more of the issues and requirements of such applications and sys- tems [1, 2, 5,8,10,15]. One of the key problems in large- scale peer-to-peer applications is to provide efficient algorithms for ... design of a large- scale event notification infrastructure. Submitted for publication. June 2001. http://www.research.microsoft.com/ antr/SCRIBE/. 23. M. A. Sheldon, A. Duda, R. Weiss, and D. K. Gifford....
  • 22
  • 439
  • 0
Báo cáo hóa học: " Adaptive antenna selection and Tx/Rx beamforming for large-scale MIMO systems in 60 GHz channels" pptx

Báo cáo hóa học: " Adaptive antenna selection and Tx/Rx beamforming for large-scale MIMO systems in 60 GHz channels" pptx

Ngày tải lên : 21/06/2014, 01:20
... beamforming for high-speed data transmission. We assume that the number of RF cha ins is smaller than the number of antennas, which motivates the use of antenna selection to exploit the beamforming ... single-run performance of the antenna selection algorithm in [10] is also shown. In Figure 5, the average performance of 100 runs for the above schemes is plotted in a larger span of iterations. ... performance requireme nt in order to guar- antee that the actual performance of the selected sub- set meets the requirement with minimum number of selected antenna. Performance of adaptive beamforming Figure...
  • 14
  • 403
  • 0

Xem thêm