... 2007. Large language
models in machine translation. In Proceedings
of the 2007 Joint Conference on Empirical Meth-
ods in Natural Language Processing and Com-
putational Natural Language Learning ... Kneser-
Ney smoothed n-gram models. IEEE Transac-
tions on Audio, Speech and Language Processing,
15(5):1617–1624.
A. Stolcke. 1998. Entropy-based pruning of backoff
language models. In Proc. DARPA ... selection and its applications in LM aug-
mentation and adaptation using web data. The
language models are part of a continuous speech
recognition system that enables users to use
speech as an input...
... to incorporate large-
scale n-gram languagemodels in conjunction with
incremental syntactic language models.
The added decoding time cost of our syntactic
language model is very high. By increasing ... trans-
lation has effectively used n-gram word sequence
models as language models.
Modern phrase-based translation using large scale
n-gram languagemodels generally performs well
in terms of lexical ... use
supertag n-gram LMs. Syntactic language models
have also been explored with tree-based translation
models. Charniak et al. (2003) use syntactic lan-
guage models to rescore the output of a...
... language
models trained from text or speech corpora of vari-
ous genres and sizes. The largest available language
models are based on written text: we investigate the
effect of written text languagemodels ... dif-
ferences among the different languagemodels when
extended features are present are relatively small.
We assume that much of the information expressed
in the languagemodels overlaps with the lexical ... derived
from the Switchboard language model, since the flu-
ent sentence itself is part of the language model
training data. We solve this by dividing the Switch-
board training data into 20 folds....
... of English
Bigrams. Computer Speech & Language, 5(1):19–54.
Joshua Goodman. 2001. A Bit of Progress in Language
Modeling. Computer Speech & Language, 15(4):403–
434.
Bo-June (Paul) Hsu ... 2008. N-
gram Weighting: Reducing Training Data Mismatch in
Cross-Domain Language Model Estimation. In Pro-
ceedings of the Conference on Empirical Methods in
Natural Language Processing, pages 829–838.
Dietrich ... Language Modeling. In Pro-
ceedings of International Conference on Acoustics,
Speech, and Signal Processing.
Robert C. Moore and William Lewis. 2010. Intelligent
selection of language model training...
... estimating N-gram language models.
Kneser-Ney smoothing, however, requires
nonstandard N-gram counts for the lower-
order models used to smooth the highest-
order model. For some applications, ... Kneser-Ney and
those methods.
1 Introduction
Statistical languagemodels are potentially useful
for any language technology task that produces
natural -language text as a final (or intermediate)
output. ... using a sequence of lower-
order to higher-order languagemodels has been
shown to be an efficient way of constraining high-
dimensional search spaces for speech recognition
(Murveit et al., 1993)...
... statistical language models.
In this paper, we also use support vector
machines to combine features from tradi-
tional reading level measures, statistical
language models, and other language pro-
cessing ... use scores from languagemodels as
features in another classifier (e.g. an SVM). For ex-
ample, perplexity (P P) is an information-theoretic
measure often used to assess language models:
P P = 2
H(t|c)
, ... of syntax. Our approach uses n-
gram languagemodels as a low-cost automatic ap-
proximation of both syntactic and semantic analy-
sis. Statistical languagemodels (LMs) are used suc-
cessfully...
...
199
Some Pragmatic Issues in the Planning of Definite and Indefinite
Noun Phrases
Douglas E. Appelt
Artificial Intelligence Center, SRI International
and
Center for the Study of Language ... an agent's goals, but allows some
of the actions to consist of the utterance of sentences. This
approach to language generation emphasizes the view of
language as action, and hence assigns ... plan-based analysis of noun ph~ is worked
out, the taxonomy of actions presented here will still be of
practical importance.
Until an analysis like Cohen and Levesque's is worked
out, the concept...
... comparison of in-
grammar recognition performance.
3 Language modelling
To generate the different trigram language models
we used the SRI language modelling toolkit (Stol-
cke, 2002) with Good-Turing ... decades of statistical language
modeling: Where do we go from here? In Proceed-
ings of IEEE:88(8).
Rosenfeld R. 2000. Incorporating Linguistic Structure
into Statistical Language Models. In Philosophical
Transactions ... statistical languagemodels (DM-SLMs)
by using GF to generate all utterances that are
specific to certain dialogue moves from our in-
terpretation grammar. In this way we can pro-
duce models that...
... grammars for modeling agglutination
in this language, but first we will present the for-
mer class of languages and its acceptor automata.
3.1 Linear context free languages and
two-taped nondeterministic ... example the Guarani language
presents nasal harmony which expands from the
root to both suffixes and prefixes (Krivoshein,
1994). This kind of characterization can have
some value in language classification ... 2010.
c
2010 Association for Computational Linguistics
The use of formal languagemodels in the typology of the morphology of
Amerindian languages
Andr
´
es Osvaldo Porta
Universidad de Buenos Aires
hugporta@yahoo.com.ar
Abstract
The...
... novel language model
caching technique that improves the query
speed of our languagemodels (and SRILM)
by up to 300%.
1 Introduction
For modern statistical machine translation systems,
language models ... with two different language models.
Our first language model, WMT2010, was a 5-
gram Kneser-Ney language model which stores
probability/back-off pairs as values. We trained this
language model on ... 2010. Storing the web
in memory: space efficient languagemodels with con-
stant time retrieval. In Proceedings of the Conference
on Empirical Methods in Natural Language Process-
ing.
Boulos Harb,...
... explore a dependency language
model to improve translation quality. To some ex-
tent, these syntactically-informed language models
are consistent with syntax-based translation models
in capturing ... or even trillions of English words,
huge languagemodels are built in a distributed man-
ner (Zhang et al., 2006; Brants et al., 2007). Such
language models yield better translation results but
at ... integrate backward n-grams and mu-
tual information (MI) triggers into language models
in SMT.
In conventional n-gram language models, we look
at the preceding n − 1 words when calculating the
probability...
... (lossless) lan-
guages models and our randomized language model.
Note that the standard practice of measuring per-
plexity is not meaningful here since (1) for efficient
computation, the language model ... 2007.
Compressing trigram languagemodels with golomb
coding. In Proceedings of EMNLP-CoNLL 2007,
Prague, Czech Republic, June.
P. Clarkson and R. Rosenfeld. 1997 . Statistical language
modeling using ... pruning of back-
off language models. In Proc. DARPA Broadcast News
Transcription and Understanding Workshop, pages
270–274.
D. Talbot and M. Osborne. 2007a. Randomised language
modelling for...
... construc-
tion of languagemodels found in new language process-
ing applications and reported experimental results show-
ing their practicality for constructing very large models.
These algorithms ... experi-
mental results demonstrating its efficiency.
Representation of languagemodels by WFAs. Clas-
sical
-gram languagemodels admit a natural representa-
tion by WFAs in which each state encodes ... is nat-
ural and convenient to construct class-based language
models, that is models based on classes of words (Brown
et al., 1992). Such models are also often more robust
since they may include...
... to 300 000 words per language, 2) SCI-
ENCE, five scientific articles of about 50 000
words per language, 3) TECH, technical doc-
umentation of about 40 000 words per language
and 4) VERNE, ...
perspective of some applications.
These problems can be avoided by taking ad-
vantage of the fact that a unit of a given gran-
~arity (e.g. sentence) can always be seen as
a (possibly discontinuous) ... million
words (ca. 1.1 million words per language) . The
part used for JOC was composed of one fifth
of the French and English sections (ca. 200 000
words per language) .
3.3 BAF
The BAF corpus...
... speech
transcripts.
Compared to standard language models, hy-
brid LMs generalize better to the test data and
partially compensate for the disproportion be-
tween in-domain and out-of-domain training data.
At the ... Hoang. 2007. Factored
translation models. In Proceedings of the 2007
Joint Conference on Empirical Methods in Natural
Language Processing and Computational Natural
Language Learning (EMNLP-CoNLL), ... Monz. 2011. Statistical Machine Translation
with Local Language Models. In Proceedings of the
2011 Conference on Empirical Methods in Natural
Language Processing, pages 869–879, Edinburgh,
Scotland,...