... outperforms minimal rules, and performs at the samelevel as composed and vertically composed rules, but is smaller and faster. The number of parameters is shown for both the full model and the ... discount for a context of length n, and (1− λn) isset to the value that makes the smoothed probabilitydistribution sum to one.We experiment with bigram and trigram ruleMarkov models. For each, ... For each, we try different values ofD1 and D2, the discount for bigrams and trigrams,respectively. Ney et al. (1994) suggest using the fol-lowing value for the discount Dn:Dn=n1n1+...