... number of words (small,medium, large subsets), average phones per word, average words per phrase, and percent of word types that occur onlyonce (hapax). Phones /word is replaced by characters /word ... partial wordshelps the segmenter handle long, infrequent words.Long words are typically created by productive mor-phology and, thus, often start and end just like otherwords. Only 32% of words ... adjacent words are possible, thealgorithm alternates which to prefer. Each word isthen sudivided into a sequence of reliable words,when possible. Because words are typically shortand reliable words...