... {}1,212()min (), ()bwtf w idf b idf b⊂⋅∑ in which b1 and b2 stand for the two bigrams and w stands for any word containing both of them. The overall information quantity is obtained by ... Processing and Management, 35:443-462. Jianyun Nie, Jianfeng Gao, Jian Zhang, Ming Zhou. 2000. On the Use of Words and N-grams for Chi-nese Information Retrieval, Proceedings of 5th In- ternational ... Association for Computational LinguisticsA Comparison and Semi-Quantitative Analysis of Words and Character-Bigrams as Features in Chinese Text Categorization Jingyang Li Maosong Sun Xian Zhang...