... two journals and reduced the number of documents by 2,432. We made the following comparisons between the document, sentence, and term event spaces. (1) Raw term comparison A set of well-correlated ... un-stemmed terms and 2.6 million for stemmed terms. Thus, the journals shared 48% and 52% of their vocabulary for unstemmed and stemmed terms respectively. Figure 7 shows the result of this comparison ... Meeting of the ACL, pages 601–608,Sydney, July 2006.c2006 Association for Computational LinguisticsA Comparison of Document, Sentence, and Term Event Spaces Catherine Blake School of Information...