Proceedings of ACL-08: HLT, pages 139–147,
Columbus, Ohio, USA, June 2008.
c
2008 Association for Computational Linguistics
A Re-examinationofQueryExpansionUsingLexical Resources
Hui Fang
Department of Computer Science and Engineering
The Ohio State University
Columbus, OH, 43210
hfang@cse.ohio-state.edu
Abstract
Query expansion is an effective technique to
improve the performance of information re-
trieval systems. Although hand-crafted lexi-
cal resources, such as WordNet, could provide
more reliable related terms, previous stud-
ies showed that queryexpansionusing only
WordNet leads to very limited performance
improvement. One of the main challenges is
how to assign appropriateweights to expanded
terms. In this paper, we re-examine this prob-
lem using recently proposed axiomatic ap-
proaches and find that, with appropriate term
weighting strategy, we are able to exploit
the information from lexical resources to sig-
nificantly improve the retrieval performance.
Our empirical results on six TREC collec-
tions show that queryexpansionusing only
hand-crafted lexical resources leads to signif-
icant performance improvement. The perfor-
mance can be furtherimproved if the proposed
method is combined with queryexpansion us-
ing co-occurrence-based resources.
1 Introduction
Most information retrieval models (Salton et al.,
1975; Fuhr, 1992; Ponte and Croft, 1998; Fang
and Zhai, 2005) compute relevance scores based on
matching of terms in queries and documents. Since
various terms can be used to describe a same con-
cept, it is unlikely for a user to use a query term that
is exactly the same term as used in relevant docu-
ments. Clearly, such vocabulary gaps make the re-
trieval performance non-optimal. Query expansion
(Voorhees, 1994; Mandala et al., 1999a; Fang and
Zhai, 2006; Qiu and Frei, 1993; Bai et al., 2005;
Cao et al., 2005) is a commonly used strategy to
bridge the vocabulary gaps by expanding original
queries with related terms. Expanded terms are of-
ten selected from either co-occurrence-based the-
sauri (Qiu and Frei, 1993; Bai et al., 2005; Jing and
Croft, 1994; Peat and Willett, 1991; Smeaton and
van Rijsbergen, 1983; Fang and Zhai, 2006) or hand-
crafted thesauri (Voorhees, 1994; Liu et al., 2004) or
both (Cao et al., 2005; Mandala et al., 1999b).
Intuitively, compared with co-occurrence-based
thesauri, hand-crafted thesauri, such as WordNet,
could provide more reliable terms for query ex-
pansion. However, previous studies failed to show
any significant gain in retrieval performance when
queries are expanded with terms selected from
WordNet (Voorhees, 1994; Stairmand, 1997). Al-
though some researchers have shown that combin-
ing terms from both types of resources is effective,
the benefit ofqueryexpansionusing only manually
created lexical resources remains unclear. The main
challenge is how to assign appropriate weights to the
expanded terms.
In this paper, we re-examine the problem of
query expansionusinglexical resources with the
recently proposed axiomatic approaches (Fang and
Zhai, 2006). The major advantage of axiomatic ap-
proaches in queryexpansion is to provide guidance
on how to weight related terms based on a given
term similarity function. In our previous study, a co-
occurrence-based term similarity function was pro-
posed and studied. In this paper, we study several
term similarity functions that exploit various infor-
mation from two lexical resources, i.e., WordNet
139
and dependency-thesaurus constructed by Lin (Lin,
1998), and then incorporate these similarity func-
tions into the axiomatic retrieval framework. We
conduct empirical experiments over several TREC
standard collections to systematically evaluate the
effectiveness ofqueryexpansion based on these sim-
ilarity functions. Experiment results show that all
the similarity functions improve the retrieval per-
formance, although the performance improvement
varies for different functions. We find that the most
effective way to utilize the information from Word-
Net is to compute the term similarity based on the
overlap of synset definitions. Using this similarity
function in queryexpansion can significantly im-
prove the retrieval performance. According to the
retrieval performance, the proposed similarity func-
tion is significantly better than simple mutual infor-
mation based similarity function, while it is compa-
rable to the function proposed in (Fang and Zhai,
2006). Furthermore, we show that the retrieval per-
formance can be further improved if the proposed
similarity function is combined with the similar-
ity function derived from co-occurrence-based re-
sources.
The main contribution of this paper is to re-
examine the problem ofqueryexpansionusing lexi-
cal resources with a new approach. Unlike previous
studies, we are able to show that queryexpansion us-
ing only manually created lexical resources can sig-
nificantly improve the retrieval performance.
The rest of the paper is organized as follows. We
discuss the related work in Section 2, and briefly re-
view the studies ofqueryexpansionusing axiomatic
approaches in Section 3. We then present our study
of usinglexical resources, such as WordNet, for
query expansion in Section 4, and discuss experi-
ment results in Section 5. Finally, we conclude in
Section 6.
2 Related Work
Although the use of WordNet in query expansion
has been studied by various researchers, the im-
provement of retrieval performance is often lim-
ited. Voorhees (Voorhees, 1994) expanded queries
using a combination of synonyms, hypernyms and
hyponyms manually selected from WordNet, and
achieved limited improvement (i.e., around −2% to
+2%) on short verbose queries. Stairmand (Stair-
mand, 1997) used WordNet for query expansion, but
they concluded that the improvement was restricted
by the coverage of the WordNet and no empirical
results were reported.
More recent studies focused on combining the in-
formation from both co-occurrence-based and hand-
crafted thesauri. Mandala et. al. (Mandala et al.,
1999a; Mandala et al., 1999b) studied the problem
in vector space model, and Cao et. al. (Cao et al.,
2005) focused on extending language models. Al-
though they were able to improve the performance,
it remains unclear whether using only information
from hand-crafted thesauri would help to improve
the retrieval performance.
Another way to improve retrieval performance
using WordNet is to disambiguate word senses.
Voorhees (Voorhees, 1993) showed that using Word-
Net for word sense disambiguation degrade the re-
trieval performance. Liu et. al. (Liu et al., 2004)
used WordNet for both sense disambiugation and
query expansion and achieved reasonable perfor-
mance improvement. However, the computational
cost is high and the benefit ofqueryexpansion using
only WordNet is unclear. Ruch et. al. (Ruch et al.,
2006) studied the problem in the domain of biology
literature and proposed an argumentative feedback
approach, where expanded terms are selected from
only sentences classified into one of four disjunct
argumentative categories.
The goal of this paper is to study whether query
expansion using only manually created lexical re-
sources could lead to the performance improve-
ment. The main contribution of our work is to
show queryexpansionusing only hand-crafted lex-
ical resources is effective in the recently proposed
axiomatic framework, which has not been shown in
the previous studies.
3 QueryExpansion in Axiomatic Retrieval
Model
Axiomatic approaches have recently been proposed
and studied to develop retrieval functions (Fang and
Zhai, 2005; Fang and Zhai, 2006). The main idea is
to search for a retrieval function that satisfies all the
desirable retrieval constraints, i.e., axioms. The un-
derlying assumption is that a retrieval function sat-
140
isfying all the constraints would perform well em-
pirically. Unlike other retrieval models, axiomatic
retrieval models directly model the relevance with
term level retrieval constraints.
In (Fang and Zhai, 2005), several axiomatic re-
trieval functions have been derived based on a set of
basic formalized retrieval constraints and an induc-
tive definition of the retrieval function space. The
derived retrieval functions are shown to perform as
well as the existing retrieval functions with less pa-
rameter sensitivity. One of the components in the
inductive definition is primitive weighting function,
which assigns the retrieval score to a single term
document {d} for a single term query {q} based on
S({q}, {d}) =
ω(q) q = d
0 q = d
(1)
where ω(q) is a term weighting function of q. A lim-
itation of the primitive weighting function described
in Equation 1 is that it can not bridge vocabulary
gaps between documents and queries.
To overcome this limitation, in (Fang and Zhai,
2006), we proposed a set of semantic term match-
ing constraints and modified the previously derived
axiomatic functions to make them satisfy these ad-
ditional constraints. In particular, the primitive
weighting function is generalized as
S({q}, {d}) = ω(q) × f(s(q, d)),
where s(q, d) is a semantic similarity function be-
tween two terms q and d, and f is a monotonically
increasing function defined as
f(s(q, d)) =
1 q = d
s(q,d)
s(q,q)
× β q = d
(2)
where β is a parameter that regulates the weighting
of the original query terms and the semantically sim-
ilar terms. We have shown that the proposed gen-
eralization can be implemented as a query expan-
sion method. Specifically, the expanded terms are
selected based on a term similarity function s and
the weight of an expanded term t is determined by
its term similarity with a query term q, i.e., s(q, t),
as well as the weight of the query term, i.e., ω(q).
Note that the weight of an expanded term t is ω(t)
in traditional queryexpansion methods.
In our previous study (Fang and Zhai, 2006), term
similarity function s is derived based on the mutual
information of terms over collections that are con-
structed under the guidance of a set of term semantic
similarity constraints. The focus of this paper is to
study and compare several term similarity functions
exploiting the information from lexical resources,
and evaluate their effectiveness in the axiomatic re-
trieval models.
4 Term Similarity based on Lexical
Resources
In this section, we discuss a set of term similar-
ity functions that exploit the information stored in
two lexical resources: WordNet (Miller, 1990) and
dependency-based thesaurus (Lin, 1998).
The most commonly used lexical resource is
WordNet (Miller, 1990), which is a hand-crafted
lexical system developed at Princeton University.
Words are organized into four taxonomies based on
different parts of speech. Every node in the WordNet
is a synset, i.e., a set of synonyms. The definition of
a synset, which is referred to as gloss, is also pro-
vided. For a query term, all the synsets in which the
term appears can be returned, along with the defi-
nition of the synsets. We now discuss six possible
term similarity functions based on the information
provided by WordNet.
Since the definition provides valuable information
about the semantic meaning of a term, we can use
the definitions of the terms to measure their semantic
similarity. The more common words the definitions
of two terms have, the more similar these terms are
(Banerjee and Pedersen, 2005). Thus, we can com-
pute the term semantic similarity based on synset
definitions in the following way:
s
def
(t
1
, t
2
) =
|D(t
1
) ∩ D(t
2
)|
|D(t
1
) ∪ D(t
2
)|
,
where D(t) is the concatenation of the definitions
for all the synsets containing term t and |D| is the
number of words of the set D.
Within a taxonomy, synsets are organized by their
lexical relations. Thus, given a term, related terms
can be found in the synsets related to the synsets
containing the term. In this paper, we consider the
following five word relations.
141
• Synonym(Syn): X and Y are synonyms if they
are interchangeable in some context.
• Hypernym(Hyper): Y is a hypernym of X if X
is a (kind of) Y.
• Hyponym(Hypo): X is a hyponym of Y if X is
a (kind of) Y.
• Holonym(Holo): Y is a holonym of Y if X is a
part of Y.
• Meronym(Mero): X is a meronym of Y if X is
a part of Y.
Since these relations are binary, we define the term
similarity functions based on these relations in the
following way.
s
R
(t
1
, t
2
) =
α
R
t
1
∈ T
R
(t
2
)
0 t
1
/∈ T
R
(t
2
)
where R ∈ {syn, hyper, hypo, holo, mero}, T
R
(t)
is a set of words that are related to term t based on
the relation R, and αs are non-zero parameters to
control the similarity between terms based on differ-
ent relations. However, since the similarity values
for all term pairs are same, the values of these pa-
rameters can be ignored when we use Equation 2 in
query expansion.
Another lexical resource we study in the paper is
the dependency-based thesaurus provided by Lin
1
(Lin, 1998). The thesaurus provides term similar-
ities that are automatically computed based on de-
pendency relationships extracted from a parsed cor-
pus. We define a similarity function that can utilize
this thesaurus as follows:
s
Lin
(t
1
, t
2
) =
L(t
1
, t
2
) (t
1
, t
2
) ∈ TP
Lin
0 (t
1
, t
2
) /∈ T P
Lin
where L(t
1
, t
2
) is the similarity of terms stored in
the dependency-based thesaurus and T P
Lin
is a set
of all the term pairs stored in the thesaurus. The
similarity of two terms would be assigned to zero if
we can not find the term pair in the thesaurus.
Since all the similarity functions discussed above
capture different perspectives of term relations, we
1
Available at http://www.cs.ualberta.ca/
˜
lindek/downloads.htm
propose a simple strategy to combine these similar-
ity functions so that the similarity of a term pair is
the highest similarity value of these two terms of
all the above similarity functions, which is shown
as follows.
s
combined
(t
1
, t
2
) = max
R∈Rset
(s
R
(t
1
, t
2
)),
where
Rset = {def, syn, hyper, hypo, holo, mero, L in}.
In summary, we have discussed eight possible
similarity functions that exploit the information
from the lexical resources. We then incorporate
these similarity functions into the axiomatic retrieval
models based on Equation 2, and perform query ex-
pansion based on the procedure described in Section
3. The empirical results are reported in Section 5.
5 Experiments
In this section, we experimentally evaluate the effec-
tiveness ofqueryexpansion with the term similar-
ity functions discussed in Section 4 in the axiomatic
framework. Experiment results show that the sim-
ilarity function based on synset definitions is most
effective. By incorporating this similarity function
into the axiomatic retrieval models, we show that
query expansionusing the information from only
WordNet can lead to significant improvement of re-
trieval performance, which has not been shown in
the previous studies (Voorhees, 1994; Stairmand,
1997).
5.1 Experiment Design
We conduct three sets of experiments. First, we
compare the effectiveness of term similarity func-
tions discussed in Section 4 in the context of
query expansion. Second, we compare the best
one with the term similarity functions derived from
co-occurrence-based resources. Finally, we study
whether the combination of term similarity func-
tions from different resources can further improve
the performance.
All experiments are conducted over six TREC
collections: ap88-89, doe, fr88-89, wt2g, trec7 and
trec8. Table 1 shows some statistics of the collec-
tions, including the description, the collection size,
142
Table 1: Statistics of Test Collections
Collection Description Size # Voc. # Doc. #query
ap88-89 news articles 491MB 361K 165K 150
doe technical reports 184MB 163K 226K 35
fr88-89 government documents 469MB 204K 204K 42
trec7 ad hoc data 2GB 908K 528K 50
trec8 ad hoc data 2GB 908K 528K 50
wt2g web collections 2GB 1968K 247K 50
the vocabulary size, the number of documents and
the number of queries. The preprocessing only in-
volves stemming with Porter’s stemmer.
We use WordNet 3.0
2
, Lemur Toolkit
3
and
TrecWN library
4
in experiments. The results are
evaluated with both MAP (mean average preci-
sion) and gMAP (geometric mean average preci-
sion) (Voorhees, 2005), which emphasizes the per-
formance of difficulty queries.
There is one parameter β in the query expansion
method presented in Section 3. We tune the value of
β and report the best performance. The parameter
sensitivity is similar to the observations described in
(Fang and Zhai, 2006) and will not be discussed in
this paper. In all the result tables, ‡ and † indicate
that the performance difference is statistically sig-
nificant according to Wilcoxon signed rank test at
the level of 0.05 and 0.1 respectively.
We now explain the notations of different meth-
ods. BL is the baseline method without query ex-
pansion. In this paper, we use the best performing
function derived in axiomatic retrieval models, i.e,
F2-EXP in (Fang and Zhai, 2005) with a fixed pa-
rameter value (b = 0.5). QE
X
is the query expan-
sion method with term similarity function s
X
, where
X could be Def., Syn., Hyper., Hypo., Mero., Holo.,
Lin and Combined.
Furthermore, we examine the query expansion
method using co-occurrence-based resources. In
particular, we evaluate the retrieval performance us-
ing the following two similarity functions: s
MIBL
and s
MIImp
. Both functions are based on the mutual
information of terms in a set of documents. s
MIBL
uses the collection itself to compute the mutual in-
formation, while s
MIImp
uses the working sets con-
2
http://wordnet.princeton.edu/
3
http://www.lemurproject.org/
4
http://l2r.cs.uiuc.edu/ cogcomp/software.php
structed based on several constraints (Fang and Zhai,
2006). The mutual information of two terms t
1
and
t
2
in collection C is computed as follow (van Rijs-
bergen, 1979):
I(X
t
1
, X
t
2
) =
p(X
t
1
, X
t
2
)log
p(X
t
1
, X
t
2
)
p(X
t
1
)p(X
t
2
)
X
t
i
is a binary random variable corresponding to the
presence/absence of term t
i
in each document of col-
lection C.
5.2 Effectiveness ofLexical Resources
We first compare the retrieval performance of query
expansion with different similarity functions us-
ing short keyword (i.e., title-only) queries, because
query expansion techniques are often more effective
for shorter queries (Voorhees, 1994; Fang and Zhai,
2006). The results are presented in Table 2. It is
clear that queryexpansion with these functions can
improve the retrieval performance, although the per-
formance gains achieved by different functions vary
a lot. In particular, we make the following observa-
tions.
First, the similarity function based on synset def-
initions is the most effective one. QE
def
signifi-
cantly improves the retrieval performance for all the
data sets. For example, in trec7, it improves the per-
formance from 0.186 to 0.216. As far as we know,
none of the previous studies showed such significant
performance improvement by using only WordNet
as queryexpansion resource.
Second, the similarity functions based on term re-
lations are less effective compared with definition-
based similarity function. We think that the worse
performance is related to the following two reasons:
(1) The similarity functions based on relations are
binary, which is not a good way to model term sim-
ilarities. (2) The relations are limited by the part
143
Table 2: Performance ofqueryexpansionusinglexical resources (short keyword queries)
trec7 trec8 wt2g
MAP gMAP MAP gMAP MAP gMAP
BL 0.186 0.083 0.250 0.147 0.282 0.188
QE
def
0.216‡ 0.105‡ 0.266‡ 0.164‡ 0.301‡ 0.210‡
(+16%) (+27%) (+6.4%) (+12%) (+6.7%) (+12%)
QE
syn
0.194 0.085‡ 0.252† 0.150† 0.287‡ 0.194‡
(+4.3%) (+2.4%) (+0.8%) (+2.0%) (+1.8%) (+3.2%)
QE
hyper
0.186 0.086 0.250 0.152 0.286† 0.192†
(0%) (+3.6%) (0%) (+3.4%) (+1.4%) (+2.1%)
QE
hypo
0.186† 0.085‡ 0.250 0.147 0.282† 0.190
(0%) (+2.4%) (0%) (0%) (0%) (+1.1%)
QE
mero
0.187‡ 0.084‡ 0.250 0.147 0.282 0.189
(+0.5%) (+1.2%) (0%) (0%) (0%) (+0.5%)
QE
holo
0.191‡ 0.085‡ 0.250 0.147 0.282 0.188
(+2.7%) (+2.4%) (0%) (0%) (0%) (0%)
QE
Lin
0.193‡ 0.092‡ 0.256‡ 0.156‡ 0.290‡ 0.200‡
(+3.7%) (+11%) (+2.4%) (+6.1%) (+2.8%) (+6.4%)
QE
Combined
0.214‡ 0.104‡ 0.267‡ 0.165‡ 0.300‡ 0.208‡
(+15%) (+25%) (+6.8%) (+12%) (+6.4%) (+10.5%)
ap88-89 doe fr88-89
MAP gMAP MAP gMAP MAP gMAP
BL 0.220 0.074 0.174 0.069 0.222 0.062
QE
def
0.254‡ 0.088‡ 0.181‡ 0.075‡ 0.225‡ 0.067‡
(+15%) (+19%) (+4%) (+10%) (+1.4%) (+8.1%)
QE
syn
0.222‡ 0.077‡ 0.174 0.074 0.222 0.065
(+0.9%) (+4.1%) (0%) (+7.3%) (0%) (+4.8%)
QE
hyper
0.222‡ 0.074 0.175 0.070 0.222 0.062
(+0.9%) (0%) (+0.5%) (+1.5%) (0%) (0%)
QE
hypo
0.222‡ 0.076‡ 0.176† 0.073† 0.222 0.062
(+0.9%) (+2.7%) (+1.1%) (+5.8%) (0%) (0%)
QE
mero
0.221 0.074† 0.174† 0.070† 0.222 0.062
(+0.45%) (0%) (0%) (+1.5%) (0%) (0%)
QE
holo
0.221 0.076 0.177† 0.073 0.222 0.062
(+0.45%) (+2.7%) (+1.7%) (+5.8%) (0%) (0%)
QE
Lin
0.245‡ 0.082‡ 0.178 0.073 0.222 0.067†
(+11%) (+11%) (+2.3%) (+5.8%) (0%) (+8.1%)
QE
Combined
0.254‡ 0.085‡ 0.179† 0.074† 0.223† 0.065
(+15%) (+12%) (+2.9%) (+7.3%) (+0.5%) (+4.3%)
144
Table 3: Performance comparison of hand-crafted and co-occurrence-based thesauri (short keyword queries)
Data MAP gMAP
QE
def
QE
MIBL
QE
MIImp
QE
def
QE
MIBL
QE
MIImp
ap88-89 0.254 0.233‡ 0.265‡ 0.088 0.081‡ 0.089‡
doe 0.181 0.175† 0.183 0.075 0.071† 0.078
fr88-89 0.225 0.222‡ 0.227† 0.067 0.063 0.071‡
trec7 0.216 0.195‡ 0.236‡ 0.105 0.089‡ 0.097
trec8 0.266 0.250‡ 0.278 0.164 0.148‡ 0.172
wt2g 0.301 0.311 0.320‡ 0.210 0.218 0.219‡
of speech of the terms, because two terms in Word-
Net are related only when they have the same part
of speech tags. However, definition-based similarity
function does not have such a limitation.
Third, the similarity function based on Lin’s the-
saurus is more effective than those based on term
relations from the WordNet, while it is less effective
compared with the definition-based similarity func-
tion, which might be caused by its smaller coverage.
Finally, combining different WordNet-based sim-
ilarity functions does not help, which may indicate
that the expanded terms selected by different func-
tions are overlapped.
5.3 Comparison with Co-occurrence-based
Resources
As shown in Table 2, the similarity function based
on synset definitions, i.e., s
def
, is most effective. We
now compare the retrieval performance ofusing this
similarity function with that ofusing the mutual in-
formation based functions, i.e., s
MIBL
and s
MIImp
.
The experiments are conducted over two types of
queries, i.e. short keyword (keyword title) and short
verbose (one sentence description) queries.
The results for short keyword queries are shown
in Table 3. The retrieval performance ofquery ex-
pansion based on s
def
is significantly better than
that based on s
MIBL
on almost all the data sets,
while it is slightly worse than that based on s
MIImp
on some data sets. We can make the similar ob-
servation from the results for short verbose queries
as shown in Table 4. One advantage of s
def
over
s
MIImp
is the computational cost, because s
def
can
be computed offline in advance while s
MIImp
has to
be computed online from query-dependent working
sets which takes much more time. The low computa-
tional cost and high retrieval performance make s
def
more attractive in the real world applications.
5.4 Additive Effect
Since both types of similarity functions are able
to improve retrieval performance, we now study
whether combining them could lead to better per-
formance. Table 5 shows the retrieval performance
of combining both types of similarity functions for
short keyword queries. The results for short verbose
queries are similar. Clearly, combining the similar-
ity functions from different resources could further
improve the performance.
6 Conclusions
Query expansion is an effective technique in in-
formation retrieval to improve the retrieval perfor-
mance, because it often can bridge the vocabulary
gaps between queries and documents. Intuitively,
hand-crafted thesaurus could provide reliable related
terms, which would help improve the performance.
However, none of the previous studies is able to
show significant performance improvement through
query expansionusing information only from man-
ually created lexical resources.
In this paper, we re-examine the problem of query
expansion usinglexical resources in recently pro-
posed axiomatic framework and find that we are
able to significantly improve retrieval performance
through queryexpansionusing only hand-crafted
lexical resources. In particular, we first study a
few term similarity functions exploiting the infor-
mation from two lexical resources: WordNet and
dependency-based thesaurus created by Lin. We
then incorporate the similarity functions with the
query expansion method in the axiomatic retrieval
145
Table 4: Performance Comparison (MAP, short verbose queries)
Data BL QE
def
QE
MIBL
QE
MIImp
ap88-89 0.181 0.220‡ (21.5%) 0.205‡ (13.3%) 0.230‡ (27.1%)
doe 0.109 0.121‡ (11%) 0.119 (9.17%) 0.117 (7.34%)
fr88-89 0.146 0.164‡ (12.3%) 0.162‡ (11%) 0.164‡ (12.3%)
trec7 0.184 0.209‡ (13.6%) 0.196 (6.52%) 0.224‡(21.7%)
trec8 0.234 0.238‡(1.71%) 0.235 (0.4%) 0.243† (3.85%)
wt2g 0.266 0.276 (3.76%) 0.276† (3.76%) 0.282‡ (6.02%)
Table 5: Additive Effect (MAP, short keyword queries)
ap88-89 doe fr88-89 trec7 trec8 wt2g
QE
MIBL
0.233 0.175 0.222 0.195 0.250 0.311
QE
def+M IBL
0.257‡ 0.183‡ 0.225‡ 0.217‡ 0.267‡ 0.320‡
QE
MIImp
0.265 0.183 0.227 0.236 0.278 0.320
QE
def+M IImp
0.269‡ 0.187 0.232‡ 0.237‡ 0.280† 0.322†
models. Systematical experiments have been con-
ducted over six standard TREC collections and show
promising results. All the proposed similarity func-
tions improve the retrieval performance, although
the degree of improvement varies for different sim-
ilarity functions. Among all the functions, the one
based on synset definition is most effective and is
able to significantly and consistently improve re-
trieval performance for all the data sets. This simi-
larity function is also compared with some similarity
functions using mutual information. Furthermore,
experiment results show that combining similarity
functions from different resources could further im-
prove the performance.
Unlike previous studies, we are able to show that
query expansionusing only manually created the-
sauri can lead to significant performance improve-
ment. The main reason is that the axiomatic ap-
proach provides guidance on how to appropriately
assign weights to expanded terms.
There are many interesting future research direc-
tions based on this work. First, we will study the
same problem in some specialized domain, such as
biology literature, to see whether the proposed ap-
proach could be generalized to the new domain.
Second, the fact that using axiomatic approaches to
incorporate linguistic information can improve re-
trieval performance is encouraging. We plan to ex-
tend the axiomatic approach to incorporate more
linguistic information, such as phrases and word
senses, into retrieval models to further improve the
performance.
Acknowledgments
We thank ChengXiang Zhai, Dan Roth, Rodrigo de
Salvo Braz for valuable discussions. We also thank
three anonymous reviewers for their useful com-
ments.
References
J. Bai, D. Song, P. Bruza, J. Nie, and G. Cao. 2005.
Query expansionusing term relationships in language
models for information retrieval. In Fourteenth Inter-
national Conference on Information and Knowledge
Management (CIKM 2005).
S. Banerjee and T. Pedersen. 2005. Extended gloss over-
laps as a measure of semantic relatedness. In Proceed-
ings of the 18th International Joint Conference on Ar-
tificial Intelligence.
G. Cao, J. Nie, and J. Bai. 2005. Integrating word rela-
tionships into language models. In Proceedings of the
2005 ACM SIGIR Conference on Research and Devel-
opment in Information Retrieval.
H. Fang and C. Zhai. 2005. An exploration of axiomatic
approaches to information retrieval. In Proceedings
of the 2005 ACM SIGIR Conference on Research and
Development in Information Retrieval.
H. Fang and C. Zhai. 2006. Semantic term matching
in axiomatic approaches to information retrieval. In
Proceedings of the 2006 ACM SIGIR Conference on
Research and Development in Information Retrieval.
146
N. Fuhr. 1992. Probabilistic models in information re-
trieval. The Computer Journal, 35(3):243–255.
Y. Jing and W. Bruce Croft. 1994. An association the-
saurus for information retreival. In Proceedings of
RIAO.
D. Lin. 1998. An information-theoretic definition of
similarity. In Proceedings of International Conference
on Machine Learning (ICML).
S. Liu, F. Liu, C. Yu, and W. Meng. 2004. An effec-
tive approach to document retrieval via utilizing word-
net and recognizing phrases. In Proceedings of the
2004 ACM SIGIR Conference on Research and Devel-
opment in Information Retrieval.
R. Mandala, T. Tokunaga, and H. Tanaka. 1999a. Ad
hoc retrieval experiments using wornet and automati-
cally constructed theasuri. In Proceedings of the sev-
enth Text REtrieval Conference (TREC7).
R. Mandala, T. Tokunaga, and H. Tanaka. 1999b. Com-
bining multiple evidence from different types of the-
saurus for query expansion. In Proceedings of the
1999 ACM SIGIR Conference on Research and Devel-
opment in Information Retrieval.
G. Miller. 1990. Wordnet: An on-line lexical database.
International Journal of Lexicography, 3(4).
H. J. Peat and P. Willett. 1991. The limitations of term
co-occurencedata for queryexpansion in document re-
trieval systems. Journal of the american society for
information science, 42(5):378–383.
J. Ponte and W. B. Croft. 1998. A language modeling
approach to information retrieval. In Proceedings of
the ACM SIGIR’98, pages 275–281.
Y. Qiu and H.P. Frei. 1993. Concept based query ex-
pansion. In Proceedings of the 1993 ACM SIGIR Con-
ference on Research and Development in Information
Retrieval.
P. Ruch, I. Tbahriti, J. Gobeill, and A. R. Aronson. 2006.
Argumentative feedback: A linguistically-motivated
term expansion for information retrieval. In Pro-
ceedings of the COLING/ACL 2006 Main Conference
Poster Sessions, pages 675–682.
G. Salton, C. S. Yang, and C. T. Yu. 1975. A theory
of term importance in automatic text analysis. Jour-
nal of the American Society for Information Science,
26(1):33–44, Jan-Feb.
A. F. Smeaton and C. J. van Rijsbergen. 1983. The
retrieval effects ofqueryexpansion on a feedback
document retrieval system. The Computer Journal,
26(3):239–246.
M. A. Stairmand. 1997. Textual context analysis for in-
formation retrieval. In Proceedings of the 1997 ACM
SIGIR Conference on Research and Development in
Information Retrieval.
C. J. van Rijsbergen. 1979. Information Retrieval. But-
terworths.
E. M. Voorhees. 1993. Using wordnet to disambiguate
word sense for text retrieval. In Proceedings of the
1993 ACM SIGIR Conference on Research and Devel-
opment in Information Retrieval.
E. M. Voorhees. 1994. Queryexpansionusing lexical-
semantic relations. In Proceedings of the 1994 ACM
SIGIR Conference on Research and Development in
Information Retrieval.
E. M. Voorhees. 2005. Overview of the trec 2005 ro-
bust retrieval track. In Notebook of the Thirteenth Text
REtrieval Conference (TREC2005).
147
. studies of query expansion using axiomatic
approaches in Section 3. We then present our study
of using lexical resources, such as WordNet, for
query expansion. the
presence/absence of term t
i
in each document of col-
lection C.
5.2 Effectiveness of Lexical Resources
We first compare the retrieval performance of query
expansion