One of the earliest problems in empirical asset pricing has been to determine whether a proposed model is correctly specified or not. This can be accomplished by using various specification tests, which are typically aggregate measures of sample pricing errors. However, some of these specification tests aggregate the pricing errors using weighting matrices that are model dependent, and these test statistics
cannot be used to perform model comparison. Therefore, researchers are often interested in a normalized goodness-of-fit measure that uses the same weighting matrix across models. One such measure is the cross-sectional R2. Following Kandel and Stambaugh(1995), this is defined as
2D1 Q
Q0; (9.34)
where
Q0 Dmin
0 .21N0/0W .21N0/
D02W202W 1N.10NW 1N/110NW2; (9.35) QDe0W e
D02W202W X.X0W X/1X0W2: (9.36) In order for 2to be well defined, we need to assume that2is not proportional to 1N (the expected returns are not all equal) so thatQ0 > 0. Note that0 21and it is a decreasing function of the aggregate pricing-error measureQDe0W e. Thus,
2 is a natural measure of goodness of fit. However, it should be emphasized that unless the model is correctly specified, 2depends on the choice ofW. Therefore, it is possible that a model with a good fit under the OLS CSR provides a very poor fit under the GLS CSR.
The sample measure of 2is similarly defined as O2D1 QO
QO0
; (9.37)
where QO0 and QO are consistent estimators of Q0 and Q in (9.35) and (9.36), respectively. WhenW is known, we estimateQ0andQusing
QO0D O02WO2 O02W 1N.10NW 1N/110NWO2; (9.38) QO D O02WO2 O02WX.O XO0WX/O 1XO0WO2: (9.39) WhenW is not known, we replaceW withWO in the formulas above.
To test the null hypothesis of correct model specification, i.e., e D 0N (or, equivalently,QD0and 2 D1), we typically rely on the sample pricing errorse.O Therefore, it is important to obtain the asymptotic distribution ofeOunder the null hypothesis. For a given weighting matrixW (orWO with a limit ofW), letP be an N .N K1/orthonormal matrix with its columns orthogonal toW12X.Kan et al.(2010) derive the asymptotic distribution ofeOunder the null hypothesis:
pTeOA N.0N; V .e//;O (9.40)
where
V .e/O D X1
jD1
EŒqtqt0Cj; (9.41)
with
qt DW12PP0W 12tyt; (9.42) andyt D110V111.ft1/.
Remark 1. Under the correctly specified model, the asymptotic distribution of eO does not depend on whether we useW orWO as the weighting matrix.
Remark 2. Under the correctly specified model,qt in (9.42) can also be written as qt DW12PP0W 12Rtyt: (9.43) Remark 3. V .e/O is a singular matrix and some linear combinations ofp
TeOare not asymptotically normally distributed. As a result, one has to be careful when relying on individual sample pricing errors to test the validity of a model because some of them may not be asymptotically normally distributed.Gospodinov et al.(2010b) provide a detailed analysis of this problem. For our subsequent analysis, it is easier to work with
eQDP0W12e:O (9.44)
The reason is that the asymptotic variance ofeQis given by V .e/Q D
X1
jD1
EŒqQtqQt0Cj; (9.45)
where
Q
qt DP0W 12tyt; (9.46)
andV .e/Q is nonsingular.
Given (9.40), we can obtain the asymptotic distribution of any quadratic form of sample pricing errors. For example, let˝ be an N N positive definite matrix, and let˝O be a consistent estimator of˝. When the model is correctly specified, we have
TeO0˝OeOA
NK1X
iD1
ixi; (9.47)
where thexi’s are independent21random variables, and thei’s are theNK1 eigenvalues of
.P0W12˝W12P /V .e/:Q (9.48) Using an algorithm due toImhof(1961) and later improved byDavies(1980) and Lu and King(2002), one can easily compute the cumulative distribution function of a linear combination of independent2 random variables. As a result, one can use (9.47) as a specification test of the model.
There are several interesting choices of˝O. The first one is˝O D OW, and the test statistic is simply given byTeO0WOeODTQO. In this case, thei’s are the eigenvalues ofV .e/Q . The second one is˝O D OV .e/O C, whereV .O e/O is a consistent estimator of V .e/O andV .OO e/Cstands for its pseudo-inverse. This choice of˝O yields the following Wald test statistic:
JW DTeO0V .O e/O CeODTeQ0V .QO e/1eQA 2NK1; (9.49) whereV .O e/Q is a consistent estimator ofV .e/Q . The advantage of usingJW is that its asymptotic distribution is simply2NK1and does not involve the computation of the distribution of a linear combination of independent2 random variables. The disadvantage of usingJW is that the weighting matrix is model dependent, making it problematic to compare theJW’s of different models.
Whenqt is serially uncorrelated and VarŒRtjftD˙(conditional homoskedas- ticity case), we can show that
V .e/Q D.1C10V1111/P0W 12˙W12P: (9.50) For the special case ofW DV221orW D˙1, we have
V .e/Q D.1C10V1111/INK1: (9.51) If we estimateV .e/Q usingV .O e/Q D .1C O10VO111O1/INK1, the Wald test in (9.49) becomes
JW DTeQ0V .QO e/1eQD TeO0VO221eO 1C O10VO111O1
D TeO0˙O1eO 1C O10VO111O1
A 2NK1; (9.52)
and JW coincides with the cross-sectional regression test (CSRT) proposed by Shanken(1985). Better finite sample properties of the Wald test can be obtained, as suggested byShanken(1985), by using the following approximateF-test:
JW
app: T .NK1/
T N C1 FNK1;TNC1: (9.53)
Using the general result in (9.47), one can show that when the model is correctly specified,
T .O21/A
NK1X
iD1
i
Q0xi; (9.54)
and the sample cross-sectionalR2can be used as a specification test.
When the model is misspecified, i.e., 2 < 1, there are two possible asymptotic distributions for O2. When 2D0, we have
TO2 A XK iD1
Qixi; (9.55)
where thexi’s are independent21random variables and theQi’s are the eigenvalues of
Œˇ0Wˇˇ0W 1N.10NW 1N/110NWˇV .O1/; (9.56) whereV .O1/is the asymptotic covariance matrix ofO1under potentially misspeci- fied models (i.e., based on the expressions ofht in (9.30)–(9.32)). This asymptotic distribution permits a test of whether the model has any explanatory power for expected returns. It can be shown that 2 D 0if and only if1 D 0K. Therefore, one can also testH0 W 2D0using a Wald test ofH0W1 D0K.
When0 < 2< 1, the asymptotic distribution of O2is given by pT .O2 2/AN
0
@0;
X1
jD1
EŒntntCj 1
A; (9.57)
where
nt D2
utytC.1 2/vt
=Q0 for knownW; (9.58)
nt D
u2t 2utyt C.1 2/.2vtv2t/
=Q0 forWO D OV221; (9.59) nt D
e0te2utytC.1 2/.2vte0te0/
=Q0 forWO D O˙d1;(9.60) with vt De00W .Rt2/andtD˙d1Diag.tt0/˙d1.
In the0 < 2< 1case, O2is asymptotically normally distributed around its true value. It is readily verified that the expressions forntapproach zero when 2 !0or
2!1. Consequently, the standard error of O2tends to be lowest when 2is close to zero or one, and thus it is not monotonic in 2. Note that the asymptotic normal distribution of O2breaks down for the two extreme cases ( 2 D0or 1). Intuitively, the normal distribution fails because, by construction,O2will always be above zero (even when 2D0) and below one (even when 2D1).