Multiple Model Comparison Tests

Thus far, we have considered comparison of two competing models. However, given a set of models of interest, one may want to test whether one model, the

“benchmark,” has the highest 2of all models in the set. This gives rise to a common problem in applied work – if we focus on the statistic that provides the strongest evidence of rejection, without taking into account the process of searching across alternative specifications, there will be a tendency to reject the benchmark more often than the nominal size of the tests suggests. In other words, the truep-value will be larger than the one associated with the most extreme statistic.

Therefore, in this section we discuss how to perform model comparison when multiple models are involved. Suppose we havepmodels. Let 2i denotes the cross- sectionalR2 of modeli. We are interested in testing if model 1 performs as well as models 2 top. Letı D .ı2; : : : ; ıp/, whereıi D 21 2i. We are interested in testingH0Wı 0r, wherer Dp1.

We consider two different tests of this null hypothesis. The first one is the multivariate inequality test developed byKan et al. (2010). Numerous studies in statistics focus on tests of inequality constraints on parameters. The relevant work dates back toBartholomew(1961),Kudo(1963),Perlman(1969),Gourieroux et al.

(1982) and Wolak(1987,1989). Following Wolak(1989), we state the null and alternative hypotheses as

H0Wı0r; H1 Wı2 <r: (9.107)

We also consider another test based on the reality check ofWhite(2000) that has been used byChen and Ludvigson(2009). LetıminD min2ipıi. Define the null and alternative hypotheses as

H0 Wımin0; H1 Wımin< 0: (9.108) The null hypotheses presented above suggest that no other model outperforms model 1, whereas the alternative hypotheses suggest that at least one of the other models outperforms model 1.

LetıOD.ıO2; : : : ;ıOp/, whereıOi D O12 O2i. For both tests, we assume

pT .ıOı/A N.0r; ˙ıO/: (9.109)

Starting with the multivariate inequality test, its test statistic is constructed by first solving the following quadratic programming problem

minı .ıOı/0˙Oı1O .ıOı/ s:t: ı 0r; (9.110) where˙OıO is a consistent estimator of ˙ıO. Let ıQ be the optimal solution of the problem in (9.110). The likelihood ratio test of the null hypothesis is given by

LRDT .ıO Qı/0˙OıO1.ıO Qı/: (9.111) For computational purposes, it is convenient to consider the dual problem

min 0ıOC1

20˙OıO s:t: 0r: (9.112) LetQ be the optimal solution of the problem in (9.112). The Kuhn-Tucker test of the null hypothesis is given by

K T DTQ0˙OıO:Q (9.113) It can be readily shown thatLRDK T.

To conduct statistical inference, it is necessary to derive the asymptotic distribution ofLR.Wolak(1989) shows that underH0 W ı D 0r (i.e., the least favorable value ofıunder the null hypothesis),LRhas a weighted chi-squared distribution:

LRA Xr

iD0

wi.˙ıO1/Xi D Xr iD0

wri.˙ıO/Xi; (9.114)

where the Xi’s are independent2 random variables withi degrees of freedom, 20 0, and the weights wi sum up to one. To compute thep-value ofLR,˙Oı1 needs to be replaced with˙OıO1in the weight functions.

The biggest hurdle in determining thep-value of this multivariate inequality test is the computation of the weights. For a givenrr covariance matrix˙ D .ij/, the expressions for the weights wi.˙/,iD0; : : : ; r, are given inKudo(1963). The weights depend on˙through the correlation coefficients ij Dij=.ij/. When rD1, w0Dw1D1=2. Whenr D2,

w0D 1

2 w2; (9.115)

w1D 1

2; (9.116)

w2D 1

4 Carcsin. 12/

2 : (9.117)

Whenr D3, w0D 1

2 w2; (9.118)

w1D 1

2 w3; (9.119)

w2D 3

8 Carcsin. 123/Carcsin. 132/Carcsin. 231/

4 ; (9.120)

w3D 1

8 Carcsin. 12/Carcsin. 13/Carcsin. 23/

4 ; (9.121)

where

ijkD ij ik j k

Œ.1 ik2/.1 2j k/12: (9.122) Forr > 3, the computation of the weights is more complicated. FollowingKudo (1963), letP D f1; : : : ; rg. There are2rsubsets ofP, which are indexed byM. Let n.M /be the number of elements inM andM0be the complement ofM relative toP. Define˙M as the submatrix that consists of the rows and columns in the setM,˙M0 as the submatrix that consists of the rows and columns in the setM0,

˙M;M0 the submatrix with rows corresponding to the elements inM and columns corresponding to the elements inM0 (˙M0;M is similarly defined), and˙MM0 D

˙M˙M;M0˙M10˙M0;M.Kudo(1963) shows that wi.˙/D X

MWn.M /Di

P .˙M10/P .˙MM0/; (9.123)

whereP .A/is the probability for a multivariate normal distribution with zero mean and covariance matrixAto have all positive elements. In the above equation, we use the convention thatP Œ˙;P D 1andP Œ˙;1 D 1. Using (9.123), we have w0.˙/DP .˙1/and wr.˙/DP .˙/.

Researchers have typically used a Monte Carlo approach to compute the positive orthant probability P .A/. However, the Monte Carlo approach is not efficient because it requires a large number of simulations to achieve the accuracy of a few digits, even whenris relatively small.

To overcome this problem,Kan et al.(2010) rely on a formula for the positive orthant probability due to Childs(1967) andSun (1988a). LetR D .rij/be the correlation matrix corresponding toA.Childs(1967) andSun(1988a) show that

P2k.A/D 1

22k C 1 22k1

1i<j2k

arcsin.rij/

C Xk jD2

1 22kjj

1i1<<i2j2k

I2j

R.i1;:::;i2j/

; (9.124)

P2kC1.A/D 1

22kC1 C 1 22k

1i<j2kC1

arcsin.rij/

C Xk jD2

1 22kC1jj

1i1<<i2j2kC1

I2j

R.i1;:::;i2j/

; (9.125)

whereR.i1;:::;i2j/ denotes the submatrix consisting of the.i1; : : : ; i2j/th rows and columns ofR, and

I2j./D .1/j .2/j

Z 1

1 Z 1

Y2j iD1

! exp

!0!

2 d!1 d!2j; (9.126) where is a 2j 2j covariance matrix and ! D .!1; : : : ; !2j/0. Sun (1988a) provides a recursive relation forI2j./that allows us to obtainI2j starting fromI2. Sun’s formula enables us to compute the2jth order multivariate integralI2j using a.j1/th order multivariate integral, which can be obtained numerically using the Gauss-Legendre quadrature method.Sun(1988b) provides a Fortran subroutine to computeP .A/forr 9.Kan et al.(2010) improve on Sun’s program and are able to accurately computeP .A/and hence wi.˙/forr 11.

Turning to the ımin test based on White (2000), one can use the sample counterpart ofımin:

ıOminD min

2ipıOi (9.127)

to test (9.108). To determine the p-value of ıOmin, we need to identify the least favorable value ofıunder the null hypothesis. It can be easily shown that the least favorable value of ı under the null hypothesis occurs at ı D 0r. It follows that asymptotically,

P Œp

T ımin< c!P Œmin

1irzi < c D 1P Œmin

1irzi > c D 1P Œz1> c; : : : ;zr > c

D 1P Œz1<c; : : : ;zr <c; (9.128) where zD .z1; : : : ;zr/0 N.0r; /, and the last equality follows from symmetry since EŒz D 0r. Therefore, to compute the asymptotic p-value one needs to evaluate the cumulative distribution function of a multivariate normal distribution.

Note that both tests crucially depend on the asymptotic normality assumption in (9.109). Sufficient conditions for this assumption to hold are (1)0 < 2i < 1, and (2) the implied stochastic discount factors of the different models are distinct. Even though the multivariate normality assumption may not always hold at the boundary point of the null hypothesis (i.e.,ı D 0r), it is still possible to compute the p- value as long as we assume that the trueıis not at the boundary point of the null hypothesis. There are, however, cases in which this assumption does not hold. For example, if model 2 nests model 1, then we cannot haveı2> 0. As a result, the null hypothesisH0 Wı2 0becomesH0 W ı2 D 0. Under this null hypothesis,p

TıO2

no longer has a multivariate normal distribution and both the multivariate inequality test and theımintest will break down.

Therefore, when nested models are involved, the two tests need to be modified.

If model 1 nests some of the competing models, then those models that are nested by model 1 will not be included in the model comparison tests. The reason is that these models are clearly dominated by model 1 and we no longer need to perform tests in presence of these models. If some of the competing models are nested by other competing models, then the smaller models will not be included in the model comparison tests. This is reasonable since if model 1 outperforms a larger model, it will also outperform the smaller models that are nested by the larger model. With these two types of models being eliminated from the model comparison tests, the remaining models will not nest each other and the multivariate asymptotic normality assumption onp

T .ıOı/can be justified.

Finally, if model 1 is nested by some competing models, one should separate the set of competing models into two subsets. The first subset will include competing models that nest model 1. To test whether model 1 performs as well as the models in this subset, one can construct a modelM that contains all the distinct factors in this subset. It can be easily verified that model 1 performs as well as the models in this subset if and only if 12 D M2 . In this case, a test ofH0 W 12 D M2 can be simply performed using the model comparison tests for nested models described earlier. The second subset includes competing models that do not nest model 1. For this second subset, we can use the non-nested multiple model comparison tests as before. If we perform each test at a significance level of˛=2 and accept the null hypothesis if we fail to reject in both tests, then by the Bonferroni inequality, the size of the joint test is less than or equal to˛.

The Organization and Contents of This Handbook

The Computational Statistics Handbook Series