According to the random walk hypothesis, asset prices follow a geometric ran- dom walk. Although market efficiency suggests that a geometric random walk may be appropriate for modeling price data, for the analyst, the important issue is whether or not observed prices behave like observations from a geo- metric random walk. Hence, a large number of statistical tests of the random walk model have been proposed.
It is worth noting that, although the term “random walk” refers to the asset log-prices, tests of the random walk model are typically based on the properties of the log-returns, which are the increments corresponding to the log-prices. Thus, these tests are designed to detect statistical relationships in the log-returns of an asset; such relationships would contradict the random walk hypothesis.
In this section, we consider four simple tests that are useful in this context;
each test is designed to detect departures from some form of a random walk for log-prices; the first two tests considered test RW1 and the remaining two test RW3. Note that, because of the relationships among the three forms of the random walk model, rejection of the hypothesis that RW3 holds for log-prices implies rejection of RW2 and RW1 for log-prices.
A Test Based on the Sample Autocorrelation Function
Letp0, p1, p2, . . .denote log-prices of a given asset and letr1, r2, . . .denote the corresponding log-returns. Under RW3 for{pt:t= 0,1,2, . . .}, r1, r2, . . . are uncorrelated random variables, each with meanμand standard deviationσ.
Let ρ(ã) denote the autocorrelation function of {rt:t= 1,2, . . .}. Then, under RW3,ρ(h) = 0 for allh= 1,2, . . . .Therefore, a test of the RW3 version of the random walk hypothesis may be based on the sample autocorrelation function, as described in Section 2.5.
Suppose we observeT periods of return data,r1, r2, . . . , rT, and let ˆρ(h), h= 1,2, . . ., denote the sample autocorrelation function. Although a test of the random walk hypothesis can be based on any one sample autocorrelation, a better approach is to construct a test statistic based on several sample autocorrelations, such as the firstm sample autocorrelations, for some given value ofm. A test statistic of this type is given by
B=T(T+ 2) m h=1
ˆρ(h)2 T−h.
Note thatBtends to be large when the sample autocorrelations are far from 0.
Under the null hypothesis that RW3 holds for log-prices, so thatρ(1) =ρ(2) =
ã ã ã=ρ(m) = 0,B has a chi-squared distribution withm degrees of freedom.
This is known as theBox–Ljung test.
In order to carry out the test,m, the number of lags used to computeB, must be selected. When the data consist of five years of monthly returns, a relatively small value ofmshould be used; for example,m= 12 is a reasonable choice. For longer series of daily returns, a larger value ofm could be used.
Example 3.3 Consider the monthly log-returns for Wal-Mart stock, stored in the variable wmt.m.logret. To compute the test statistic B and the correspondingp-value, we may use theBox.testfunction.
> Box.test(wmt.m.logret, lag=12, type="L") Box-Ljung test
data: wmt.m.logret
X-squared = 9.9011, df = 12, p-value = 0.6246
The argument lag in the Box.test function specifies m, the number of lags to be used and thetype="L"argument specifies that the Box–Ljung test be used.
These results indicate that, for m= 12, B= 9.9011 and the p-value is 0.6246. Therefore, based on this test, there is no evidence to reject the hypothesis that the data do not exhibit autocorrelation (at least up to lag 12), confirming our informal conclusion in Section 2.5.
Variance-Ratio Test
The variance-ratio test is based on the following observation. Suppose that RW3 holds for the log-prices, so that r1, r2, . . . , rT each has mean μ and standard deviationσandrt, rsare uncorrelated for allt=s. Then
E(rt+rt−1) = E(rt) + E(rt−1) =μ+μ= 2μ and
Var(rt+rt−1) = Var(rt) + Var(rt−1) =σ2+σ2= 2σ2.
More generally, rt+rt−1+ã ã ã+rt−q+1 has mean qμ and variance qσ2. Note thatrt+rt−1+ã ã ã+rt−q+1 is simply theq-period log-return at time t.
Therefore, if RW3 holds for log-prices, then there is a simple relationship between the variance of multiperiod log-returns and the variance of single- period log-returns.
This fact can be used to test RW3 by comparing an estimate of the variance of
rt+rt−1+ã ã ã+rt−q+1, t=q, . . . , T
to an estimate of the variance ofr1, r2, . . . , rT; if RW3 holds, the ratio of these estimates should be roughlyq.
For a given value ofq, let Sq2=
T
t=q(rt+rt−1+ã ã ã+rt−q+1−q¯r)2
T−q ,
which is essentially the sample variance of rt+rt−1+ã ã ã+rt−q+1, t=q, q+ 1, . . . , T, with the divisor equal to the sample size minus one, except that instead of subtracting the sample mean of these values we subtractq¯r, where
¯ r= 1
T T t=1
rt.
Let S2 denote the usual sample variance of r1, r2, . . . , rT. The variance- ratio statistic is given by
Vq = T
T−q+ 1 1 q
Sq2 S2. If RW3 holds, we expect that
1 q
Sq2 S2 = 1;˙
the factorT /(T−q+ 1) is an adjustment term designed to improve the accu- racy of the normal approximation to the distribution ofVq in small samples.
Note that, like the Box–Ljung test, the variance-ratio test is a test of the correlation structure of the log-returns.
Under the null hypothesis that RW3 holds for the log-returns,√
T(Vq−1) is approximately normally distributed with mean 0 and variance given by 2(2q−1)(q−1)/(3q). Therefore, the standardized test statistic is
V¯q =√
T 3q
2(2q−1)(q−1)(Vq−1)
and the null hypothesis is rejected for large values of|V¯q|. Thep-value of the test is given by
P(|Z|>|V¯q,0|)
whereZ has a standard normal distribution and ¯Vq,0is the observed value of V¯q; hence,
P(|Z|>|V¯q,0|) = 2
1−Φ(|V¯q,0|) where Φ denotes the standard normal distribution function.
Example 3.4 Consider the statistic V3 applied to the log-returns on Wal-Mart stock. This statistic may be calculated using the following com- mands:
> x<-wmt.m.logret - mean(wmt.m.logret)
> x3<-x[3:60] + x[2:59] + x[1:58]
> (60/58)*(1/3)*(sum(x3^2)/57)/var(x) [1] 0.90135
Here,xis the vector of mean-corrected log-returns for Wal-Mart stock andx3 is the vector of three-month mean-corrected log-returns,
rt−r¯+rt−1−r¯+rt−2−¯r, t= 3,4, . . . , T . The observed value of the statisticV3 is 0.90135.
To compute ap-value, we use the fact that, under the null hypothesis,V3 has mean 1 and variance
2(2q−1)(q−1) 3q
1
T =2(5)(2) 9
1 60 = 1
27.
Therefore, the observed value of the standardized test statistic is given by V¯3,0=√
27(0.90135−1) =−0.51260;
this corresponds to a two-tailedp-value of 0.6082, calculated by
> 2*(1-pnorm(0.51260)) [1] 0.6082
Therefore, there is no evidence to reject the null hypothesis that Wal-Mart log-prices follow RW3. Note that here the function pnorm is the standard normal distribution function.
A similar conclusion is obtained using V6. The observed value of V6 is
0.610, corresponding to ap-value of 0.222.
Runs Test
Not all types of dependence are reflected in correlation. Another approach to detecting relationships in a series of returns is to look at patterns of above-average and below-average returns.
More formally, let r1, r2, . . . , rT denote a sequence of log-returns and let med(r1, r2, . . . , rT) denote the sample median of r1, r2, . . . , rT. For t= 1,2, . . . , T, define
Gt=
1 ifrt>med(r1, . . . , rT) 0 ifrt≤med(r1, . . . , rT)
ThenG1, G2, . . . , GT is a sequence of indicator variables showing if the return in a given period exceeds the median (Gt= 1) or not (Gt= 0).
Suppose thatr1, r2, . . . rT are i.i.d. random variables; that is, suppose that RW1 holds. Then G1, G2, . . . , GT should exhibit a random pattern of zeros and ones. On the other hand, ifr1, r2, . . . , rT have some type of dependence structure, or if certain features of the distribution ofrtdepend ont, then there may be patterns of zeros and ones; for instance, ifrt, rt+1are dependent, then Gt+1= 1 may be more likely if Gt= 1 than ifGt= 0.
Therefore, we can test the hypothesis thatp0, p1, p2, . . . , pT follow RW1 by counting the number of “runs” in the sequenceG1, G2, . . . , GT, where a run is defined as a sequence of one symbol. For example, if the sequence of indicator variables is 0 0 0 1 1 1 0 0 1 1, there are four runs, while if the sequence is 0 1 0 0 1 1 0 1 1 0, there are seven runs.
For convenience, assume that there are T /2 zeros in sequence G1, G2, . . . , GT and T /2 ones. This holds if T is even and r1, r2, . . . , rT are unique. In general, the number of zeros and the number of ones are both approximately equal to T /2 with high probability and the results described as follows continue to hold.
LetM0denote the observed number of runs and letM denote the number of runs in a random sequence of length T with T /2 ones and T /2 zeros; to compute ap-value for the test, we can compareM0to the distribution ofM. Although the exact distribution ofM is complicated, it may be shown thatM is approximately distributed as a binomial random variable with parameters T and 1/2. To see why this might hold, consider building a sequence of length T by adding randomly selected ones and zeros, one step at a time; at each stage, there is a 50% chance of increasing the number of runs by one. This fact may be used to calculate ap-value for the test.
Example 3.5 Consider the calculation ofM0and the correspondingp-value for log-returns on Wal-Mart stock, stored in the variable wmt.m.logret.
These calculations can be performed using the function runs.test, which is available in therandtestspackage (Caeiro and Mateus 2014).
> library(randtests) Warning message:
package randtests was built under R version 3.2.3
> runs.test(wmt.m.logret) Runs Test
data: wmt.m.logret
statistic = 1.0417, runs = 35, n1 = 30, n2 = 30, n = 60, p-value = 0.2976
alternative hypothesis: nonrandomness
Therefore, thep-value of the test is 0.2976 so that there is no evidence to reject the null hypothesis that RW1 holds for the log-returns.
Rescaled Range Test
The Box–Ljung, variance-ratio, and runs tests are useful for detecting associ- ation among log-returns from nearby time periods; however, another way in which the random walk hypothesis may fail is if the log-returns are related over a long period of time. For instance, there may be multiyear periods dur- ing which the monthly log-returns are generally (but not always) large. The rescaled range test is designed to detect this type of long-range dependence.
The test statistic is given by H =max1≤k≤Tk
t=1(rt−¯r)−min1≤l≤T l
t=1(rt−r)¯ S√
T
whereS is the sample standard deviation of r1, r2, . . . , rT ; that is, H is the range of the variables
k t=1
(rt−¯r), k= 1,2, . . . , T .
Large values of H are evidence against the null hypothesis that RW1 holds for the log-prices.
A large value ofH indicates that there are timest0, t1 such that
t1
t=1
(rt−¯r) is a large positive value and
t0
t=1
(rt−¯r)
TABLE 3.1
Critical Values for the Rescaled Range Test Significance Level Critical Value
0.10 1.620
0.05 1.747
0.025 1.862
0.005 2.098
is a large negative value; note that the valuesrt−¯r,t= 1,2, . . . , T must sum to 0. That is, there is a time period over which the log-returns differ greatly from their sample mean.
To determine if the observed value of H is statistically significant, we compare it to the critical values in Table 3.1.
Example 3.6 Consider calculation of H for the Wal-Mart log-returns in wmt.m.logret. To calculateH in R, we can use thecumsum function, which returns the cumulative sums of the values in a vector:
> x<-c(1, 3, -2, -4, 5)
> cumsum(x)
[1] 1 4 2 -2 3 Let
H1= max
1≤k≤T
k j=1
(rj−r)¯ and
H2= min
1≤l≤T
l j=1
(rj−¯r) so thatH= (H1−H2)/(S√
T).
For the Wal-Mart monthly log-returns,H1andH2may be calculated by
> H1<-max(cumsum(wmt.m.logret-mean(wmt.m.logret)))
> H1
[1] 0.085685
> H2<-min(cumsum(wmt.m.logret-mean(wmt.m.logret)))
> H2
[1] -0.19496 andH is given by
> H<-(H1 - H2)/(sd(wmt.m.logret)*(60^.5))
> H
[1] 0.83819
To compute thep-value for a test of RW2, we compare the observed value of Hto critical values in Table 3.1. It follows that thep-value is greater than 0.10.
Therefore, according to the rescaled range test, there is no evidence to reject the hypothesis that the RW2 model holds for Wal-Mart stock.