In practice, we are often interested in estimating the means, standard devi- ations, and correlations for the returns, or excess returns, on several assets.
Hence, it is useful to consider estimation of the mean vector and covariance matrix of a vector of asset returns.
Suppose that there areN assets under consideration. LetRtbe theN×1 vector asset returns at time periodtof the form
Rt=
⎛
⎜⎜
⎜⎝ R1,t R2,t ... RN,t
⎞
⎟⎟
⎟⎠, t= 1,2, . . . , T .
Let μ and Σ denote the mean vector and covariance matrix, respectively, ofRt.
An estimator ofμ is given by the sample mean vector of the returns, R¯ = 1
T T t=1
Rt.
The vector of mean excess returns, μ−μf1, may be estimated by the sample mean vector of the excess returns
R¯E = ¯R−R¯f1=
⎛
⎜⎜
⎜⎝
R¯1−R¯f R¯2−R¯f
... R¯N−R¯f
⎞
⎟⎟
⎟⎠.
To estimate Σ, we can use the sample covariance matrix S, calculated from either the standard returns or the excess returns; here we use the excess returns. The sample covariance matrix is an N×N matrix with the (j, k)th element given by Sj2 if k=j and ˆρjkSjSk if k=j. The same information, in a form that is easier to interpret, is provided by the asset excess-return sample standard deviations, S1, S2, . . . , SN together with the corresponding sample correlation matrix, ˆC, theN×N matrix with ones on the diagonal, and the (j, k)th element given by ˆρjk forj=k.
Data Matrix
When describing the sample mean vector and the sample covariance matrix, it is often convenient to express them in terms of adata matrix. The data matrix for the excess returns, which we denote here byX, is theT×N matrix with rowt given by the vector of excess returns at time t, (Rt−Rf,t1)T:
X=
⎛
⎜⎜
⎜⎝
R1,1−Rf,1 R2,1−Rf,1 ã ã ã RN,1−Rf,1 R1,2−Rf,2 R2,2−Rf,2 ã ã ã RN,2−Rf,2
... ... ã ã ã ... R1,T−Rf,T R2,T−Rf,T ã ã ã RN,T−Rf,T
⎞
⎟⎟
⎟⎠.
Thus, thejth column ofX is the time series of excess returns on assetj and the rowt ofX is the vector ofN asset excess returns at timet.
The sample mean vector and the sample covariance matrix have simple expressions in terms ofX. The (column) vector of sample mean excess returns may be written
R¯E = ¯R−R¯f1= 1
TXT1T. (6.1)
The sample covariance matrix has a particularly simple expression in terms ofX:
S= 1
T−1(X−1TR¯TE)T(X−1TR¯TE). (6.2) Note that
1TR¯TE =
⎛
⎜⎜
⎜⎝ 1 1 ... 1
⎞
⎟⎟
⎟⎠
R¯1−R¯f R¯2−R¯f ã ã ã R¯N−R¯f
=
⎛
⎜⎜
⎜⎝
R¯1−R¯f R¯2−R¯f ã ã ã R¯N−R¯f R¯1−R¯f R¯2−R¯f ã ã ã R¯N−R¯f
... ... ã ã ã ... R¯1−R¯f R¯2−R¯f ã ã ã R¯N−R¯f
⎞
⎟⎟
⎟⎠
so thatX−1TR¯TE is given by
⎛
⎜⎜
⎜⎝
R1,1−Rf,1−( ¯R1−R¯f) R2,1−Rf,1−( ¯R2−R¯f) . . . RN,1−Rf,1−( ¯RN−R¯f) R1,2−Rf,2−( ¯R1−R¯f) R2,2−Rf,2−( ¯R2−R¯f) . . . RN,2−Rf,2−( ¯RN−R¯f)
... ... ... ...
R1,T−Rf,T−( ¯R1−R¯f) R2,T−Rf,T−( ¯R2−R¯f) . . . RN,T−Rf,T−( ¯RN−R¯f)
⎞
⎟⎟
⎟⎠.
Example 6.6 Consider the returns on the stocks of eight large companies, Apple (symbol AAPL), Baxter International (BAX), Coca-Cola (KO), CVS Health Corporation (CVS), Exxon Mobil (XOM), IBM (IBM), Johnson &
Johnson (JNJ), and Walt Disney (DIS). These companies were chosen to represent large companies from a variety of industries.
For each stock, five years of monthly excess returns were calculated for the period ending December 31, 2014. In R, each vector of 60 excess returns was stored as a variable with the name of the stock symbol (e.g.,aaplfor Apple).
To calculate the parameters of the distribution of the return vector Rt, it is convenient to have all of the data stored in a single matrix, with each column corresponding to a particular stock; this can be done using thecbind command:
> big8<-cbind(aapl, bax, ko, cvs, xom, ibm, jnj, dis)
Thenbig8is a 60×8 matrix of excess returns; it corresponds to the data matrixX described previously.
When reading the output from various functions, it is helpful to have the columns ofbig8labeled; this may be achieved using the following command:
> colnames(big8)<-c("AAPL", "BAX", "KO", "CVS", "XOM", "IBM", + "JNJ", "DIS")
> head(big8)
AAPL BAX KO CVS XOM IBM JNJ
[1,] -0.0886 -0.0186 -0.0483 0.00753 -0.0552 -0.06506 -0.0241 [2,] 0.0653 -0.0116 -0.0283 0.04254 0.0153 0.04353 0.0098 [3,] 0.1483 0.0272 0.0517 0.08313 0.0303 0.00845 0.0348 [4,] 0.1109 -0.1888 -0.0283 0.01211 0.0117 0.00571 -0.0139 [5,] -0.0163 -0.1058 -0.0385 -0.06216 -0.1019 -0.02415 -0.0852 [6,] -0.0209 -0.0309 -0.0168 -0.15344 -0.0562 -0.01431 0.0129
DIS [1,] -0.0838 [2,] 0.0571 [3,] 0.1174 [4,] 0.0552 [5,] -0.0930 [6,] -0.0576
Descriptive statistics for the returns may now be calculated. Although we could use matrix expressions like the one in (6.1) to obtain such results, a simpler approach is to use the applycommand, which applies a function to the margins of a matrix or, more generally, an array. For instance, the commandapply(big8, MARGIN=2, FUN=mean)applies the function meanto
“margin” of the matrixbig8designated by theMARGINargument, here given by “2,” to denote columns, the second dimension of the matrixbig8. As with many other R functions, the argument names can be omitted provided that the order of the arguments is respected.
> apply(big8, MARGIN=2, FUN=mean)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.02540 0.00740 0.00978 0.02119 0.00825 0.00598 0.01153 0.02075
> apply(big8, 2, sd)
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579
Therefore, the excess returns on Apple stock, for example, have sample mean 0.0254 and sample standard deviation 0.0739.
Of course, one or both of the results of the aforementionedapplyfunction may be assigned to a variable. For example,
> Rbar<-apply(big8, 2, mean)
To calculate the sample correlation matrix of the data in the matrixbig8, we use thecorfunction with the data matrix as the argument.
> cor(big8)
AAPL BAX KO CVS XOM IBM JNJ DIS
AAPL 1.000 0.193 0.260 0.329 0.303 0.319 0.145 0.346 BAX 0.193 1.000 0.310 0.330 0.327 0.381 0.473 0.196 KO 0.260 0.310 1.000 0.310 0.338 0.197 0.493 0.348 CVS 0.329 0.330 0.310 1.000 0.442 0.244 0.421 0.537 XOM 0.303 0.327 0.338 0.442 1.000 0.520 0.408 0.650 IBM 0.319 0.381 0.197 0.244 0.520 1.000 0.206 0.348 JNJ 0.145 0.473 0.493 0.421 0.408 0.206 1.000 0.323 DIS 0.346 0.196 0.348 0.537 0.650 0.348 0.323 1.000
Thus, the excess returns on Apple and Disney stocks have correlation 0.346, for example. Note that the sample correlation matrix, like all correlation matrices, is symmetric.
The sample covariance matrix may be calculated using thecovfunction:
> Smat<-cov(big8)
> Smat
AAPL BAX KO CVS XOM IBM
AAPL 0.005460 0.000794 0.000790 0.001405 0.001028 0.001080 BAX 0.000794 0.003088 0.000709 0.001061 0.000835 0.000969 KO 0.000790 0.000709 0.001694 0.000737 0.000638 0.000371 CVS 0.001405 0.001061 0.000737 0.003339 0.001174 0.000646 XOM 0.001028 0.000835 0.000638 0.001174 0.002109 0.001094 IBM 0.001080 0.000969 0.000371 0.000646 0.001094 0.002099 JNJ 0.000413 0.001016 0.000783 0.000939 0.000723 0.000365 DIS 0.001479 0.000631 0.000830 0.001797 0.001729 0.000923
JNJ DIS
AAPL 0.000413 0.001479 BAX 0.001016 0.000631 KO 0.000783 0.000830 CVS 0.000939 0.001797 XOM 0.000723 0.001729 IBM 0.000365 0.000923 JNJ 0.001491 0.000722 DIS 0.000722 0.003352
Thus, a second way to compute the standard deviations of these eight stocks is to use the square root of the diagonal elements of Smat. The diagonal of a square matrix may be extracted using the diagcommand. Hence, the estimated standard deviations are given by
> diag(Smat)^.5
AAPL BAX KO CVS XOM IBM JNJ DIS
0.0739 0.0556 0.0412 0.0578 0.0459 0.0458 0.0386 0.0579
matching the results obtained previously.
Some Properties of the Sample Covariance Matrix
Some basic properties of the sample covariance matrixS are easily obtained from its expression in terms of the data matrix. For instance, it follows from (6.2) that S is nonnegative definite. To see this, let u be an N×1 vector.
ThenuTSu=dTdwhere
d= 1
√T−1(X−1TR¯TE)u
is a T×1 vector. Writing d= (d1, d2, . . . , dT)T, dTd=T
t=1d2t; it follows thatdTd≥0 and, hence,uTSu≥0 for anyu∈ N.
There is another important implication of (6.2) for the properties ofS. Like X,X−1TR¯TEis aT×Nmatrix. Hence, althoughSis anN×N matrix, the rank ofS is, at most, min(N, T). Therefore, ifT < N, that is, if the number of time periods is less than the number of assets, thenS cannot be invertible and, hence, it cannot be positive definite. Suppose that we are analyzing five years of monthly returns,T = 60. Then forS to be positive definite, we can consider, at most, 60 assets.
In addition to the sample covariance matrix being singular whenNis larger thanT, ifN andT are roughly the same magnitude, as is often the case with financial data, then there are features of the sample covariance matrix that are not, in general, accurate estimators of the corresponding features ofΣ.
The basic issue is somewhat obvious. The covariance matrix Σ contains N(N+ 1)/2 parameters. We haveN T data points and if N .
=T , N T .
=N2. Therefore, we are trying to estimate a large number of parameters with, rel- atively speaking, a small amount of data. This occurs even though whenT is relatively large any one element ofS, Sij, provides an accurate estimator of Σij, the corresponding element ofΣ.
Because the issue arises with the large number of covariances in Σ, not with the variances, consider the properties of the sample correlation matrix C as an estimator of theN×N correlation matrix C. If T is very large, so that we have many observations, andN/T is near zero, so that the number of assets is small relative to the number of data points, thenC .
=C with high probability; more formally, C converges in probability toC as T → ∞ and N remains fixed. Here, convergence in probability of a sequence of matrices is defined elementwise.
However, ifTis large butq=N/T is not near zero, so thatN is also large, there are ways in whichC is a poor estimator of C. For instance, it may be shown that
tr(C−1) .
= 1
(1−q)tr(C−1)
where tr(A) denotes the trace of a matrix A. Recall that the trace of a matrix is the sum of its diagonal elements; it is also equal to the sum of its eigenvalues.
Although we are not interested directly in the trace of C−1, many of the weight vectors we derived in the previous chapter are based on the inverse of the asset return covariance matrix. Hence, this result suggests that, when N/T is not small, such weight vectors are not well-estimated by replacing the return covariance matrix with the corresponding sample covariance matrix.
For example, if we have five years of monthly returns on 40 assets, so that N = 40 andT = 60, thenq= 2/3 and the trace of ˆC−1will be approximately three times the trace ofC−1, suggesting that ˆC−1is, in some respects, a poor estimator ofC−1.