1. Convergence in mean square... Introduction to Time Series Analysis. Convergence in mean square... It is also important for forecasting... The effect of the correlation is a reduction [r]
(1)Introduction to Time Series Analysis Lecture 4.
Peter Bartlett
Last lecture:
1 Sample autocorrelation function ACF and prediction
(2)Introduction to Time Series Analysis Lecture 4.
Peter Bartlett
1 Review: ACF, sample ACF
(3)Mean, Autocovariance, Stationarity
A time series {Xt} has mean function µt = E[Xt] and autocovariance function
γX(t + h, t) = Cov(Xt+h, Xt)
= E[(Xt+h − µt+h)(Xt − µt)] It is stationary if both are independent of t
Then we write γX(h) = γX(h, 0)
The autocorrelation function (ACF) is
ρX(h) =
γX(h)
γX(0)
(4)Estimating the ACF: Sample ACF
For observations x1, , xn of a time series, the sample mean is x¯ =
n
n
X
t=1 xt The sample autocovariance function is
ˆ
γ(h) =
n
n−|h|
X
t=1
(xt+|h| −x¯)(xt −x¯), for −n < h < n
The sample autocorrelation function is ˆ
ρ(h) = γˆ(h) ˆ
(5)Properties of the autocovariance function
For the autocovariance function γ of a stationary time series {Xt}, γ(0) ≥ 0,
2 |γ(h)| ≤ γ(0), γ(h) = γ(−h),
4 γ is positive semidefinite
(6)Introduction to Time Series Analysis Lecture 4. Review: ACF, sample ACF
(7)Properties of the sample autocovariance function
The sample autocovariance function:
ˆ
γ(h) =
n
n−|h|
X
t=1
(xt+|h| − x¯)(xt − x¯), for −n < h < n
For any sequence x1, , xn, the sample autocovariance function γˆ satisfies γˆ(h) = ˆγ(−h),
(8)Properties of the sample autocovariance function: psd
ˆ
Γn =
ˆ
γ(0) γˆ(1) · · · γˆ(n − 1) ˆ
γ(1) γˆ(0) · · · γˆ(n − 2)
ˆ
γ(n − 1) γˆ(n − 2) · · · γˆ(0)
=
nM M
′, (see next slide)
so a′Γˆna =
n(a
′M)(M′a)
=
nkM
′a
k2 ≥ 0,
(9)Properties of the sample autocovariance function: psd M =
0 · · · 0 X˜1 X˜2 · · · X˜n · · · X˜1 X˜2 · · · X˜n 0 · · · X˜1 X˜2 · · · X˜n 0
˜
X1 X˜2 · · · X˜n · · ·
(10)Estimating µ
How good is X¯n as an estimate of µ?
For a stationary process {Xt}, the sample average, ¯
Xn =
n (X1 + · · · + Xn) satisfies
E( ¯Xn) = µ, (unbiased) var( ¯Xn) =
1
n
n
X
h=−n
1 − |h|
n
(11)Estimating the ACF: Sample ACF
To see why: var( ¯Xn) = E
n
n
X
i=1
Xi − µ
! n n X j=1
Xj − µ
= n2 n X i=1 n X j=1
E(Xi − µ)(Xj − µ)
=
n2
X
i,j
γ(i − j)
=
n
n−1 X
h=−(n−1)
1 − |h|
n
(12)Estimating µ
Since var( ¯Xn) =
n
n
X
h=−n
1 − |h|
n
γ(h),
if lim
(13)Estimating µ
Also, since var( ¯Xn) =
n
n
X
h=−n
1 − |h|
n
γ(h),
if X
h
|γ(h)| < ∞, nvar( ¯Xn) →
∞
X
h=−∞
γ(h) = σ2
∞
X
h=−∞
ρ(h)
(14)Estimating µ
nvar( ¯Xn) → σ2
∞
X
h=−∞
ρ(h)
i.e., instead of var( ¯Xn) ≈
σ2
n , we have var( ¯Xn) ≈
σ2 n/τ ,
with τ = P
h ρ(h) The effect of the correlation is a reduction of sample
(15)Estimating µ: Asymptotic distribution
Why are we interested in asymptotic distributions?
• If we know the asymptotic distribution of X¯n, we can use it to construct hypothesis tests,
e.g., is = 0?
ã Similarly for the asymptotic distribution of ρˆ(h), e.g., is ρ(1) = 0?
Notation: Xn ∼ AN(µn, σn2) means ‘asymptotically normal’:
Xn − µn
σn
d
(16)Estimating µ for a linear process: Asymptotically normal
Theorem (A.5) For a linear process Xt = µ + Pj ψjWt−j, if Pψj 6= 0, then
¯
Xn ∼ AN
µx,
V n
,
where V =
∞
X
h=−∞
γ(h)
= σw2
∞
X
j=−∞
ψj
(X ∼ AN(µn, σn) means σn−1(Xn − µn) d
(17)Estimating µ for a linear process
Recall: for a linear process Xt = µ +
P
j ψjWt−j,
γX(h) = σw2
∞
X
j=−∞
ψjψh+j,
so lim
n→∞ nvar( ¯Xn) = limn→∞
n−1 X
h=−(n−1)
1 − |h|
n
γ(h)
= lim n→∞ σ
2
w
∞
X
j=−∞
ψj
n−1 X
h=−(n−1)
ψj+h − |
h|
n ψj+h
(18)Estimating the ACF: Sample ACF for White Noise
Theorem For a white noise process Wt, if E(W4
t ) < ∞,
ˆ
ρ(1) ˆ
ρ(K)
∼ AN
0, nI
(19)Sample ACF and testing for white noise
If {Xt} is white noise, we expect no more than ≈ 5% of the peaks of the sample ACF to satisfy
|ρˆ(h)| > 1√.96 n
(20)Sample ACF for white Gaussian (hence i.i.d.) noise
−20 −15 −10 −5 10 15 20 −0.2
(21)Estimating the ACF: Sample ACF
Theorem (A.7) For a linear process Xt = µ + Pj ψjWt−j, if E(W4
t ) < ∞,
ˆ ρ(1) ˆ
ρ(K)
∼ AN ρ(1)
ρ(K)
, nV ,
where Vi,j =
∞
X
h=1
(22)Sample ACF for MA(1)
Recall: ρ(0) = 1, ρ(±1) = 1+θθ2, and ρ(h) = for |h| > Thus,
V1,1 =
∞
X
h=1
(ρ(h + 1) + ρ(h − 1) − 2ρ(1)ρ(h))2 = (ρ(0) − 2ρ(1)2)2 + ρ(1)2, V2,2 =
∞
X
h=1
(ρ(h + 2) + ρ(h − 2) − 2ρ(2)ρ(h))2 =
1 X
h=−1
ρ(h)2
And if ρˆ is the sample ACF from a realization of this MA(1) process, then with probability 0.95,
|ρˆ(h) − ρ(h)| ≤ 1.96
r
Vhh
(23)Sample ACF for MA(1)
0 0.2 0.4 0.6 0.8 1.2
ACF
(24)Introduction to Time Series Analysis Lecture 4. Review: ACF, sample ACF
(25)Convergence in Mean Square
• Recall the definition of a linear process:
Xt =
∞
X
j=−∞
ψjWt−j
• What we mean by these infinite sums of random variables? i.e., what is the ‘limit’ of a sequence of random variables?
(26)Convergence in Mean Square
Definition: A sequence of random variables S1, S2,
converges in mean square if there is a random variable Y
for which
lim
n→∞E(Sn − Y )
(27)Example: Linear Processes
Xt =
∞
X
j=−∞
ψjWt−j
Then if P∞j=−∞ |ψj| < ∞,
(1) |Xt| < ∞ a.s (2)
∞
X
j=−∞
(28)Example: Linear Processes (Details)
(1) P (|Xt| ≥ α) ≤
αE|Xt| (Markov’s inequality)
≤ α1
∞
X
j=−∞
|ψj|E|Wt−j|
≤ σα
∞
X
j=−∞
(29)Example: Linear Processes (Details)
For (2):
The Riesz-Fisher Theorem (Cauchy criterion):
Sn converges in mean square iff lim
m,n→∞ E(Sm − Sn)
(30)Example: Linear Processes (Details)
(2) Sn = n
X
j=−n
ψjWt−j converges in mean square, since
E(Sm − Sn)2 = E
X
m≤|j|≤n
ψjWt−j
= X m≤|j|≤n
ψj2σ2
≤ σ2
X
m≤|j|≤n
|ψj|
(31)Example: AR(1)
Let Xt be the stationary solution to Xt − φXt−1 = Wt, where
Wt ∼ W N(0, σ2) If |φ| < 1,
Xt =
∞
X
j=0
φjWt−j
(32)Example: AR(1)
Furthermore, Xt is the unique stationary solution: we can check that any other stationary solution Yt is the mean square limit:
lim
n→∞E Yt −
n−1 X
i=0
φiWt−i
!2
= lim
n→∞ E(φ
nY
(33)Example: AR(1)
Let Xt be the stationary solution to
Xt − φXt−1 = Wt, where Wt ∼ W N(0, σ2)
If |φ| < 1,
Xt =
∞
X
j=0
φjWt−j
φ = 1?
(34)Introduction to Time Series Analysis Lecture 4. Properties of estimates of µ and ρ