advanced engineering mathematics – mathematics

• Identifying the (best linear) relationship between two time series.. • Forecasting one time series from the other...[r]

(1)

Final Exam • Open-book

• Covers all of the course

(2)

Introduction to Time Series Analysis: Review

1 Time series modelling Time domain

(a) Concepts of stationarity, ACF

(b) Linear processes, causality, invertibility (c) ARMA models, forecasting, estimation (d) ARIMA, seasonal ARIMA models

3 Frequency domain (a) Spectral density

(b) Linear filters, frequency response

(3)

Objectives of Time Series Analysis

1 Compact description of data Example:

Xt = Tt + St + f(Yt) + Wt

(4)

Time Series Modelling

1 Plot the time series

Look for trends, seasonal components, step changes, outliers 2 Transform data so that residuals are stationary.

(a) Estimate and subtract Tt, St

(b) Differencing

(5)

1 Time series modelling 2 Time domain.

(6)

Stationarity

{Xt} is strictly stationary if, for all k, t1, , tk, x1, , xk, and h,

P(Xt1 ≤ x1, , Xtk ≤ xk) = P(xt1+h ≤ x1, , Xtk+h ≤ xk)

i.e., shifting the time axis does not affect the distribution We consider second-order properties only:

{Xt} is stationary if its mean function and autocovariance function satisfy

µx(t) = E[Xt] = µ,

γx(s, t) = Cov(Xs, Xt) = γx(s − t)

(7)

ACF and Sample ACF

The autocorrelation function (ACF) is

ρX(h) = γX(h) γX(0)

= Corr(Xt+h, Xt)

For observations x1, , xn of a time series, the sample mean is x¯ =

n

n X

t=1 xt

The sample autocovariance function is

ˆ

γ(h) = n

n−|h|

X t=1

(8)

Linear Processes

An important class of stationary time series:

Xt = µ +

∞

X j=−∞

ψjWt−j

where {Wt} ∼ W N(0, σw2 )

and µ, ψj are parameters satisfying

∞

X j=−∞

|ψj| < ∞

(9)

Causality

A linear process {Xt} is causal (strictly, a causal function

of {Wt}) if there is a

ψ(B) = ψ0 + ψ1B + ψ2B2 + · · ·

with

∞

X j=0

|ψj| < ∞

(10)

Invertibility

A linear process {Xt} is invertible (strictly, an invertible

function of {Wt}) if there is a

π(B) = π0 + π1B + π2B2 + · · ·

with

∞

X j=0

|πj| < ∞

(11)

Polynomials of a complex variable

Every degree p polynomial a(z) can be factorized as

a(z) = a0 + a1z + · · · + apzp = ap(z − z1)(z − z2)· · ·(z − zp),

where z1, , zp ∈ C are called the roots of a(z) If the coefficients

(12)

Autoregressive moving average models

An ARMA(p,q) process {Xt} is a stationary process that satisfies

Xt−φ1Xt−1−· · ·−φpXt−p = Wt+θ1Wt−1+· · ·+θqWt−q,

where {Wt} ∼ W N(0, σ2)

(13)

Properties of ARMA(p,q) models

Theorem: If φ and θ have no common factors, a (unique)

sta-tionary solution to φ(B)Xt = θ(B)Wt exists iff

φ(z) = − φ1z − · · · − φpzp = ⇒ |z| 6=

This ARMA(p,q) process is causal iff

φ(z) = − φ1z − · · · − φpzp = ⇒ |z| >

It is invertible iff

(14)

Properties of ARMA(p,q) models

φ(B)Xt = θ(B)Wt, ⇔ Xt = ψ(B)Wt

so θ(B) = ψ(B)φ(B)

⇔ + θ1B + · · · + θqBq = (ψ0 + ψ1B + · · · )(1 − φ1B − · · · − φpBp)

⇔ = ψ0,

θ1 = ψ1 − φ1ψ0,

θ2 = ψ2 − φ1ψ1 − · · · − φ2ψ0,

(15)

Linear prediction

Given X1, X2, , Xn, the best linear predictor

Xnn+m = α0 +

n X

i=1

αiXi

of Xn+m satisfies the prediction equations E Xn+m − Xnn+m

=

E Xn+m − Xnn+mXi = for i = 1, , n

(16)

Projection Theorem

If H is a Hilbert space,

M is a closed linear subspace of H, and y ∈ H,

then there is a point P y ∈ M

(the projection of y on M) satisfying

1 kP y − yk ≤ kw − yk for w ∈ M,

2 kP y −yk < kw−yk for w ∈ M, w 6= y hy − P y, wi = for w ∈ M

y

y−Py

Py

(17)

One-step-ahead linear prediction

Xnn+1 = φn1Xn + φn2Xn−1 + · · · + φnnX1 Γnφn = γn,

Pnn+1 = E Xn+1 − Xnn+12 = γ(0) − γn′ Γ−1n γn,

with Γn =        

γ(0) γ(1) · · · γ(n − 1)

γ(1) γ(0) γ(n − 2)

γ(n − 1) γ(n − 2) · · · γ(0)

        ,

(18)

The innovations representation

Write the best linear predictor as

Xnn+1 = θn1 Xn − Xnn−1

| {z }

innovation

+θn2 Xn−1 − Xnn−1−2

+· · ·+θnn X1 − X10

The innovations are uncorrelated:

(19)

Yule-Walker estimation

Method of moments: We choose parameters for which the moments are

equal to the empirical moments

In this case, we choose φ so that γ = ˆγ

Yule-Walker equations for φˆ:

  

ˆ

Γpφˆ = ˆγp, ˆ

σ2 = ˆγ(0) − φˆ′γˆp

These are the forecasting equations

(20)

Maximum likelihood estimation

Suppose that X1, X2, , Xn is drawn from a zero mean Gaussian ARMA(p,q) process The likelihood of parameters φ ∈ Rp, θ ∈ Rq,

σw2 ∈ R+ is defined as the density of X = (X1, X2, , Xn)′ under the

Gaussian model with those parameters:

L(φ, θ, σw2 ) = (2π)n/2 |Γ

n|1/2

exp

−12X′Γ−1n X

,

where |A| denotes the determinant of a matrix A, and Γn is the

variance/covariance matrix of X with the given parameter values

(21)

Maximum likelihood estimation

The MLE ( ˆφ,θ,ˆ σˆw2 ) satisfies

ˆ

σw2 = S( ˆφ, θ)ˆ

n ,

and φ,ˆ θˆminimize log S( ˆφ,θ)ˆ n ! + n n X i=1

logrii−1,

where rii−1 = Pii−1/σw2 and

S(φ, θ) =

n X

i=1

(22)

Integrated ARMA Models: ARIMA(p,d,q)

For p, d, q ≥ 0, we say that a time series {Xt} is an

ARIMA (p,d,q) process if Yt = ∇dXt = (1 − B)dXt is ARMA(p,q) We can write

(23)

Multiplicative seasonal ARMA Models

For p, q, P, Q ≥ 0, s > 0, d, D > 0, we say that a time series {Xt} is a multiplicative seasonal ARIMA model (ARIMA(p,d,q)×(P,D,Q)s)

Φ(Bs)φ(B)∇Ds ∇dXt = Θ(Bs)θ(B)Wt,

where the seasonal difference operator of order D is defined by

(24)

3 Frequency domain. (a) Spectral density

(25)

Spectral density and spectral distribution function

If {Xt} has P∞h=−∞ |γx(h)| < ∞, then we define its

spectral density as

f(ν) =

∞

X h=−∞

γ(h)e−2πiνh

for −∞ < ν < ∞ We have

γ(h) =

Z 1/2 −1/2

e2πiνhf(ν)dν =

Z 1/2 −1/2

e2πiνh dF(ν),

(26)

Frequency response of a linear filter

If {Xt} has spectral density fx(ν) and the coefficients of the

time-invariant linear filter ψ are absolutely summable, then

Yt = ψ(B)Xt has spectral density

fy(ν) = ψ e2πiν2 fx(ν)

(27)

Sample autocovariance

The sample autocovariance γˆ(·) can be used to give an estimate of the spectral density,

ˆ

f(ν) =

n−1

X h=−n+1

ˆ

γ(h)e−2πiνh

for −1/2 ≤ ν ≤ 1/2

(28)

Periodogram

The periodogram is defined as

I(ν) = |X(ν)|2 = n n X t=1

e−2πitνxt

= Xc2(ν) + Xs2(ν)

Xc(ν) = √1 n

n X

t=1

cos(2πtν)xt,

Xs(ν) = √1 n

n X

t=1

(29)

Asymptotic properties of the periodogram

Under general conditions (e.g., gaussian, or linear process with rapidly

decaying ACF), the Xc(νj), Xs(νj) are all asymptotically independent and

N(0, f(νj)/2), and f(ˆν(n)) → f(ν), where νˆ(n) is the closest Fourier frequency (k/n) to the frequency ν

In that case, we have

2

f(ν)I(ˆν

(n)) = f(ν)

Xc2(ˆν(n)) + Xs2(ˆν(n)) →d χ22

(30)

Smoothed periodogram

If f(ν) is approximately constant in the band [νk − L/(2n), νk + L/(2n)],

the average of the periodogram over the band will be unbiased

ˆ

f(νk) =

1 L

(L−1)/2

X l=−(L−1)/2

I(νk − l/n)

= L

(L−1)/2

X l=−(L−1)/2

Xc2(νk − l/n) + Xs2(νk − l/n)

(31)

Smoothed spectral estimators

ˆ

f(ν) = X |j|≤Ln

Wn(j)I(ˆν(n) − j/n),

where the spectral window function satisfies Ln → ∞, Ln/n → 0,

Wn(j) ≥ 0, Wn(j) = Wn(−j), PWn(j) = 1, and PWn2(j) →

Then fˆ(ν) → f(ν) (in the mean square sense), and asymptotically

ˆ

f(νk) ∼ f(νk)χ

d

d ,

(32)

Parametric spectral density estimation

Given data x1, x2, , xn,

1 Estimate the AR parameters φ1, , φp, σw2

2 Use the estimates φˆ1, ,φˆp,σˆw2 to compute the estimated spectral density:

ˆ

fy(ν) =

ˆ σw2

φˆ(e−2πiν)

(33)

Parametric spectral density estimation

For large n,

Var( ˆf(ν)) ≈ 2p n f

2(ν).

Notice the bias-variance trade off.

Advantage over nonparametric: better frequency resolution of a small

number of peaks This is especially important if there is more than one peak at nearby frequencies

(34)

Lagged regression models

Consider a lagged regression model of the form

Yt =

∞

X h=−∞

βhXt−h + Vt,

where Xt is an observed input time series, Yt is the observed output time

series, and Vt is a stationary noise process This is useful for

• Identifying the (best linear) relationship between two time series

(35)

Lagged regression in the time domain

Yt = α(B)Xt + ηt =

∞

X j=0

αjXt−j + ηt,

1 Fit an ARMA model (with θx(B), φx(B)) to the input series {Xt} 2 Prewhiten the input series by applying the inverse operator

φx(B)/θx(B)

3 Calculate the cross-correlation of Y˜t with Wt, γy,w˜ (h), to give an indication of the behavior of α(B) (for instance, the delay)

(36)

Coherence

Define the cross-spectrum and the squared coherence function:

fxy(ν) =

∞

X h=−∞

γxy(h)e−2πiνh,

γxy(h) =

Z 1/2 −1/2

fxy(ν)e2πiνhdν, ρ2y,x(ν) = |fyx(ν)|

2 fx(ν)fy(ν)

(37)

Lagged regression models in the frequency domain

Yt =

∞

X j=−∞

βjXt−j + Vt,

We compute the Fourier transform of the series {βj} in terms of the cross-spectral density and the spectral density:

B(ν)fx(ν) = fyx(ν) M SE =

Z 1/2 −1/2

fy(ν) − ρ2yx(ν) dν

(38)

Introduction to Time Series Analysis: Review

Định dạng
Số trang	38
Dung lượng	111,09 KB