Model Setting and Estimation

5. Measuring Model Risk in the European Energy

5.4 Model Setting and Estimation

Even if VaR analysis is based on returns, we were forced to compute simply price differences because electricity price series can exhibit null or even negative values. Therefore, we consider the series of daily electricity spot prices and denote the price change between day t − 1 and day t as

At date t, we need to predict a (conditional) VaR for X t+1. More precisely, given α ∈ (0, 1), we are interested in computing

where is the information at time t, by means of the GARCH–type models which are considered a standard approach for VaR estimation, where only the conditional mean and standard deviation of returns are estimated. The original GARCH models assumed normal conditional distributions;

recently, these models have been expanded to account for skewness and extra-kurtosis, hence

supporting more complex distributions, but the additional estimated parameters are not time-varying, opposite to the dynamics of the conditional mean and variance.

Also, we have decided not to include relations with fundamental drivers or additional factors

affecting electricity prices in order to provide an overall quantification of model risk. This figure will naturally include all other forms of risks that influence electricity prices, such as those related to regulatory, structural, and fundamental factors.

Taking advantage of the long history of prices available from 2001 to 2013, we investigate the

“time evolution” of the measure of model risk across years by adopting a rolling window procedure, as explained in the following sections.

5.4.1 The GARCH Methodology

Let us remind that we consider a particular set of parametric models within the GARCH framework, using the AR(5)-GARCH(1,1) specification, then for any t we have

(5.3) where

(5.4) is the conditional mean expressed as an AR(5) process, and

(5.5) is the conditional variance in form of a GARCH(1,1) model. Parameters , ϕ i , ω, α and β have to satisfy known constraints. Finally, the innovations Z t form an IID sequence with standard (i.e. mean 0 and variance 1) common distribution. As a consequence, μ t and σ t 2 are the conditional (on ) mean and variance, respectively.

Parameters are estimated by ML, i.e. by maximizing the joint likelihood of the variables (X t ) on a given data window. We stress that the form of the likelihood depends on the distribution F Z of the innovations. As a consequence, different sets of parameters are retrieved using different hypotheses on F Z , even if the same data window is used for the estimation process.

A technical remark is in order. The joint distribution of a process X t , following an AR(5)- GARCH(1,1) model as in (5.3), (5.4), and (5.5), is completely specified only when the common standard distribution of innovations F Z and the parameters ( , ϕ i , etc.) are fixed. We can associate a probability measure Q to each complete specification such that the process X t has the given joint distribution under Q. Moreover, in view of the fact that the estimation process (Maximum Likelihood on a given data window) and the parameters are completely determined once the distribution is fixed, we can safely associate a completely specified AR(5)-GARCH(1,1) model to any possible

innovations distribution F Z ; hence, a probability Q on . As a consequence, in what follows a

“model” Q will be given directly in terms of the distribution of Z.

The following alternative distributions for Z are considered in the present study4: the normal (N)

the student–t, skew student–t, skew normal, (ST, SST, SN);

the generalized and skew generalized error distributions, (GED and SGED);

the Johnson’s SU, (JSU);

the normal inverse gaussian, (NIG);

the generalized hyperbolic and generalized hyperbolic student–t, (GHYP and GHST).

For all the listed distributions, the standard versions (i.e. the ones having mean 0 and variance 1) are considered. Hence, we consider the following finite set of models

containing the reference one, Q 0, to be specified below, and the other Qs associated to the above listed alternative distributions for Z.

We believe that the alternative models considered in this analysis constitute a sufficiently

representative range of distributions, often used in practice. In particular, we consider distributions that are different from the normal because they account for asymmetry and/or thick tails, two

important features to be considered for proper Risk Management. Moreover, we think that the inclusion of additional realistic distributions would not affect much our results, even though this is our conjecture not empirically verified.

Note that the listed distributions may depend on additional parameters: for instance, the student–t distribution depends on the number of degrees of freedom. These parameters5 are also estimated through Maximum Likelihood on a common data window. As a consequence, we can think that once the parametric class of distributions (ST, SST, etc.) is considered, additional parameters are

implicitly determined. In other words, a “model” Q corresponds to the unique standard distribution in a parametric class (ST, SST, etc.), with additional parameters determined by ML estimation. This

“identification” will be tacitly understood in what follows, particularly when presenting the estimation procedure in Sect. 5.4.2.

Making use of the GARCH models and using the basic properties of VaR, for each model we have

Notice that the parameters , ϕ i , ω, α and β will enter the preceding equation through the quantities μ t+1 and σ t+1, while the distribution of Z, completely specified by a parametric class, as explained above, will affect q α (Z). Summary results of the estimates for the AR(5)-GARCH(1,1) models, using the entire time series of 3391 observations, and under different innovations

distributions are reported in Table 5.2. We firstly observe that the unconditional mean, , is never significant, as expected from the descriptive statistics of price changes. Secondly, the autoregressive structure is an important fact to be included, as well as the heteroskedasticity. On the contrary,

estimates for skewness (ν) and kurtosis (τ) turn from significant to non-significant values, according to the distribution considered; and they are definitively not important when the hyperbolic distribution is employed. Turning our attention to the ability of each model to fit the data, we can observe without surprise that the worst fitting models are the one with the normal (NO) and the skew normal (SN) distributions, for which all information criteria show the highest values. On the contrary, the best fitting model is the one with the Generalized Error Distribution (GED), where instead all

information criteria exhibit the lowest values. As a robustness check, we estimated two variants of the AR(5)-GARCH(1,1) model, in order to take into account two possible facts: (i) electricity price levels may be affected by past price variability, that is, today prices are affected by yesterday price movements reflecting possible problems of “scarcity” in the system (problems referred to unavailable capacity, outages, or excess of demand, among others); and, (ii) volatility in electricity prices is

generally stronger when prices are high. Therefore, we have verified (i) by including the standard deviation, as obtained from the conditional variance equation, in the conditional mean equation hence formulating a GARCH-in-mean and estimating the AR(5)-GARCH-M(1,1) models defined by (5.3), and (5.4) and (5.5) replaced by

Table 5.2 Estimates of the AR(5)-GARCH(1,1) models. Note that ν and τ are the parameters related to skewness and kurtosis, respectively, whereas λ is the third parameter for the GHYP distribution.∗∗∗,∗∗, and∗ mean significant at 1%, 5% and 10% confidence levels respectively. AIC, SBC and HQ represent the Akaike, Bayes and Hannan-Quinn information criteria

NO SN ST SST GED SGED JSU NIG GHYP GHST

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 ϕ 1 −0.445∗∗∗ −0.571∗∗∗ −0.392∗∗∗ −0.391∗∗∗ −0.407∗∗∗ −0.434∗∗∗ −0.413∗∗∗ −0.406∗∗∗ −0.416∗∗∗ −0.431∗∗∗

ϕ 2 −0.343∗∗∗ −0.352∗∗ −0.274∗∗∗ −0.271∗∗∗ −0.260∗∗∗ −0.301∗∗∗ −0.324∗∗∗ −0.293∗∗∗ −0.269∗∗∗ −0.312∗∗∗

ϕ 3 −0.227∗∗ −0.261∗∗ −0.114∗∗∗ −0.116∗∗∗ −0.184∗∗ −0.220∗∗∗ −0.220∗∗∗ −0.196∗∗∗ −0.153∗∗∗ −0.138∗∗∗

ϕ 4 −0.228∗∗ −0.316∗∗ −0.111∗∗∗ −0.096∗∗∗ −0.111 −0.126∗∗ −0.112∗∗∗ −0.121∗∗∗ −0.078∗∗ −0.160∗∗∗

ϕ 5 −0.102∗ −0.233∗∗ −0.075∗∗∗ −0.083∗∗∗ −0.079 −0.049 −0.076∗∗ −0.094∗∗∗ −0.075∗∗ −0.074∗∗∗

ω 0.662 ∗ 0.162 0.152 0.151 0.186 0.151 0.147 0.150 0.152 0.161 α 0.128 ∗∗∗ 0.084 ∗∗∗ 0.063 ∗∗ 0.062 ∗ 0.100 ∗∗∗ 0.063 0.057 0.060 ∗ 0.063 ∗∗ 0.075 ∗∗∗

β 0.869 ∗∗∗ 0.892 ∗∗∗ 0.901 ∗∗∗ 0.901 ∗∗∗ 0.886 ∗∗∗ 0.899 ∗∗∗ 0.901 ∗∗ 0.901 ∗∗∗ 0.900 ∗∗∗ 0.897 ∗∗∗

ν 1.028 ∗∗∗ 0.960 ∗∗∗ 0.994 ∗∗∗ −0.045 0.005 −0.009 −0.895∗

τ 4.207 ∗∗∗ 4.214 ∗∗∗ 1.045 ∗∗∗ 0.972 ∗∗∗ 1.385 ∗∗ 0.452 ∗∗∗ 1.841 8.197 ∗∗∗

λ −1.273∗∗∗

AIC 7.0237 7.2328 6.6683 6.6723 6.6353 6.7163 6.6897 6.7191 6.7099 6.7090 SBC 7.0400 7.2509 6.6864 6.6923 6.6534 6.7362 6.7096 6.7390 6.7316 6.7289 HQ 7.0295 7.2392 6.6748 6.6795 6.6417 6.7234 6.6968 6.7262 6.7176 6.7162

where δ is an additional parameter, and

Also, we have tested (ii) by including the price range of the 24 hourly prices observed on the previous day as an explanatory variable in the conditional variance equation of the GARCH models.

The price range is given by PR t−1 = max h (S t−1h ) − min h (S t−1 h ), where S t−1 h represents the hourly electricity prices observed during hour h = 1, …, 24 and day t − 1. We therefore considered an AR(5)-GARCHX(1,1) model, defined by (5.3) and (5.4), and (5.5) replaced by

where is an additional parameter.

The estimated coefficients for investigated facts are reported in Table 5.3, but complete results are available on request. These results clearly show how both facts are not influential and can be neglected in the following implementation of our models.

Table 5.3 ML estimates for the and coefficients (with p-values in brackets), when respectively price levels are affected by past

price volatility – by means of the AR(5)-GARCH-M(1,1) model – and when price volatility is affected by high past prices – by means of the AR(5)-GARCHX(1,1) model. Note that∗ indicates that convergence was obtained with less strict criteria of tolerance compared with those used for the other models

Innovations (p-value) (p-value) NO −0.014 (0.286) 0.000 (1.000) SN 0.003 (0.814) 0.000 (0.999) ST 0.002 (0.914) 0.000 (1.000) SST −0.005 (0.805) 0.000 (1.000) GED −0.003 (0.368) 0.000 (1.000) SGED −0.001∗ (0.281) 0.000 (1.000) JSU −0.004 (0.853) 0.000 (0.999) NIG 0.001 (0.970) 0.000 (0.999) GHYP 0.003 (0.894) 0.000 (1.000) GHST 0.003 (0.874) 0.000 (1.000)

5.4.2 Dynamic Model Risk Quantification

Once Q 0 and have been specified (up to additional parameters as explained before), a rolling window procedure is recursively adopted to investigate the time evolution of the relative measure of model risk. In particular, for any model for the innovations, at day t we estimate the parameters of the GARCH model by ML, using a rolling window of the past 260 days. Hence, we are able to forecast the (conditional) distribution of X t+1 and then retrieve the V aR Q, α , for both values of α, that is 1% and 5%. Hence, we compute the RMMR measure at any date and for both levels of α.

Explicitly, the step-by-step procedure is as follows:

1. start at day t = 265;

2. consider the estimation window ; note that when t = 265,

as the first 5 observations have to be excluded from the estimation step, due to the AR(5) part of the model;

3. for any model for the innovations (including the reference one Q 0), estimate the parameters of the AR(5)-GARCH(1,1) model for X t+1, specified as in Eqs. (5.3), (5.4), and (5.5), including additional parameters of the innovation distribution (e.g. the degrees of freedom in the t–student model), using ML on the estimation window . We perform this step using the software R with the “rugarch” package6;

4. having the (conditional) distribution F t+1, Q of X t+1 under Q, compute for α = 1% and α = 5%, and then

where

5. increment t by 1 day and go back to step 2: in particular, we have t = 266 and at the second iteration; and so on.

Recalling that the entire time series at our disposal has 3391 ( = 3392 − 1) observations for price changes, the procedure outlined above yields two series of 3126 measures of model risk (with

respect to Q 0 and ):

Notice that the actual distribution associated to Q 0 (or any other ) will change from one day to another, due to different estimates of the GARCH parameters and additional parameters of

innovations. As a consequence, any series (RMMR α, t ) t = 266, …, 3391 will in fact describe the relative model risk of sticking, day by day, to a given parametric family for innovations, instead of choosing one of the 9 alternative families.

Financial Impact from a Wine Futures Market

Risk Capital and Risk-Adjusted Performance