The Generalised Extreme Value Distribution

Một phần của tài liệu Financial enterprise risk management, second edition (Trang 299 - 303)

So far, the analysis has concentrated on distributions that relate to the full range of data available, or to the tail of a sample of data. However, another approach is to consider the distribution of the highest value for each of a number of tranches of data. This is the area of generalised extreme value theory. The starting point here is to consider the maximum observation from each of a sample of independent, identically distributed random variables, XM. As the size of a sample increases, the distribution of the maximum observation H(x) converges to the generalised extreme value (GEV) distribution. The cumulative distribution function is shown in Equation 12.1:

H(x) =Pr(XMx) =

⎧⎪

⎪⎪

⎪⎪

⎪⎩ e

1+γx−βα−

ifγ=0;

ee

x−α

β

ifγ=0.

(12.1)

In this formulation,α and β are the location and scale parameters, analogous to the mean and standard deviation for the whole distribution. As with the mean and standard deviation,α can take any value whilstβ must be positive. The value for which the expression is evaluated,x, must be greater than or equal toα.

12.2 The Generalised Extreme Value Distribution 287 The parameter γdetermines the shape of the distribution. With the GEV distri- bution, this parameter determines the range of distributions to which the extreme values belong. It does this by giving a particular distribution that has the same shape as the tail of a number of other distributions:

Ifγ>0, then the distribution is a Fr´echet-type GEV distribution. The Fr´echet- type GEV distribution has a tail that follows a power law. This means that the extreme values could in fact have come from Student’s t-distribution, the Pareto distribution or the L´evy distribution. Which of these distributions the full dataset might follow is irrelevant: the behaviour of observations in the tail – which is the important thing – will be the same.

Ifγ=0, then the distribution is a Gumbel-type GEV distribution. Here, the tail will be exponential as with the normal and gamma distributions and their close relatives.

Ifγ<0, then the distribution is a Weibull-type GEV distribution. This has a tail that falls off so quickly that there is actually a finite right endpoint to the distribu- tion, as with the beta, uniform and triangular distributions. Given that EVT is used when there is concern about extreme observations, this suggests that Weibull-type GEV distribution is of little interest in this respect.

A ‘standard’ GEV distribution can be created by setting α =0 and β =1, as shown in Equation 12.2. The cumulative distributions for Fr´echet-, Gumbel- and Weibull-types of this standard distribution are shown in Figure 12.1.

H(x) =

⎧⎪

⎪⎩

e−(1+γx)−

ifγ=0;

eex ifγ=0.

(12.2) It is straightforward to differentiate the GEV distribution function to give the density function, as shown for the standard distribution in Equation 12.3. This is helpful as it allows us to see more clearly the shape of the tails for different values ofγ. Density functions are shown in Figure 12.2

h(x) =

⎧⎪

⎪⎩

(1+γx)−

1+1γ

e−(1+γx)−

ifγ=0;

e−(x+ex) ifγ=0.

(12.3)

A confusing point to note is that the Weibull distribution does not necessarily have a tail that corresponds to a Weibull-type GEV distribution. This is because there are a number of different versions of the Weibull distribution, only some of which have a finite end point; others – including the one described in this book – have exponential tails.

To fit the GEV distribution, the raw data must be divided into equally sized

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0 1 2 3 4 5

x H(x)

γ=−0.5 γ=0 γ=0.5

Figure 12.1 Various GEV Distribution Functions

0 0.1 0.2 0.3 0.4 0.5

−2 −1 0 1 2 3 4 5 6 7 8

x h(x)

γ=−0.5 γ=0 γ=0.5

Figure 12.2 Various GEV Density Functions

blocks. Then, extreme values are taken from each of the blocks. There are two types of information that might be taken, and thus modelled. The first is simply the highest observation in each block of data. This is known as the return level ap- proach, and the result is a distribution of the highest observation per the block size.

So if each block contained a thousand observations the result of the analysis would be the distribution of the highest observation per thousand. The second approach is to set a level above which an observation could be regarded as extreme. Then, the number of observations in each block would be counted and modelled using a GEV distribution. In this case, if each block contained a thousand observations the result would be the distribution of the rate of extreme observations per thousand.

This is known as the return period approach.

12.2 The Generalised Extreme Value Distribution 289 Block Size = 5 Block Size = 10

Return Level

Return Period

Return Level

Return Period 50

100 70 85 300 450 10 95 400 60 65 30 25 135 300 260 30 80 15 105

300 450

300 260

50 100 70 85 300 450 10 95 400

60 65 30 25 135 300 260 30 80 15 105

1 2

1 1

50 100 70 85 300 450 10 95 400 60 65 30 25 135 300 260 30 80 15 105

450

300

50 100 70 85 300 450 10 95 400

60 65 30 25 135 300 260 30 80 15 105

3

2

Figure 12.3 Comparison of GEV Approaches and Block Sizes

The size of the blocks is crucial, and there is a compromise to be made. If a large number of blocks is used, then this means that there are fewer observations in each block. If the return level approach is used, this translates to less information about extreme values – a rate per hundred observations does not give as much information about what is ‘extreme’ as a rate per thousand. However, the large number of blocks means a large number of ‘extreme’ observations, so the variance of the parameter estimates is lower. If, on the other hand, fewer and larger blocks of data are used, then the information in each group about what is extreme is greater under the return level approach. However, with fewer blocks the variance of the parameter estimates is higher.

This can be seen in Figure 12.3. The first column of numbers shows the return level approach calculated using a block size of five. The result is the distribution of one-in-five events. The third column divides the data into only two blocks. The result is information on the distribution of more extreme one-in-ten events, but the distribution is based on only two observations rather than four. The choice of block size appears to be less important for the return period approach, since the total

number of extreme events is five in both column two and column four. However, since the result is divided into the number of observation per blocks, a similar issue arises when the parameters for the GEV distribution are being calculated.

A major drawback of the GEV approach is that by using only the largest value or values in each block of data, it ignores a lot of potentially useful information.

For example, if the return level approach is used and there are a thousand obser- vations per block, then 99.9% of the information is discarded. For this reason, the generalised Pareto distribution is more commonly used.

Một phần của tài liệu Financial enterprise risk management, second edition (Trang 299 - 303)

Tải bản đầy đủ (PDF)

(601 trang)