Operational Risk Modeling Analytics phần 3 pot

46 206 0
Operational Risk Modeling Analytics phần 3 pot

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

THE ROLE OF PARAMETERS 75 Example 4.4 Demonstrate that the exponential distribution is a scale distri- bution. The distribution function of the exponential distribution is Fx(x)=l-e -X/O , z>O. FY(Y) = Pr(Y 5 Y) . Let Y = cX, where c > 0. Then, : Pr(cX 5 y) =Pr X<- ( -3 > Y>O. - - 1 - e-Y/cQ This is an exponential distribution with parameter c6. So the form of the distribution has not changed, only the parameter value. Definition 4.5 For random variables with nonnegative support, a scale pa- rameter is a parameter for a scale distribution that meets two conditions. First, when the random variable of a member of the scale distribution is multi- plied by a positive constant, the parameter is multiplied by the same constant. Second, when the random variable of a member of the scale distribution is multiplied by a positive constant, all other parameters are unchanged. Example 4.6 Demonstrate that the gamma distribution has a scale parame- ter. Let X have the gamma distribution and Y = CX plete gamma notation given in Appendix A, Then, using the incom- indicating that Y has a gamma distribution with parameters Q: and c6. There- 0 fore, the parameter 6 is a scale parameter. It is often possible to recognize a scale parameter from looking at the distribution or density function. In particular, the distribution function would have x always appear together with the scale parameter 6 as xl6. 4.5.2 Finite mixture distributions Distributions that are finite mixtures have distributions that are weighted averages of other distribution functions. 76 MODELS FOR THE SIZE OF LOSSES: CONTINUOUS DISTRIBUTIONS Definition 4.7 A random variable Y is a k-point mixture2 of the random variables XI, x~, . . . , xk if its cdf is given by FY(y) =alFXz(Y) +a2FXz(y) + +akFXk(?/), (4.3) where all aj > 0 and al + a2 + . . . + ak = 1. This essentially assigns weight aj to the jth distribution. The weights are usually considered as parameters. Thus the total number of parameters is the sum of the parameters on the k distributions plus k - 1. Note that, if we have 20 different distributions, a two-point mixture allows us to create over 200 new distribution^.^ This may be sufficient for most modeling situations. Nevertheless, these are still parametric distributions, though perhaps with many parameters. Example 4.8 Models used in insurance can provide some insight into models that could be used for operational risk losses, particularly those that are insur- able risks. For models involving general liability insurance, the Insurance Ser- vices Ofice has had some success with a mixture of two Pareto distributions. They also found that jive parameters were not necessary. The distribution they selected has cdf Note that the shape parameters in the two Pareto distributions difler by 2. The second distribution places more probability on smaller values. This might be a model for frequent, small losses while the first distribution covers large, but infrequent losses. This distribution has only four parameters, bringing some 0 parsimony to the modeling process. Suppose we do not know how many distributions should be in the mix- ture. Then the value of k itself also becomes a parameter, as indicated in the following definition. Definition 4.9 A variable-component mixture distribution has a dis- tribution function that can be written as K K F(x) = CajFj(x), Caj = I, aj > 0, j = 1,. . . , K, K = 1,2,. . j=1 j=1 2The words “mixed” and “mixture” have been used interchangeably to refer to the type of distribution described here as well as distributions that are partly discrete and partly continuous. This text will not attempt to resolve that confusion. The context will make clear which type of distribution is being considered. “There are actually (y) + 20 = 210 choices. The extra 20 represent the cases where both distributions are of the same type but with different parameters. THE ROLE OF PARAMETERS 77 These models have been called semiparametric because in complexity they are between parametric models and nonparametric models (see Section 4.5.3). This distinction becomes more important when model selection is discussed in Chapter 12. When the number of parameters is to be estimated from data, hypothesis tests to determine the appropriate number of parameters become more difficult. When all of the components have the same parametric distribution (but different parameters), the resulting distribution is called a “variable mixture of gs” distribution, where g stands for the name of the component distribution. Example 4.10 Determine the distribution, density, and hazard rate func- tions for the variable mixture of exponential distributions. A combination of exponential distribution functions can be written and then the other functions are The number of parameters is not fixed nor is it even limited. For example, when K = 2 there are three parameters (a1,61,&), noting that a2 is not a parameter because once a1 is set the value of a2 is determined. However, when K = 4 there are seven parameters. Example 4.11 Illustrate how a two-point mixture of gamma variables can create a bimodal distribution. Consider a mixture of two gamma distributions with equal weights. One has parameters a = 4 and 0 = 7 (for a mode of 21) and the other has parameters a = 15 and 0 = 7 (for a mode of 98). The density function is and a graph appears in Figure 4.8. 0 78 MODELS FOR THE SUE OF LOSSES: CONT/NUOUS DISTRIBUTIONS 0 50 100 150 200 X Fig. 4.8 Two-point mixture of gammas distribution. 4.5.3 Data-dependent distributions For Models 1-5 and many of the examples, we postulate a shape for a distrib- ution by assuming that the distribution is of a particular form (e.g., uniform, lognormal, gamma). The distribution is completely specified when its para- meters are specified. It is also possible to construct models for which we do not specify the form a priori. We can require data in the determination of shape. Such models also have parameters but are often called nonparametric. It is convenient to think of parameters in a broader sense: as an independent piece of information required in specifying a distribution. Then the number of independent pieces of information required to fully specify a distribution is the number of parameters. Definition 4.12 A data-dependent distribution is at least as complex as the data or knowledge that produced it, and the number of ‘rparameters” in- creases as the number of data points or amount of knowledge increases. Essentially, these models have as many (or more) “parameters” than ob- servations in the data set. The empirical distribution as illustrated by Model 6 on page 31 is a data-dependent distribution. Each data point contributes probability l/n to the probability function, so the n parameters are the n observations in the data set that produced the empirical distribution. Another example of a data-dependent model is the kernel smoothing den- sity model. Rather than placing a mass of probability l/n at each data point, a continuous density function with weight l/n replaces the data point. This continuous density function is usually centered at the data point. Such a continuous density function surrounds each data point. The kernel-smoothed distribution is the weighted average of all the continuous density functions. As a result, the kernel smoothed distribution follows the shape of data in a general sense, but not exactly as in the case of the empirical distribution. THE ROLE OF PARAMETERS 79 0.2 0.18 0.16 0.14 0.12 g 0.1 0.08 0.06 0.04 0.02 0 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 X fig. 4.9 Kernel density distribution A simple example is given below. The idea of kernel density smoothing is illustrated in Example 4.13. Included, without explanation, is the concept of bandwidth. The role of bandwidth is self-evident. Example 4.13 Construct a kernel smoothing model from Model 6 using the uniform kernel and a bandwidth of 2. The probability density function is Ix - xjj > 2, Kj(x) = { O1 0.25, /Z - xjCjl 5 2, where the sum is taken over the five points where the original model has positive probability. For example, the first term of the sum is the function x< 1, ~C(Z~)K~(X) = 0.03125, 1 5 z 5 5, {:: x>5. The complete density function is the sum of five such functions, which are illustrated in Figure 4.9. 0 Note that both the kernel smoothing model and the empirical distribution can also be written as mixture distributions. The reason that these models are classified separately is that the number of components is directly related to the sample size. This is not the case with finite mixture models where the number of components in the model is not a function of the amount of data. 80 MODELS FOR THE SIZE OF LOSSES. CONTINUOUS DlSTRlBUT/ONS 4.6 TAILS OF DISTRIBUTIONS The tail of a distribution (more properly, the right tail) is the portion of the distribution corresponding to large values of the random variable. Under- standing large possible operational risk loss values is important because these have the greatest impact on the total of operational risk losses. Random vari- ables that tend to assign higher probabilities to larger values are said to be heavier-tailed. Tail weight can be a relative concept (model A has a heavier tail than model B) or an absolute concept (distributions with a certain prop- erty are classified as heavy-tailed). When choosing models, tail weight can help narrow the choices or can confirm a choice for a model. Heavy-tailed distributions are particularly important of operational risk in connection with extreme value theory (see Chapter 7). 4.6.1 Classification based on moments Recall that in the continuous case the kth raw moment for a random variable that takes on only positive values (like most insurance payment variables) is given by sow xkf(x)dx. Depending on the density function and the value of k, this integral may not exist (that is, it may be infinite). One way of classifying distribution is on the basis of whether all moments exist. It is generally agreed that the existence of all positive moments indicates a light right tail, while the existence of only positive moments up to a certain value (or existence of no positive moments at all) indicates a heavy right tail. Example 4.14 Demonstrate that for the gamma distribution all positive mo- ments exist but for the Pareto distribution they do not. For the gamma distribution, the raw moments are = ~33(yB)*(y6)'-1e-y8dy, r(Q)oa making the substitution y = x/8 Bk r(a> = -r(a + k) < co for all k > 0. For the Pareto distribution, they are 00" (Y - 8)kFdy, making the substitution y = x + 8 = TAILS OF DISTRIBUTIONS 81 The integral exists only if all of the exponents on 9 in the sum are less than -1. That is, if j - cy - 1 < -1 for all j, or, equivalently, if k < a. Therefore, only some moments exist. 0 By this classification, the Pareto distribution is said to have a heavy tail and the gamma distribution is said to have a light tail. A look at the moment formulas in this chapter reveals which distributions have heavy tails and which do not, as indicated by the existence of moments. 4.6.2 Classification based on tail behavior One commonly used indication that one distribution has a heavier tail than another distribution with the same mean is that the ratio of the two survival functions should diverge to infinity (with the heavier-tailed distribution in the numerator) as the argument becomes large. This classification is based on asymptotic properties of the distributions. The divergence implies that the numerator distribution puts significantly more probability on large values. Note that it is equivalent to examine the ratio of density functions. The limit of the ratio will be the same, as can be seen by an application of L’HBpital’s rule: Example 4.15 than the gamma tions. Demonstrate that the Pareto distribution has a heavier tail distribution using the limit of the ratio of their density func- To avoid confusion, the letters r and X will be used for the parameters of the gamma distribution instead of the customary Q and 8. Then the required limit is = c lim 5-32 (x + Q)a+1~7-1 ex/X > c lim 5-92 (X + 6)a+7 and, either by application of L’H6pital’s rule or by remembering that expo- nentials go to infinity faster than polynomials, the limit is infinity. Figure 4.10 shows a portion of the density functions for a Pareto distribution with parameters cy = 3 and Q = 10 and a gamma distribution with parameters LY = and B = 15. Both distributions have a mean of 5 and a variance of 75. 0 The graph is consistent with the algebraic derivation. 82 MODELS FOR THE SIZE OF LOSSES: CONTINUOUS DISTRIBUTIONS Fig. 4.10 Tails of gamma and Pareto distributions 4.6.3 Classification based on hazard rate function The hazard rate function also reveals information about the tail of the distri- bution. Distributions with decreasing hazard rate functions have heavy tails. Distributions with increasing hazard rate functions have light tails. The dis- tribution with constant hazard rate, the exponential distribution, has neither increasing nor decreasing failure rates. For distributions with (asymptoti- cally) monotone hazard rates, distributions with exponential tails divide the distributions into heavy-tailed and light-tailed distributions. Comparisons between distributions can be made on the basis of the rate of increase or decrease of the hazard rate function. For example, a distribution has a lighter tail than another if, for large values of the argument, its hazard rate function is increasing at a faster rate. Example 4.16 Compare the tails of the Pareto and gamma distributions by looking at their hazard rate functions. The hazard rate function for the Pareto distribution is Q - - - f (z) QP(z + B) a l h(x) = T = F(x) 8"(~+8)-" z+6 which is decreasing. For the gamma distribution we need to be a bit more clever because there is no closed form expression for F(x). Observe that and so, if f (x + y)/ f (z) is an increasing function of x for any fixed y, then l/h(x) will be increasing in x and so the random variable will have a decreasing TAILS OF DlSTRlBUllONS 83 hazard rate. Now, for the gamma distribution which is strictly increasing in x provided a < 1 and strictly decreasing in x if a > 1. By this measure, some gamma distributions have a heavy tail (those with cy < 1) and some have a light tail. Note that when a = 1 we have the exponential distribution and a constant hazard rate. Also, even though h(x) is complicated in the gamma case, we know what happens for large x. Because f(x) and F(x) both go to 0 as x + 00, L'HBpital's rule yields That is, h(x) + 1/6' as x + 00. 0 The mean excess function also gives information about tail weight. If the mean excess function is increasing in d, the distribution is considered to have a heavy tail. If the mean excess function is decreasing in d, the distribution is considered to have a light tail. Comparisons between distributions can be made on the basis of the rate of increase or decrease of the mean excess function. For example, a distribution has a heavier tail than another if, for large values of the argument, its mean excess function is increasing at a lower rate. In fact, the mean excess loss function and the hazard rate are closely related in several ways. First, note that - exp [ - s,"'" h(z)dz] Yfd F(Y - + d) - - = exp [ - h(x)dx] F(d) exp[- h(z)dx] =exp[-lyh(d+t)dt]. Therefore, if the hazard rate is decreasing, then for fixed y it follows that h(d + t)dt is a decreasing function of d, and from the above F(y + d)/F(d) is an increasing function of d. But from (2.5), the mean excess loss function may be expressed as Thus, if the hazard rate is a decreasing function, then the mean excess loss function e(d) is an increasing function of d because the same is true of F(y + 84 MODELS FOR THE SIZE OF LOSSES: CONTINUOUS DISTRIBUTIONS d)/F(d) for fixed y. Similarly, if the hazard rate is an increasing function, then the mean excess loss function is a decreasing function. It is worth noting (and is perhaps counterintuitive), however, that the converse implication is not true. Exercise 4.16 gives an example of a distribution that has a decreasing mean excess loss function, but the hazard rate is not increasing for all values. Nevertheless, the implications described above are generally consistent with the above discussions of heaviness of the tail. There is a second relationship between the mean excess loss function and the hazard rate. As d f m, F(d) and SF F(z)dz go to 0. Thus, the limiting behavior of the mean excess loss function as d -+ 00 may be ascertained using L’HGpital’s rule because formula (2.5) holds. We have - 1 - lim __ -F(d) = lim - - g= F(x)ds lim e(d) = lim - d-ca d-03 F(d) d-ca - f(d) d-w h(d) as long as the indicated limits exist. These limiting relationships may useful if the form of F(z) is complicated. Example 4.17 Examine the behavior of the mean excess loss function of the gamma distribution. Because e(d) = s’ F(x)dz/F(d) and F(z) is complicated, e(d) is compli- cated. But e(0) = E(X) = QB, and, using Example 4.16, we have = 0. 1 - - 1 lim e(x) = lirn - 2-33 2-33 h(z) lim h(z) z-+w Also, from Example 4.16, h(z) is strictly decreasing in z for Q < 1 and strictly increasing in s for Q > 1, implying that e(d) is strictly increasing from e(0) = a6 to e(m) = 0 for a < 1 and strictly decreasing from e(0) = a0 to e(m) = 8 for cy > 1. For (Y = 1, we have the exponential distribution for which e(d) = 8. 0 4.7 CREATING NEW DISTRIBUTIONS 4.7.1 Introduction This section indicates how new parametric distributions can be created from existing ones. Many of the distributions in this chapter were created this way. In each case, a new random variable is created by transforming the original random variable in some way or using some other method. 4.7.2 Multiplication by a constant This transformation is equivalent to applying loss size inflation uniformly across all loss levels and is known as a change of scale. For example, if this [...]... resulting from a certain type of error, the amounts of such losses that occurred in the year 2005 were arranged grouped by size (in hundreds of thousands of dollars): 42 were below $30 0, 3 were between $30 0 and $35 0, 5 were between $35 0 and $400, 5 were between $400 and $450, 0 were between $450 and $500, 5 were between $500 and $600, and the remaining 40 were above $600 For the next three years, all losses... in x 4 .30 Write the density function for a two-component spliced model in which the density function is proportional to a uniform density over the interval from 0 to 1,000 and is proportional to an exponential density function from 1,000 to 03 Ensure that the resulting density function is continuous 4 .31 Let X have pdf f ( x ) = exp(-iz/6/)/26 for Determine the pdf and cdf of Y - 03 < x < 03 Let Y... only at the points 0 , 1 , 2 , 3 , 4 , In an operational risk context, counting distributions describe the number of losses or the number of events causing losses such as power outages that cause business interruption With an understanding of both the number of losses and the size of losses, we can have a deeper understanding of a variety of issues surrounding operational risk than if we have only information... distribution is also Poisson but with a new Poisson parameter This is also useful when considering the impact of removing or adding a type of risk to the definition of operational risks Suppose that the number of losses for a particular set of types of operational risks follows a Poisson distribution If one of the types of losses is eliminated, the distribution of the number of losses of the remaining... Y = ex 4 .32 Losses in 2006 follow the density function f ( x ) = 3 ~ : - x 2 1, where x ~, is the loss size expressed in millions of dollars It is expected that individual loss sizes in 2007 will be 10% greater Determine the cdf of losses for 2007 and use it to determine the probability that a 2007 loss exceeds $2.2 millions 106 MODELS FOR THE SlZE OF LOSSES: CONTlNUOUS DlSTRlBUTlONS 4 .33 Consider... arises in insurance It is easy to imagine how the same type model of uncertainty can be used in the operational risk framework to describe the lack of precision of quantifying a scale parameter A scale parameter can be used as a basis for measuring a company's exposure to risk Example 4.29 I n considering risks associated with automobile driving, it is important to recognize that the distance driven varies... distributions are not normally used for modeling losses because they have positive and negative support However they can be used for modeling random variables, such as rates of return, that can take on positive or negative values The normal and other distributions have been used in the fields of finance and risk man- TVaR FOR CONTINUOUS DISTRIBUTIONS 95 agement Landsman and Valdez [ 73] provide an analysis of TVaR... and Pareto distributions with regard to tail weight To reinforce this conclusion, consider a gamma distribution with parameters CY = 0.2, O = 500; a lognormal distribution with parameters p = 3. 709290, cr = 1 .33 8566; and a Pareto distribution with parameters CY = 2.5, 8 = 150 First, demonstrate that all three distributions have the same mean and variance Then numerically demonstrate that there is a... Definition 4 .31 , one can replace fj(z) with g j ( z ) / [ G ( c j- G ( c j - ~ ) ] This formulation makes it easier to have the ) break points become parameters that can be estimated Neither approach to splicing ensures that the resulting density function will be continuous (that is, the components will meet a t the break points) Such a restriction could be added to the specification Example 4 .33 Create... Definition 4 .31 makes this precise Definition 4 .31 A k-component spliced distribution has a density function that can be expressed as follows: a 1 f 1 ( x ) , co < x < c1, a z f z ( ~ ) , c1 < 5 < c2, a k f k ( x ) , ck-I < x < ck For j = 1, ,k, each aj > 0 and each fj(x) must be a legitimate density function with all probability o n the interval ( ~ j - c j ) Also, a1 f ak = 1 ~ , + Example 4 .32 Demonstrate . variable. Under- standing large possible operational risk loss values is important because these have the greatest impact on the total of operational risk losses. Random vari- ables that tend. E(X) = QB, and, using Example 4.16, we have = 0. 1 - - 1 lim e(x) = lirn - 2 -33 2 -33 h(z) lim h(z) z-+w Also, from Example 4.16, h(z) is strictly decreasing in z for. insurance can provide some insight into models that could be used for operational risk losses, particularly those that are insur- able risks. For models involving general liability insurance, the Insurance

Ngày đăng: 09/08/2014, 19:22

Từ khóa liên quan

Mục lục

  • Operational Risk

    • Part II Probabilistic tools for operational risk modeling

      • 4 Models for the size of losses: Continuous distributions

        • 4.5 The role of parameters

          • 4.5.2 Finite mixture distributions

          • 4.5.3 Data-dependent distributions

          • 4.6 Tails of distributions

            • 4.6.1 Classification based on moments

            • 4.6.2 Classification based on tail behavior

            • 4.6.3 Classification based on hazard rate function

            • 4.7 Creating new distributions

              • 4.7.1 Introduction

              • 4.7.2 Multiplication by a constant

              • 4.7.3 Transformation by raising to a power

              • 4.7.4 Transformation by exponentiation

              • 4.7.5 Continuous mixture of distributions

              • 4.7.6 Frailty models

              • 4.7.7 Splicing pieces of distributions

              • 4.8 TVaR for continuous distributions

                • 4.8.1 Continuous elliptical distributions

                • 4.8.2 Continuous exponential dispersion distributions

                • 4.9 Exercises

                • 5 Models for the number of losses: Counting distributions

                  • 5.1 Introduction

                  • 5.2 The Poisson distribution

                  • 5.3 The negative binomial distribution

                  • 5.4 The binomial distribution

                  • 5.5 The (a, b, 0) class

Tài liệu cùng người dùng

Tài liệu liên quan