b 19 2006 il /S 91 05 Stratified sampling 91 = 1 N M i=1 p i 2 i + 2 i − 2 = 1 N M i=1 p i 2 i + 1 N M i=1 p i i − 2 (5.21) Comparing Equations (5.18) and (5.21) gives Var −Var PS = 1 N M i=1 p i i − 2 which is the amount of variance that has been removed through proportional stratified sampling. In theory we can do better than this. If Equation (5.17) is minimized subject to M i=1 n i = N , it is found that the optimum number to select from the ith stratum is n ∗ i = Np i i M i=1 p i i in which case the variance becomes Var OPT = 1 N M i=1 p i i 2 = 2 N (5.22) say. However, M i=1 p i i − 2 = M i=1 p i 2 i − 2 (5.23) Therefore, from Equations (5.18) and (5.22), Var PS −Var OPT = 1 N M i=1 p i i − 2 Now the various components of the variance of the naive estimator can be shown: Var = 1 N M i=1 p i i − 2 + M i=1 p i i − 2 + 2 (5.24) The right-hand side of Equation (5.24) contains the variance removed due to use of the proportional n i rather than the naive estimator, the variance removed due to use of the optimal n i rather than the proportional n i , and the residual variance respectively. Now imagine that very fine stratification is employed (i.e. M →). Then the outcome, X ∈S i , is replaced by the actual value of X and so from Equation (5.21) Var = 1 N Var X EY X +E X 2 Y X (5.25) b 19 2006 il /S 92 05 92 Variance reduction 1 2 3 4 5 6 Y 0 0.8 0.6 0.40.2 X Figure 5.2 An example where X is a good stratification but poor control variable The first term on the right-hand side of Equation (5.25) is the amount of variance removed from the naive estimator using proportional sampling. The second term is the residual variance after doing so. If proportional sampling is used (it is often more convenient than optimum sampling which requires estimation of the stratum variances 2 i through some pilot runs), then we choose a stratification variable that tends to minimize the residual variance or equivalently one that tends to maximize Var X EY X . Equation (5.25) shows that with a fine enough proportional stratification, all the variation in Y that is due to the variation in EY X can be removed, leaving only the residual variation E X 2 Y X . This is shown in Figure 5.2 where a scatter plot of 500 realizations of X Y demonstrates that most of the variability in Y will be removed through fine stratification. It is important to note that it is not just variation in the linear part of EY X that is removed, but all of it. 5.3.1 A stratification example Suppose we wish to estimate = E W 1 +W 2 5/4 where W 1 and W 2 are independently distributed Weibull variates with density fx = 3 2 x 1/2 exp−x 3/2 on support 0 . Given two uniform random numbers R 1 and R 2 , W 1 = −ln R 1 2/3 W 2 = −ln R 2 2/3 b 19 2006 il /S 93 05 Stratified sampling 93 and so a naive Monte Carlo estimate of is Y = −ln R 1 2/3 + −ln R 2 2/3 5/4 A naive simulation procedure, ‘weibullnostrat’ in Appendix 5.3.1, was called to generate 20 000 independent realizations of Y (seed = 639 156) with the result = 215843 and ese = 000913 (5.26) For a stratified Monte Carlo, note that Y is monotonic in both R 1 and R 2 , so a reasonable choice for a stratification variable is X =R 1 R 2 This is confirmed by the scatter plot (Figure 5.2) of 500 random pairs of X Y . The joint density of X and R 2 is f XR 2 x r 2 = f R 1 R 2 x r 2 r 2 r 1 x = 1 r 2 on support 0 <x<r 2 < 1. Therefore f X x = 1 x dr 2 r 2 =−ln x and the cumulative distribution is F X x = x −x ln x (5.27) on (0,1). The conditional density of R 2 given X is f R 2 X r 2 x = f XR 2 x r 2 f X x =− 1 r 2 ln x on support 0 <x<r 2 < 1, and the cumulative conditional distribution function is F R 2 X r 2 x = r 2 x du −u ln x = 1 − ln r 2 ln x (5.28) N realizations of X Y will be generated with N strata where p i =1/N for i =1N. With this design and under proportional stratified sampling there is exactly one pair b 19 2006 il /S 94 05 94 Variance reduction X Y for which X ∈ S i for each i. Let U i V i be independently distributed as U 0 1 . Using Equation (5.27) we generate X i from the ith stratum through X i −X i ln X i = i −1 +U i N (5.29) Using Equation (5.28), ln R i 2 ln X i = V i that is R i 2 = X V i i Therefore R i 1 = X i R i 2 Note that Equation (5.29) will need to be solved numerically, but this can be made more efficient by observing that X i ∈ X i−1 1 . The ith response is Y i = −ln R i 1 2/3 + −ln R i 2 2/3 5/4 and the estimate is PS = N i=1 p i Y i = 1 N N i=1 Y i To estimate Var PS we cannot simply use 1/N −1 N i=1 Y i − PS 2 N as the Y i are from different strata and are therefore not identically distributed. One approach is to simulate K independent realizations of as in the algorithm below: For j = 1K do For i = 1N do generate u v ∼ U0 1 solve: x −ln x = i −1 +u N r 2 = x v r 1 = x r 2 y i = −ln r 1 2/3 + −ln r 2 2/3 5/4 end do y j = 1 N N i=1 y i end do PS = 1 K K j=1 y j Var PS = 1 K K−1 K j=1 y j − PS 2 b 19 2006 il /S 95 05 Stratified sampling 95 Using procedure ‘weibullstrat’ in Appendix 5.3.2 with N = 100 and K = 200 (and with the same seed as in the naive simulation), the results were PS = 216644 and ese PS = 000132 Comparing this with Equation (5.26), stratification produces an estimated variance reduction ratio, vrr = 48 (5.30) The efficiency must take account of both the variance reduction ratio and the relative computer processing times. In this case stratified sampling took 110 seconds and naive sampling 21 seconds, so Efficiency = 21 ×4771 110 ≈ 9 Three points from this example are worthy of comment: (i) The efficiency would be higher were it not for the time consuming numerical solution of Equation (5.29). Problem 5 addresses this. (ii) A more obvious design is to employ two stratification variables, R 1 and R 2 . Accordingly, the procedure ‘grid’ in Appendix 5.3.3 uses 100 equiprobable strata ona10×10 grid on 0 1 2 , with exactly one observation in each stratum. Using N = 200 replications (total sample size = 20 000 as before) and the same random number stream as before, this gave = 216710 ese = 000251 vrr = 13 and Efficiency ≈ 13 Compared with the improved stratification method suggested in point (i), this would not be competitive. Moreover, this approach is very limited, as the number of strata increases exponentially with the dimension of an integral. (iii) In the example it was fortuitous that it was easy to sample from both the distribution of the stratification variable X and from the conditional distribution of Y given X. In fact, this is rarely the case. However, the following method of post stratification avoids these problems. b 19 2006 il /S 96 05 96 Variance reduction 5.3.2 Post stratification This refers to a design in which the number of observations in each stratum is counted after naive sampling has been performed. In this case n i will be replaced by N i to emphasize that N i are now random variables (with expectation Np i ). A naive estimator is = M i=1 N i N Y i but this takes no account of the useful information available in the p i . Post (after) stratification uses AS = M i=1 p i Y i conditional on no empty strata. The latter is easy to arrange with sufficiently large Np i . The naive estimator assigns equal weight 1/N to each realization of the response Y , whereas AS assigns more weight p i /N i to those observations in strata that have been undersampled N i <Np i and less to those that have been oversampled N i >Np i . Cochran (1977, p. 134) suggests that if E N i > 20 or so for all i, then the variance of AS differs little from that of PS obtained through proportional stratification with fixed n i =Np i . Of course, the advantage of post stratification is that there is no need to sample from the conditional distribution of Y given X, nor indeed from the marginal distribution of X. Implementing post stratification requires only that cumulative probabilities for X can be calculated. Given there are M equiprobable strata, this is needed to calculate j = MF X x +1 , which is the stratum number in which a pair x y falls. This will now be illustrated by estimating = E W 1 +W 2 +W 3 +W 4 3/2 where W 1 W 4 are independent Weibull random variables with cumulative distribution functions 1 −exp−x 2 1 −exp−x 3 1 −exp−x 4 , and 1 −exp−x 5 respectively on support 0 . Bearing in mind that a stratification variable is a function of other random variables, that it should have a high degree of dependence upon the response Y = W 1 +W 2 +W 3 +W 4 3/2 and should have easily computed cumulative probabilities, it will be made a linear combination of standard normal random variables. Accordingly, define z i by F W i w i = z i for i = 14 where is the cumulative normal distribution function. Then = 0 4 4 i=1 w i 3/2 4 i=1 f W i w i dw i = − 4 4 i=1 F −1 W i z i 3/2 4 i=1 z i dz i = E Z∼N 0I 4 i=1 F −1 W i Z i 3/2 b 19 2006 il /S 9 05 Stratified sampling 97 where is the standard normal density. Note that an unbiased estimator is 4 i=1 F −1 W i Z i 3/2 where the Z i are independently N0 1, that is the vector Z ∼ N 0 I , where the covariance matrix is the identity matrix I. Now 4 i=1 F −1 W i Z i = 4 i=1 −ln 1 − Z i 1/ i (5.31) where 1 = 2 2 = 3 3 = 4 4 = 5. Using Maple a linear approximation to Equation (5.31) is found by expanding as a Taylor series about z = 0.Itis X = a 0 + 4 i=1 a i z i where a 0 = 35593, a 1 = 04792, a 2 = 03396, a 3 = 02626, and a 4 = 02140. Let X = X −a 0 4 i=1 a 2 i ∼ N 0 1 Since X is monotonic in X the same variance reduction will be achieved with X as with X . An algorithm simulating K independent realizations, each comprising N samples of 4 i=1 F −1 W i Z i 3/2 on M equiprobable strata, is shown below: For k = 1K do For j = 1M do s j = 0 and n j = 0 end do For n = 1N generate z 1 z 2 z 3 z 4 ∼ N0 1 x= 4 i=1 a i z i 4 i=1 a 2 i y= 4 i=1 F −1 W i z i 3/2 j= M x +1 n j = n j +1 s j = s j +y end do y k = 1 M M j=1 s j n j end do = 1 K K k=1 y k Var = 1 K K−1 K k=1 y k − 2 b 19 2006 il /S 98 05 98 Variance reduction 4 6 8 10 12 Y – 3– 2– 1 02 X 1 Y Figure 5.3 An example where X is both a good stratification and control variable Using K =50, N =400, M = 20 seed =566309 it is found that AS = 693055 (5.32) and ese AS = 000223 Using naive Monte Carlo with the same random number stream, ese = 001093 and so the estimated variance reduction ratio is vrr = 24 (5.33) A scatter plot of 500 random pairs of X Y shown in Figure 5.3 illustrates the small variation about the regression curve EY X . This explains the effectiveness of the method. 5.4 Control variates Whereas stratified sampling exploits the dependence between a response Y and a stratification variable X, the method of control variates exploits the correlation between a response and one or more control variables. As before, there is a response Y from a b 19 2006 il /S 99 05 Control variates 99 simulation and we wish to estimate = EY where 2 = VarY. Now suppose that in the same simulation we collect additional statistics X = X 1 X d having known mean X = 1 d and that the covariance matrix for (X,Y) is XX XY XY 2 The variables X 1 X d are control variables. A control variate estimator b = Y −b X − X is considered for any known vector b = b 1 b d . Now, b is unbiased and Var b = 2 +b XX b −2b XY This is minimized when b = b ∗ = −1 XX XY (5.34) leading to a variance of Var b ∗ = 2 − XY −1 XX XY = 1 −R 2 2 where R 2 is the proportion of variance removed from the naive estimator = Y .In practice the information will not be available to calculate Equation (5.34) so it may be estimated as follows. Typically, there will be a sample of independent realizations of X k Y k k= 1 N . Then Y = Y = 1 N N k=1 Y k X = X = 1 N N k=1 X k Let X ik denote the ith element of column vector X k . Then an unbiased estimator of b ∗ is b ∗ = S −1 XX S XY where the ijth element of S XX is N k=1 X ik −X i X jk −X j N −1 and the ith element of S XY is N k=1 X ik −X i Y k −Y N −1 Now the estimator b ∗ = Y − b ∗ X − X (5.35) [...]... Matrix([[1,0.7,0.5,0.3],[0.7,1,0.6,0.2],[0.5,0.6,1,0 .4] ,[0.3,0.2,0 .4, 1]]): > b := LUDecomposition(A, method=’Cholesky’); [1] [0.6999999999999999 54 , 0.7 141 42 842 8 542 849 76 , 0 , 0.] [0.500000000000000000 , 0.35007002100700 246 3 , 0.7921180 343 8133 943 1 , 0.] [0.299999999999999988 , -0.0 140 02800 840 2800531 , 0.32179795 146 741 9185 , 0.8979 142 49803398512] and so ⎞ ⎛ 1 Z1 ⎜Z2 ⎟ ⎜0 7 ⎜ ⎟=⎜ ⎝Z3 ⎠ ⎝0 5 03 Z4 ⎞⎛ ⎞ Y1 ⎟ ⎜ Y2 ⎟ ⎟⎜ ⎟ ⎠ ⎝ Y3 ⎠ 0 8979 14 Y4 ⎛ 0 7 141 43... that have been developed in finance assume an underlying geometric Brownian motion First the main features of a Brownian motion, also known as a Wiener process, will be described Simulation and Monte Carlo: With applications in finance and MCMC © 2007 John Wiley & Sons, Ltd J S Dagpunar 108 Simulation and finance 6.1 Brownian motion t ≥ 0 where Consider a continuous state, continuous time stochastic process... comparison of naive Monte Carlo (asiannaive) with combined importance and post stratification (asianimppoststrat) for an Asian average price call asiannaivea asianimppoststratb K 03 03 03 01 01 01 a b c c1 Var c1 c2 Var c2 v.r.r.c 55 50 45 55 50 45 2 2 149 4 1666 7 145 2 0 2012 1 9178 6 048 2 0 0300 0 0399 0 048 6 0 0 046 2 0 0 140 0 0186 2 2116 4 1708 7 1521 0 20 24 1 9195 6 0553 0 000313 0 0003 74 0 00 048 3 0 0000235... lattice covering the cube If h is again the subinterval length along any of the d axes, then mhd 1 The resulting error is O h2 = O 1/m2/d However, using Monte Carlo, the √ error is still O 1/ m Therefore, for d > 4 and for sufficiently large m, Monte Carlo will be better than the trapezium rule This advantage increases exponentially with increasing dimension As will be seen, in financial applications. .. < < xm < 1 For the case m = 10, simulate 10 000 points lying in D, and hence find a 95% confidence interval for the integral 105 6 Simulation and finance A derivative is a tradeable asset whose price depends upon other underlying variables The variables include the prices of other assets Monte Carlo methods are now used routinely in the pricing of financial derivatives The reason for this is that apart... value of d = 100 is not unusual, so Monte Carlo is the obvious choice This chapter provides an introduction to the use of Monte Carlo in financial applications For more details on the financial aspects there are many books that can be consulted, including those by Hull (2006) and Wilmott (1998) For a state-of-the-art description of Monte Carlo applications Glasserman (20 04) is recommended The basic mathematical... concerned, which induces a large estimated standard error on the variance reduction ratio This does not detract from the main point emerging from this example It is that if there is strong linear dependence between Y and X, little efficiency is likely to be lost in using a control variate in preference to stratification 5.5 Conditional Monte Carlo Conditional Monte Carlo works by performing as much as... 1 m − 1 at the beginning of a 122 Simulation and finance simulation Then for a path in the jth stratum, generate X using rejection with a uniform envelope if j = 2 m − 1, and with an envelope proportional to x exp −x2 /2 for j = 1 and m (see Problem 7) The remaining problem is to sample from the conditional distribution of Z ∼ N ∗ I given that X = x This is a standard problem concerning a multivariate... expected cost of writing and hedging the option will exceed c X t t , the Black–Scholes cost, since the discrete hedging policy is no longer optimal Secondly, the actual cost of writing and hedging the option will no longer be known with certainty (and equal to c X t t , but will be a random variable The larger its standard deviation, the greater is the bank’s exposure to risk Simulation can be used... options, most calculations involve the evaluation of high-dimensional definite integrals To see why Monte Carlo may be better than standard numerical methods, suppose we wish to evaluate I= f x dx 01 where f x is integrable Using the composite trapezium rule a subinterval length of h is chosen such that m − 1 h = 1 and then f is evaluated at m equally spaced points in 0 1 The error in this method is O h2 . described. Simulation and Monte Carlo: With applications in finance and MCMC J. S. Dagpunar © 2007 John Wiley & Sons, Ltd 108 Simulation and finance 6.1 Brownian motion Consider a continuous state,. ‘weibullstrat’ in Appendix 5.3.2 with N = 100 and K = 200 (and with the same seed as in the naive simulation) , the results were PS = 216 644 and ese PS = 000132 Comparing this with Equation. provides an introduction to the use of Monte Carlo in financial applications. For more details on the financial aspects there are many books that can be consulted, including those by Hull (2006) and