Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 183 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
183
Dung lượng
695,13 KB
Nội dung
MULTIVARIATE, COMBINATORIAL AND DISCRETIZED NORMAL APPROXIMATIONS BY STEIN’S METHOD FANG XIAO NATIONAL UNIVERSITY OF SINGAPORE 2012 MULTIVARIATE, COMBINATORIAL AND DISCRETIZED NORMAL APPROXIMATIONS BY STEIN’S METHOD FANG XIAO (B.Sc. Peking University) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY NATIONAL UNIVERSITY OF SINGAPORE 2012 ii ACKNOWLEDGEMENTS I am grateful to my advisor, Professor Louis H.Y. Chen, for teaching me Stein’s method, giving me problems to work on and guiding me through writing this thesis. His encouragement and requirement for perfection have motivated me to overcome many difficulties during my research. I also want to thank my co-supervisor, Dr. Zhengxiao Wu, for helpful discussions. Professor Zhidong Bai has played an important role in my academic life. He introduced me to this wonderful department when I graduated from Peking University and had nowhere to go. Besides, I became a Ph.D. student of Professor Louis Chen because of his recommendation. Acknowledgements There have been two people who are particularly helpful in my learning and researching of Stein’s method. During the writing of a paper with Professor Qiman Shao, we had several discussions and I learnt a lot from him. When I showed an earlier version of my results on multivariate normal approximation to Adrian R¨ollin, he suggested me to unify them in the framework of Stein coupling and pointed out a mistake. The expression of this thesis has been greatly improved following his suggestions. I would like to thank some members of our weekly working seminar for the inspiring discussions. The list includes Wang Zhou, Rongfeng Sun, Sanjay Chauhuri, Le Van Thanh and Daniel Paulin. I am thankful to my thesis examiners, Professors Andrew Barbour, Kwok Pui Choi, Gesine Reinert, for their valuable comments. The Department of Statistics and Applied Probability at the National University of Singapore is a great place to study in. I thank the faculty members for teaching me courses and all my friends for the happy times we had together. I thank my parents for their support during all these years. Although not in Singapore, they are very concerned about my life here. No matter what achievement or difficulty I had, they were the first I wanted to share with. This thesis is dedicated to my parents. This thesis is written partially supported by Grant C-389-000-010-101 and iii Acknowledgements Grant C-389-000-012-101 at the National University of Singapore. iv v CONTENTS Acknowledgements Summary List of Symbols Chapter Introduction ii viii xi 1.1 Stein’s method for normal approximation . . . . . . . . . . . . . . . 1.2 Multivariate normal approximation . . . . . . . . . . . . . . . . . . 1.3 Combinatorial central limit theorem . . . . . . . . . . . . . . . . . . 16 1.4 Discretized normal approximation . . . . . . . . . . . . . . . . . . . 18 Chapter Multivariate Normal Approximation under Stein Coupling: The Bounded Case 21 CONTENTS vi 2.1 Multivariate Stein coupling . . . . . . . . . . . . . . . . . . . . . . . 22 2.2 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3 Bounded local dependence . . . . . . . . . . . . . . . . . . . . . . . 40 2.4 Base-(k + 1) expansion of a random integer . . . . . . . . . . . . . . 43 Chapter Multivariate Normal Approximation under Stein Coupling: The Unbounded Case 57 3.1 Main result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.2 Local dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3 Number of vertices with a given degree sequence on an Erd¨os-R´enyi graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Chapter Multivariate Normal Approximation by the Concentration Inequality Approach 83 4.1 Concentration inequalities . . . . . . . . . . . . . . . . . . . . . . . 85 4.1.1 Multivariate normal distribution . . . . . . . . . . . . . . . . 87 4.1.2 Sum of independent random vectors . . . . . . . . . . . . . . 90 4.2 Multivariate normal approximation for independent random vectors 98 4.3 Proofs of the lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Chapter Combinatorial CLT by the Concentration Inequality Approach 113 5.1 Statement of the main result . . . . . . . . . . . . . . . . . . . . . . 113 5.2 Concentration inequalities via exchangeable pairs . . . . . . . . . . 116 5.3 Proof of the main result . . . . . . . . . . . . . . . . . . . . . . . . 128 Chapter Discretized Normal Approximation for Dependent Random Integers 138 6.1 Total variation approximation . . . . . . . . . . . . . . . . . . . . . 138 CONTENTS vii 6.2 Discretized normal approximation for sums of independent integer valued random variables . . . . . . . . . . . . . . . . . . . . . . . . 142 6.3 Discretized normal approximation under Stein coupling . . . . . . . 146 6.4 Applications of the main theorem . . . . . . . . . . . . . . . . . . . 152 6.4.1 Local dependence . . . . . . . . . . . . . . . . . . . . . . . . 153 6.4.2 Exchangeable pairs . . . . . . . . . . . . . . . . . . . . . . . 156 6.4.3 Size-biasing . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Bibliography 165 viii SUMMARY Stein’s method is a method for proving distributional approximations along with error bounds. Its power of handling dependence among random variables has attracted many theoretical and applied researchers to work on it. Our goal in this thesis is proving bounds for non-smooth function distances, for example, Kolmogorov distance, between distributions of sums of dependent random variables and Gaussian distributions. The following three topics in normal approximation by Stein’s method are studied. Multivariate normal approximation. Since Stein introduced his method, much has been developed for normal approximation in one dimension for dependent random variables for both smooth and non-smooth functions. On the other hand, Summary Stein’s method for multivariate normal approximation has only made its first appearance in Barbour (1990) and G¨otze (1991), and relatively few results have been obtained for non-smooth functions, typically for indicators of convex sets in finite dimensional Euclidean spaces. In general, it is much harder to obtain optimal bounds for non-smooth functions than for smooth functions. Under the setting of Stein coupling introduced by Chen and R¨ollin (2010), we obtain bounds on non-smooth function distances between distributions of sums of dependent random vectors and multivariate normal distributions using the recursive approach in Chapter and Chapter 3. By extending the concentration inequality approach to the multivariate setting, a multivariate normal approximation theorem on convex sets is proved for sums of independent random vectors in Chapter 4. The resulting bound is better than the one obtained by G¨otze (1991). Moreover, our concentration inequality approach provides a new way of dealing with dependent random vectors, for example, those under local dependence, for which the induction approach or the method of Bentkus (2003) is not likely to be applicable. Combinatorial central limit theorem. Combinatorial central limit theorem has a long history and is one of the most successful applications of Stein’s method. A third-moment bound for a combinatorial central limit theorem was obtained in Bolthausen (1984), who used Stein’s method and induction. The bound in Bolthausen (1984) does not have an explicit constant and is only applicable in the fixed-matrix case. In Chapter 5, we give a different proof of the combinatorial central limit theorem using Stein’s method of exchangeable pairs and the use of a concentration inequality. We assume the matrix to be random and our bound has explicit constant. ix 6.4 Applications of the main theorem 155 − P(ζ1 = 0) = p. Consider integer valued random variable S = n i=1 Xi where Xi = ζi ζi+1 and ζn+1 = ζ1 . Then S can be regarded as a sum of locally dependent integer valued random variables with θ = 7. The mean and variance of S can be calculated as µ = ❊S = np2 , σ = Var(S) = n(p2 + 2p3 − 3p4 ). (6.43) From (6.41), with cp , cp constants depending on p, dT V (L (S), N d (µ, σ )) ≤ cp √ + cp dT V (L (V ), L (V + 1)) n (6.44) where, with m = n − and a, b ∈ {0, 1} given, m V = aζ1 + ζj−1 ζj + bζm . (6.45) j=2 Regarding V = f (ζ1 , . . . , ζm ), we define V = f (ζ1 , . . . , ζI , . . . , ζm ) where I is uniform in {1, 2, . . . , m}, independent of {ζ1 , . . . , ζm } and ζI is an independent copy of ζI . Then (V, V ) is an exchangeable pair. It is easy to verify that P(V − V = 1) ≥ 2(n − 6) p (1 − p)2 , n−4 (6.46) Var(❊(I(V − V = 1)|V )) ≤ 1−p n−4 + 10(n − 6)p2 (1 − p) (6.47) Var(❊(I(V − V = −1)|V )) ≤ 1−p n−4 + 10(n − 6)p2 (1 − p). (6.48) and From Lemma 6.1, dT V (L (V ), L (V + 1)) ≤ + 10(n − 6)p2 (1 − p) . (n − 6)p2 (6.49) 6.4 Applications of the main theorem 156 Therefore, √ dT V (L (S), N d (µ, σ )) ≤ cp / n (6.50) where cp is a constant depending on p. This problem was studied in Barbour and Xia (1999) and R¨ollin (2005) by using the translated Possion approximation. Barbour and Xia (1999) assumed some extra conditions on p to obtain a bound on the total variation distance between S and a translated Poisson distribution. Although the result in R¨ollin (2005) applies for all p, the approach used was different from ours. Remark 6.1. Corollary 6.1 may be used to prove discretized normal approximation results for word counts in DNA sequences assuming the base pairs are independent. 6.4.2 Exchangeable pairs Here we consider an exchangeable pair of integer valued random variables (S, S ) with ❊S = µ, Var(S) = σ2. Suppose we have the following approximate linearity condition ❊(S − S |S) = λ(S − µ) + σ❊(R|S). Then a simple modification of Theorem 6.2 yields the following corollary. (6.51) 6.4 Applications of the main theorem 157 Corollary 6.2. We have dT V (L (S), N d (µ, σ )) √ π ❊R2 + ≤( + 2) λ + ❊|S Var(❊((S − S)2 |S)) + λσ π ❊|S − S|3 + 2λσ ❊|S − S|3 + ❊(S − S)2 sup dT V (L (S|Θ = θ), L (S + 1|Θ = θ)) 4λσ θ − S|6 2λσ (6.52) where Θ is any random vector such that B(S − S) ⊂ B(Θ). Proof. Let G = (S 2λ − S) and D = S − S. Note that because of the remainder term in the approximate linearity condition (6.51), ❊(S − µ)f (S) = ❊{Gf (S ) − Gf (S)} − σλ ❊f (S)R. Therefore, (6.19) has an extra term σ ❊fh (S)R/λ, which is bounded by Moreover, from (6.51), |R1 | ≤ ( σ2 π/2❊|R|/λ. ❊GD = σ2 + σ❊((S − µ)R)/λ. Hence instead of (6.22), Var(❊(GD|S))+ σ ❊|(S −µ)R|) ≤ λ Var(❊((S − S)2 |S)) √ + ❊R2. λσ λ Therefore, Corollary 6.2 follows from Theorem 6.2. ✷ If the exchangeable pair (S, S ) satisfies that |S − S | ≤ 1, we have the following corollary. Corollary 6.3. If (S, S ) is an exchangeable pair of integer valued random variables and linearlity condition (6.51) is satisfied. In addition, suppose |S − S | ≤ 1. Then 6.4 Applications of the main theorem 158 we have dT V (L (S), N d (µ, σ )) √ π ❊R2 + ≤( + 2) λ Proof. Let G = π/8 + . 2λσ (6.53) − S), D = S − S. Then for h ∈ H , D ❊G σ2 = (S 2λ Var(❊((S − S)2 |S)) + λσ (h(S + t) − h(S))dt ❊(S − S) 2λσ ❊[ = 2λσ S −S (h(S + t) − h(S))dt (h(S + t) − h(S))dtI(S − S = 1) −1 − (h(S + t) − h(S))dtI(S − S = −1)] ❊[(h(S + 1) − h(S))I(S − S = 1) + (h(S − 1) − h(S))I(S − S = −1)] 4λσ = ❊[(h(S ) − h(S))I(S − S = 1) − (h(S) − h(S ))I(S − S = 1)] 4λσ = = 0. (6.54) We used the exchangeability of (S, S ) in the last equality. From (6.54), the last term in the first line of the inquality (6.24) equals 0. Therefore, the bound on dT V (L (S), N d (µ, σ )) can be deduce similarly as for (6.52) except that we not have the last term of (6.52) in this situation. ✷ Remark 6.2. Exchangeable pairs of integer valued random variables (S, S ) such that |S − S| ≤ is commonly seen in the literature. For example, binary expansion of a random integer (Diaconis (1977)), anti-voter model (Rinott and Rotar 6.4 Applications of the main theorem 159 (1997)). Corollary 6.3 shows that under this special assumption, bounding the total variation distance requires no more effort than bounding the Kolmogorov distance. 6.4.3 Size-biasing Theorem 6.2 has the following corollary for size bias coupling. Corollary 6.4. Let S be a non-negative integer valued random variable with mean µ and finite variance σ . S s has the size biased distribution of S and is defined on the same probability space as S. Then dT V (L (S), N d (µ, σ )) ≤ 2µ σ2 + Var(❊(S s − S|S)) + π µ µ s ❊ |S − S| + σ3 σ3 ❊|S s − S|4 (6.55) µ(❊|S s − S|2 + ❊|S s − S|) sup dT V (L (S|Θ = θ), L (S + 1|Θ = θ)) 2σ θ where Θ is any random vector such that B(S s − S) ⊂ B(Θ). Proof. The bound (6.55) follows by Theorem 6.2 and (6.7). ✷ Example: lightbulb process. We consider the lightbulb process studied in Goldstein and Zhang (2011), to which we refer for the history of this problem. There are n lightbulbs. Initially these lightbulbs are all in the off status. On days 6.4 Applications of the main theorem 160 r = 1, 2, . . . , n, we change the status of r bulbs from off to on, or from on to off. These r bulbs are chosen uniformly from the n bulbs and independent of the choices of the other days. Let X denote the number of bulbs on after n days. A Berry-Esseen bound was proved in Goldstein and Zhang (2011) on the Kolmogorov distance between L (X) and N (˜ µ, σ ˜ ) where ❊X = µ ˜, Var(X) = σ ˜ . Here we derive a bound on the total variation distance between L (X) and N d (˜ µ, σ ˜ ). For simplicity, assume n = 4l for some positive integer l. Then X must be an even number and ≤ X ≤ n. Define S = X/2. Then µ = ❊S = µ ˜/2 and σ = Var(S) = σ ˜ /4. We have the following proposition. Proposition 6.2. With S defined above, √ dT V (L (S), N d (µ, σ )) ≤ c/ n (6.56) where c is an absolute constant. Remark 6.3. In Goldstein and Xia (2011), a clubbed binomial approximation for X was proved. Their bound, together with a bound on the total variation distance between clubbed binomial distributions and discretized normal distributions, results in (6.56). Here, we give a direct proof of (6.56) by applying Corollary 6.4. Proof. Define S s = X s /2 where X s has X-size biased distribution and is coupled with X. Then (S, S s , µ) is a Stein coupling. X s can be constructed in the following 6.4 Applications of the main theorem 161 way by Goldstein and Zhang (2011). Let X = {Xrk : r, k = 1, 2, . . . , n} be a collection of switch variables with distribution 1/ nr if e1 , . . . , en ∈ {0, 1} and e1 + · · · + en = r P(Xr1 = e1, · · · , Xrn = en) = 0 otherwise (6.57) and the collections {Xr1 , . . . , Xrn } are independent for r = 1, . . . , n. Let Xi = n r=1 Xri mod for each i ∈ {1, . . . , n}, then the number of bulbs on after n days is X = n i=1 Xi . Let Xi be given from X as follows. If Xi = 1, then Xi = X. Otherwise, with J i uniformly chosen from {j : Xn/2,j = − Xn/2,i }, independent of {Xrk : r = n/2, k = 1, . . . , n}, let Xi Xrk Xn/2,k i Xrk = Xn/2,J i Xn/2,i and let X i = n k=1 i = {Xrk : r, k = 1, . . . , n} where r = n/2 r = n/2, k ∈ / {i, J i } (6.58) r = n/2, k = i r = n/2, k = J i Xki where n Xki i Xrk ) mod 2. =( (6.59) r=1 Then, it was proved in Goldstein and Zhang (2011) that with I uniformly chosen from {1, 2, . . . , n} and independent of all other variables, the mixture X I = X s has the X-size biased distribution. It was pointed out in Goldstein and Zhang (2011) 6.4 Applications of the main theorem 162 that X s − X = 2I(XI = 0, XJ I = 0). (6.60) S s − S = I(XI = 0, XJ I = 0). (6.61) Therefore, From Lemma 3.3 in Goldstein and Zhang (2011) and the facts that µ = O(n), σ = √ O(n), the first three terms in the bound (6.55) are of order O(1/ n). Therefore, we √ only need to prove that supθ dT V (L (S|Θ = θ), L (S +1|Θ = θ)) = O(1/ n) where Θ is any random vector such that B(X s −X) ⊂ B(Θ). If XI = 1, we define J I = I. Then X s − X is determined by Θ := {I, J I , Xrk : r ∈ {1, . . . , n}, k ∈ {I, J I }}. Assume without loss of generality that J I = I and denote J I by J. Given any realization of Θ, we define a new n by n − random matrix Y = {Yrk : r = 1, . . . , n, k = 1, . . . , n − 2} where P(Yr1 = e1, · · · , Yr,n−2 = en−2) = 1/ 0 n r−XrI −XrJ if e1 , . . . , en−2 ∈ {0, 1} and e1 + · · · + en−2 = r − XrI − XrJ otherwise (6.62) and the collections {Yr1 , . . . , Yr,n−2 } are independent for r = 1, . . . , n. Let Yi = n r=1 Yri mode for each i ∈ {1, . . . , n − 2} and Y = n−2 i=1 Yi . Then dT V (L (S|Θ), L (S + 1|Θ)) = dT V (L (V ), L (V + 1)) (6.63) 6.4 Applications of the main theorem 163 where V = Y /2. We bound dT V (L (V ), L (V + 1)) by applying Lemma 6.1. Note that because XrI = XrJ , there are n − ones in the n2 th row of Y. We uniformly and independently choose one of these ones (in column I † ) and exchange it with a uniformly and independently chosen zero (in column J † ) in the n th row. By doing this, we change the values of YI † and YJ † . Define Y to be the sum of Yi : i ∈ {1, . . . , n − 2} after the above exchange. Then (Y, Y ) is an exchangeable pair. Define V = Y /2, then (V, V ) is also an exchangeable pair and I(V − V = 1) = I(YI † = 1, YJ † = 1) n−2 I(Yi = 1, Yj = 1)I(I † = i, J † = j) = (6.64) i=j n−2 I(Yi = 1, Yj = 1, Yn/2,i = 1, Yn/2,j = 0)I(I † = i, J † = j). = i=j Therefore, ❊(I(V −V = 1)|Y) = (n −4 2)2 n−2 I(Yi = 1, Yj = 1, Yn/2,i = 1, Yn/2,j = 0). (6.65) i=j Following essentially the same calculation in pages 11-12 in Goldstein and Zhang (2011), we can prove that Var(❊(I(V − V = 1)|V )) ≤ Var(❊(I(V − V = 1)|Y)) = O(1/n) (6.66) and P(V − V = 1) = ❊I(V − V = 1) = O(1). (6.67) Similarly, Var(❊(I(V − V = −1)|V )) ≤ Var(❊(I(V − V = −1)|Y)) = O(1/n). (6.68) 6.4 Applications of the main theorem 164 Therefore, by Lemma 6.1, √ dT V (L (V ), L (V + 1)) = O(1/ n). This completes the proof. (6.69) ✷ A final remark. It would be interesting to examine multivariate discretized normal approximation. 165 Bibliography Arratia, R., Goldstein, L. and Gordon, L. (1990) Poisson approximation and the Chen-Stein method. Statist. Sci. 403-424. Ball, K. (1993) The reverse isoperimetric problem of Gaussian measure. Discrete Comput. Geom. 10 411-420. Barbour, A.D. (1990) Stein’s method for diffusion approximations. Probab. Theory Related Fields 84 297-322. Barbour, A.D., Chen, L.H.Y. and Loh, W.L. (1992) Compound Poisson approximation for nonnegative random variables via Stein’s method. Ann. Probab. 20 1843-1866. Barbour, A.D., Holst, L. and Janson, S. (1992) Poisson approximation. Oxford University Press. Bibliography Barbour, A.D., Karonski, M. and Rucinski, A. (1989) A central limit theorem for decomposable random variables, with applications to random graphs. J. Combin. Theory Ser. B 47 125-145. Barbour, A.D. and Xia, A. (1999) Poisson perturbations. ESAIM Probab. Stat. 131-150. Bentkus, V. (2003) On the dependence of the Berry-Esseen bound on dimension. J. Statist. Plann. Inference 113 385-402. Bhattacharya, R.N. and Holmes, S. (2010) An exposition of G¨otze’s Estimation of the Rate of Convergence in the Multivariate Central Limit Theorem. Technical Report, Stanford University. Bhattacharya, R.N. and Rao, R.R. (1986) Normal approximation and asymptotic expansions. Wiley. Bolthausen, E. (1984) An estimate of the remainder in a combinatorial central limit theorem. Z. Wahrscheinlichkeitstheorie verw. Gebiete 66 379-386. ¨ tze, F. (1993) The rate of convergence for multivariate Bolthausen, E. and Go sampling statistics. Ann. Statist. 1692-1710. Chatterjee, S. (2008) A new method of normal approximation. Ann. Probab. 36 1584-1610. Chatterjee, S. and Meckes, E. (2008) Multivariate normal approximation using exchangeable pairs. ALEA Lat. Am. J. Probab. Math. Stat. 257-283. Chen, L.H.Y. (1975a) Poisson approximation for dependent trials. Ann. Probab. 534-545. Chen, L.H.Y. (1975b) An approximation theorem for sums of certain randomly selected indicators. Z. Wahrscheinlichkeitstheorie verw. Gebiete 33 69-74. Chen, L.H.Y. (1986) The rate of convergence in a central limit theorem for dependent random variables with arbitrary index set. IMA Preprint Series 243 Univ. Minnesota. 166 Bibliography Chen, L.H.Y. (1998) Stein’s method: some perspectives with applications. Probability Towards 2000. L. Accardi and C.C. Heyde, eds., Lecture Notes in Statistics 128, Springer Verlag, 515-528. Chen, L.H.Y., Fang, X. and Shao, Q.M. (2011) From Stein identities to moderate deviations. Preprint. Available at http://arxiv.org/abs/0911.5373. Chen, L.H.Y., Goldstein, L. and Shao, Q.M. (2010) Normal approximation by Stein’s method. Springer. Chen, L.H.Y. and Leong, Y.K. (2010) From zero-bias to discretized normal approximation. Unpublished. ¨ llin, A. (2010) Stein couplings for normal approximation. Chen, L.H.Y. and Ro Preprint. Available at http://arxiv.org/abs/1003.6039. Chen, L.H.Y. and Shao, Q.M. (2001) A non-uniform Berry-Esseen bound via Stein’s method. Probab. Theory Related Fields 120 236-254. Chen, L.H.Y. and Shao, Q.M. (2004) Normal approximation under local dependence. Ann. Probab. 32 1985-2028. Chen, L.H.Y. and Shao, Q.M. (2005). Stein’s Method for Normal Approximation. An Introduction to Stein’s Method. A.D. Barbour and L. H. Y. Chen, eds., Lecture Notes Series 4, Institute for Mathematical Sciences, National University of Singapore, Singapore University Press and World Scientific, 1-59. Chen, L.H.Y. and Shao, Q.M. (2007). Normal approximation for nonlinear statistics using a concentration inequality approach. Bernoulli 13(2) 581-599. Diaconis, P. (1977) The distribution of leading digits and uniform distribution mod 1. Ann. Probab. 72-81. Ehm, W. (1991) Binomial approximation to the Poisson binomial distribution. Statist. Probab. Lett. 11 7-16. Ghosh, S. (2010) Lp bounds for a combinatorial central limit theorem with involutions. Preprint. Available at http://arxiv.org/abs/0905.1150. 167 Bibliography 168 Goldstein, L. (2005) Berry Esseen bounds for combinatorial central limit theorems and pattern occurences, using zero and size biasing. Appl. Probab. Index 42 661-683. Goldstein, L. (2011) A Berry-Esseen bound with applications counts in the Erd¨os-R´enyi random graph. Preprint. Available http://arxiv.org/abs/1005.4390. to at Goldstein, L. and Reinert, G. (1997) Stein’s method and the zero bias transfromation with application to simple random sampling. Ann. Appl. Probab. 935-952. Goldstein, L. and Rinott, Y. (1996) Multivariate normal approximation by Stein’s method and size bias couplings. Appl. Probab. Index 33 1-17. Goldstein, L. and Xia, A. (2006) Zero biasing and a discrete central limit theorem. Ann. Probab. 34 1782-1806. Goldstein, L. and Xia, A. (2010) Clubbed Binomial Approximation for the Lightbulb Process. Available at http://arxiv.org/abs/1111.3984. Goldstein, L. and Zhang, H. (2011). A Berry Esseen theorem for the lightbulb process. Appl. Probab. Index 43 875-898. ¨ tze, F. (1991) On the rate of convergence in the multivariate CLT. Ann. Go Probab. 19 724-739. Ho, S.T. and Chen, L.H.Y. (1978) An Lp bound for the remainder in a combinatorial central limit theorem. Ann. Probab. 231-249. Hoeffding, W. (1951) A combinatorial central limit theorem. Annals of Mathematical Statistics 22 558-566. Lindvall, T. (1992) Lectures on the coupling method. Wiley, New York. Loh, W.L. (1992) Stein’s method and multinomial approximation. Ann. Appl. Probab. 536-554. Bibliography Nagaev, S.V. (1976) An estimate of the remainder term in the multidimensional central limit theorem. Proc. Third Japan-USSR Symp. Probab. Theory. Lecture Notes in Math. 550 419-438. Springer, Berlin. Neammannee, K. and Suntornchost, J. (2006) A uniform bound on a combinatorial central limit theorem. Stoch. Anal. Appl. 559-578. ¨ z, E. (1996) Stein’s method for geometric approximation. Appl. Probab. Peko Index 33 707-713. ˘, M. (2003) Normal approximation with Stein’s method. Proceedings of the Raic Seventh Young Statisticians Meeting. Reinert, G. (1998) Couplings for normal approximations with Stein’s method. Microsurveys in Discrete Probability. D. Aldous, J. Propp eds., Dimacs series. AMS, 193-207. ¨ llin, A. (2009) Multivariate normal approximation with Reinert, G. and Ro Stein’s method of exchangeable pairs under a general linearity condition. Ann. Probab. 37 2150-2173. Rinott, Y. and Rotar, V. (1996) A multivariate CLT for local dependence with n−1/2 log n rate and applications to multivariate graph related statistics. J. Multivariate Anal. 56 333-350. Rinott, Y. and Rotar, V. (1997). On coupling constructions and rates in the CLT for dependent summands with applications to antivoter model and weighted U-statistics. Ann. Appl. Probab. 1080-1105. ¨ llin, A. (2005) Approximation of sums of conditionally independent random Ro variables by the translated Poisson distribution. Bernoulli 11 1115-1128. ¨ llin, A. (2007) Translated Poisson approximation using exchangeable pair Ro couplings. Ann. Appl. Probab. 17 1596-1614. ¨ llin, A. (2008) Symmetric Binomial approximation for sums of locally depenRo dent random variables. Electron. J. Probab. 13 756-776. 169 Bibliography ¨ llin, A. and Ross, N. (2010) A probabilistic approach to local Ro limit theorems with applications to random graphs. Preprint. Available at http://arxiv.org/abs/1011.3100. Sazonov, V.V. (1981) Normal approximation - some recent advances. Springer. Senatov, V.V. (1980) Uniform estimates of the rate of convergence in the multidimensional central limit theorem. Teor. Veroyatn. Primen. 25 757-770. Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. Proc. Sixth Berkeley Symp. Math. Stat. Prob. Univ. California Press. Berkeley, Calif., 583-602. Stein, C. (1986). Approximation Computation of Expectations. Lecture Notes 7, Inst. Math. Statist., Hayward, Calif. Wald, A. and Wolfowitz, J. (1944) Statistical tests based on permutations of the observations. Annals of Mathematical Statistics 15 358-372. 170 [...]... dependent random variables In Chapter 2 and Chapter 3, we consider Stein couplings in the multivariate setting and prove multivariate normal approximation results 1.2 Multivariate normal approximation Since Stein introduced his method, much has been developed for normal approximation in one dimension for dependent random variables for both smooth and non-smooth functions On the other hand, Stein’s method. .. be fixed and the constant obtained was as big as 61702446 In Chapter 5, we give a different proof of the combinatorial central limit theorem Our approach is by Stein’s method of exchangeable pairs and a concentration inequality 1.4 Discretized normal approximation The total variation distance between a sum of integer valued random variables S and a Gaussian random variable is always 1 However, a discretized. .. of the their results is incorrect and a counter-example was found in Chen and Shao (2007) Multivariate analogies of local dependence, size bias couplings and exchangeable pairs were considered in Rinott and Rotar (1996), Goldstein and Rinott (1996), Chatterjee and Meckes (2008) and Reinert and R¨llin (2009) Although Stein’s method has been extended o to multivariate normal approximation, relatively... these two most 1.1 Stein’s method for normal approximation 4 common distributions, Stein’s method for binomial, geometric and compound Poisson distributions were also developed in Ehm (1991), Pek¨z (1996) and Barbour, o Chen and Loh (1992) 1.1 Stein’s method for normal approximation Stein’s method consists of several steps First find a characterizing operator L for the target random variable Z such that... θ where Θ is any random vector such that B(G, D) ⊂ B(Θ) (1.42) 1.4 Discretized normal approximation The above theorem is illustrated by proving discretized normal approximation results for integer-valued random variables with different dependence structures 20 21 CHAPTER 2 Multivariate Normal Approximation under Stein Coupling: The Bounded Case For k ∈ Z+ , let W be a k-dimensional random vector, Z be... prove a normal approximation theorem for W with an error bound of the order k 1/2 γ This dependence of k 1/2 on the dimension is better than k 5/2 and k 3/2 obtained by Bhattacharya and Holmes (2010) and k as stated in G¨tze (1991) o The paper by Bhattacharya and Holmes (2010) is an exposition of the proof of G¨tze (1991) but the authors remark that they are unable to obtain k as stated o by G¨tze... is for i.i.d random vectors and his method is different from Stein’s method Our concentration inequality approach provides a new way of dealing with dependent random vectors, for example, those under local dependence, for which the induction approach or the method of Bentkus (2003) is not likely to be 1.3 Combinatorial central limit theorem applicable To go beyond independence, Bolthausen and G¨tze (1993)... invented a new method, known as Stein’s method, to prove probability approximation results along with convergence rates Stein’s method was first introduced in Stein (1972) to prove normal approximation Soon after that, Chen (1975a) introduced a version of Stein’s method for Poisson approximation whose power was fully recognized after the work Arratia, Goldstein and Gordon (1990) and Barbour, Holst and Janson... optimal bounds for non-smooth functions than for smooth functions In Chapter 2 and Chapter 3, we work under the general setting of Stein coupling and prove bounds on non-smooth function distances for multivariate normal approximations, with and without boundedness conditions 1.3 Combinatorial central limit theorem Let X be an n by n random matrix with independent components {Xij : i, j ∈ {1, 2, , n}} Let...Summary Discretized normal approximation The total variation distance between the distribution of a sum of integer valued random variables S and a Gaussian distribution is always 1 However, a discretized normal distribution supported on the integers is possible to approximate L (S) in the total variation distance When S is a sum of independent random integers, this heuristic was realized by using . MULTIVARIATE, COMBINATORIAL AND DISCRETIZED NORMAL APPROXIMATIONS BY STEIN’S METHOD FANG XIAO NATIONAL UNIVERSITY OF SINGAPORE 2012 MULTIVARIATE, COMBINATORIAL AND DISCRETIZED NORMAL APPROXIMATIONS BY. random variables and Gaussian distributions. The following three topics in normal approximation by Stein’s method are studied. Multivariate normal approximation. Since Stein introduced his method, . 165 viii SUMMARY Stein’s method is a method for proving distributional approximations along with error bounds. Its power of handling dependence among random variables has attracted many theoretical and applied