Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 140 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
140
Dung lượng
657,17 KB
Nội dung
COVARIANCE MATRIX ESTIMATION WITH HIGH FREQUENCY FINANCIAL DATA LIU CHENG NATIONAL UNIVERSITY OF SINGAPORE 2013 COVARIANCE MATRIX ESTIMATION WITH HIGH FREQUENCY FINANCIAL DATA LIU CHENG (B.Sc. Wuhan University) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY NATIONAL UNIVERSITY OF SINGAPORE 2013 ii Acknowledgements I would first like to give my deepest thank to my supervisor, Dr. Tang Cheng Yong. He is truly a great advisor not only in my research but also in my daily life. I would like to thank him for his guidance, encouragement, all kinds of supports, time, and endless patience. Next, I would like to thank all my seniors and classmates, especially Dr. Jiang Binyan for discussions about my research problems. I also thank all my friends who let me know I’m not alone in the world. Special thanks to Lv Zhixin, Zhang Huaxing, Guo Xihui, Xu Qiao, Liu Yini, He Yawei and Cai Qingyun. A deepest gratitude to my parents, my brother, my sister, and also my uncles. Their love and constant concern make my life. A deep gratitude to the university and the department for supporting me Acknowledgements through NUS Graduate Research Scholarship and other kinds of supports. Thanks to the examiners for their precious work. iii iv Contents Acknowledgements Summary ii vii List of Tables ix Chapter Introduction 1.1 Diffusion Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Estimation of the IV . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Microstructure Noise . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Transactions or Quotes? . . . . . . . . . . . . . . . . . . . . 1.2.3 Calendar, Transaction or Tick Time Sampling? . . . . . . . 10 1.2.4 Random Sampling . . . . . . . . . . . . . . . . . . . . . . . 11 1.2.5 Existing Estimators of the IV . . . . . . . . . . . . . . . . . 12 Contents v 1.3 Estimation of the ICM . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.3.1 Asynchronous Data . . . . . . . . . . . . . . . . . . . . . . . 14 1.3.2 Dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.3.3 Positive Semi-definite . . . . . . . . . . . . . . . . . . . . . . 16 1.3.4 Existing Estimators of the ICM . . . . . . . . . . . . . . . . 17 Chapter Synchronous Data Multivariate QMLE 19 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.3.1 Consistency and Asymptotic Normality . . . . . . . . . . . . 29 2.3.2 Clearer Insight of the Main Result in Dimension . . . . . . 34 2.4 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.2 Financial Data Analysis . . . . . . . . . . . . . . . . . . . . 46 2.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Chapter Asynchronous Data Scheme 77 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.2 The QML Approach for Asynchronous Data . . . . . . . . . . . . . 79 3.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.3.1 The QKF Approach for Asynchronous Data . . . . . . . . . 84 3.3.2 Estimation of the ICM for Two Special Case . . . . . . . . . 90 3.4 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Contents vi 3.4.1 The QKF and QML Approach are the Same When Observations are Synchronous . . . . . . . . . . . . . . . . . . . . . 92 3.4.2 Consistency of the QKF Approach . . . . . . . . . . . . . . 93 3.4.3 Comparisons between Our Approach and Existing Similar Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 96 3.5.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 3.5.2 Financial Data Analysis . . . . . . . . . . . . . . . . . . . . 99 3.6 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Chapter Conclusion and Future Work 115 4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Bibliography 120 vii Summary Estimating the integrated covariance matrix (ICM) from high frequency financial trading data is crucial to reflect the volatilities and covariations of trading instruments. Such an objective is difficult due to contaminated data with microstructure noises, asynchronous trading records, and increasing data dimensionality. In this dissertation, we study the estimation of the ICM of a finite dimensional diffusion process step by step. We firstly develop a quasi-maximum likelihood (QML) approach for estimating the ICM for synchronous data. We explore a novel and convenient multivariate time series device for evaluating the estimator both theoretically for its asymptotic properties, and numerically for its practical implementations. We demonstrate that the QML approach is consistent to the ICM, and is asymptotically normally Summary distributed. Efficiency gain of the QML approach is theoretically quantified, and numerically demonstrated via extensive simulation studies. An application of the QML approach is illustrated through analyzing a data set of high frequency financial trading. We then extend the coverage of the QML approach to asynchronous data. We express the original stochastic model as a state space model and then apply the Kalman filter approach for solving the QML for estimating the ICM, which is denoted as the QKF approach. Different from synchronizing the original data, an approach by applying the expectation-maximization (EM) algorithm is applied to evaluate the QKF approach for asynchronous data. We show that the estimator of the new approach is consistent, efficient, positive semi-definite. Properties of the QKF approach are theoretically derived and numerically demonstrated via extensive simulation studies. We also implement the QKF approach on some high frequency financial trading data. viii ix List of Tables Table 2.1 Parameter Values for Simulations . . . . . . . . . . . . . . . Table 2.2 Bias and root mean square errors (RMSE, values in brackets) (×102 ) of 44 ˆ 11, Σ ˆ 12 , Σ ˆ 22] with constant and stochastic estimators for elements of the ICM [Σ spot volatilities when data are synchronous and equally spaced with time interval between two consecutive observations equals to ∆ and correlation between two log-price processes equals to ρ. . . . . . . . . . . . . . . . . . . . . . . . 45 3.6 Discussions 112 j−1 j−1 j−1 = E{(Xj−1 + Vj − Xj−1 )(Xj−1 − Xj−1 ) } = Pj−1 . (3.36) Then (3.18) are proven by Xnj−1 = E (Xj−1 |yn ) = E E Xj−1 |yj−1 , Xj − Xjj−1 , γjn |yn j−1 = Xj−1 + Jj−1 (Xnj − Xjj−1 ). Therefore j−1 Xj−1 − Xnj−1 = Xj−1 − Xj−1 − Jj−1 (Xnj − Xjj−1 ), or j−1 j−1 Xj−1 − Xnj−1 + Jj−1 Xnj = Xj−1 − Xj−1 + Jj−1 Xj−1 . (3.37) Hence by taking expectation of each side of (3.37) multiplied the transpose of itself, we have j−1 j−1 j−1 Pnj−1 + Jj−1 E(Xnj Xnj )Jj−1 = Pj−1 + Jj−1 E(Xj−1 Xj−1 )Jj−1 where the last equation is because E Xnj (Xj−1 − Xnj−1 ) (3.38) j−1 j−1 = and E Xj−1 (Xj−1 − Xj−1 ) 0, which can be obtained by firstly denoting ˜ h = Xh − Xh , X j j and then for h ≤ l, h ≤ i and l ≤ j, ˜ l ) = 0, E(Vj X j l ˜ ) = E Xh (Xj − Xl ) E(Xhi X j i j = E E Xhi Xj |yl = E Xhi Xlj − E Xhi Xlj − E Xhi Xlj = 0, (3.39) = 3.6 Discussions 113 and for h ≥ l, h ≤ i and l ≤ j, l ˜ ) = E Xh (Xj − Xl ) E(Xhi X j i j = E E Xhi Xj |yh = E Xhi Xhj − E Xhi Xlj − E Xhi Xlj = 0. (3.40) On the other hand , since E(Xnj Xnj ) = E(Xj Xj ) − Pnj = E(Xj−1 Xj−1 ) + Σ∆j − Pnj j−1 j−1 j−1 = E(Xj−1 Xj−1 ) + Pj−1 + Σ∆j − Pnj j−1 j−1 = E(Xj−1 Xj−1 ) + Pjj−1 − Pnj by realizing Vj is independent of Xj−1 and (3.14), therefore we obtain equation (3.19) by combing (3.38) and above equation. The lag-one covariance smoother can also be proven by direct calculation. By (3.13) and (3.34) we have ˜ jX ˜j Pjj,j−1 = E(X j j−1 ) =E ˜ j−1 − Kj (B ˘ jX ˜ j−1 + U ˘ j) X j j ˜ j−1 − Jj−1 Kj (B ˘ jX ˜ j−1 + U ˘ j) X j−1 j j−1 ˘ j Pj−1 − Pj−1 B ˘ K J + K j (B ˘ j Pj−1 B ˘ +A ˘ j )K J = Pj,j−1 − Kj B j j j−1 j j j j−1 j,j−1 j j−1 ˘ j Pj−1 − Pj−1 B ˘ K J + Pj−1 B ˘ KJ = Pj−1 − Kj B j j j−1 j j j−1 j−1 j j ˘ j )Pj−1 . = (I − Kj B j−1 (3.41) The fourth equation is because (3.36) and (3.17). Therefore (3.21) is proven by above and letting j = n. To prove (3.22), we reuse (3.37) to have ˜ n + Jj−1 Xn )(X ˜ n + Jj−2 Xn ) = (X ˜ j−1 + Jj−1 Xj−1 )(X ˜ j−2 + Jj−2 Xj−2 ) . (X j−1 j j−2 j−1 j−1 j−2 j−1 j−2 (3.42) 3.6 Discussions 114 And, on the other hand, we have n ˜ ) = 0, E(Xnj X j−2 n j−1 ˜ Xn ) = 0, E(X j−1 j−1 ˜ Xj−2 ) = E(X j−1 j−2 ˜ j−1 = by (3.39). Combing (3.18), (3.42), (3.43) and X j−1 ˘ j−1 obtained from (3.34), we have Kj−1 U (3.43) ˘ j−1 X ˜ j−2 + I − Kj−1 B j−1 Pnj−1,j−2 =E ˜ j−1 + Jj−1 Xj−1 X j−1 j−1 ˜ j−2 + Jj−2 Xj−2 X j−2 j−2 − Jj−1 E(Xnj Xnj−1 )Jt−2 j−2 ˘ j−1 )Pj−2 ˘ = (I − Kj−1 B j−1,j−2 + Jj−1 Kj−1 Bj−1 Pj−1,j−2 j−1 j−2 + Jj−1 E(Xj−1 Xj−2 ) − E(Xnj Xnj−1 ) Jt−2 j−1 j−1 j−2 n n ˘ j−1 Pj−2 = Pj−1 Jj−2 + Jt−1 Kj−1 B j−1,j−2 + Jj−1 E(Xj−1 Xj−2 ) − E(Xj Xj−1 ) Jj−2 j−1 ˘ j−1 )Pj−1 , Jt−2 = Pj−2 Pj−2 as Pj−1 = (I − Kj−1 B j−2 j−2 j−1 −1 and (3.36). Moreover, since j−1 j−2 E(Xj−1 Xj−2 ) − E(Xnj Xnj−1 ) j−2 j−2 = E(Xj−1 Xj−2 ) − {E(Xj Xj−1 ) − Pnj,j−1 } j−2 = E(Xj−1 Xj−2 ) − Pj−1,j−2 − E(Xj−1 Xj−2 ) + Σ∆j−1 − Pnj,j−1 j−2 = Pnj,j−1 − Pj−2 + Σ∆j−1 by (3.15) and (3.36), therefore by (3.14) and (3.16) we have j−1 j−2 ˘ j−1 Pj−1 (J )−1 Pnj−1,j−2 = Pj−1 Jj−2 + Jj−1 Pnj,j−1 − Pj−1 − Kj−1 B j−2 j−2 j−1 j−1 = Pj−1 Jj−2 + Jj−1 Pnj,j−1 − Pj−1 Jj−2 , which is actually (3.22). Therefore, we finished the proof of Lemma 3.1. Jj−2 115 CHAPTER Conclusion and Future Work 4.1 Conclusion In financial study, one of the most attractive topic is the estimation of the integrated covariance matrix (ICM) of an assets price process. This matrix plays a crucial role in risk management and in many financial applications including constructing hedging and investing strategies, pricing stock options, and other derivatives, where the assets prices are usually modeled by a stochastic process. The difficulties of estimating the ICM are caused by many factors, for example, the trading records of an assets price process in practice in practice are usually asynchronous and contaminated with market microstructure noises, the estimator of the ICM should be positive semi-definite, dimensionality and so on. The 4.1 Conclusion approaches on estimating the ICM developed in previous literature have been discussed in Chapter 1. However, none of these approaches has all the following desirable properties—consistency, efficiency, positive semi-definite matrix, computational efficiency for the high dimensional ICM. In this dissertation, we study the estimation of the ICM with high frequency financial data. The high frequency data are trading records of a d dimensional assets price process and are assumed to be asynchronous and contaminated with microstructure noises. The log of this assets prices are modeled by a general continuous multivariate stochastic volatility process. In this study, high frequency means that the account of observations of each asset goes to infinity theoretical in a fixed time interval [0, T ], where T can be one day, one month or one year. The main idea of the two approaches developed in this dissertation is applying quasi maximum-likelihood (QML), which is firstly introduced for the estimation of integrated volatility in A¨ıt-Sahalia, Mykland and Zhang (2005) and further studied in Xiu (2010), to estimate the ICM. In Chapter 2, we extend the univariate QML approach to the multivariate QML approach theoretically and develop a convenient procedure to derive the QML estimator for a finite d dimensional ICM. This procedure is to transform the stochastic model of log-returns of a d dimensional assets price process to a d dimensional multivariate moving average time series model— MA (1) model. Therefore, the QML estimator of the ICM is obtained through just evaluating the likelihood function for a d dimensional multivariate normal distributed sample. The theoretical proofs and simulation results show that the QML approach of the ICM is consistent, efficient with optimal convergence rate and more efficient than other estimators developed in previous literature. Moreover, the QML approach of the ICM is positive semi-definite as we estimate the 116 4.1 Conclusion Choleskey decomposition of the ICM instead of estimating the ICM directly. Although the QML approach has many good properties theoretically, it has problems in the computation for evaluating the likelihood function of a d dimensional multivariate normal distributed sample if d is large since we can’t obtain a close-form of the QML estimator. In addition, the QML approach also need synchronizing the original asynchronous data simultaneously. Therefore, we may loss a quite large part of information contained in the original data if the dimension of an assets process is large. However, these two problems are solved successfully in Chapter through a new approach. In Chapter 3, instead of rewriting the original stochastic model of an assets price process as a MA (1) model, we rewrite the stochastic model as a multivariate Gaussian state space model. Based on this rewriting, we combine the QML approach and Kalman filter together to derive a new approach (denoted by QKF approach) using EM-algorithm for evaluating the estimator of the ICM. We consider the original asynchronous data as synchronous data with some missing components. Therefore, the techniques of handling missing data in Kalman filter can be applied to deal with missing components. Therefore all information contained in the original data can be used in the estimation of the ICM. In addition, the closed-form of this estimator in each M-step of EM-algorithm is explicit and hence we are able to handle the estimation of the ICM even when d is large. Our theoretical proofs and simulation results in Chapter show that the QKF approach can also achieve the desirable properties as the QML approach does. The QKF approach is equivalent to the QML approach if the data are synchronous and it’s more efficient than the QML approach if the data are asynchronous. 117 4.2 Future Work 4.2 Future Work In this dissertation, we have developed two approaches to estimate the ICM of a finite dimensional assets price process with randomly recorded high frequency data in the presence of market microstructure noises. Although these approaches have several good properties, there are still some problems in the estimation of the ICM. The future works in this area include: 1. Is there any jump in real market data? If yes, how to detect, model and handle these jumps? Can we derive the impacts of these jumps theoretically in the estimation of the ICM? In our theoretical proof of the QML and QKF approach, we not consider the case that the volatility process and assets log-return process have jumps. Are these two approaches robust to this case? 2. In this dissertation, we assume the microstructure noises are serially independent across time and mutually independent of the latent price process. Are these approaches robust to a more general assumption for the microstructure noises? 3. In this paper, we assume the observation time points of an assets price process are randomly spaced and they are independent of values of the price process. However, the independent constraint may not be true in reality. Can we relax this constraint? 118 4.2 Future Work 4. In reality, there are usually hundreds or thousands of assets in an Exchange. Can we estimate the ICM of all assets in an Exchange? Especially, can we estimate an ICM accurately when its dimension goes to infinity? 119 120 Bibliography [1] Andersen, T. G., Bollerslev, T., Diebold, F. X., and Labys, P. (2001), “The distribution of realised exchange rate volatility,” Journal of the American Statistical Association, 96, 42-55. [2] A¨ıt-Sahalia, Y., Fan, J., and Xiu, D. (2010), “High Frequency Covariance Estimates with Noisy and Nonsynchronous Financial Data,” Journal of the American Statistical Association, 105, 1505-1517. [3] —— (2009), “Estimating Volatility in the Presence of Market Microstructure Noise: A Review of the Theory and Practical Considerations,” Handbook of Financial Time Series, edited by Thomas Mikosch et al., Springer-Verlag. [4] A¨ıt-Sahalia, Y., Mykland, P. A., and Zhang, L. (2005), “How Often to Sample a Continuous-Time Process in the Presence of Market Microstructure Noise,” Review of Financial Studies, 18, 315-416. [5] ——(2011), “Ultra High Frequency Volatility Estimation with Dependent Microstructure Noise,” Journal of Econometrics, 160, 160-175. Bibliography 121 [6] Bai, J. and Shi, S. (2002), “Estimating high dimensional covariance matrices and its applications,” Annals of Economics and Finance, 12, 199-215. [7] — (2004), “Power and Bipower Variation with Stochastic volatility and Jumps” (with discussion), Journal of Financial Economics, 2, 1-48. [8] Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shepard, N. (2008), “Designing Realized Kernels to Measure the Ex-Post Variation of Equality Prices in the Presence of Noise,” Econometrica, 76, 1481-1536. [9] —— (2011), “Multivariate Realized Kernels: Consistent Positive Semi- Definite Estimators of the Covariation of Equality Prices with Noise and nonsynchronous Trading,” Journal of Econometrics, 162, 149-169. [10] Bibinger, M. and M. Reiß (2011), “Spectral estimation of covolatility from noisy observations using local weights,” preprint, Humboldt-University Berlin. [11] Black, F. and Scholes, M. (1973), “The pricing of options and corporate liabilities,” Journal of Political Economy, 81, 637-654. [12] Brockwell, P. J., and Davis, R. A. (1991), Time Series: Theory and Methods (2nd ed.), New York: Springer. [13] Brown, R. (1828), “A brief account of microscopical observations made in the months of June, July and August, 1827, on the particles contained in the pollen of plants; and on the general existence of active molecules in organic and inorganic bodies,” Philosophical Magazine, 4, 161-173. [14] Butcher, J. C. (2003), Numerical Methods for Ordinary Differential Equations, New York: John Wiley & Sons. [15] Christensen, K., Kinnebrock, S., and Podolskij, M. (2010), “Pre-Averaging Estimator of the Ex-Post Covariance Matrix in Noisy Diffusion Models with Non-Synchronous Data,” Journal of Econometrics, 159, 116-133. Bibliography [16] Corsi, F., S. Peluso, and F. Audrino (2012), “Missing asynchronicity: a Kalman-EM approach to multivariate realized covariance estimation,” manuscript . [17] Dempster, A. P., Laird, N. M., and Rubin, D.B. (1977), “ Maximum likelihood from incomplete data via the EM algorithm,” Journal of Royal Statistics Society, Series. B, 39, 1, 1-38. [18] Durbin, J. and S. J. Koopman (2001), Time Series Analysis by State Space Methods, Oxford: Oxford University Press. [19] Einstein, A. (1956), Investigations on the Theory of Brownian Movement, Dover. [20] Engle, R. F. and J. R. Russell (1998), “Forecasting transaction rates: the autoregressive conditional duration model,” Econometrica, 66, 1127-1162. [21] Epps, T.W. (1979), “Comovement in Stock Prices in the Very Short Run,” Journal of the American Statistical Association, 74, 291-298. [22] Fan, J., Fan, Y., and Lv, J. (2008), “High Dimensional Covariance Matrix Estimation Using a Factor Model,” Journal of Econometrics, 147, 186-197. [23] Fan, J., Lv, J., and Qi, L. (2011), “Sparse high-dimensional models in economics,” Annual Review of Economics, 3, 291-317. [24] Fan, J., Wang, M., and Yao, Q. (2008), “Modelling Multivariate Volatilities via Conditionally Uncorrelated Components,” Journal of the Royal Statistical Society, Series. B, 70, 679-702. [25] Fan, J., and Wang, Y. (2007), “Multi-Scale Jump and Volatility Analysis for High-Frequency Financial Data,” Journal of the American Statistical Association, 102, 1349-1362. [26] Fan, J., and Yao, Q. (2003), Nonlinear Time Series: Nonparametric and Parametric Methods, Springer-Verlag, New York. 122 Bibliography [27] Gloter, A., and Jacod, J. (2001), “Diffusions with measurement errors: I. Local asymptotic normality,” European Series in Applied and Industrial Mathematics, 5, 225-242. [28] Griffin, J.E., and Oomen, R. C. (2008), “Sampling Returns for Realized Variane Calculations: Tick time or Transaction Time?” Econometric Reviews, 27, 230-253. [29] Guillaume, D. M., Dacorogna, M. M., Dave, R. R., M¨ uller, U. A., Olsen, R. B., and Pictet, O. V. (1997), “From the bird’s eye view to the microscope: a survey of new stylized facts of the intra-daily foreign exchange markets,” Finance and Stochastics, 2, 95-130. [30] Hansen, P. R., and Lunde, A. (2006), “Realized Variance and Market Microstructure Noise,” (with discussion), Journal of Business & Economic Statistics, 24, 127-218. [31] Harville, D. A. (1997), Matrix Algebra from a Statistician’s Perspective, New York: Springer. [32] Harris, F., McInish, T., Shoesmith, G., and Wood, R. (1995), “Cointegration, error correction and price discovery on informationally-linked security markets,” Journal of Financial and Quantitative Analysis, 30, 563-581. [33] Hayashi, T., and Yoshida, N. (2005), “On Covariance Estimation of NonSynchronously Observed Diffusion Processes,” Bernoulli, 11, 359-379. [34] Hoshikawa, T., Kanatani, T., Nagai, K., and Nashiyama, Y. (2008), “Nonparametric estimation methods of integrated multivariate volatilities,” Econometric Review, 27 112-138. [35] Hull, J., and White, A. (1987), “The Pricing of Options on Assets With Stochastic Volatilities, Journal of Finance, 42, 281300. [36] Jacod, J., Li, Y., Mykland, P., Podolskij, M., and Vetter, M. (2009), “Microstructure Noise in the Continuous Case: The Pre-Averaging Approach,” Stochastic Processes and Their Applications, 119, 2249-2276. 123 Bibliography [37] Jacod, J., and Protter, P. (2012), Discretization of Processes. Springer-Verlag, Heidelberg. [38] Jacod, J. and Shiryaev, A.N. (2003), Limit Theorems for Stochastic Processes (2nd ed.), New York: Springer-Verlag. [39] Johnstone, I. M. (2001), “On the distribution of the largest eigenvalue in principal components analysis,” Annals of Statistics, 29, 295-327. [40] Koopmans, T.C. (1950), “Models involving a continuous-time variable,” In T.C. Koopmans (ed.), Statistical Inference in Dynamic Economic Models, Chapter 16 and pp. 384-392. New York: Wiley. [41] Li, Y., P. Mykland, E. Renault, L. Zhang, and X. Zheng (2009), “Realized volatility when endogeniety of time matters,” manuscript. [42] Liu, C., and Tang, C.Y. (2012). “A Quasi-Maximum Likelihood Approach for Integrated Covariance Matrix Estimation with High Frequency Data,” manuscript. [43] Malliavin, P., and Mancino, M. (2002), “Fourier Series Method for Measurement of Multivariate Volatilities,” Finance and Stochastics, 84, 668-681. [44] —— (2009), “A Fourier Transform for Nonparametric Estimation of Multivariate Volatilities,” Annals of Statistics, 37, 1983-2010. [45] Matteson, D. S. and Tsay, R. S. (2009), “Modelling Multivariate Volatilities via Independent Components,” working paper. [46] McAleer, M., Medeiros, M., (2008), “Realized volatility: A review,” Econometric Reviews, 27, 10-45. [47] McCullagh, P. (1987), Tensor Methods in Statistics. Chapman and Hall, London, Uk. [48] Merton, R. C.(1969), “ Lifetime portfolio selection under uncertainty: The continuous time case,” Review of Economics and Statistics, 51, 247-257. 124 Bibliography 125 [49] ——(1973), “Theory of rational option pricing,” Bell Journal of Economics, 4, 141183. [50] ——(1990), Continuous-Time Finance, Oxford University Press, New York. [51] Mykland, P. A. and Zhang, L. (2006), “ANOVA for diffusions and Ito processes,”Annals of Statistics, 34 (4), 1931-1963. [52] —— (2010), “The Econometrics of High Frequency Data,” Statistical Methods for Stochastic Differential Equations, M. Kessler, A. Lindner, and M. Sorensen, eds. Chapman & Hall/CRC Press, forthcoming. [53] Moran, P.A.P. (1953), “The Statistical Analysis of the Canadian Lynx Cycle,” Australian Journal of Zoology, 1(3), 291-298. [54] M¨ uller, U. A., Dacorogna, M. M., Olsen, R. B., Pictet, O. V., Schwarz, M., and Morgenegg, C. (1990), “Statistical Study of Foreign Exchange Rates, Empirical Evidence of a Price Change Scaling Law, and Intraday Analysis,” Journal of Banking and Finance, 14, 11891208. [55] Nelson, Daniel B. (1990), “ARCH Models as Diffusion Approximations,” Journal of Econometrics, 45, 7-39. [56] Øksendal, Bernt (2003), Stochastic Differential Equations: An Introduction with Applications (6th Ed), Springer. [57] Peluso, S., Corsi, F., and Mira, A. (2012), “A Bayesian High- Frequency Estimator of the Multivariate Covariance of Noisy and Asynchronous Returns,” Available at SSRN: http://ssrn.com/abstract=2003492 or http://dx.doi.org/10.2139/ssrn.2003492. [58] Pinheiro, J. C. and Bates, D.M,(1996), “Unconstrained parameterizations for variance-covariance matrices,” Statistics and Computing, 6, 289-296. [59] Protter, P. E. (2004), Stochastic Integration and Differential Equations (2nd ed.), New York: Springer. Bibliography [60] Shephard N., and Xiu, D. (2012), “Econometric Analysis of Multivariate Realised QML: Efficient Positive Semi-definite Estimators of the Covariation of Equity Prices,” manuscript. [61] Shumway, R. and Stoffer, D. (2006), Time Series Analysis and Its Application with R Examples (3rd edition), New York: Springer. [62] Tao, M., Wang, Y., and Chen, X. (2012), “Fast Convergence Rates in Estimating Large Volatility Matrices Using High-Frequency Financial Data,” manuscript. [63] Tao, M., Wang, Y., Yao, Q. and Zou, J. (2011), “Large Volatility Matrix Inference via Combing Low-Frequency and High-Frequency Approaches,” Journal of the American Statistical Association, 106, 1025-1040. [64] Tao, M., Wang, Y. and Zhou, H. (2013). “Optimal sparse volatility matrix estimation for high dimensional Itˆo Processes with measurement errors,” manuscript. [65] Tong, H. and Lim, K. S. (1980), “Threshold autoregression, limit cycles and cyclical data, ” Journal of the Royal Statistical Society, Series B, 42, 245-292. [66] Tong, H. (1990), Non-linear time series: a dynamical system approach, Oxford University Press (Oxford). [67] Tsay, R. S. (2010), Analysis of Financial Time Series (3rd edition). New York: Wiley. [68] Wang, Y., and Zou, J. (2010), “Vast Volatility Matrix Estimation for High Frequency Data,” The Annals of Statistics, 38, 943-978. [69] Wiener, N. (1921), “ The average of an analytical functional and the Brownian movement,” Proc. Nat. Acad. Sci. U.S.A. 7, 294-298. [70] Xiu, D. (2010), “Quasi-Maximum Likelihood Estimation of Volatility with High Frequency Data,” Journal of Econometrics, 159, 235-250. 126 Bibliography 127 [71] Yule, G. U. (1927), ”On a Method of Investigating Periodicities in Disturbed Series With Special Reference to W6lfer’s Sunspot Numbers,” Philosophical Transactions of the Royal Society London, Series. A, 226, 267-298. [72] Zhang, L. (2006), “Efficient Estimation of Stochastic Volatility Using Noisy Observations: A Multi-Scale Approach,” Bernoulli, 12, 1019-1043. [73] —— (2011), “Estimating Covariation: Epps Effect and Microstructure Noise,” Journal of Econometrics, 160, 33-47. [74] Zhang, L., Mykland, P. A., and Ait-Sahalia, Y. (2005), “A Tale of Two Time Scales: Determining Integrated Volatility with Noisy High-Frequency Data,” Journal of the American Statistical Association, 100, 1394-1411. [75] Zheng, X., and Li, Y. (2011), “On The Estimation Of Integrated Covariance Matrics Of High Dimensional Diffusion Process, ” The Annals of Statistics, 39, 3121-3151. [76] Zhou, B. (1996), “High-Frequency Data and Volatility in Foreign-Exchange Rates,” Journal of Business and Economic Statistics, 14, 45-52. [77] Zhou, B. (1998), “Parametric and Nonparametric Volatility Measurement, in Nonlinear Modeling of High-Frequency Financial Time Series”, eds. C. L. Dunis and B. Zhou, New York: Wiley, pp. 109-123. [...]... stochastic process, Wt is a one-dimensional Brownian motion Our target is to estimate the IV T 0 2 σt dt based on the high frequency discrete observations X1 , , Xn , where Xi is the observation at time ti ∈ [0, T ] with T fixed The data are high frequency means that the sampling frequency n of the data is quite large and the sampling interval ∆ = max1≤i≤n {ti − ti−1 } is quite small at the scale of 1 second... Epps effect Therefore, asynchronous property of empirical data is an obstacle for estimating the ICM 1.3 Estimation of the ICM Methods in previous literature on handling asynchronous data can be divided into three different groups, methods using part of the original data, using the entire original data and inserting new data into the original data The first group of methods are commonly used in existing... dimensionality is a big problem for the estimation of the ICM In addition, the eigenvalues and eigenvectors of sample covariance matrix are far from the true values (Johnstone, 2001 and Wang and Zou, 2010) Therefore, a simply realized covariance matrix is not a good estimator for the ICM Bai and Shi (2011) give a survey of new approaches for the estimation of high dimensional covairance matrices and... discussion of possible future work 18 19 CHAPTER 2 Synchronous Data Multivariate QMLE 2.1 Introduction In this chapter, we study the QML approach for estimating the ICM with high frequency financial trading data Extending the QML approach to the multivariate ICM estimation is difficult in both practical implementation and theoretical analysis, where huge covariance matrice is encountered which is very hard to... distributed j=1 1 with mean 0 and covariance matrix n 1 0 Σt dt, then we also have (2.1) Motivated by this, the QML approach for the ICM based on contaminated data with microstructure noise proposes to impose a not necessarily correct model by assuming that Yj (j = 1, , n) independently follows a multivariate normal distribution N(0, Σ∆) where Σ is a time invariant covariance matrix We make the... that n → ∞ and ∆ = max {ti − ti−1 } → 0 1≤i≤n (1.3) A well know estimator based on high frequency data the summation of squared returns n i=1 (Xi − Xi−1 )2, named as realized volatility (RV), is a consistent esti- mator of IV if ∆ → 0, n → ∞ when the data are observed without measurement errors Existing literature on the estimation of the IV based on RV includes Hull and White (1987), Andersen et al (2001),... These practical and/or statistical demands 13 1.3 Estimation of the ICM 14 motivate researchers to extend the univariate stochastic process modeling to multivariate stochastic process modeling However, extending the estimation of the IV T 0 2 σt dt to the estimation of the ICM T 0 Σt dt = T 0 σt σt dt is more challenging as the high frequency trading data of assets are usually sampled randomly and asynchronous... estimators for elements of the ICM [Σij ] (i, j = 1, 2, 3) for original synchronous data with equally spaced time interval ∆ and asynchronous data randomly selected from original synchronous data through Bernoulli trials with successful properties p1 = 0.6, p2 = 0.8, p3 = 0.5 103 Table 3.6 Correlation matrix of 10 assets log-return process 104 Table 3.7 Ratios of root mean... approach and the QKF approach for elements of the ICM with synchronous and equally spaced data, where the time interval ∆ between two consecutive data equals to 12s 104 Table 3.8 Ratios of root mean square errors of the CQM approach and the QKF approach for elements of the ICM with irregularly spaced asynchronous data The original data are generated by choosing time interval ∆ =... motivate us to continue the study on the estimation of the ICM On the other hand, because of the good performance of the QML approach on the estimation of the IV, we first extend the QML approach to multivariate case for synchronous data in Chapter 2 And then we apply a novel method to handle the asynchronous data and consider the estimation of the ICM for asynchronous data in Chapter 3 Chapter 4 includes . COVARIANCE MATRIX ESTIMATION WITH HIGH FREQUENCY FINANCIAL DATA LIU CHENG NATIONAL UNIVERSITY OF SINGAPORE 2013 COVARIANCE MATRIX ESTIMATION WITH HIGH FREQUENCY FINANCIAL DATA LIU CHENG (B.Sc the high frequency discrete observations X 1 , , X n , where X i is the observation at time t i ∈ [0, T ] with T fixed. The data are high fr equency means that the sampling frequency n of the data. for original synchronous data with equally spaced time interval ∆ and asyn- chronous data randomly sele cted f r om original synchronous data through Bernoulli trials with successful properties