Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 62 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
62
Dung lượng
210,87 KB
Nội dung
ESTIMATION OF CENTRAL RETINAL VASCULAR EQUIVALENT: CANONICAL CORRELATION ANALYSIS WANG LING (Master of Public Health, University of Texas) A THESIS SUBMITTED FOR THE DEGREE OF MASTER OF SCIENCE DEPARTMENT OF STATISTICS AND APPLIED PROBABILITY NATIONAL UNIVERSITY OF SINGAPORE 2010 i Acknowledgement I would like to take this opportunity to express my deep and sincere gratitude to my supervisor Assistant Professor Li Jialiang I appreciate his valuable advice, guidance, endless patience, kindness and encouragement during my graduate period I have learned many things from him, especially regarding academic research and character building I would sincerely like to thank Professor Wong Tien Yin from Singapore Eye Research Institute for providing the data set and sharing his knowledge and experiences on eye diseases I would also like to thank all my dear fellow students: Ms Jiang Qian, Ms Li Hua, Ms Luo Shan, Mr Jiang Binyan and Mr Liang Xuehua, who helped me in studying statistical theory My special thanks to Ms Zhao Wanting and Ms Zhang Rongli for teaching me Latex in my thesis writing Sincere thanks to all my friends who help me in one way or another in my study Further more, I especially would like to thank my husband Chen Yahua for his love, patience and support during my graduate period I also feel a deep gratitude to my ii dearest family for their support in my study Finally, my gratitude goes to the National University of Singapore for awarding me a research scholarship, and the Department of Statistics and Applied Probability for the excellent research environment I would like to thank all the staffs from General Office in the Department for their all kinds of help CONTENTS iii Contents Summary List of Tables vi viii List of Figures x Introduction 1.1 Biological background 1.2 Statistical background 1.3 Aims and Organization of The Thesis Methods 2.1 The Singapore Malay Eye Study Data Review 2.2 Multiple Linear Regression Analysis 11 2.2.1 Estimation of the parameters 11 2.2.2 Estimation of sample correlation 15 2.2.3 Box-Cox Power Transformation 15 CONTENTS 2.2.4 2.3 2.4 Model Adequacy Checking 16 Canonical Correlation Analysis (CCA) 17 2.3.1 Canonical Correlations and Variates in the Population 17 2.3.2 Estimation of Canonical Correlation and Variates 20 Statistical Analysis 21 Results 3.1 iv 23 Data Review 23 3.1.1 Baseline Characteristics 23 3.1.2 Frequency of the Predictor Variables 25 3.2 Sample Correlation Coefficients 25 3.3 Multiple Linear Regression Analysis 27 3.3.1 bmi as the response variable 27 3.3.2 glucose as the response variable 28 3.3.3 dbpdia as the response variable 29 3.3.4 dbpsys as the response variable 30 CONTENTS v 3.3.5 Conclusion of multiple linear regression analysis 31 3.4 Canonical Correlation Analysis (CCA) 32 3.4.1 Case I: two response variables dbpdia and bdpsys 32 3.4.2 Case II:two response variables bmi and glucose 32 3.4.3 Case III:three response variables dbpdia, dbpsys, and bmi 35 3.4.4 Case IV: four response variables dbpdia, dbpsys, bmi and glucose 35 3.4.5 Comparison of the four cases 38 Discussion 40 4.1 Conclusion 40 4.2 Similar Application of CCA for CRVE 42 4.3 Further Improvement 44 References 47 SUMMARY vi Summary Hypertension, obesity, and diabetes are three common health problems in the world Retinopathy usually refers to an ocular manifestation of systemic disease and are common in older people There existed direct and indirect associations between these four health problems a quantitative assessment in retinal microvascular caliber may provide information to the risks of these systematic health problems The Singapore Malay Eye Study (SiMES) was a population-based cross-sectional study in Singapore 3280 participants were sampled in the study Diastolic blood pressure (dbpdia), systolic blood pressure (dbpsys), body mass index (BMI) and glucose were measured The diameters of all retinal arterioles and all retinal venules were measured The purpose of this study is to use the statistical methods to quantify the central retinal arteriole equivalent (CRAE) using all diameters of all retinal arterioles and the central retinal venule equivalent (CRVE) with all retinal venule diameters Multiple linear regression analysis and canonical correlation analysis (CCA) had been applied to quantify CRAE such that the Pearson correlations between CRAE and SUMMARY vii dbpdia, dbpsys, BMI and glucose were maximized, respectively The results showed that the CCA is more appropriate to quantify CRAE in this study LIST OF TABLES viii List of Tables 3.1 Participants Characteristics in Singapore Malay Eye Study(N = 3280) 24 3.2 The Counts of the Response and Predictor variables (N = 3280) 26 3.3 The Pearson Correlation Coefficients between the Response and Predictor variables 27 3.4 The Coefficients in Multiple Linear Regression Model using log(bmi) as a response variable 28 3.5 The Coefficients in Multiple Linear Regression Model using 1/glucose as a response variable 29 3.6 The Coefficients in Multiple Linear Regression Model using 1/dbpdia as a response variable 30 3.7 The Coefficients in Multiple Linear Regression Model using √ dbpdia as a response variable 31 3.8 The First Standardized Canonical Coefficients in Case I 33 3.9 The First Standardized Canonical Coefficients in Case II 34 3.10 The First Standardized Canonical Coefficients in Case III 36 3.11 The First Standardized Canonical Coefficients (corrected) in Case IV 37 3.12 The Maximum Canonical Correlations 39 3.13 The Pearson Correlation Coefficients between CRAE Estimated in Each Case and the Response Variables 39 LIST OF TABLES ix 4.1 The Pearson Correlation Coefficients between the Response and Predictor variables 42 4.2 The Maximum Canonical Correlations 43 4.3 The Pearson Correlation Coefficients between CRVE Estimated in Each Case and the Response Variables 44 Chapter 3: Results 37 Table 3.11: The First Standardized Canonical Coefficients (corrected) in Case IV Coefficients Response Variable Set (αˆ (1) ) dbpdia -0.0019 dbpsys 0.00015 bmi glucose -0.00031 0.0013 Predictor Variable Set (ˆγ(1) ) a1 0.00048 a2 0.00058 a3 0.00041 a4 0.00015 a5 0.00033 a6 0.00030 Chapter 3: Results 3.4.5 38 Comparison of the four cases The maximum canonical correlations in four cases were listed in Table 3.12 The value of the maximum correlations increased with a small extent when adding a term to the response set holding another set not changed (from 0.21 to 0.2169) The CRAE was estimated by the first canonical variate, which equaled to the summation of the first canonical coefficients times the corresponding predictor variables ( = γˆ (1) X∗i )) The Pearson correlation coefficients between CRAE estimated from four cases and each variable of the response variable set were listed in Table 3.13 In Case I, II, III and corrected IV, the correlation coefficients were negative between estimated CRAE and dbpdia, dbpsys, and bmi but positive between CRAE and glucose In Case I, III, and IV, the value of the correlation coefficients between CRAE and dbpdia, dbpsys were similar while the correlation coefficients in Case III were little bit higher In Case II, there existed the largest absolute correlation coefficients between CRAE estimated from Case I and bmi (-0.0341),or glucose (0.0669) compared with other three models Chapter 3: Results 39 Table 3.12: The Maximum Canonical Correlations Case Maximum correlation I (dbpdia, dbpsys) 0.2100 II (bmi, glucose) 0.0787 III (dbpdia, dbpsys, bmi) 0.2103 IV (dbpdia, dbpsys, bmi, glucose) 0.2169 Note: In the first column, each row gives the variables used in one set in Canonical Correlation Analysis (CCA) Another set of variables in CCA is the same for each row, which is a1, a2, , a6 Table 3.13: The Pearson Correlation Coefficients between CRAE Estimated in Each Case and the Response Variables Case dbpdia dbpsys bmi glucose I (dbpdia, dbpsys) -0.2066 -0.1044 -0.0284 0.0497 II (bmi, glucose) -0.1545 -0.0639 -0.0341 0.0669 III (dbpdia, dbpsys, bmi) -0.2066 -0.1047 -0.0291 0.0497 IV (dbpdia, dbpsys, bmi, glucose) -0.2060 -0.1029 -0.0291 0.0523 Note: CRAE = γˆ (1) X∗i γˆ (1) values can be found in Table 3.8,3.9,3.10,3.11 X∗i = (a1, a2,a3,a4,a5,a6)’ Chapter 4: Discussion 40 Chapter Discussion 4.1 Conclusion In this study, three types of correlations had been estimated The individual Pearson correlation coefficients described the correlation between two variables The correlations between the response variables and predictor variables were very weak (< ±0.15) The results showed that blood pressure was negatively correlated with all six diameters of retinal arterioles and BMI negatively correlated with most of diameters of retinal arterioles while glucose positively correlated with all six retinal arterioles diameters Multiple linear regression analysis had been used to describe the relationship between one response variable and a set of predictor variables Then the Pearson correlation coefficients were estimated between the response variable and the fitted value Chapter 4: Discussion 41 (consider as CRAE) The initial results showed that there were some weaknesses in this method for the purpose of this study First, this method only used one response variable when building model Secondly, the value of correlation between the response variable and estimates was always positive Finally, model building and model adequacy checking were relatively trivial So multiple linear regression analysis was considered not appropriate for the accomplishment of the purpose of this study Canonical correlation analysis was therefore used to study the relationship between two sets of variables This method was proved to be more appropriate for reaching the purpose of this study First of all, this method used two or more response variables simultaneously Secondly, the directions of relationships between the CRAE and the response variables were agreeable with ones in the individual Pearson correlation coefficients Thirdly, The maximum canonical correlation would increase as adding terms in the response variable set Finally, for the known four response variables, Case III (including variables dbpdia, dbpsys, BMI) presented the relative large individual correlations between the CRAE and all four response variables, which consider to be an optimal model for the estimation of CRAE CRAE were negatively related with blood pressures(r = -0.21 with dbpdia and r = -0.1047 with dbpsys) and negatively related with BMI (-0.029) but positively related with glucose (r = 0.0497) These conclusions were testified by following section of similar application of CCA for CRVE Chapter 4: Discussion 42 Table 4.1: The Pearson Correlation Coefficients between the Response and Predictor variables 4.2 v1 v2 v3 v4 v5 v6 dbpdia -0.0061 -0.0071 -0.0234 -0.0589 0.0202 -0.0097 dbpsys -0.0104 -0.0100 -0.0252 -0.0341 0.0129 -0.0237 bmi 0.0492 0.0173 -0.0009 -0.0086 0.0293 0.0146 glucose 0.0324 0.0275 0.0233 0.0321 0.0302 -0.0091 Similar Application of CCA for CRVE The response variable set included variables dbpdia, dbpsys, BMI, and glucose The predictor variable set included variables v1, v2, v3, v4, v5, v6 with response rate of each variable 88.14% or more (data not shown) The individual Pearson correlation coefficients are presented in Table 4.1 The correlations were very weak between the response variables and the predictor variables (< ±0.05) The results also showed that blood pressure (dbpdia and dbpsys) was negatively related with most of retinal venule diameters while BMI and glucose positively related with most of retinal venule diameters The first or maximum canonical correlation coefficients are shown in Table 4.2 Similar to the situation in CRAE estimation, the value increased when adding a term to the response variable set while holding the predictor variable set constant The CRVE was estimated by the first canonical variate, which equals to the sum- Chapter 4: Discussion 43 Table 4.2: The Maximum Canonical Correlations Case Maximum correlation I (dbpdia, dbpsys) 0.067 II (bmi, glucose) 0.084 III (dbpdia, dbpsys, bmi) 0.0704 IV (dbpdia, dbpsys, bmi, glucose) 0.0981 Note: In the first column, each row gives the variables used in one set in Canonical Correlation Analysis (CCA) Another set of variables in CCA is the same for each row, which is v1, v2, , v6 mation of the first canonical coefficients times the corresponding predictor variables ( = γˆ (1) X∗i ) The Pearson correlation coefficients between CRVE and each of the response variable set are listed in Table 4.3 The results showed that dbpdia and dbpsys grouped to be the response set made the correlations biggest between CRVE and dbpdia or dbpsys compared to other two cases Similarly, grouping BMI and glucose gave the biggest correlations between CRVE and BMI or glucose Thus, the results in estimation of CRVE strongly supported the conclusion that CCA be more appropriate statistical method to estimate the CRAE or CRVE Chapter 4: Discussion 44 Table 4.3: The Pearson Correlation Coefficients between CRVE Estimated in Each Case and the Response Variables Case dbpdia dbpsys bmi glucose I (dbpdia, dbpsys) -0.0669 -0.0486 0.0003 0.0447 II (bmi, glucose) -0.0281 -0.0277 0.0510 0.0673 III (dbpdia, dbpsys, bmi) -0.0513 -0.0460 -0.0381 0.0597 IV (dbpdia, dbpsys, bmi, glucose) -0.0494 -0.0422 0.0375 0.0684 Note: CRAE = γˆ (1) X∗i γˆ (1) are canonical coefficients for the predictor variable set produced by CCA Of four cases, Case I, II, and III used corrected coefficients, which means changing signs for every coefficient X∗i = (v1,v2,v3,v4,v5,v6)’ 4.3 Further Improvement Canonical correlation analysis has an assumption that all variables studied should follow normal distributions Figure 4.1 of histograms of all variables showed that some of them (like glucose and a3) are obviously not following normal distributions The possible way to remove the problem is to Box-Cox transformation to make data look normal, then use the transformed data to the analysis The estimation of CRAE is a linear combination of the predictor variables (a1 - a6) and can be viewed as a gradient, but a linear combination of the transformed variables is hard to interpret as the representative of retinal arteriole diameters In the analysis, the predictor variables a1-a6 were selected from all 14 measure- Chapter 4: Discussion 45 200 500 0 100 1500 Histogram of glucose Frequency 800 Histogram of BMI 400 Frequency 400 100 10 30 50 15 30 Histogram of a1 Histogram of a2 Histogram of a3 Histogram of a4 140 40 80 120 Histogram of a5 Histogram of a6 200 Frequency 140 40 80 a3 140 a4 200 Frequency 200 40 80 400 a2 400 a1 200 0 400 Frequency 200 40 80 400 glucose 500 BMI Frequency dbpsys 500 dbpdia Frequency 60 Frequency 200 200 Frequency 400 Histogram of dbpsys Frequency Histogram of dbpdia 40 80 a5 140 40 80 140 a6 Figure 4.1: Histograms of the Response and Predictor Variables ments of retinal arteriole calibers based on the response rate (see Table 3.2) The response rates of a7 - a10 are lower than 62%, but the bases of these variables are large Chapter 4: Discussion 46 (from 2022 to 103) because of large sample size So other statistical methods should be explored to estimate CRAE (or CRVE) which can make use of all possible measurements of retinal vessel calibers such that the estimates can be more representative In canonical correlation analysis, only the main effects of the predictor variables had been analyzed in the quantifying the central retinal vascular equivalent In further study the higher order of the predictor variables could be added to the set to improve the correlation between two sets of variables Besides, the other variables such as age, gender and other variables might affect the results in quantifying the CRAE So these variables should be evaluated in future study References 47 References Anderson, T W (2003) An Introduction to Multivariate Statistical Analysis (3rd edition) John Wiley & Sons, Inc Hoboken, New Jersey USA Capilla, C., Navarro, J.L., Sendra, J.M and Izquierhido, L (1988) Detection of orange juice dilution by canonical correlation analysis Analytica Chimica Acta, 212:09315 Carroll,R.J and Ruppert,D (1981) On prediction and the power transformation family Biometrika, 68: 609-615 Cheung, N., Lim, L., Wang, J.J., Amirul Islam, F.M., Mitchell, P., Saw, D.M., Aung, T and Wong, T.Y (2008) Prevalence and risk factor of retinal arteriolar emboli: The Singapore Malay Eye Study American Journal of Ophthalmology, 146: 620-624 Cheung, N and Wong, T Y (2007) Obesity and Eye Diseases Survey of Ophthalmology, 52: 180-195 Foong, A.W.P., Saw, S.-M., Loo, J.-L.,Shen, S., Loon, S.-C., Rosman, M., Aung, T., Tan, D.T.H., Tai, E.S and Wong, T.Y (2007) Rationale and methodology for a population-based study of eye diseases in Malay people: the Singapore Malay Eye Study (SiMES) Ophthalmic Epidemiology, 14: 25-35 Han,J N., Stegen,K., DE Valck, C., Clement, J and Van de Woestijne, K P (1996) Influence of breathing therapy on complaints, anxiety and breathing pattern in patients References 48 with hyperventilation syndrome and anxiety disorders Journal of Psychosomatic Research, 41: 481-493 Hardoon, D.R., Szedmak, S and Shawe-Taylor, J (2004) Canonical correlation analysis: an overview with application to learning methods Neural Computation, 16: 26392664 Hotelling, R (1935) Relations between two sets of variates Biometrika, 28: 321-377 Hubbard, L D., Brothers, R J., King, W N., Clegg, L X., Klein, R., Cooper, L S., Sharrett, A R., Davis M D., Cai, J and Atherosclerosis Risk in Communities Study Group (1999) Methods for evaluation of retinal microvascular abnormalities associated with hypertension/sclerosis in the atherosclerosis risk in communities study Ophthalmology, 106: 2269-2280 Jeganathan, V.S.E., Sabanayagam, C., Tai, E.S., Lee, J., Lanoureux, E., Sun, C., Kawasaki, R and Wong, T.Y (2009) Retianl vascular caliber and diabetes in a multiethnic asian population Microcirculation, 16: 534-543 Kutner, M.H., Nachtsheim, C J and Neter, J (2004) Applied Linear Regression Models (4th edition) The McGraw-Hill Companies, Inc., New York, USA Leung, H., Wang, J.J., Rochtchina, E., Tan, A.G., Wong, T.Y., Klein, R., Hubbard, L.D and Mitchell, P (2003) Relationships between age, blood pressure, and retinal vessel diameters in an older population Invest Ophhalmol Vis Sci., 44:2900-2904 References 49 Low,S., Mien,C.C., and Deurenberg-Yap,M (2009) Review on Epidemic of Obesity Ann Acad Med Singapore, 38:57-65 Mimoun, L., Massin, P., and Steg, G (2009) Retinal microvascularisation abnormalities and cardiovascular risk Archives of Cardiovascular Disease, 102: 449-456 Philippaerts, R.M., Lefevre, J., Delvaux, K., Thomis, M., Vanreusel, B., Vanden Eynde, B., Claessens, A.L., Lysens, R and Beunen, G (1999) Associations between daily physical activity and physical fitness in Flemish males: A cross-sectional analysis American Journal of Human Biology, 11: 587-597 Poore, G.C.B and Mobley, M.C (1980) Canonical correlation analysis of marine macrobenthos survey data J exp mar Biol Ecol., 45: 37-50 Ribeiro-Corrkaa, J., Cavadiasb, G.S.,Clkment B and Rousselle, J (1995) Identification of hydrological neighborhoods using canonical correlation analysis Journal of Hydrology, 173: 71-89 Seber, G A F (1977) Linear Regression Analysis John Wiley & Sons, Inc New Jersey, USA Sherry, L.M., Wang, J.J., Rochtchina, E., Wong, T.Y., Klein, R., Hubbard, L.D and Mitchell, P (2002) Reliability of computer-assisted retinal vessel measurement in a population Clinical and Experimental Ophthalmology, 30:179-182 Su, D.H.W., Wong, T.Y., Wong, W.-L., Saw, S.-M., Tan, D.T.H., Shen, S.Y., Loon, S.C., Foster, P.J., Aung, T and Singapore Malay Eye Study Group (2008) Diabetes, References 50 hyperglycemia, and central corneal thickness: The Singapore Malay Eye Study Ophthalmology, 115: 964-968 Tikellis, G., Wang, J.J., Tapp, R., Simpson, R., Mitchell, P., Zimmer, P.Z and Shaw, J (2007) The relationship of retinal vascular calibre to diabetes and retinopathy: the Australian Diabetes, Obesity and Lifestyle (AusDiab) study Diabetologia, 50: 2263-2271 van der Meer, J (1991) Exploring macrobenthos-environment relationship by canonical correlation analysis J Exp Mar Biol Ecol., 148: 105-120 Wade, J.B., Dougherty, L.M., Hart, R.P., Rafii, A and Price, D.D.(1992) A canonical correlation analysis of the influence of neuroticism and extraversion on chronic pain, suffering, and pain behavior Pain, 51: 67-73 Wasim, S.A (1993) Estimation of channel depth during floods by canonical correlation analysis Journal of Hydraulic Engineering, 119:81-94 Wong, T.Y., Barr, E L M., Tapp, R J., Harper, C A., Taylor, H R., Zimmet, P Z and Shaw, J E (2005) Retinopathy in persons with impaired glucose metabolism: the Australian Diabetes Obesity and Lifestyle (AusDiab) Study American Journal of Ophthalmology, 140:1157-1159 Wong,T.Y., Klein,R., Duncan, B.B., Nieto, F.J., Klein, B.E.K., Couper, D.J., Hubbard, L.D., and Sharrett, A.R (2003) Racial Differences in the Prevalence of Hypertensive Retinopathy Hypertension, 41:1086-1091 References 51 Wong,T.Y., Klein,R., and Klein,B.E.K.(2001) Retinal microvascular abnormalities and their relationship with hypertension, cardiovascular disease, and mortality Survey of Ophthalmology, 46: 59-80 Wong, T Y., Klein R., Islam, F M A., Cotch, F M., Couper, D J., Klein, B E K., Hubbard, L D., and Sharrett, A R (2007) Three-Year Incidence and Cumulative Prevalence of Retinopathy: The Atherosclerosis Risk in Communities Study American Journal of Ophthalmology, 143, 970-976 Wong, T.Y., Knudtson, M.D., Klein, R., Klein, B.E.K., Meuer, S.M and Hubbard, L.D (2004) Computer-assisted measurement of retinal vessel diameters in the Beaver Dam Eye Study Ophthalmology, 111: 1183-1190 Young, J.E and Matthews,P (1981) Pollution injury in south east Northumberland: the analysis of field data using canonical correlation analysis Environmental Pollution (Series B), 2: 353-365 ... diameters of all retinal arterioles and the central retinal venule equivalent (CRVE) with all retinal venule diameters Multiple linear regression analysis and canonical correlation analysis (CCA)... canonical correlation analysis has been used in this study 2.3 2.3.1 Canonical Correlation Analysis (CCA) Canonical Correlations and Variates in the Population Canonical correlation analysis has... 16 Canonical Correlation Analysis (CCA) 17 2.3.1 Canonical Correlations and Variates in the Population 17 2.3.2 Estimation of Canonical Correlation and Variates