Machine learning in medicine

Machine Learning in Medicine Ton J Cleophas • Aeilko H Zwinderman Machine Learning in Medicine by TON J CLEOPHAS, MD, PhD, Professor, Past-President American College of Angiology, Co-Chair Module Statistics Applied to Clinical Trials, European Interuniversity College of Pharmaceutical Medicine, Lyon, France, Department Medicine, Albert Schweitzer Hospital, Dordrecht, Netherlands, AEILKO H ZWINDERMAN, MathD, PhD, Professor, President International Society of Biostatistics, Co-Chair Module Statistics Applied to Clinical Trials, European Interuniversity College of Pharmaceutical Medicine, Lyon, France, Department Biostatistics and Epidemiology, Academic Medical Center, Amsterdam, Netherlands With the help from EUGENE P CLEOPHAS, MSc, BEng, HENNY I CLEOPHAS-ALLERS Ton J Cleophas European College Pharmaceutical Medicine Lyon, France Aeilko H Zwinderman Department of Epidemiology and Biostatistics Academic Medical Center Amsterdam Netherlands Please note that additional material for this book can be downloaded from http://extras.springer.com ISBN 978-94-007-5823-0 ISBN 978-94-007-5824-7 (eBook) DOI 10.1007/978-94-007-5824-7 Springer Dordrecht Heidelberg New York London Library of Congress Control Number: 2012954054 © Springer Science+Business Media Dordrecht 2013 This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer Permissions for use may be obtained through RightsLink at the Copyright Clearance Center Violations are liable to prosecution under the respective Copyright Law The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made The publisher makes no warranty, express or implied, with respect to the material contained herein Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Preface Machine learning is a novel discipline concerned with the analysis of large and multiple variables data It involves computationally intensive methods, like factor analysis, cluster analysis, and discriminant analysis It is currently mainly the domain of computer scientists, and is already commonly used in social sciences, marketing research, operational research and applied sciences It is virtually unused in clinical research This is probably due to the traditional belief of clinicians in clinical trials where multiple variables are equally balanced by the randomization process and are not further taken into account In contrast, modern computer data files often involve hundreds of variables like genes and other laboratory values, and computationally intensive methods are required This book was written as a hand-hold presentation accessible to clinicians, and as a must-read publication for those new to the methods Some 20 years ago serious statistical analyses were conducted by specialist statisticians Nowadays there is ready access for professionals without a mathematical background to statistical computing using personal computers or laptops At this time we witness a second revolution in data-analysis Computationally intensive methods have been made available in user-friendly statistical software like SPSS software (cluster and discriminant analysis since 2000, partial correlations analysis since 2005, neural networks algorithms since 2006) Large and multiple variables data, although so far mainly the domain of computer scientists, are increasingly accessible for professionals without a mathematical background It is the authors’ experience as master class professors, that students are eager to master adequate command of statistical software For their benefit, most of the chapters include all of the steps of the novel methods from logging in to the final results using SPSS statistical software Also for their benefit, SPSS data files of the examples used in the various chapters are available at extras.springer.com The authors have given special efforts for all chapters to have their own introduction, discussion, and references sections They can, therefore, be studied separately and without the need to read the previous chapters first In addition to the analysis steps of the novel methods explained from data examples, also background information and clinical relevance information of the novel methods is given, and this is done in an explanatory rather than mathematical manner v vi Preface We should add that the authors are well-qualified in their field Professor Zwinderman is president of the International Society of Biostatistics (2012–2015) and Professor Cleophas is past-president of the American College of Angiology (2000–2012) From their expertise they should be able to make adequate selections of modern methods for clinical data analysis for the benefit of physicians, students, and investigators The authors have been working and publishing together for over 10 years, and their research of statistical methodology can be characterized as a continued effort to demonstrate that statistics is not mathematics but rather a discipline at the interface of biology and mathematics The authors are not ware of any other work published so far that is comparable with the current work, and, therefore, believe that it does fill a need Contents Introduction to Machine Learning Summary 1.1 Background 1.2 Objective and Methods 1.3 Results and Conclusions Introduction Machine Learning Terminology 3.1 Artificial Intelligence 3.2 Bootstraps 3.3 Canonical Regression 3.4 Components 3.5 Cronbach’s alpha 3.6 Cross-Validation 3.7 Data Dimension Reduction 3.8 Data Mining 3.9 Discretization 3.10 Discriminant Analysis 3.11 Eigenvectors 3.12 Elastic Net Regression 3.13 Factor Analysis 3.14 Factor Analysis Theory 3.15 Factor Loadings 3.16 Fuzzy Memberships 3.17 Fuzzy Modeling 3.18 Fuzzy Plots 3.19 Generalization 3.20 Hierarchical Cluster Analysis 3.21 Internal Consistency Between the Original Variables Contributing to a Factor in Factor Analysis 3.22 Iterations 3.23 Lasso Regression 1 1 4 4 4 5 5 6 6 7 7 7 8 vii viii Contents 3.24 Latent Factors 3.25 Learning 3.26 Learning Sample 3.27 Linguistic Membership Names 3.28 Linguistic Rules 3.29 Logistic Regression 3.30 Machine Learning 3.31 Monte Carlo Methods 3.32 Multicollinearity or Collinearity 3.33 Multidimensional Modeling 3.34 Multilayer Perceptron Model 3.35 Multivariate Machine Learning Methods 3.36 Multivariate Method 3.37 Network 3.38 Neural Network 3.39 Optimal Scaling 3.40 Overdispersion, Otherwise Called Overfitting 3.41 Partial Correlation Analysis 3.42 Partial Least Squares 3.43 Pearson’s Correlation Coefficient (R) 3.44 Principal Components Analysis 3.45 Radial Basis Functions 3.46 Radial Basis Function Network 3.47 Regularization 3.48 Ridge Regression 3.49 Splines 3.50 Supervised Learning 3.51 Training Data 3.52 Triangular Fuzzy Sets 3.53 Universal Space 3.54 Unsupervised Learning 3.55 Varimax Rotation 3.56 Weights Discussion Conclusions Reference 8 9 9 9 10 10 10 10 10 10 10 11 11 11 11 12 12 12 12 12 12 12 13 13 13 13 13 13 14 14 15 Logistic Regression for Health Profiling Summary 1.1 Background 1.2 Methods and Results 1.3 Conclusions Introduction Real Data Example Discussion 17 17 17 17 17 18 19 21 Contents ix Conclusions References 23 24 Optimal Scaling: Discretization Summary 1.1 Background 1.2 Objective and Methods 1.3 Results 1.4 Conclusions Introduction Some Terminology 3.1 Cross-Validation 3.2 Discretization 3.3 Elastic Net Regression 3.4 Lasso Regression 3.5 Overdispersion, Otherwise Called Overfitting 3.6 Monte Carlo Methods 3.7 Regularization 3.8 Ridge Regression 3.9 Splines Some Theory Example Discussion Conclusion Appendix: Datafile of 250 Subjects Used as Example References 25 25 25 25 25 26 26 28 28 28 28 28 28 28 28 29 29 29 29 30 32 32 37 Optimal Scaling: Regularization Including Ridge, Lasso, and Elastic Net Regression Summary 1.1 Background 1.2 Objective 1.3 Methods 1.4 Results 1.5 Conclusions Introduction Some Terminology 3.1 Discretization 3.2 Splines 3.3 Overdispersion, Otherwise Called Overfitting 3.4 Regularization 3.5 Ridge Regression 3.6 Monte Carlo Methods 3.7 Cross-Validation 3.8 Lasso Regression 3.9 Elastic Net Regression 39 39 39 39 39 39 40 40 40 40 41 41 41 41 41 41 41 42 x Contents Some Theory Example Discussion Conclusions Appendix: Datafile of 250 Subjects Used as Example References 42 42 46 47 47 53 Partial Correlations Summary 1.1 Background 1.2 Objective 1.3 Methods 1.4 Results 1.5 Conclusions Introduction Some Theory Case-Study Analysis Discussion Conclusions References 55 55 55 55 55 55 56 56 57 61 63 64 64 Mixed Linear Models Summary 1.1 Background 1.2 Objective 1.3 Methods and Results 1.4 Conclusions Introduction A Placebo-Controlled Parallel Group Study of Cholesterol Treatment A Three Treatment Crossover Study of the Effect of Sleeping Pills on Hours of Sleep Discussion Conclusion References 65 65 65 65 65 66 66 Binary Partitioning Summary 1.1 Background 1.2 Objective 1.3 Methods and Results 1.4 Conclusions Introduction Example ROC (Receiver Operating Characteristic) Method for Finding the Best Cut-off Level 67 69 75 76 76 79 79 79 79 79 80 80 80 81 Second Example, Time-Response Effect of Propranolol on Peripheral Arterial Flow 249 Fig 19.3 Pharmacodynamic relationship between the time after oral administration of 120 mg of propranolol (x-axis, hours) and absolute change in fore arm flow (y-axis, ml/100 ml tissue/min) The un-modeled curve (upper curve) fits the data slightly less well than does the modeled (lower curve) with r-square values of 0.977 (F-value = 168), and 0.990 (F-value = 416) respectively 250 19 Fuzzy Modeling Membership Grade 0,8 0,6 null zero small medium big superbig 0,4 0,2 10 11 12 Imput variable Membership Grade 0,8 0,6 zero small medium big superbig 0,4 0,2 10 12 14 16 18 20 22 Output variable Fig 19.4 Fuzzy plots summarizing the fuzzy memberships of the imput values (upper graph) and output values (lower graph) from the propranolol data (Table 19.2 and Fig 19.3) For the other imput memberships similar linguistic rules are determined: Imput-small → output-zero Imput-medium → output-zero Imput-big → output-small Imput-superbig → output-superbig We are, particularly, interested in the modeling capacity of fuzzy logic in order to improve the precision of pharmacodynamic modeling Discussion 251 The modeled output value of imput value is found as follows Value is 100% member of imput-null, meaning that according to the above linguistic rules it is also associated with a 100% membership of output-superbig corresponding with a value of 20 Value is 50% member of imput-null and 50% imput-zero This means it is 50% associated with the output-superbig and –small corresponding with values of 50% × (8 + 20) = 14 For all of the imput values modeled output values can be found in this way Table 19.2 right column shows the results When performing a quadratic regression on the fuzzy-modeled outcome data similar to that shown above, the fuzzy-modeled output values provided a better fit than did the un-modeled output values (Fig 19.3, upper an lower curves) with r-square values of 0.990 (F-value = 416) and 0.977 (F-value = 168) Discussion Biological processes are full of variations Statistical analyses not provide certainties, only chances, particularly, the chances that prior hypotheses are true or untrue Fuzzy statistics is different from conventional statistical methods, because it does not assess the chance of entire truths, but, rather, the presence of partial truths The advantage of fuzzy statistics compared to conventional statistics is that it can answer questions to which the answers are “yes” and “no” at different times, or partly “yes” and “no” at the same time Additional advantages are that it can be used to match any set of im- and output data, including incomplete and imprecise data, and nonlinear functions of unknown complexity as sometimes observed with pharmacodynamic data The current paper suggests, indeed, that fuzzy logic may better fit and, thus, better predict pharmacodynamic data than conventional methods We have only shown the simplest method of fuzzy modeling with a single imput and a single output variable Just like with multiple regression, multiple imput variables are possible, and are capable of adequately modeling complex chemical and engineering processes [12] To date, complex fuzzy models are rarely applied in medicine, but one recent clinical study successfully used ten imput variables including sex, age, smoking, and clinical grade, to predict tumor relapse time [4] The problem is that such calculations soon get very complex and can not be carried out on a pocket calculator like in our examples Statistical software is required Fuzzy logic is not yet widely available in statistical software programs, and it is not in SPSS [11] or SAS [13], but several user-friendly programs exist [14–16] We conclude Fuzzy logic is different from conventional statistical methods, because it does not asses the chance of entire truths but rather the presence of partial truths The advantage of fuzzy statistics compared to conventional statistics is that it can answer questions to which the answers are “yes” and “no” at different times, or partly “yes” and “no” at the same time 252 19 Fuzzy Modeling Additional advantages are that it can be used to match any set of im- and output data, including incomplete and imprecise data, and nonlinear functions of unknown complexity as sometimes observed with pharmacodynamic data Fuzzy modeling may better than conventional statistical methods fit and predict quantal dose response and time response data We hope that the examples given will stimulate researchers analyzing pharmacodynamic data to more often apply fuzzy methodologies Conclusions Fuzzy logic can handle questions to which the answers may be “yes” at one time and “no” at the other, or may be partially true and untrue Pharmacodynamic data deal with questions like “does a patient respond to a particular drug dose or not”, or “does a drug cause the same effects at the same time in the same subject or not” Such questions are typically of a fuzzy nature, and might, therefore, benefit from an analysis based on fuzzy logic This chapter assesses whether fuzzy logic can improve the precision of predictive models for pharmacodynamic data The quantal pharmacodynamic effects of different induction dosages of thiopental on numbers of responding subjects was used as the first example Regression analysis of the fuzzy-modeled outcome data on the imput data provided a much better fit than did the un-modeled output values with r-square values of 0.852 (F-value = 40.34) and 0.555 (F-value 8.74) respectively The time-response effect propranolol on peripheral arterial flow was used as a second example Regression analysis of the fuzzy-modeled outcome data on the imput data provided a better fit than did the un-modeled output values with r-square values of 0.990 (F-value = 416) and 0.977 (F-value = 168) respectively We conclude that fuzzy modeling may better than conventional statistical methods fit and predict pharmacodynamic data, like, for example, quantal dose response and time response data This may be relevant to future pharmacodynamic research References Zadeh LA (1965) Fuzzy sets Inf Control 8:338–353 Hirota K (1993) Subway control In: Hirota (ed) Industrial applications of fuzzy technology Springer, Tokyo, pp 263–269 Fournier C, Castelain B, Coche-Dequeant B, Rousseau J (2003) MRI definition of target volumes using logic metho for three-dimensional conformal radiation therapy Int J Rad Oncol 55:225–233 Catto JW, Linckens DA, Abbod MF, Chen M, Burton JL, Feeley KM, Hamdy FC (2003) Artificial intelligence in predicting bladder cancer outcome: a comparison of fuzzy modelling and artificial intelligence Clin Cancer Res 9:4172–4177 References 253 Bates JH, Young MP (2003) Applying fuzzy logic to medical decision making in the intensive care Am J Resp Crit Care Med 167:948–952 Caudrelier JM, Vial S, Gibon D, Kulik C, Swedko P, Boxwala A (2004) An authoring tool for fuzzy logic based decision support systems Medinfo 9:1874–1879 Naranjo CA, Bremmer KE, Bazoon M, Turksen IB (1997) Using fuzzy logic to predict response to citalopram in alcohol dependence Clin Pharmacol Ther 62:209–224 Helgason CM, Jobe TH (2005) Fuzzy logic and continuous cellular automata in warfarin dosing of stroke patients Curr Treat Options Cardiol Med 7:211–218 Helgason CM (2004) The application of fuzzy logic to the prescription of antithrombotic agents in the elderly Drugs Aging 21:731–736 10 Russo M, Santagati NA (1998) Medicinal chemistry and fuzzy logic Inf Sci 105:299–314 11 SPSS Statistical Software (2010) www.SPSS.com Accessed 22 Oct 2010 12 Ross J (2004) Fuzzy logic with engineering applications, 2nd edn Wiley, Chichester 13 SAS Statistical Software (2010) www.SAS.com Accessed 22 Oct 2010 14 FuzzyLogic (2010) fuzzylogic.sourceforge.net Accessed 22 Oct 2010 15 Fuzzy logic: flexible environment for exploring fuzzy systems (2010) www.wolfram.com Accessed 22 Oct 2010 16 Defuzzification methods www.mathworks.com Accessed 22 Oct 2010 Chapter 20 Conclusions Introduction The current book is an introduction to the wonderful methods that statistical software offers in order to analyze large and complex data A nice thing about the novel methodologies, is, that, unlike the traditional methods like analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA), they can not only handle large data files with numerous exposure and outcome variables, but also can so in a relatively unbiased way Limitations of Machine Learning It is time that we addressed some limitations of the novel methods First, if you statistically test large data, you will almost certainly find some statistically significant effects They may be statistically significant, but are very small, and, therefore, often clinically irrelevant Second, multiple variables require multiple testing, and raise the risk of significant effects due to chance, rather than a clinical mechanism Third, testing without a prior hypothesis raises the chance of type I errors of finding an effect which does not exist Fourth, large samples are at risk of being overpowered: less power would be more adequate to demonstrate a clinical effect of a desired magnitude T.J Cleophas and A.H Zwinderman, Machine Learning in Medicine, DOI 10.1007/978-94-007-5824-7_20, © Springer Science+Business Media Dordrecht 2013 255 256 20 Conclusions Serendipities and Machine Learning We should add that, in clinical research to date, important novel discoveries are generally not serendipities, otherwise called sensational and unexpected novelties [1] The princes of Serendip went on random journeys, and always came home with sensational novelties In clinical research serendipities have been rare so far Maybe the medical use of penicilline and nitroglycerine were serendipities, but inventions in clinical research were mostly the result of hard work and prospective testing of prior hypotheses Also in the past, medical conclusions based on uncontrolled observations appeared to be subsequently frequently wrong Indeed, the findings from randomized controlled trials may be more accurate and reliable than those from uncontrolled data files But are they completely meaningless? First, data mining and other machine learning methods to establish a priori hypothesized health risks may be hard but are not impossible Second, today computer files from clinical data are very large and may include hundreds of variables Testing such data without a prior hypothesis is not essentially different from data dredging/ fishing, and your finding may be true in less than 10% of the cases, but 10% is better than 0% Machine Learning in Other Disciplines and in Medicine Machine learning was highly efficacious in geoscience, marketing research, anthropology and other disciplines [2] In medicine it is little used so far Some examples are: Bio-informatics and genetics research (DNA sequencing for disease susceptibilities) [3] Data mining of clinical trials as an alternative to laborious meta-analyses [4] Adverse drug surveillance [5] The lack of use in medicine is probably a matter of time, now that many machine learning methods are available in SPSS and other statistical software Conclusions We hope that the book is helpful to clinicians, medical students, and clinical investigators to improve their expertise in the field of machine learning, and that it facilitates the analysis of the large and complex data files widely available in the electronic health records of modern health facilities References 257 References Clancy C, Simpson L (2002) Looking forward to impact: moving beyond serendipities Health Serv Res 37:14–23 Anonymous Data mining (2012) Wikipedia http://en.wikipedia.org/wiki/Data_mining Accessed 30 Aug 2012 Zhu X, Davidson I (2007) Knowledge discovery and data mining Hershey, New York, pp 163–189 Zhu X, Davidson I (2007) Knowledge discovery and data mining Hershey, New York, pp 31–48 Bate A, Lindquist M, Edwards I, Olsson S, Orre R, Lansner A, De Freitas R (1998) A Bayesian neural network method for adverse drug reaction signal generation Eur J Clin Pharmacol 54:315–321 Index A Abrahamowicz, M., 100 Akaike information criterion (AIC), 141 Analysis of covariance (ANCOVA), 2–4 See also Canonical regression Analysis of variance (ANOVA) model, 2–4, 255 canonical regression (see Canonical regression) mixed linear models, 66 Artificial intelligence, definition, 146 history, 146–147 multilayer perceptron modeling (see Back propagation (BP) artificial neural networks) radial basis function network (see Radial basis function network) B Back propagation (BP) artificial neural networks activity/inactivity phases, 147 Gaussian distributions, 145, 148 Haycock equation, 145, 152 imput, hidden and output layer, 147 iteration/bootstrapping, 148 linear regression analysis, 145 ninety persons’ physical measurements and body surfaces, 148–151 non-Gaussian method, 153 variables nonlinear relationship, 148, 151 weights matrices, 148 Bastien, T., 206 Binary logistic regression, 19, 173–175 Binary partitioning best cut-off level entropy method, 82–84 ROC method, 81–83 CART, 80 data-based decision cut-off levels, 80 decision trees, 85 peripheral vascular disease, 80 representative historical data classifications, 80 splitting procedure, 81 Binomial distribution, 83 Bootstraps, BP artificial neural networks See Back propagation (BP) artificial neural networks Breiman, L., 80 C Canonical regression, 2, ANOVA/ANCOVA add-up sums, 226 composite variables, 226 multiple linear regression, 226 disadvantages, 233 drug efficacy scores, 227 elastic net and lasso method, 232 latent variables, 233 linear regression, 233 manifest variables, 233 MANOVA/MANCOVA add-up sums, 226 canonical weights, 231, 232 collinearity, 228 composite variables, 226 T.J Cleophas and A.H Zwinderman, Machine Learning in Medicine, DOI 10.1007/978-94-007-5824-7, © Springer Science+Business Media Dordrecht 2013 259 260 Canonical regression (cont.) correlation coefficient, 228, 231 correlation matrix, covariates, 228–230 multiple linear regression, 226 multivariate tests, 228, 231 Pillai’s statistic and normal distributions, 227 microarray gene expression levels, 227 patients data-file, example, 234–239 Centroid clusters, 192 Chi-square goodness of fit, 88 Classification and regression trees (CART), 80 Collinearity, canonical regression, 228 discriminant analysis, 222 factor analysis, 169 partial correlations, 57 Components, 4, 172–174 Cox regression with segmented time-dependent predictor blood pressure study and survival, 108 cardiovascular event occurrence, 106–108 logical expression, 106 with time-dependent predictor disproportional hazard, 103 elevated LDL-cholesterol, 105, 106 LDL cholesterol level, 103 multiple Cox regression, 108–109 non-proportional hazards, 103 VAR0006, 105, 106 variables, 103–105 without time-dependent predictor covariates, 103 exponential models, 101 hazard ratio, 102 Kaplan Meier curves, 101, 102 risks of dying ratio, 103 C-reactive protein (CRP), 114, 115 Cronbach’s alphas, 4–5, factor analysis, 169, 172, 173 principal components analysis, 201 Cross-validation, discretization, 28 regularization, 41, 42 Cytostatic drug efficacy, 185 D Data dimension reduction, Data mining, Defays, D., 185 Density based clusters, 192 Discretization, 5, 40 Index Discriminant analysis, ANOVAs, 218 collinearity, 222 health recovery, 216 latent variables, 217 linear cause-effect, 221 MANOVA, 216, 218, 221 mean function score, 219 multiple linear regression coefficients, 217 multiple outcome variables, 216 orthogonal discriminant functions, 218, 219 orthogonal linear modeling, 216 outcome variables, 218 sepsis treatment with multi-organ, 218 SPSS statistical software, 218 subgroup analysis, 220–221 test statistic of functions, 219 treatment-group plots, 219, 220 Durbin Watson test, 206 E Eftekbar, B., 152, 164 Eigenvectors, 6, 171 Elastic net regression, discretization, 28 regularization, 42, 44, 46 Exponential modeling, 135 F Factor analysis, 5, 6, add-up scores, 167, 168 advantages and disadvantages, 176 binary logistic regression, 173–175 collinearity, 169 components, 172–174 Cronbach’s alpha, 169, 172, 173 eigenvectors, 171 factor analysis theory, 169–171 vs hierarchical cluster analysis, 193–194 individual patients, health risk profiling, 175 iterations, 171 latent factors, 172 loadings, 170 magnitude, 170 multicollinearity, 169 multidimensional modeling, 172 multiple logistic regression, 168 original variables, 168–169 coefficients, 173, 174 test-retest reliability, 172, 173 Pearson’s correlation coefficient (R), 169 sepsis patients data file, example, 177–181 Index SPSS’s data dimension reduction module, 167 three factor factor-analysis, 167 varimax rotation, 171 Factor analysis theory, 6–7, 169–171 Factor loadings, 7, 170 Fisher, R.A., 216 Franke, R., 158 Fuzzy logic modeling advantages, 251–252 fuzzy memberships, 243, 246, 250 fuzzy plots, 243, 246, 250 fuzzy statistics, 251 linguistic membership names, 243 linguistic rules, 243 propranolol, time-response effect imput and output relationships, 247–248, 250–251 pharmacodynamic effect, single oral dose, 247, 248 pharmacodynamic relationship, 247, 249 quadratic regression model, 247 regression analysis, 241 thiopental, dose-response effects imput values, 244, 246–247 induction dose and number of responders, 244, 245 quantal pharmacodynamic effects, 243, 247 statistical distribution, 243–244 triangular fuzzy sets, 243 universal space, 243 Fuzzy memberships, Fuzzy modeling, 2, 3, Fuzzy plots, 261 Hierarchical cluster analysis, 2, 3, 7–8 centroid clusters and density based clusters, 192 collinearity, 185 crystallization optimization, 185 data analysis add-up sum, 186, 189–190 dendrogram, 189, 191 icicle plot, 189, 191 linear regression with progression free interval, 189, 192 drug efficacy, 184, 185 explorative data mining, 192 vs factor analysis, 193–194 flexibility, 192–193 gastric cancer patients, cytostatic treatment genes expression levels, 185, 186 variables correlation matrix, 186–188 linear regression, 185 oral thiopurines, 185 platinum and fluorouracil chemo-resistance, 185 SPSS statistical software, 183 Hojsgaard, S., 42 Hotelling, H., 227 Huynh–Feldt test, 68, 72 G Gaussian curves, 12 Gaussian distributions, 83 BP artificial neural networks, 145, 148 item response modeling, 88 Gifi, A., 26 Goodness of fit (GoF) value, 204 I Item response modeling ceiling effects, 88 vs classical linear testing, 88 clinical and laboratory diagnostic-testing analysis results, 93, 94 item response scores and classical scores, 93, 95 vascular-laboratory tests, 93 disadvantages, 96–97 logistic models, 94 principles, 89–90 psychological and intelligence tests, 88 QOL assessment (see Quality of life (QOL) assessment) Iterations, BP artificial neural networks, 148 Factor analysis, 171 H Halekoh, U., 42 Hancock’s equation, 162, 163 Haycock equation, 145, 152 Hazard ratio (HR), 102 Henderson, C.R., 66 K Kernel frequency distribution modeling, 141 Kessler, R.C., 88 Klecka, W.R., 216 Kolmogorov–Smirnov (KS) goodness of fit test, 88 262 L Lasso regression, optimal scaling discretization, 28 regularization, 41, 42, 44, 45 Latent factors, 8, 172 Latent variables (LVs) canonical regression, 233 discriminant analysis, 217 principal components analysis, 198, 200 Learning, Linear cause-effect, Linear regression analysis BP artificial neural networks, 145 radial basis function network, 159 seasonality assessments, 114, 115, 124 Linguistic membership names, Linguistic rules, Loess modeling, 139–140 Logistic regression, 1, 3, binary, 19 b-values, 20, 22, 23 calculated odds, 21 characteristics, 21 disadvantages, 22 endometrial cancer example, 22–23 linear equation transformation, 18–19 log linear models, 17 odds of infarction and age, 18, 19 predictive models, 22 probability of events, 22 probability prediction, 17, 18 p-values, 20 regression equation, 19 LVs See Latent variables (LVs) M Machine learning, definition, 1, Manifest variables (MVs) canonical regression, 233 principal components analysis, 198, 200 McCulloch, W., 146 McLean, R.A., 66 Minsky, M.A., 146 Mixed linear models, 2, advantages and disadvantages, 75 ANOVA model, 66 placebo-controlled parallel group study, cholesterol treatment data adaptation, 68, 70–71 data file, 67 general linear model, 68, 69 Huynh–Feldt test, 68 Index hypercholesterolemia treatment, 67, 68 mixed model analysis, 68, 70 sphericity test, 68 three treatment crossover study, sleeping pills effect data adaptation, 73, 74 data file, 69, 71 mixed model analysis, 73, 74 p-value, 75 sphericity test, 72, 73 treatment effects, single group, 72 Monte Carlo methods, 4, discretization, 28 regularization, 41 Multicollinearity, 9, 169 Multidimensional modeling, 10, 172 Multilayer perceptron modeling, 10 BP artificial neural networks (see Back propagation (BP) artificial neural networks) Gaussian distributions, 145 Haycock equation, 145 linear regression analysis, 145 Multi-layer perceptron neural network, 158, 159, 163 Multiple linear regression, 217 Multiple logistic regression, 168 Multivariate analysis of covariance (MANCOVA), 2–4, 198 See also Canonical regression Multivariate analysis of variance (MANOVA), 198, 203, 255 canonical regression (see Canonical regression) discriminant analysis, 216, 218, 221 Multivariate machine learning methods, 10 Multivariate method, 5, 10 MVs See Manifest variables (MVs) N Network, 10 Neural networks, 2, 10 Non-linear modeling Ace/Avas packages, 133–134 background, 127 box Cox transformation, 133, 134 disadvantages, 141 exponential modeling, 135 Gaussian curves, 141 kernel frequency distribution modeling, 141 linearity testing curvilinear regression, 128 non-linear data sets, 128, 129 263 Index quadratic and cubic models, 128, 130 squared correlation coefficient, 128 standard models, regression analysis, 128, 130 Loess modeling, 139–140 logit and probit transformations, 131–133 mechanical spline methods, 128 objective, 127 sinusoidal data, 134–135 spline modeling computer graphics, 139 low-order polynomial regression lines, 137, 138 multidimensional smoothing, 139 multiple linear regression lines, 137 non-linear dataset, 136 third order polynomial functions, 137, 138 two-dimensional, 139 “trial and error” method, 133 Non-linear relationships, 2, O Optimal scaling, 1–3, 10–11 discretization bouncing betas, 31 continuous predictor variable, 26, 27 cross-validation, 28 disadvantages, 31 drug efficacy composite score, 29, 30 elastic net regression, 28 F-tests, 30 Lasso regression, 28 microarray gene expression levels, 29, 30 Monte Carlo methods, 28 multiple linear regression analysis, 26, 27 outcome variable composite scores, 29 overdispersion, 28 overfitting, 28 quadratic approximation, 29 regularization, 28 ridge regression, 29 splines, 29 SPSS module, 25 250 subjects datafile, example, 32–37 without regularization, 30, 31 regularization bouncing betas, 46 continuous variable conversion, 40 cross-validation, 41, 42 discretization, 40 discretized variable correction, 40 elastic net regression, 42, 44, 46 instable regression coefficients, 46 k-fold vs k-1 fold scale, 42 Lasso regression, 41, 42, 44, 45 microarray gene expression levels, 42 Monte Carlo methods, 41 overdispersion/overfitting, 41 patients’ datafile, example, 47–52 ridge path model, 43, 44 ridge regression, 41–44 ridge scale model, 43 splines, 41 Orthogonal discriminant functions, 218, 219 Overdispersion/overfitting, 11 discretization, 28 regularization, 41 P Partial correlation analysis, 2, 10 cardiovascular factors, multiple regression, 56 exercise and calorie intake effects, weight loss with age held constant, 62 with calorie intake held constant, 61, 62 clinical outcome, 61 collinearity, 57 covariate, 57–59 data interaction, 63–64 with exercise held constant, 61, 62 higher order partial correlation analysis, 62 interaction variable, 56–59 linear correlation, 60–61 linear regression, 59 multiple linear regression, 57, 60 r-square values, 60 subgroups, 63 variables correlation matrix, 57, 59 partial regression analysis, 56 Partial least squares analysis, 2, 10 add-up scores, 198 advantages and disadvantages, 206 clusters of variables, 203 correlation coefficients, 201, 204 data dimension reduction, 199, 205 example datafile, 207–212 GoF value, 204 MANCOVA, 198 MANOVA, 198 multivariate linear regression, 201 264 Partial least squares analysis (cont.) vs principal components analysis, 204–205 response variables, 200, 206 r-values, 204 square boolean matrix, 203 Partial regression analysis, 56 Pearson’s correlation coefficient (R), 11, 169 PLS-Cox model, 206 Principal components analysis, 2, 12 add-up scores, 203 advantages and disadvantages, 206 best fit coefficients, 201, 202 Cronbach’s alphas, 201 data dimension reduction, 199, 205 data validation, 201, 202 example datafile, 207–212 latent variables, 198, 200 manifest variables, 198, 200 MANOVA, 203 multiple linear regression, 199, 202 original variables, 201 outcome variables, 203 vs partial least squares analysis, 204–205 test-retest reliability, 201 Propranolol, time-response effect imput and output relationships, 247–248, 250–251 pharmacodynamic effect, single oral dose, 247, 248 pharmacodynamic relationship, 247, 249 quadratic regression model, 247 Q Quality of life (QOL) assessment EAP scores, 92, 93 Gaussian error model, 91 5-item mobility-domain, 90, 91 LTA-2 software program, 91 normal Gaussian frequency distribution curve, 90, 92 R Radial basis function network, 12 black box modeling, 165 correlation coefficient, 162 Hancock’s equation, 162, 163 kurtosis of age, 163, 164 linear regression analysis, 159 multi-layer perceptron neural network, 158, 159, 163, 164 Index 90 persons body surfaces example, 159–161 radial distant functions, 158 sigmoidal function, 158 skewness of height, 163, 164 SPSS module neural networks, 159 symmetric functions, 158 three layer network, 159, 162 Radial basis functions, 12 Rasch, G., 88 Receiver operating characteristic (ROC) method, 81–83 Regularization, 12, 28 Ridge regression, 12 discretization, 29 regularization, 41–44 Ridge scale model, 42 Robinson, G.K., 66 Rosenblatt, F., 146 S Sample learning, Seasonality assessments autocorrelation causal factors, 119 coefficients and standard errors, 116, 118 C-reactive protein levels, 114, 115 cross-correlation coefficients, 119 definition, 114 first vs second summer pattern, 119–121 inconsistent patterns, 119, 122 lagcurves, 115–116 linear regression analysis, 114, 115, 124 original datacurve, 114, 115, 125 partial autocorrelation, 116 p-values, 118 curvilinear regression methods, 124 monthly CRP values, 116, 117, 123 objective and methods, 113 vs para-seasonality, 124 seasonal patterns, 113, 114 Segmented time-dependent predictor blood pressure study and survival, 108 cardiovascular event occurrence, 106–108 logical expression, 106 Serendipities, 256 Sibson, R., 185 Spearman, C., 168 Spline modeling computer graphics, 139 low-order polynomial regression lines, 137, 138 265 Index multidimensional smoothing, 139 multiple linear regression lines, 137 non-linear dataset, 136 third order polynomial functions, 137, 138 two-dimensional, 139 Splines, 12 discretization, 29 regularization, 41 Stampfer, M.J., 57 Stevens, J., 233 Subgroup discriminant analysis, 220–221 Supervised learning, 12 Supervised machine learning See Discriminant analysis T Thiopental, dose-response effects imput values, 244, 246–247 induction dose and number of responders, 244, 245 quantal pharmacodynamic effects, 243, 247 statistical distribution, 243–244 Thiopurines, 185 Tibshirani, R., 40 Time-dependent predictor Cox regression (see Cox regression) methods and results, 99–100 morbidity/mortality, 100 SPSS program, 109 SPSS statistical software, 99 T_*covariate, 109 time-dependent factor analysis, 99 Training data, 13 “Trial and error” method, 133 Triangular fuzzy sets, 13, 243 U Uebersax, J., 87, 89, 97 Universal space, 13 Unsupervised learning, 13 V Varimax rotation, 13 W Waaijenberg, S., 232 Weights, 13 Willett, W., 57 Wold, H., 199 Y Yule, G.U., 56, 114 Z Zadeh, L.A Zwinderman, A.H., 232 ... Conclusions Introduction Limitations of Machine Learning Serendipities and Machine Learning Machine Learning in Other Disciplines and in Medicine Conclusions... to Machine Learning 3 Machine Learning Terminology 3.1 Artificial Intelligence Engineering method that simulates the structures and operating principles of the human brain 3.2 Bootstraps Machine. .. Learning Machine learning using data that include both input and output data (exposure and outcome data) 3 Machine Learning Terminology 13 3.51 Training Data The output data of a supervised learning

Tiêu đề	Machine Learning in Medicine
Tác giả	Ton J. Cleophas, Aeilko H. Zwinderman
Người hướng dẫn	Eugene P. Cleophas, MSc, BEng, Henny I. Cleophas-Allers
Trường học	European Interuniversity College of Pharmaceutical Medicine
Chuyên ngành	Medicine
Thể loại	book
Năm xuất bản	2013
Thành phố	Dordrecht

Định dạng
Số trang	270
Dung lượng	2,89 MB
File đính kèm	60. Machine Learning in Medicine ( PDFDrive ).rar (3 MB)