Part V Big Data and Future Directions for Business
6.1 Opening Vignette: predictive Modeling Helps Better Understand and Manage Complex Medical procedures
Healthcare has become one of the most important issues to have a direct impact on quality of life in the United States and around the world. While the demand for healthcare services is increasing because of the aging population, the supply side is having problems keeping up with the level and quality of service. In order to close the gap, healthcare systems ought to significantly improve their operational effectiveness and efficiency. Effectiveness (doing the right thing, such as diagnosing and treating accurately) and efficiency (doing it the right way, such as using the least amount of resources and time) are the two fundamental pillars upon which the healthcare system can be revived. A promising way to improve healthcare is to take advan- tage of predictive modeling techniques along with large and feature-rich data sources (true reflections of medical and healthcare experiences) to support accurate and timely decision making.
According to the American Heart Association, cardiovascular disease (CVD) is the underlying cause for over 20 percent of deaths in the United States. Since 1900, CVD has been the number-one killer every year except 1918, which was the year of the great flu pandemic. CVD kills more people than the next four leading causes of deaths combined:
cancer, chronic lower respiratory disease, accidents, and diabetes mellitus. Out of all CVD deaths, more than half are attributed to coronary diseases. Not only does CVD take a huge toll on the personal health and well-being of the population, but it is also a great drain on the healthcare resources in the Unites States and elsewhere in the world. The direct and indirect costs associated with CVD for a year are estimated to be in excess of
$500 billion. A common surgical procedure to cure a large variant of CVD is called coro- nary artery bypass grafting (CABG). Even though the cost of a CABG surgery depends on the patient and service provider–related factors, the average rate is between $50,000 and
$100,000 in the United States. As an illustrative example, Delen et al. (2012) carried out an analytics study where they used various predictive modeling methods to predict the outcome of a CABG and applied an information fusion-based sensitivity analysis on the trained models to better understand the importance of the prognostic factors. The main goal was to illustrate that predictive and explanatory analysis of large and feature-rich data sets provides invaluable information to make more efficient and effective decisions in healthcare.
research MethOd
Figure 6.1 shows the model development and testing process used by Delen et al. They employed four different types of prediction models (artificial neural networks, support vector machines, and two types of decision trees, C5 and CART), and went through a large number of experimental runs to calibrate the modeling parameters for each model type. Once the models were developed, they went on the text data set. Finally, the trained models were exposed to a sensitivity analysis procedure where the contribution of the variables was measured. Table 6.1 shows the test results for the four different types of prediction models.
Training and calibrating the
model
Training and calibrating the
model
Preprocessed Data (in Excel format)
Partitioned data (training, testing and validation)
Partitioned data (training, testing,
and validation)
Input Processing Output
Training and calibrating the
model
ANN
SVM
DT/C5
DT/CART
Testing the model
Testing the model Conducting
sensitivity analysis
Tabulated Model Testing Results
(Accuracy, Sensitivity, and
Specificity)
Integrated (fused) Sensitivity Analysis Results Conducting
sensitivity analysis
Testing the model
Conducting sensitivity
analysis
Training and calibrating the
model
Testing the model
Conducting sensitivity
analysis
Figure 6.1 A Process Map for Training and Testing of the Four Predictive Models.
resuLts
In this study, they showed the power of data mining in predicting the outcome and in analyzing the prognostic factors of complex medical procedures such as CABG surgery.
They showed that using a number of prediction methods (as opposed to only one) in a competitive experimental setting has the potential to produce better predictive as well as explanatory results. Among the four methods that they used, SVMs produced the best results with prediction accuracy of 88 percent on the test data sample. The information fusion-based sensitivity analysis revealed the ranked importance of the independent variables. Some of the top variables identified in this analysis having to overlap with the most important variables identified in previously conducted clinical and biological studies confirms the validity and effectiveness of the proposed data min- ing methodology.
From the managerial standpoint, clinical decision support systems that use the outcome of data mining studies (such as the ones presented in this case study) are not meant to replace healthcare managers and/or medical professionals. Rather, they intend to support them in making accurate and timely decisions to optimally allocate resources in order to increase the quantity and quality of medical services. There still is a long way to go before we can see these decision aids being used extensively in healthcare prac- tices. Among others, there are behavioral, ethical, and political reasons for this resistance to adoption. Maybe the need and the government incentives for better healthcare systems will expedite the adoption.
QuestiOns fOr the Opening vignette
1. Why is it important to study medical procedures? What is the value in predicting outcomes?
2. What factors do you think are the most important in better understanding and managing healthcare? Consider both managerial and clinical aspects of healthcare.
Table 6.1 Prediction Accuracy Results for All Four Model Types Based on the Test Data Set
Model Type1
Confusion Matrices2
Pos (1) Neg (0) Accuracy3 Sensitivity3 Specificity3
ANN Pos (1) 749 230
74.72% 76.51% 72.93%
Neg (0) 265 714
SVM Pos (1) 876 103
87.74% 89.48% 86.01%
Neg (0) 137 842
C5 Pos (1) 876 103
79.62% 80.29% 78.96%
Neg (0) 137 842
CART Pos (1) 660 319
71.15% 67.42% 74.87%
Neg (0) 246 733
1 Acronyms for model types: ANN: Artificial Neural Networks; SVM: Support Vector Machines; C5: A popular decision tree algorithm; CART: Classification and Regression Trees.
2 Prediction results for the test data samples are shown in a confusion matrix, where the rows represent the actuals and columns represent the predicted cases.
3 Accuracy, Sensitivity, and Specificity are the three performance measures that were used in comparing the four prediction models.
3. What would be the impact of predictive modeling on healthcare and medicine?
Can predictive modeling replace medical or managerial personnel?
4. What were the outcomes of the study? Who can use these results? How can the results be implemented?
5. Search the Internet to locate two additional cases where predictive modeling is used to understand and manage complex medical procedures.
What We can Learn frOM this vignette
As you will see in this chapter, predictive modeling techniques can be applied to a wide range of problem areas, from standard business problems of assessing customer needs to understanding and enhancing efficiency of production processes to improving healthcare and medicine. This vignette illustrates an innovative application of predictive modeling to better predict, understand, and manage coronary bypass grafting pro- cedures. As the results indicate, these sophisticated predictive modeling techniques are capable of predicting and explaining such complex phenomena. Evidence-based medicine is a relatively new term coined in the healthcare arena, where the main idea is to dig deep into past experiences to discover new and useful knowledge to improve medical and managerial procedures in healthcare. As we all know, healthcare needs all the help that it can get. Compared to traditional research, which is clinical and biological in nature, data-driven studies provide an out-of-the-box view to medicine and management of medical systems.
Sources: D. Delen, A. Oztekin, and L. Tomak, “An Analytic Approach to Better Understanding and Management of Coronary Surgeries,” Decision Support Systems, Vol. 52, No. 3, 2012, pp. 698–705; and American Heart Association, “Heart Disease and Stroke Statistics—2012 Update,” heart.org (accessed February 2013).