Exploring the performances of stacking classifier in predicting patients having stroke

2021 8th NAFOSTED Conference on Information and Computer Science (NICS) 2021 NAFOSTED Conference on Information and Computer Science (NICS) Hanoi City, Vietnam, December 21-22, 2021 Exploring the Performances of Stacking Classifier in Predicting Patients Having Stroke Tasnimul Hasan1, Mirza Muntasir Nishat1, Fahim Faisal1, Abrar Islam1, Abdullah Al Mehadi1, Sarker Md Nasrullah2 and Mohammad Rakibul Islam1 Department of Electrical and Electronic Engineering Islamic University of Technology Dhaka, Bangladesh Department of Public Health North South University, Dhaka, Bangladesh Email: {tasnimulhasan56, mirzamuntasir, faisaleee, abrarislam, abdullahmehadi} @iut-dhaka.edu, sarker.nasrullah@northsouth.edu, rakibultowhid@yahoo.com Abstract— Stroke refers to a spectrum of clinical manifestations with underlying neurological dysfunctions of the brain It is a medical condition which is often misdiagnosed and commonly misclassified, leading to a delay in the initiation of disease-specific treatment in patients Rapid and precise detection of stroke is the key to the effective management of the patients and alleviate possible disabilities Machine learning techniques are being adopted for their capabilities of identifying hidden patterns from the obtained data of patients In this study, a stacking classifier is constructed by utilizing Random Forest (RF), Extra Tree (ET) and Gradient Boosting Classifier (GBC) as well as the performances are observed in terms of various performance metrics A detailed comparative analysis is portrayed where it is observed that the accuracies of RF, ET and GBC are 94.63%, 94.62% and 94.72% respectively whereas the proposed stacking classifier outperformed the individual classifiers’ performances with an accuracy of 95% The hyperparameter tuning is accomplished for all the classifiers by which the performances are enhanced Hence, the investigative analysis can significantly contribute to predict patients having a stroke and aid in developing an automated diagnosis for e-healthcare systems Keywords— Stacking Accuracy, Stroke Classifier, Machine Learning, I INTRODUCTION In accordance with WHO (World Health Organization), stroke is a spectrum of clinical signs and symptoms of vascular origin, lasting for at least a day or more, that lead to focal neurological dysfunctions and sometimes to death [1] Stroke can be divided into three categories depending on the pathogenesis of the disease- ischemic stroke, hemorrhagic stroke and subarachnoid hemorrhage Ischemic stroke is characterized by the occlusion of arteries, both small and large, that provide nutrition and oxygen to the brain, leading to ischemia and infarction of a specific area On the other hand, hemorrhagic stroke is featured by spontaneous bleeding within the cerebrum due to factors like hypertension, diet, vascular malformation, coagulation disorder etc Hypertension has been identified as the most common cause of hemorrhagic stroke [2] Subarachnoid hemorrhage is caused by sudden rupture of otherwise asymptomatic aneurysm on the undersurface of the brain Among the risk factors of acute stroke, sedentary lifestyle with decreased physical activities, age-inappropriate diet, smoking, alcohol consumption, drug abuse, and excessive 978-1-6654-1001-4/21/$31.00 ©2021 IEEE intake of salt, sugar and processed food contribute more to the incidence of stroke [3] Acute stroke is the second most common reason of death and the third most common reason of disability among the old patients worldwide [4-5] The major difficulty in the diagnosis of stroke arises from the existence of several other disorders known as “stroke mimics” which also present with features of focal neurological deficits [8] Examples are nonvascular diseases like- tumor inside the brain, seizures or hysteric conversion disorders, cerebral infections, toxic metabolic conditions etc., and vascular disorders such as transitory ischemic attacks, posterior revocable syndrome, reversible vasoconstriction syndrome etc These conditions are the causes of a significant number of false positive diagnoses of stroke in the emergency rooms of health care centers [9] The frequency ranges from low to high depending on the type of facility, which can reach up to 30% of all suspected cases in hospitals having no neurologist [10] Therefore, precise, and early detection of stroke cases could be the key step leading to effective treatment of the patient with a desirable prognosis and limitation of disabilities Machine learning (ML) relies on algorithms that can learn from data rather than rules-based programming [11] Active human interaction is not required in machine learning models as it can learn, recognize patterns, and make choices Machines, in theory, improve accuracy and efficiency while eliminating (or significantly reducing) the chances of human error [12] Current diagnostic planning and simulation computational methods are imprecise and time-consuming, resulting in limited application The process of doctor-patient communication and clinical decision making is aided by an ML-based framework that incorporates supervised learning for diagnosing, vulnerability estimation, and therapeutic simulation [13-19] II RELATED WORKS A lot of research is being conducted that utilizes machine learning tools to aid in various clinical diagnoses [20-26] Ali et al conducted research on prediction of stroke where they utilized distributed machine learning algorithms in healthcare stroke dataset The work was carried out by Apache Spark which is a big data platform and they exhibited that random forest attained the preeminent 242 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) accuracy of 90% [27] D Shanthi et al employed Artificial Neural Networks (ANN) to forecast the disease thromboembolic stroke This study displayed ANN-based stroke illness prediction by increasing accuracy to 89% with a greater consistent rate [28] However, Cheng et al used ANN models to predict ischemic stroke in a dataset from Sugam Multispecialty Hospital of Tamil Nadu, India The researchers conferred that the accuracy rate was 79.2 % [29] Moreover, Kansadub et al applied three classification algorithms termed as decision tree, naïve bayes and neural network to predict stroke where decision tree outperformed the other two algorithms with an accuracy of 75% [30] Linder et al., on the other hand, assessed logistic regression (LR) and artificial neural networking (ANN) in the German stroke database to detect acute ischemic stroke [31] The findings of the current study show that LR is the most appropriate to categorize acute ischemic stroke compared with ANNs Sung et al have studied KNN, MLR and regression tree model performances to predict the severity of the stroke; the findings showed that KNN exceeded other models The results of KNN's study have been compared [32] graph pertaining to selected top features by Kbest algorithm has been illustrated in Fig and the correlation heatmap is presented in Fig In this study, a Stacking Classifier (SC) incorporating random forest, extra tree and gradient boosting has been proposed which can predict patients having stroke in an automated manner The performances of the classifiers have been obtained by rigorous simulation in Python for both without hyperparameter tuning and with hyperparameter tuning As a result, an investigative approach has been carried out to examine the applicability of this kind of stacking classifier so that it can be applied in building a computer aided diagnosis system for e-healthcare services Hence, the methodology of the study is presented in Section III where the data processing, feature selection and the concept of stacking classifier are discussed In section IV, the experimental analysis and results are depicted with vivid graphical presentation and comparative analysis among the performance parameters obtained from Python simulation Lastly, the conclusion is portrayed in section V Selecting Top Features III PROPOSED METHOD In order to perform the predictive analysis, at first, the dataset is collected from Kaggle [33], one of the popular destinations for open-source datasets Then, the dataset has been loaded into Jupyter Notebook and the categorical features like gender, ever_married, work_type, Residence_type, smoking_status have been converted into numerical values using label encoder However, some missing values have been filled up using “KNN Imputer” where the parameter (n_neighbors) has been kept to Following that, a filtering method has been employed to select the top six features using “SelectKBest” function Then stratified train-test split has been performed where 20% data has been taken for testing and rest are for training As the data was imbalanced, oversampling was carried out in training data using RandomOversampler where sampling strategy was kept at 0.7 and after that, data has been converted into computational friendly (0 to 1) format using MinMaxScaler Finally, the training data has been fed into the ML models The overall workflow diagram of data preprocessing has been depicted in Fig However, the 243 Kaggle Stroke Dataset Finding Categorical Features Converting into Numerical Values Filling Missing Values Applying Filtering Method Data Splitting (Train and Test Set) Perform Oversampling on Imbalanced Data Apply Min Max Scaler Obtain Computational Friendly Data Feed Data to ML Models Fig Workflow diagram of Data Preprocessing Fig K-best Function Graph 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) IV EXPERIMENTS In this study, three classifier models – Random Forest, Extra Tree and Gradient Boosting Classifier have been constructed and through rigorous simulation in Python, performances have been executed and compared Later a Stacking Classifier (SC) model has been developed using these three individual algorithms and the performances of this model have also been observed Hence, the confusion matrices for both „before tuning‟ and „after tuning‟ have been tabulated in Table I, II, III, IV, V, VI, VII and VIII for RF, ET, GBC and SC respectively These confusion matrices aid in deriving necessary performance parameters like accuracy, precision, recall, F1 score, cross-validation score and Area under the Curve (AUC) TABLE I RF (Before Tuning) False Actual True Fig Correlation heatmap Ensemble learning is a problem that combines numerous machine learning models [34] Weak learners are the term used to describe these kinds of models The idea is that by grouping together a group of weak learners, they can become strong Each weak learner is fitted to the training set and gives the results [35] The ultimate prediction result is calculated by adding all the weak learners' results together As a result, with a final classifier, a stack of estimators is formed Stacked generalization involves stacking the output of individual estimators and computing the final prediction using a classifier [36] In this analysis RF, ET and GBC have been incorporated to achieve the stacking classifier This classifier is trained using the anticipated class labels and the ensemble probabilities [37] TABLE II TABLE III False True TABLE VI Model Model Model N Training Prediction Training Prediction Training Prediction False True False True Actual Fig Workflow diagram of stacking classifier 244 Predicted False 822 15 True 150 35 False 911 30 True 61 20 CONFUSION MATRIX FOR SC (BEFORE TUNING) Predicted False 967 49 True CONFUSION MATRIX FOR SC (AFTER TUNING) Predicted SC (After Tuning) Final Model True Predicted SC (Before Tuning) False Actual True TABLE VIII False 963 46 CONFUSION MATRIX FOR GBC (AFTER TUNING) Actual Training Final Model True 12 CONFUSION MATRIX FOR GBC (BEFORE TUNING) GBC (After Tuning) TABLE VII Predicted False 960 49 Predicted GBC (Before Tuning) False Actual True Testing Dataset True 13 CONFUSION MATRIX FOR ET (AFTER TUNING) ET (After Tuning) TABLE V False 959 42 CONFUSION MATRIX FOR ET (BEFORE TUNING) ET (Before Tuning) False Actual True TABLE IV True 17 Predicted RF (After Tuning) Actual Predicted False 955 46 CONFUSION MATRIX FOR RF (AFTER TUNING) Actual Initial Preprocessing Training Dataset CONFUSION MATRIX FOR RF (BEFORE TUNING) False True False 969 49 True 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) By tuning different types of hyperparameters, the performances of the four classifiers have been improved Hence, the „with tuning‟ perspective has been introduced and a comparative analysis has been provided to give a clear picture of the practicality and suitability of the three distinct algorithms and the stacking classifier in predicting stroke with high accuracy The graphical representation of all the ML models is exhibited in Fig 5, Fig 6, Fig 7, Fig 8, Fig 9, Fig 10, and Fig 11 where accuracy, precision, recall, F1_score, AUC, Specificity and Cross Validation have been compared respectively Firstly, it is observed that the stacking model achieves higher accuracy than the other three individual algorithms both during before tuning and after tuning period (0.9471 and 0.9491 respectively) Secondly, this stacking model also triumphs over the other classifiers in the cases of F1 score, specificity and cross-validation score with values of 0.9491, 0.9969 and 0.9969 respectively Fig Comparison of recall among all the ML models Fig Comparison of accuracy among the ML models Fig Comparison of F1_score among all the ML models Fig Comparison of precision among all the ML models Fig Comparison of AUC among all the ML models 245 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) recall of GBC were 0.8386 and after applying the tuning method this number has been boosted up to 0.911 The same scenario can be seen in AUC and F1 Score, which before tuning were returning 0.6686 and 0.8789, while after tuning it has climbed up to 0.7728 and 0.9208 respectively The other classifiers also exhibit this kind of increment in all the parameters inferring the essentiality of hyperparameter tuning Some of the researchers have worked on prediction of stroke diseases with various datasets and strategies Table IX displays a comparative illustration of this predictive analysis in terms of the accuracies obtained Hence, it can be concluded that for this kind of dataset, stacking classifier can be a viable ML model if implemented in a computer aided diagnosis system for detecting patients having stroke All the performance metrics of all the classifiers are depicted in Table X TABLE IX Reference [27] [28] [29] [30] Fig 10 Comparison of specificity among all the ML models COMPARATIVE ANALYSIS WITH OTHER WORKS Authors Ali et al D Shanthi et al Cheng et al Kansadub et al This work Best Accuracy 90.0% 89.0% 79.2% 75.0% 95.0% V CONCLUSION Fig 11 Comparison of error rate among all the ML models In terms of precision and recall, this model may not have the highest value but still provides decent scores for both cases above 0.9 (0.9175 for precision and 0.928 for recall) whereas, the number for AUC is 0.5084, which is quite low compared to other performance metrics for this model In the case of precision, GBC occupies the top spot by obtaining the value 0.9433, whereas for recall both Random Forest and Extra Tree excel other classifiers by achieving the same score of 0.9462 Another important observation from the simulation is the effect of hyperparameter tuning on various algorithms All the classifiers have acquired a significant improvement to their performance metrics due to tuning GBC is the prime example of this circumstance, as it has been affected most due to tuning Both the accuracy and TABLE X Performance Parameters Accuracy Precision Recall F1_score AUC Specificity Cross Validation Score RF 0.9384 0.9167 0.9384 0.9262 0.5313 0.9825 0.9472 As the brain is the body's primary mover, any abnormality in it puts all the body's systems in danger Hence, the prediction of stroke is important, as during a brain stroke brain cells get severely harmed or become dead The dead cells cannot be resurrected and most of the severely harmed cells may not be recovered which leads to disability and death So, it infers that precise prediction of stroke is a matter of sheer importance, as succession at this can save a lot of lives Since the conventional medical ways are not sufficient to predict stroke and may worsen the situation more through misdirected data and result, the application of machine learning models can be very fruitful, which can reduce the danger to a minimum level by taking precautions and keeping necessary medical equipment nearby The outcomes of Random Forest (RF), Extra Tree (ET), Gradient Boosting Classifier (GBC) and Stacking Classifier (SC) are illustrated where stacking classifier outperformed the other models with an accuracy of 95% Though all the other mentioned algorithms also portrayed promising outcomes, it can be concluded that the concept of stacking enhanced the overall performance However, with a larger quantity of data, these models can be evaluated in the future, which will provide more insights and also will implore the researchers to develop a computer aided system to predict patients having stroke so that the early treatments can be served and mortality rates can be reduced significantly PERFORMANCE METRICS OF ALL THE CLASSIFIERS RF (tuned) 0.9462 0.9298 0.9462 0.9356 0.5733 0.9866 0.9487 ET 0.9403 0.9087 0.9403 0.9233 0.5038 0.9877 0.9425 246 ET (tuned) 0.9462 0.9228 0.9462 0.9309 0.5354 0.9907 0.9440 GBC 0.8386 0.9328 0.8386 0.8789 0.6686 0.8457 0.9337 GBC (tuned) 0.9110 0.9433 0.9110 0.9208 07728 0.9372 0.9479 SC 0.9471 0.9133 0.9269 0.9471 0.5074 0.9948 0.9499 SC (tuned) 0.9500 0.9175 0.9280 0.9491 0.5084 0.9969 0.9503 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] WHO MONICA Project Principal Investigators, “The world health organization monica project (monitoring trends and determinants in cardiovascular disease): A major international collaboration,” Journal of Clinical Epidemiology, 41(2), pp 105–114, 1988 Hakimi, R and Garg, A., “Imaging of hemorrhagic stroke,” CONTINUUM: Lifelong Learning in Neurology, 22(5), pp.1424-1450, 2016 Truelsen, T., Begg, S and Mathers, C., “The global burden of cerebrovascular,” In Who Int., 2006 Simonetti, B.G., Mono, M.L., Huynh-Do, U., Michel, P., Odier, C., Sztajzel, R., Lyrer, P., Engelter, S.T., Bonati, L., Gensicke, H and Traenka, C., “Risk factors, aetiology and outcome of ischaemic stroke in young adults: the Swiss Young Stroke Study (SYSS),” Journal of neurology, 262(9), pp.2025-2032, 2015 Johnson, Walter, Oyere Onuma, Mayowa Owolabi, and Sonal Sachdev, "Stroke: a global response is needed,” Bulletin of the World Health Organization 94, no (2016): 634 Krishnamurthi, R.V., Moran, A.E., Feigin, V.L., Barker-Collo, S., Norrving, B., Mensah, G.A., Taylor, S., Naghavi, M., Forouzanfar, M.H., Nguyen, G and Johnson, C.O., “Stroke prevalence, mortality and disability-adjusted life years in adults aged 20-64 years in 19902013: data from the global burden of disease 2013 study,” Neuroepidemiology, 45(3), pp.190-202, 2015 “WHO | The world health report 2002 - Reducing Risks, Promoting Healthy Life,” WHO, 2013 Liberman, A.L and Prabhakaran, S., “Stroke chameleons and stroke mimics in the emergency department,” Current neurology and neuroscience reports, 17(2), p.15, 2017 Vilela, P., “Acute stroke differential diagnosis: stroke mimics,” European journal of radiology, 96, pp.133-144, 2017 Zinkstok, S.M., Engelter, S.T., Gensicke, H., Lyrer, P.A., Ringleb, P.A., Artto, V., Putaala, J., Haapaniemi, E., Tatlisumak, T., Chen, Y and Leys, D., “Safety of thrombolysis in stroke mimics: results from a multicenter cohort study,” Stroke, 44(4), pp.1080-1084, 2013 El Naqa, I and Murphy, M.J., “What is machine learning?” In machine learning in radiation oncology, pp 3-11, Springer, Cham, 2015 Ahmed, Z., Mohamed, K., Zeeshan, S and Dong, X., “Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine,” Database, 2020 Faisal F., and Nishat M M., “An Investigation for Enhancing Registration Performance with Brain Atlas by Novel Image Inpainting Technique using Dice and Jaccard Score on Multiple Sclerosis (MS) Tissue”, Biomedical and Pharmacology Journal 2019,12(3) Nishat, M M., Faisal, F., Dip, R R., Shikder, M F., Ahsan, R., Asif, M A A R., and Udoy, M R "Performance Investigation of Different Boosting Algorithms in Predicting Chronic Kidney Disease." In 2020 2nd International Conference on Sustainable Technologies for Industry 4.0 (STI), pp 1-5 IEEE, 2020 Asif, M A A R., Nishat, M M., Faisal, F., Dip, R R., Udoy, M H., Shikder, M F., and Ahsan, R., “Performance Evaluation and Comparative Analysis of Different Machine Learning Algorithms in Predicting Cardiovascular Disease,” Engineering Letters, 29 (2), pp 731-741 (2021) Nishat, M M., Faisal, F., Dip, R R., Nasrullah, S M., Ahsan, R., Shikder, M F., Asif, M A A R., and Hoque, M A., “A Comprehensive Analysis on Detecting Chronic Kidney Disease by Employing Machine Learning Algorithms,” EAI Endorsed Transactions on Pervasive Health and Technology, phat 18: e6, 2021, DOI: 10.4108/eai.13-8-2021.170671 Farazi, M R., Faisal, F., Zaman, Z., and Farhan, S., “Inpainting multiple sclerosis lesions for improving registration performance with brain atlas,” In Medical Engineering, Health Informatics and Technology (MediTec), pp 1-6 IEEE, 2016 doi: 10.1109/MEDITEC.2016.7835363 Asif, M A A R., Nishat, M M., Faisal, F., Shikder, M F., Udoy, M H., Dip, R R., and Ahsan, R., “Computer Aided Diagnosis of Thyroid Disease Using Machine Learning Algorithms,” In 2020 11thInternational Conference on Electrical and Computer Engineering (ICECE), pp 222-225 IEEE, 2020 Rahman, A A., Faisal, F., Nishat, M M., Siraji, M I., Khalid, L I., Khan, M R H., and Reza, M T., “Detection of Epileptic Seizure from EEG Signal Data by Employing Machine Learning Algorithms [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] 247 with Hyperparameter Optimization,” IEEE 4th International Conference on Bio-Engineering for Smart Technologies (BioSMART 2021), 8-10 December, Paris, IEEE 2021, in press Nishat, M M, Faisal, F, Mahbub, M A., Mahbub, M H., Islam, S., and, Hoque, M A., “Performance Assessment of Different Machine Learning Algorithms in Predicting Diabetes Mellitus,” Biosc.Biotech.Res.Comm., 14(1), 2021 doi: http://dx.doi.org/10.21786/bbrc/14.1/10 Rahman, A A., Khalid, L I., Siraji, M I., Nishat, M M., Faisal, F., and Ahmed, A., “Enhancing the Performance of Machine Learning Classifiers by Hyperparameter Optimization in Detecting Anxiety Levels of Online Gamers,” 24th International Conference on Computer and Information Technology (ICCIT), IEEE 2021, in press Nishat, M M., Faisal, F., Hasan, T., Karim, M F B., Islam, Z., and Shagor, M R K., “An Investigative Approach to Employ Support Vector Classifier as a Potential Detector of Brain Cancer from MRI Dataset,” 2021 International Conference on Electronics, Communication & Information Technology (ICECIT), IEEE, 2021, in press M M Nishat, T Hasan, S M Nasrullah, F Faisal, M A -A -R Asif and M A Hoque, “Detection of Parkinson's Disease by Employing Boosting Algorithms,” 2021 Joint 10th International Conference on Informatics, Electronics & Vision (ICIEV) and 2021 5th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), 2021, pp 1-7, doi: 10.1109/ICIEVicIVPR52578.2021.9564108 M M Nishat and F Faisal, “An Investigation of Spectroscopic Characterization on Biological Tissue,” 2018 4th International Conference on Electrical Engineering and Information & Communication Technology (iCEEiCT), 2018, pp 290-295, doi: 10.1109/CEEICT.2018.8628081 Faisal, Fahim, Mirza Muntasir Nishat, Md Ashif Mahbub, Md Minhajul Islam Shawon, and Md Mahbub-Ul-Huq Alvi, “Covid-19 and its impact on school closures: a predictive analysis using machine learning algorithms,” In 2021 International Conference on Science and Contemporary Technologies (ICSCT), IEEE, 2021 Nishat, M M., Faisal, F., Hasan, T., Nasrullah, S M., Bristy, A H., Shawon, M I., & Hoque, A., “Detection of Autism Spectrum Disorder by Discriminant Analysis Algorithm,” In Proceedings of the International Conference on Big Data, IoT, and Machine Learning, pp 473-482, Springer, Singapore, 2022 Ali, Abdelmgeid A., “Stroke Prediction using Distributed Machine Learning Based on Apache Spark,” Stroke 28(15), pp 89-97, 2019 Shanthi, D., Sahoo, G., and Saravanan, N., “Designing an artificial neural network model for the prediction of thrombo-embolic stroke." International Journals of Biometric and Bioinformatics (IJBB),” 3(1), pp 10-18, 2009 Cheng, C A., Lin, Y C., and Chiu, H W., “Prediction of the prognosis of ischemic stroke patients after intravenous thrombolysis using artificial neural networks,” In ICIMTH, pp 115-118, 2014 Kansadub, T., Thammaboosadee, S., Kiattisin, S., and Jalayondeja, C., “Stroke risk prediction model based on demographic data,” In 2015 8th Biomedical Engineering International Conference (BMEiCON), pp 1-3 IEEE, 2015 Linder, R., König, I R., Weimar, C., Diener, H C., Pöppl, S J., and Ziegler, A., “Two models for outcome prediction,” Methods of information in medicine, 45(5), pp 536-540, 2006 Sung, S F., Hsieh, C Y., Yang, Y H K., Lin, H J., Chen, C H., Chen, Y W., and Hu, Y H., “Developing a stroke severity index based on administrative data was feasible using data mining techniques,” Journal of clinical epidemiology, 68(11), pp 1292-1300, 2015 https://www.kaggle.com/fedesoriano/stroke-prediction-dataset Ali, S and Majid, A., “Can–Evo–Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences,” Journal of biomedical informatics, 54, pp 256-269, 2015 Ting, K.M and Witten, I.H., “Issues in stacked generalization,” Journal of artificial intelligence research, 10, pp 271-289, 1999 Džeroski, S and Ženko, B., “Stacking with multi-response model trees,” In International workshop on multiple classifier systems, pp 201-211, Springer, 2002 Zhu, D., “A hybrid approach for efficient ensembles,” Decision Support Systems, 48(3), pp 480-487, 2010 ... a final classifier, a stack of estimators is formed Stacked generalization involves stacking the output of individual estimators and computing the final prediction using a classifier [36] In. .. clear picture of the practicality and suitability of the three distinct algorithms and the stacking classifier in predicting stroke with high accuracy The graphical representation of all the ML models... services Hence, the methodology of the study is presented in Section III where the data processing, feature selection and the concept of stacking classifier are discussed In section IV, the experimental

Định dạng
Số trang	6
Dung lượng	1,19 MB