Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 59 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
59
Dung lượng
2,5 MB
Nội dung
Data Science CHAPTER & 11 for Business Decision Analytic Thinking I II What Is a Good Model? Toward Analytical Engineering Lecturer: VAN CHAU EMail: vvchauit@gmail.com Zalo:0918080300 Decision Analytic Thinking Chapter WHAT IS A GOOD MODEL ? 01 Evaluating Classifiers 02 Plain Accuracy and Its Problems 03 The Confusion Matrix 04 Problems with Unbalanced Classes 05 Problems with Unequal Costs and Benefits 06 A Key Analytical Framework: Expected Value 07 Using Expected Value to Frame Classifier Use 08 Using Expected Value to Frame Classifier Evaluation 09 Evaluation, Baseline Performance, and Implications for Investments in Data 10 Summary Evaluating Classifiers Binary classification, for which the classes often are simply called “positive” and “negative.” How shall we evaluate how well such a model performs? In Chapter we discussed how for evaluation we should use a holdout test set to assess the generalization performance of the model But how should we measure generalization performance? Let’s look at the basic confusion matrix first Evaluating Classifiers(cont.) 24 Evaluation Metrics for Binary Classification (And When to Use Them) Confusion Martix False positive rate | Type-I error False negative rate | Type-II error True negative rate | Specificity Negative predictive value False discovery rate True positive rate | Recall | Sensitivity Positive predictive value | Precision Accuracy 10 F beta score 11 F1 score 12 F2 score 13 Cohen Kappa 14 Matthews correlation coefficient 15 ROC curve 16 ROC AUC score 17 Precision-Recall curve 18 PR AUC | Average precision 19 Log loss 20 Brier score 21 Cumulative gain chart 22 Lift curve | Lift chart 23 Kolmogorov-Smirnov plot 24 Kolmogorov Smirnov statistics https://neptune.ai/blog/evaluation-metrics-binary-classification Plain Accuracy and Its Problems •1. Up to this point we have assumed that some simple metric, such as classifier error rate or accuracy, was being used to measure a model’s performance Accuracy Error rate = 1- Accuracy Accuracy is a common evaluation metric that is often used in data mining studies because it reduces classifier performance to a single number and it is very easy to measure Unfortunately, it is usually too simplistic for applications of data mining techniques to real business problems Plain Accuracy and Its Problems(cont.) Let's try calculating accuracy for the following model that classified 100 tumors as malignant (the positive class) or benign (the negative class): •True Positive (TP):Reality: Malignant •ML model predicted: Malignant •Number of TP results: •False Positive (FP):Reality: Benign •ML model predicted: Malignant •Number of FP results: •False Negative (FN):Reality: •True Negative (TN):Reality: Malignant Benign •ML model predicted: Benign •ML model predicted: Benign •Number of FN results: •Number of TN results: 90 Accuracy=0.91 Accuracy comes out to 0.91, or 91% (91 correct predictions out of 100 total examples) That means our tumor classifier is doing a great job of identifying malignancies, right? Of the 91 benign tumors, the model correctly identifies 90 as benign That's good However, of the malignant tumors, the model only correctly identifies as malignant—a terrible outcome, as out of malignancies go undiagnosed! The Confusion Matrix (CM) A confusion matrix for a problem involving n classes is an n × n matrix with the columns labeled with actual classes and the rows labeled with predicted classes A confusion matrix separates out the decisions made by the classifier, making explicit how one class is being confused for another In this way different sorts of errors may be dealt with separately Here is the X confusion matrix The Confusion Matrix (cont.) In the table right side, there are terms we need to pay attention to: True Positive (TP): patients who are presumed to have an illness are indeed carriers True Negative (TN): The patients who are not presumed to have no disease are truly healthy False Positive (FP): patients who are expected to have an illness are in fact healthy Predicted Classes Actual Classes P N Y True Positives False Positives N False Negative True Negatives False Negative (FN): Patients who are not presumed to have no illness are actually carriers FP and FN are called with the names as Type I error and Type II error The Confusion Matrix(cont.) With CM, we will calculate two important quantities, Precision and Recall Precision: This is the ratio between people who actually have the disease compared to all predicted cases In other words, how many positive predictions are actually "true" in actual? Recall (called as Sensitivity): of those who actually have the disease, how many of them are correctly predicted by our model? In other words, how many predicted 'positive' are correct due to our model? precision R The Confusion Matrix (cont.) Type I Error (False Positive) and Type Error (False Negative) Source medium.com Source of Image: Effect Size FAQs by Paul Ellis Note: True / False: indicates whether what we predicted is true or not (true or false) Positive / Negative: indicates what we predict (yes or no) 10 ... measure Unfortunately, it is usually too simplistic for applications of data mining techniques to real business problems Plain Accuracy and Its Problems(cont.) Let's try calculating accuracy for the... model performs? In Chapter we discussed how for evaluation we should use a holdout test set to assess the generalization performance of the model But how should we measure generalization performance?... Classifier Evaluation 09 Evaluation, Baseline Performance, and Implications for Investments in Data 10 Summary Evaluating Classifiers Binary classification, for which the classes often are simply called