Severity Prediction based on Clinical Data and only using hospitalized Cases

Một phần của tài liệu Investigating the 2005 singaporean dengue outbreak (Trang 138 - 142)

3.1 Decision Tree Analyses of Clinical Data

3.1.2 Prediction of Disease Severity in Dengue Patients

3.1.2.6 Severity Prediction based on Clinical Data and only using hospitalized Cases

The second tree (SEVHOSP_TOTAL_71) was constructed by only including first visit data from hospitalized cases leading to a dataset of 71 samples. Pruning confidence was set to the standard of 25% and a value of 4 was given to “minimum cases”. The overall performance of the decision tree excluding the body temperature (TEMP_1) was slightly worse and resulted in a higher sensitivity (84%) but lower specificity (70%) (Table 3.44; Figure 3.23). The AUC of the ROC curve for more severe (95%CI:

0.69, 0.90) as well as mild cases (95%CI: 0.69, 0.91) was 0.80. The classifier correctly predicted 55 cases with an overall misclassification rate of 22.7%. Interestingly, the tree (Figure 3.22) still used PLT_1 <= 108 (OR: 25.45; 95%CI: 17.41, 33.50) as the main decision criterium to detect the low cases. Patients that were found to have higher platelet counts were further separated in secondary/primary infections. DV_IG__1 =

‘positive’ (OR: 5.63; 95%CI: 2.21, 9.05) patients were sub-grouped by first using CT_1st_COLLECTION <= 20.94 (OR: 29.75; 95%CI: 19.14, 40.36) as a criteria followed by MCHC_1 <= 34.5 (OR: 32.00; 95%CI: 18.00, 46.00) as a last decision node, both indicating classification into the low group. On the other hand, cases that were categorized as primary infections (DV_IG_G_1 = ‘negative’) were further grouped by using DIASTOLIC_BP_1 <= 67.0 (OR: 11.38; 95%CI: 1.67, 21.08) as a threshold for low cases. However, patients that had a higher diastolic blood pressure were separated into groups having either lower or higher systolic blood pressure and SYSTOLICBP_1<=119.0 (OR: 14.00; 95%CI: 1.67, 26.33) was an indicator for the more severe group low 12 (Table 3.42; Table 3.43).

12 PLT=platelet count; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; MCHC=mean corpuscular hemoglobin concentration; DIASTOLIC=diastolic blood

ROOT 42 LOW / 29 high

PLT_1 <= 108 20 LOW / 1 high

PLT_1 > 108 22 low / 28 HIGH

DV_IG_G_1 = positive 16 LOW / 9 high

DV_IG_G_1 = negative 6 low / 19 HIGH

CT_1ST_COLLECTION <= 20.94 16 LOW / 3 high

CT_1ST_COLLECTION > 20.94 0 low / 6 HIGH

DIASTOLIC1 <= 67.0 0 low / 12 HIGH

DIASTOLIC1 > 67.0 6 low / 7 HIGH

MCHC_1 <= 34.5 15 LOW / 0 high

MCHC_1 > 34.5 1 low / 3 HIGH

SYSTOLICBP_1 <= 119.0 6 LOW / 2 high

SYSTOLICBP_1 > 119.0 0 low / 5 HIGH ROOT

42 LOW / 29 high

PLT_1 <= 108 20 LOW / 1 high

PLT_1 > 108 22 low / 28 HIGH

DV_IG_G_1 = positive 16 LOW / 9 high

DV_IG_G_1 = negative 6 low / 19 HIGH

CT_1ST_COLLECTION <= 20.94 16 LOW / 3 high

CT_1ST_COLLECTION > 20.94 0 low / 6 HIGH

DIASTOLIC1 <= 67.0 0 low / 12 HIGH

DIASTOLIC1 > 67.0 6 low / 7 HIGH

MCHC_1 <= 34.5 15 LOW / 0 high

MCHC_1 > 34.5 1 low / 3 HIGH

SYSTOLICBP_1 <= 119.0 6 LOW / 2 high

SYSTOLICBP_1 > 119.0 0 low / 5 HIGH

Figure 3.22: SEVHOSP_TOTAL_71: Decision tree for severity prediction calculated on 71 hospitalized patients excluding cytokine data. PLT=platelet count; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection;

CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; MCHC=mean corpuscular hemoglobin concentration; DIASTOLIC=diastolic blood pressure; SYSTOLICBP=systolic blood pressure; 1=1st visit data.

Table 3.42: SEVHOSP_TOTAL_71: Decision tree for severity prediction calculated on 71 hospitalized patients excluding cytokine data. Statistical analysis of splitting criteria performed on the whole dataset.

PLT=platelet count; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; MCHC=mean corpuscular hemoglobin concentration; DIASTOLIC=diastolic blood pressure; SYSTOLICBP=systolic blood pressure; 1=1st visit data; RR=relative risk; OR=odds ratio;

CI=confidence interval.

p value Decision Node Feature RR OR 95% CI (OR)

PLT_1 [*1000/mm3] 2.16 25.45 17.41, 33.50 < 0.001 Cut-off value <= 108

DV_IG_G_1 1.65 3.42 0.72, 6.12 0.017

Cut-off value = positive

CT_1ST_COLLECTION 1.74 3.02 -078, 6.83 0.11 Cut-off value <= 20.94

MCHC_1 0.78 0.52 -2.71, 3.75 0.397

Cut-off value <= 34.5

DIASTOLIC1 1.21 1.57 -1.1084, 4.26 0.451

Cut-off value > 67.0

SYSTOLICBP_1 0.96 0.90 -1.74, 3.54 0.451 Cut-off value <= 119.0

Table 3.43: SEVHOSP_TOTAL_71: Decision tree for severity prediction calculated on 71 patients excluding cytokine data. Statistical analysis of splitting criteria performed on each subgroup at the decision nodes. In case of 0 values in the original contingency table, OR calculations were adjusted by adding 1 to each table value +1. PLT=platelet count; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; MCHC=mean corpuscular hemoglobin concentration; DIASTOLIC=diastolic blood pressure; SYSTOLICBP=systolic blood pressure; 1=1st visit data; RR=relative risk; OR=odds ratio; CI=confidence interval.

p value Decision Node Feature RR OR 95% CI (OR)

PLT_1 [*1000/mm3] 2.16 25.45 17.41, 33.50 < 0.001 Cut-off value <= 108

DV_IG_G_1 2.67 5.63 2.21, 9.05 0.01

Cut-off value = positive

CT_1ST_COLLECTION +1 6.48 29.75 19.14, 40.36 < 0.001 Cut-off value <= 20.94

MCHC_1 [g/dl] +1 2.82 32.00 18.00, 46.00 < 0.001 Cut-off value <= 34.5

DIASTOLIC1 [mmHg] +1 6.53 11.38 1.67, 21.08 0.015 Cut-off value > 67.0

SYSTOLICBP_1 [mmHg] +1 1.28 14.00 1.67, 26.33 0.015 Cut-off value <= 119.0

Table 3.44: SEVHOSP_TOTAL_71: Summary of K-fold (k=10) cross validation for severity prediction based on 71 hospitalized patients excluding cytokine data.

Overall Evaluation Value (n=71) Confusion Matrix Total

misclassifications 16.0 Predicted Class

Overall error rate 22.679% high low

SE of error rate 19.380 20 9

high

Actual Class

(70%) (30%) Average profit 0.546

SE of profit 0.388 7 35

low

95%CI: 0.69,

0.91 (17%) (83%)

AUC high 0.7991

95%CI: 0.69, 0.90 AUC low 0.7962

Figure 3.23: SEVHOSP_TOTAL_71: Receiver operating characteristics (ROC) curve for severity prediction calculated on 71 hospitalized patients excluding cytokine data.

Một phần của tài liệu Investigating the 2005 singaporean dengue outbreak (Trang 138 - 142)

Tải bản đầy đủ (PDF)

(239 trang)