Severity Prediction based on Cytokine and Clinical Data

Một phần của tài liệu Investigating the 2005 singaporean dengue outbreak (Trang 142 - 149)

3.1 Decision Tree Analyses of Clinical Data

3.1.2 Prediction of Disease Severity in Dengue Patients

3.1.2.7 Severity Prediction based on Cytokine and Clinical Data

Furthermore, we checked for the influence of cytokines on the classifier performance especially to elucidate the possibility of using specific cytokines as a marker for more severe infections. For this kind of analysis, we calculated a decision tree based on 89 dengue positive patients excluding the cytokine data to be able to compare the influence of cytokines on the overall classifier performance. The remaining 36 patients were excluded due to no cytokine data. The resulting tree (SEVERE_EXCYT_89) (Figure 3.24) (excluding TEMP_1), having a pruning confidence of 25% and

“minimum cases” set to 10, was identical to the tree (Figure 3.20) calculated on the total of 133 dengue patients using PLT_1 <= 108 (OR: 40.53; 95%CI: 32.40, 48.65), CT_1ST_COLLECTION <= 20.9 (OR: 13.53; 95%CI: 5.55, 21.51) as well as DV_IG_G_1 = ‘positive’ (OR: 2.28; 95%CI: 6.04, 13.46)13 as the classification criteria for more severe infections (Table 3.45; Table 3.46). The tree had a misclassification rate of 20.28% with a sensitivity of 77% and a specificity of 83%

(Table 3.47; Figure 3.25). The AUC of the high classification was minimally higher (95%CI: 0.70, 0.88) than the AUC for the low classification (95%CI: 0.69, 0.89) indicating similar overall performance of 0.79. The average profit of the chosen probabilistic classifier was 0.594.

13 TEMP=body temperature; PLT=platelet count; CT_1st_COLLECTION=viral load whereby a high

ROOT 33 low / 56 HIGH

PLT_1 <= 108 14 LOW / 1 high

PLT_1 > 108 19 lows / 55 HIGH

CT_1ST_COLLECTION <= 20.9 19 low / 33 HIGH

CT_1ST_COLLECTION > 20.9 0 low / 22 HIGH

DV_IG_G_1 = positive 13 LOW / 6 high

DV_IG_G_1 = negative 6 low / 27 HIGH ROOT

33 low / 56 HIGH

PLT_1 <= 108 14 LOW / 1 high

PLT_1 > 108 19 lows / 55 HIGH

CT_1ST_COLLECTION <= 20.9 19 low / 33 HIGH

CT_1ST_COLLECTION > 20.9 0 low / 22 HIGH

DV_IG_G_1 = positive 13 LOW / 6 high

DV_IG_G_1 = negative 6 low / 27 HIGH

Figure 3.24: SEVERE_EXCYT_89: Decision tree for severity prediction calculated on 89 patients excluding cytokine data. PLT=platelet count; CT_1st_COLLECTION=viral load whereby a high Ct- value indicates a low viral load; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data.

Table 3.45: SEVERE_EXCYT_89: Decision tree for severity prediction calculated on 89 patients excluding cytokine data. Statistical analysis of splitting criteria performed on the whole dataset.

PLT=platelet count; CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data; RR=relative risk; OR=odds ratio; CI=confidence interval.

p value Decision Node Feature RR OR 95% CI (OR)

PLT_1 [*1000/mm3] 3.64 40.53 32.40, 48.65 < 0.001 Cut-off value <= 108

CT_1ST_COLLECTION 6.05 10.80 6.20, 15.40 < 0.001 Cut-off value <= 20.9

DV_IG_G_1 1.91 2.90 0.43, 5.30 0.026

Cut-off value = positive

Table 3.46: SEVERE_EXCYT_89: Decision tree for severity prediction calculated on 89 patients excluding cytokine data. Statistical analysis of splitting criteria performed on each subgroup at the decision nodes. In case of 0 values in the original contingency table, OR calculations were adjusted by adding 1 to each table value +1. PLT=platelet count; CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data; RR=relative risk; OR=odds ratio;

CI=confidence interval.

p value Decision Node Feature RR OR 95% CI (OR)

PLT_1 [*1000/mm3] 3.64 40.53 32.40, 48.65 < 0.001 Cut-off value <= 108

CT_1ST_COLLECTION +1 8.89 13.53 5.55, 21.51 < 0.001 Cut-off value <= 20.9

DV_IG_G_1 3.76 2.28 6.04, 13.46 0.001

Cut-off value = positive

Table 3.47: SEVERE_EXCYT_89: Summary of K-fold (k=10) cross validation for severity prediction based on 89 patients excluding cytokine data.

Overall Evaluation Value (n=89) Confusion Matrix Total

misclassifications 18.0 Predicted Class

Overall error rate 20.278% high low

SE of error rate 8.886 46 10

high

Actual Class

(83%) (17%) Average profit 0.594

SE of profit 0.178 8 25

low

95%CI: 0.70,

0.88 (23%) (77%)

AUC high 0.7933

95%CI: 0.69, 0.89 AUC low 0.7901

Figure 3.25: SEVERE_EXCYT_89: Receiver operating characteristics (ROC) curve for severity prediction calculated on 89 patients excluding cytokine data.

Including the cytokine data and leaving the technical tree parameters unchanged (minimum cases was set to 10 with a pruning confidence of 25%), no changes in the tree splitting criteria were observed but the overall performance of the tree decreased suggesting noisiness caused by interference of cytokine and clinical data.

Therefore, we constructed a tree that excluded TEMP_1 and CT_1ST_COLLECTION.

This resulted in a tree similar to the one calculated without the cytokine data (SEVERE_INCYTA_89) (Figure 3.24), with the split represented by CT_1ST_COLLECTION exchanged by IP_10_1 (OR: 12.75; 95%CI: 7.98, 17.52) 14 (Table 3.48; Table 3.49). The chosen classifier (Figure 3.26) had a higher profit (0.617) with higher specificity (86%) but lower sensitivity (74%) and the resulting overall error rate was 19.17% (Table 3.50; Figure 3.27). The AUC for low and high classification was 0.79 but the two groups showed different confidence intervals (low 95%CI: 0.68, 0.89; high 95%CI: 0.69, 0.88).

14 TEMP=body temperature; CT_1st_COLLECTION=viral load whereby a high Ct-value indicates a low viral load; PLT=platelet count; IP_10=interferon-inducible protein 10; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data.

ROOT 33 low / 56 HIGH

PLT_1 <= 108 14 LOW / 1 high

PLT_1 > 108 19 lows / 55 HIGH

IP_10_1 > 1697.9 17 low / 22 HIGH

IP_10_1 <= 1697.9 2 low / 33 HIGH

DV_IG_G_1 = positive 12 LOW / 4 high

DV_IG_G_1 = negative 7 low / 18 HIGH ROOT

33 low / 56 HIGH

PLT_1 <= 108 14 LOW / 1 high

PLT_1 > 108 19 lows / 55 HIGH

IP_10_1 > 1697.9 17 low / 22 HIGH

IP_10_1 <= 1697.9 2 low / 33 HIGH

DV_IG_G_1 = positive 12 LOW / 4 high

DV_IG_G_1 = negative 7 low / 18 HIGH

Figure 3.26: SEVERE_INCYTA_89: Decision tree for severity prediction calculated on 89 patients including cytokine data. PLT=platelet count; IP_10=interferon-inducible protein 10;

DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data.

Table 3.48: SEVERE_INCYTA_89: Decision tree for severity prediction calculated on 89 patients including cytokine data. Statistical analysis of splitting criteria performed on the whole dataset.

PLT=platelet count; IP_10=interferon-inducible protein 10; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data;

RR=relative risk; OR=odds ratio; CI=confidence interval.

p value Decision Node Feature RR OR 95% CI (OR)

PLT_1 [*1000/mm3] 3.64 40.53 32.40, 48.65 < 0.001 Cut-off value <= 108

IP_10_1 [pg/ml] 5.40 11.20 7.97, 14.44 < 0.001 Cut-off value > 1697.9

DV_IG_G_1 1.91 2.87 0.43, 5.30 0.026

Cut-off value = positive

Table 3.49: SEVERE_INCYTA_89: Decision tree for severity prediction calculated on 89 patients including cytokine data. Statistical analysis of splitting criteria performed on each subgroup at the decision nodes. PLT=platelet count; IP_10=interferon-inducible protein 10; DV_IG_G=indicator for primary/secondary infection whereby a positive result indicates a secondary infection; 1=1st visit data;

RR=relative risk; OR=odds ratio; CI=confidence interval.

p value Decision Node Feature RR OR 95% CI (OR)

PLT_1 [*1000/mm3] 3.64 40.53 32.40, 48.65 < 0.001 Cut-off value <= 108

IP_10_1 [pg/ml] 7.63 12.75 7.98, 17.52 < 0.001 Cut-off value <= 1697.9

DV_IG_G_1 3.45 10.80 6.30, 15.30 0.003

Table 3.50: SEVERE_INCYTA_89: Summary of K-fold (k=10) cross validation for severity prediction based on 89 patients including cytokine data.

Overall Evaluation Value (n=89) Confusion Matrix Total

misclassifications 17.0 Predicted Class

Overall error rate 19.167% high low

SE of error rate 7.686 48 8

high

Actual Class

(86%) (14%) Average profit 0.617

SE of profit 0.154 9 24

low

95%CI: 0.69,

0.88 (26%) (74%)

AUC high 0.7865

95%CI: 0.68, 0.89 AUC low 0.7874

Figure 3.27: SEVERE_INCYTA_89: Receiver operating characteristics (ROC) curve for severity

Một phần của tài liệu Investigating the 2005 singaporean dengue outbreak (Trang 142 - 149)

Tải bản đầy đủ (PDF)

(239 trang)