Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 29 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
29
Dung lượng
775,46 KB
Nội dung
The following code uses PROC TABULATE to create the decile analysis, a table that calculates the number of observations (records) in each decile, the average predicted probability per decile, the percent active per responders (target of Method 2, model 2), the response rate (target of Method 2, model 1), and the active rate (target of Method 1). title1 "Decile Analysis - Activation Model - One Step"; title2 "Model Data - Score Selection"; proc tabulate data=acqmod.mod_dec; weight smp_wgt; class mod_dec; var respond active pred records activ_r; table mod_dec='Decile' all='Total', records='Prospects'*sum=' '*f=comma10. pred='Predicted Probability'*(mean=' '*f=11.5) activ_r='Percent Active of Responders'*(mean=' '*f=11.5) respond='Percent Respond'*(mean=' '*f=11.5) active='Percent Active'*(mean=' '*f=11.5) /rts = 9 row=float; run; In Figure 5.5, the decile analysis shows the model's ability to rank order the prospects by their active behavior. To clarify, each prospect's probability of becoming active is considered its rank. The goal of the model is to rank order the prospects so as to bring the true actives to the lowest decile. At first glance we can see that the best decile (0) has 17.5 as many actives as the worst decile Figure 5.5 Decile analysis using model data. Page 117 (9). And as we go from decile 0 to decile 9, the percent active value is monotonically decreasing with a strong decrease in the first three deciles. The only exception is the two deciles in the middle with the same percent active rate. This is not unusual since the model is most powerful in deciles 0 and 9, where it gets the best separation. Overall, the model score does a good job of targeting active accounts. Because the model was built on the data used in Figure 5.5, a better test will be on the validation data set. Another consideration is how closely the ''Percent Active" matches the "Predicted Probability." The values in these columns for each decile are not as close as they could be. If my sample had been larger, they would probably be more equal. I will look for similar behavior in the decile analysis for the validation data set. Preliminary Evaluation Because I carried the validation data through the model using the missing weights, each time the model is processed, the validation data set is scored along with the model data. By creating a decile analysis on the validation data set we can evaluate how well the model will transfer the results to similar data. As mentioned earlier, a model that works well on alternate data is said to be robust. In chapter 6, I will discuss additional methods for validation that go beyond simple decile analysis. The next code listing creates the same table for the validation data set. This provides our first analysis of the ability of the model to rank order data other than the model development data. It is a good test of the robustness of the model or its ability to perform on other prospect data. The code is the same as for the model data decile analysis except for the (where=( splitwgt = .)) option. This accesses the "hold out" sample or validation data set. proc univariate data=acqmod.out_act1 (where=( splitwgt = .)) noprint; weight smp_wgt; var pred active; output out=preddata sumwgt=sumwgt; run; data acqmod.val_dec; set acqmod.out_act1(where=( splitwgt = .)); if (_n_ eq 1) then set preddata; retain sumwgt; number+smp_wgt; if number < .1*sumwgt then val_dec = 0; else if number < .2*sumwgt then val_dec = 1; else if number < .3*sumwgt then val_dec = 2; else if number < .4*sumwgt then val_dec = 3; else if number < .5*sumwgt then val_dec = 4; else if number < .6*sumwgt then val_dec = 5; else if number < .7*sumwgt then val_dec = 6; else if number < .8*sumwgt then val_dec = 7; else if number < .9*sumwgt then val_dec = 8; else val_dec = 9; activ_r = (activate = '1'); run; title1 "Decile Analysis - Activation Model - One Step"; title2 "Validation Data - Score Selection"; PROC tabulate data=acqmod.val_dec; weight smp_wgt; class val_dec; var respond active pred records activ_r; table val_dec='Decile' all='Total', records='Prospects'*sum=' '*f=comma10. pred='Predicted Probability'*(mean=' '*f=11.5) activ_r='Percent Active of Responders'*(mean=' '*f=11.5) respond='Percent Respond'*(mean=' '*f=11.5) active='Percent Active'*(mean=' '*f=11.5) /rts = 9 row=float; run; The validation decile analysis seen in Figure 5.6 shows slight degradation from the original model. This is to be expected. But the rank ordering is still strong Figure 5.6 Decile analysis using validation data. Page 119 with the best decile attracting almost seven times as many actives as the worst decile. We see the same degree of difference between the "Predicted Probability" and the actual "Percent Active" as we saw in the decile analysis of the model data in Figure 5.5. Decile 0 shows the most dramatic difference, but the other deciles follow a similar pattern to the model data. There is also a little flipflop going on in deciles 5 and 6, but the degree is minor and probably reflects nuances in the data. In chapter 6, I will perform some more general types of validation, which will determine if this is a real problem. Method 2: Two Models — Response The process for two models is similar to the process for the single model. The only difference is that the response and activation models are processed separately through the stepwise, backward, and Score selection methods. The code differences are highlighted here: proc logistic data=acqmod.model2 (keep=variables) descending; weight splitwgt ; model respond = variables /selection = stepwise sle=.3 sls=.3; run; proc logistic data=acqmod.model2 (keep=variables) descending; weight splitwgt ; model respond = variables /selection = backward sls=.3; run; proc logistic data=acqmod.model2 (keep=variables) descending; weight splitwgt ; model respond = variables /selection = score best=2; run; The output from the Method 2 response models is similar to the Method 1 approach. Figure 5.7 shows the decile analysis for the response model calculated on the validation data set. It shows strong rank ordering for response. The rank ordering for activation is a little weaker, which is to be expected. There are different drivers for response and activation. Because activation is strongly driven by response, the ranking for activation is strong. Method 2: Two Models — Activation As I process the model for predicting activation given response (active|response), recall that I can use the value activate because it has a value of missing for nonresponders. This means that the nonresponders will be eliminated from the model processing. The following code processes the model: Figure 5.7 Method 2 response model decile analysis. proc logistic data=acqmod.model2 (keep=variables) descending; weight splitwgt ; model activate = variables /selection = stepwise sle=.3 sls=.3; run; proc logistic data=acqmod.model2 (keep=variables) descending; weight splitwgt ; model activate = variables /selection = backward sls=.3; run; proc logistic data=acqmod.model2 (keep=variables) descending; weight splitwgt ; model activate = variables /selection = score best=2; run; The output from the Method 2 activation models is similar to the Method 1 approach. Figure 5.8 shows the decile analysis for the activation model calculated on the validation data set. It shows strong rank ordering for activation given response. As expected, it is weak when predicting activation for the entire file. Our next step is to compare the results of the two methods. Figure 5.8 Method 2 activation model decile analysis. Comparing Method 1 and Method 2 At this point, I have several options for the final model. I have a single model that predicts the probability of an active account that was created using Method 1, the single- model approach. And I have two models from Method 2, one that predicts the probability of response and the other that predicts the probability of an active account, given response (active|response). To compare the performance between the two methods, I must combine the models developed in Method 2. To do this, I use a simplified form of Bayes' Theorem. Let's say: P(R) = the probability of response (model 1 in Method 2) P(A|R) = the probability of becoming active given response (model 2 in Method 2) P(A and R) = the probability of responding and becoming active Then: P(A and R) = P(R)* P(A|R) Therefore, to get the probability of responding and becoming active, I multiply the probabilities created in model 1 and model 2. TEAMFLY Team-Fly ® Page 122 Following the processing of the score selection for each of the two models in Method 2, I reran the models with the final variables and created two output data sets that contained the predicted scores, acqmod.out_rsp2 and acqmod.out_act2. The following code takes the output data sets from the Method 2 models built using the score option. The where= ( splitwgt = .) option designates both probabilities are taken from the validation data set. Because the same sample was used to build both models in Method 2, when merged together by pros_id the names should match up exactly. The rename=(pred=predrsp) creates different names for the predictors for each model. proc sort data=acqmod.out_rsp2 out=acqmod.validrsp (where=( splitwgt = .) rename=(pred=predrsp)); by pros_id; run; proc sort data=acqmod.out_act2 out=acqmod.validact (where=( splitwgt = .) rename=(pred=predact)); by pros_id; run; data acqmod.blend; merge acqmod.validrsp acqmod.validact; by pros_id; run; data acqmod.blend; set acqmod.blend; predact2 = predrsp*predact; run; To compare the models, I create a decile analysis for the probability of becoming active derived using Method 2 (predact2) with the following code: proc sort data=acqmod.blend; by descending predact2; run; proc univariate data=acqmod.blend noprint; weight smp_wgt; var predact2; output out=preddata sumwgt=sumwgt; run; data acqmod.blend; set acqmod.blend; if (_n_ eq 1) then set preddata; retain sumwgt; number+smp_wgt; if number < .1*sumwgt then act2dec = 0; else Page 123 if number < .2*sumwgt then act2dec = 1; else if number < .3*sumwgt then act2dec = 2; else if number < .4*sumwgt then act2dec = 3; else if number < .5*sumwgt then act2dec = 4; else if number < .6*sumwgt then act2dec = 5; else if number < .7*sumwgt then act2dec = 6; else if number < .8*sumwgt then act2dec = 7; else if number < .9*sumwgt then act2dec = 8; else act2dec = 9; run; title1 "Decile Analysis - Model Comparison"; title2 "Validation Data - Two Step Model"; PROC tabulate data=acqmod.blend; weight smp_wgt; class act2dec; var active predact2 records; table act2dec='Decile' all='Total', records='Prospects'*sum=' '*f=comma10. predact2='Predicted Probability'*(mean=' '*f=11.5) active='Percent Active'*(mean=' '*f=11.5) /rts = 9 row=float; run; In Figure 5.9, the decile analysis of the combined scores on the validation data for the two-model approach shows a slightly better performance than the one-model approach in Figure 5.6. This provides confidence in our results. At first glance, it's difficult to pick the winner. Figure 5.9 Combined model decile analysis. Page 124 Summary This chapter allowed us to enjoy the fruits of our labor. I built several models with strong power to rank order the prospects by their propensity to become active. We saw that many of the segmented and transformed variables dominated the models. And we explored several methods for finding the best-fitting model using two distinct methodologies. In the next chapter, I will measure the robustness of our models and select a winner. Page 125 Chapter 6— Validating the Model The masterpiece is out of the oven! Now we want to ensure that it was cooked to perfection. It's time for the taste test! Validating the model is a critical step in the process. It allows us to determine if we've successfully performed all the prior steps. If a model does not validate well, it can be due to data problems, poorly fitting variables, or problematic techniques. There are several methods for validating models. In this chapter, I begin with the basic tools for validating the model, gains tables and gains charts. Marketers and managers love them because they take the modeling results right to the bottom line. Next, I test the results of the model algorithm on an alternate data set. A major section of the chapter focuses on the steps for creating confidence intervals around the model estimates using resampling. This is gaining popularity as an excellent method for determining the robustness of a model. In the final section I discuss ways to validate the model by measuring its effect on key market drivers. Gains Tables and Charts A gains table is an excellent tool for evaluating the performance of a model. It contains actionable information that can be easily understood and used by non- [...]... two or three deciles and sample the remaining deciles Then develop a new model with the results of the campaign At this point the Method 1 model and the Method 2 model are running neck and neck The next section will describe some powerful tools to test the robustness of the models and select a winner Resampling Resampling is a common-sense, nonstatistical technique for estimating and validating models... for evaluating and comparing models At each decile it demonstrates the model's power to beat the random approach or average performance In Figure 6.1, we see that decile 0 is 2.77 times the average Up to and including decile 3, the model performs better than average Another way to use lift is in cumulative lift This means to a given "Depth of File": the model performs better than random If we go through... this point, the macro processing is finished, and we have our 25 bootstrap samples The following code combines the 25 bootstrap samples with the data set containing the values for the original validation sample (acqmod.fullmean) using the decile value It then calculates the mean and standard deviation for each estimate: active rate, predicted probability, and lift Following the formula for bootstrap... observation in the data set In our case study, the model was developed on a 50% random sample, presumed to be representative of the entire campaign data set A 50% random sample was held out for validation In this section, I use jackknifing to estimate the pre- Page 135 dicted probability of active, the actual active rate, and the lift for each decile using 100– 99% samples I will show the code for this... appended to the data sets and the lifts are calculated data acqmod.jkmns&prcnt; set acqmod.jkmns&prcnt; if (_n_ eq 1) then set actomean; retain actom&prcnt; liftd&prcnt = 100*actmn&prcnt/actom&prcnt; After this process is repeated 100 times, the macro is terminated %end; %mend; %jackknif; Page 137 The 100 output files are merged together by decile The values for the mean and standard deviation are calculated... following formula: In order to calculate a confidence interval, I must derive a standard error for the bootstrap I use the standard deviation of the set of bootstrap estimates (BSi) A 95% bootstrap confidence intervals is derived using the following formula: Due to the large samples in marketing data, it is often impractical and unnecessary to pull the number of samples equal to the number of observations... validation technique This allows us to calculate confidence intervals around our estimates Two main types of resampling techniques are used in database marketing: jackknifing and bootstrapping The following discussion and examples highlight and compare the power of these two techniques Jackknifing In its purest form, jackknifing is a resampling technique based on the "leave -one-out" principle So, if N is... distinguishing actives from nonresponders and nonactive responders The lift measures imitate the similarity Page 128 Figure 6.2 Validation gains chart Recall that the response model in the two-model approach did a good job of predicting actives overall The gains chart in Figure 6.4 compares the activation model from the one -model approach to the response and combination models developed using the... Figure 6.4 Gains chart comparing models from Method 1 and Method 2 Page 130 Scoring Alternate Data Sets In business, the typical purpose of developing a targeting model is to predict behavior on a data set other than that on which the model was developed In our case study, I have our "hold out" sample that I used for validation But because it was randomly sampled from the same data set as the model... samples I will show the code for this process for the Method 1 model The process will be repeated for the Method 2 model, and the results will be compared The program begins with the logistic regression to create an output file (acqmod.resamp) that contains only the validation data and a few key variables Each record is scored with a predicted value (pred) proc logistic data=acqmod.model2 descending; . .1*sumwgt then act2dec = 0; else Page 123 if number < .2* sumwgt then act2dec = 1; else if number < .3*sumwgt then act2dec = 2; else if number < .4*sumwgt then act2dec = 3; else. response (model 2 in Method 2) P(A and R) = the probability of responding and becoming active Then: P(A and R) = P(R)* P(A|R) Therefore, to get the probability of responding and becoming active,. two models in Method 2, I reran the models with the final variables and created two output data sets that contained the predicted scores, acqmod.out_rsp2 and acqmod.out_act2. The following code