Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 29 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
29
Dung lượng
668,88 KB
Nội dung
Page 202 [...]... substitution For each variable, I also create a new identify which records have missing values The following code replaces the missing values for pid80c4 and ppbluec and creates four n variables , pid80crn, pc4_miss, ppbluecn, and pec_miss (Note: I'm using the first and last two letters to create a three-character referenc variable.) Figure 9.2 Univariate chi -square Figure 9.3 Means analysis of continuous... full list of candidate variables As I did in part 2, I am running logistic regression with three different selection options First, I will run the backward and stepwise selection on all variables I will then take the combination of winning variables from both methods and put it into a logistic regression using the score selection I will begin looking at models with about 20 variables and find where... Headquarters; 2 = Branch To capture the values, I create two indicator variables, bus_sgle and bus_hdqt, each Figure 9.6 Frequency of business level by weighted response Page 220 having the values of 0 and 1 If they are both equal to 0, then the value = 2 or Branch I run frequencies for all categorical variables and analyze the results I create indicator variables for each categorical variable to capture... are the natural form, psy_sq and pamsy13 (pamsy < 13) These two forms will be candidates for the final model This process is repeated for the remaining 32 variables The winning transformations for each continuous variable are combined into a new data set called acqmod.down_mod; Figure 9.5 Variable transformation selection Table 9.2 is a summary of the continuous variables and the transformations that... variables are shown in bold These can be used to assist in solicitation design and other marketing decisions The validation gains table created in proc tabulate can be seen in Figure 9.9 These values are expanded into a spreadsheet that can be seen in Figure 9.10 This table calculates cumulative values for response, 12-month sale, and lift Notice how the lift is much greater for 12months sales than it is... PU6_CURT WHITECO2 The next code performs a logistic regression on the same variables with the selection=stepwise option In this model, I used the options, sle=.0001 and sls=.0001 These options specify the level of significance to enter and stay in the model Page 222 proc logistic data=ch09.down_mod descending; weight modwgt; model respond = AUTONENT BRANCH BUS_HDQT EMPL_500 | | | | | | | | | | PU6_LOGI... backward selection has many more variables But the stepwise selection has five variables that were not selected by the backward selection method, ppipro , pm3_cu, pppa2, ptc_sq, and pu1_curt I add these five variables to the list and run the logistic regression using the score selection I want the best model based on the highest score so I use the option, best=1; proc logistic data=ch09.down_mod descending;... set that contains the predicted values My goal is to create a gains table so that I can evaluate the model's ability to rank responders and 12-month sales The following code runs the logistic regression, creates an output scoring file (ch09.coeff), creates deciles, and builds a gains table Notice that I do not use the weight in the validation gains table: proc logistic data=ch09.down_mod descending... logistic regression uses only numeric variables And it sees all variables as continuous In order to use categorical variables, I must create indicator, or binary, variables to state whether a given situation is true or false I have 15 categorical variables The first step is to run a frequency against the weighted response variable I also request the missing option and a chi -square test of significance: proc... the weighting used to attract big-spending responders In Figure 9.9 the model shows good rank ordering of prospects by response and sales The deciles are all monotonically decreasing with a strong decline, especially in 12-month sales In Figure 9.10 the cumulative values and lift measures show the true power of this model Notice how the lift for sales is so much better than the lift for response This . prospecting and more successful customer relationship management. You can segment and profile your customer base to uncover those profit drivers using your knowledge of your customers, products, and. programming to create the clusters is very simple. I designate three clusters with random seeds (random =55 55) . Replace=full directs the program to replace all the seeds with the cluster means. 9.1 shows that over 50 % of Downing's customers have 12- month sales of less than $100. And over 25% have 12 less than $50 . To help the model target the higher - dollar customers, I will use