Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 29 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
29
Dung lượng
413,99 KB
Nội dung
Page 173 How can you tell this has happened to you? If response rates seem extremely low but still have somewhat of a pulse, and if the offer is a proven offer, this may be an area that you want to investigate further. How can you confirm it? First, take the mail file and have this group's data (that would have been used to score them) appended. Score the model or apply the schema. Are they in the correct deciles/groups? If the answer is yes, you may need to look elsewhere for the source of your problem. If the answer is no, perform one other check. Go back to the main file/database where these persons' scores are stored. Pull out those names that were mailed and confirm that they belong to the deciles/groups they should. This two-part validation will answer two issues: Was the data scored properly to begin with, and was the model inverted? In the example of the direct marketing agency, the problem lay with having two databases in two different IT environments. The mainframe held the main database and was where the models were scored. A copy of these scores and deciles was extracted and given to an IT group in a relational setting. The scores were added to the relational environment and in doing so, the programmers ignored the decile codes and redeciled with the highest scores being assigned to decile 10 instead of 1. In revalidating and investigating all the efforts, if we had just compared the individual scores on the file in the relational setting without comparing back to the mainframe, we would have missed the problem. The cautionary tale here is that it can happen, so be careful not to let it. 6. Like a good farmer, check your crop rotation. This is another elementary point in database management, but again it can be overlooked. I was once asked if "list fatigue" existed, and I believe it does but can be avoided/minimized. One tactic is to develop some sound business rules that allow you to systematically rotate your lists. In direct marketing, the rule of thumb is usually 90-day intervals. There are some exceptions, though. With in-house files/databases, in-depth profiling will tell you what your frequency should be for talking to the customer. Some customers love constant communications (frequent purchasers, heavy users), while others would prefer you never talk to them (the opt-outs). E-mail solicitations have become very popular, mainly due to the low costs associated with producing them, but caution should be exercised in how often you fill up someone's inbox with offers. Even though we have all become somewhat numb to the amount of mailbox stuffers we receive, e-mail solicitations have a slightly more invasive feel than direct mail, similar to telemarketing calls. I often wonder how businesses that I haven't bought from get my e-mail address. If we as direct marketers can appreciate this distinction with e-mail and refrain from spamming our hearts out, we can probably assure ourselves that we won't be regulated in how often we can e-mail people and preserve a low-cost alternative for talking to our customers. continues Page 174 (Continued) How can you tell if list fatigue is setting in? Are response and conversion rates gradually declining in a nice steady curve? Can you tell me the average number of times a person is mailed and with what frequency? Do you have business rules that prevent you from over-communicating to customers? If the answers to these questions are yes, no, and no, chances are you aren't rotating your crops enough. 7. Does your model/schema have external validity? This is a question that sometimes is forgotten. You have great analysts who build technically perfect models. But can anyone interpret them in the context of the business? If the answer is no, your models/schemas do not have external validity. External validity to modeling is analogous to common sense. For example, let's take a look at a financial services model that finds that one of the factors in a model to predict demand for a high-interest-rate mortgage is someone's FICO score. FICO is weighted positively, which would be interpreted to mean that someone with a really high FICO score is more likely to convert. Well, any mortgage banker in the crowd will tell you that goes against what really happens. People with high FICO scores are people with excellent credit and therefore would most likely not be interested in, or likely to borrow at, high interest rates. Try evaluating and interpreting analytical work with a marketing manager's perspective. It will help you to evaluate whether your model/schema has external validity. 8. Does your model have good internal validity? When I refer to internal validity, I am referring to the validity of the model/schema building process itself. There are many ways to prevent a badly built model/schema from ever seeing the light of day. One good approach is to have the model/schema building process formalized with validation checks and reviews built into the process. Good modelers always keep a "hold-out" sample for validating their work. Documentation at every step of the process is good so in the case that something goes wrong, one can follow the model-building process much like a story. Not every modeler is very thorough. Having a formalized documentation/process can help to avoid errors. Having modelers review each other's work is also helpful. Often, I am asked to decipher whether a model is "good" or not by just looking at the algorithm. That in itself is not enough to determine the quality of the model. Understanding the underlying data, as well as the process by which the modeler built the algorithm, is crucial. In one such case, the model seemed to be valid. On reviewing the data, however, I found the culprit. The algorithm included an occupation code variable. However, when I looked at the data, this variable was an alphanumeric code that would have had to be transformed to be of any use in a model. And that hadn't happened. This example brings up another related issue. With the explosion in the importance and demand for dataminers, there are many groups/people operating out there who are less than thorough when building models/schemas. If someone builds you a model, ask him or her to detail the process by which he Page 175 or she built it and by what standards he or she evaluated it. If you aren't sure how to evaluate his or her work, hire or find someone who can. 9. Bad ingredients make bad models. Nothing will ruin a model or campaign faster than bad data. Model- building software has become so automated that anyone can build a model with a point and click. But the real service that an experienced analyst brings is being able to detect bad data early on. EDA, or exploratory data analysis, is the first step toward building a good model/schema and avoiding the bad data experience. If you are the analyst, don't take someone's word that the data is what it is; check it out for yourself. Know your data inside and out. I once had an experience where the client gave me all the nonresponders but told me they were responders. Only when I got to the part where I checked my external validity did I find the problem and correct it. If you work in database marketing, don't assume that others understand data the same way. Confirm how samples are pulled, confirm data content, and examine files very closely. If you are working with appended data, make sure that the data is clean. This is more difficult because you may not be as familiar with it. Ask for ranges of values for each field and for the mean scores/frequencies for the entire database that the data came from. A related issue with appended data is that it should make sense with what you are trying to predict. Financial data is a very powerful ingredient in a model/schema to predict demand for financial services, but as a predictor for toothpaste purchase behavior, it is not. Choose your ingredients wisely. 10. Sometimes good models, like good horses, need to be put out to pasture. Good models, built on well- chosen data, will perform over time. But like all good things, models do have a life cycle. Because not every market is the same and consumers tend to change over time, it almost ensures that the process of prediction will not be an event. How can you tell if it is time to refresh/rebuild your model? Have you seen a complete drop-off in response/conversion without a change in your creative/offer or the market at large? If yes, it's time to rebuild. But nobody wants to wait until that happens; you would prefer to be proactive about rebuilding models. So, that said, how do you know when it's time? The first clue is to look at the market itself. Is it volatile and unpredictable? Or is it staid and flat? Has something changed in the marketplace recently (i.e., legislation, new competitors, new product improvements, new usage) that has changed overall demand? Are you communicating/distributing through new channels (i.e., the Internet)? Have you changed the offer/creative? All of the preceding questions will help you to determine how often and when new models should be built. If you are proactive by watching the market, the customers, and the campaigns you will know when it is time. One suggestion is to always be testing a "challenger" to the "established champ." When the challenger starts to out-perform the champ consistently, it's time to retire the champ. Page 176 Back - end Validation In my opinion, the most exciting and stressful part of the modeling process is waiting for the results to come in. I usually set up a daily monitoring program to track the results. That approach can be dangerous, though, because you can't determine the true performance until you have a decent sample size. My advice is to set up a tracking program and then be patient. Wait until you have at least a couple hundred responders before you celebrate. In the case study, I am predicting the probability of a prospect becoming an active account. This presumes that the prospect responds. I can multiply the number of early responders times the expected active rate, given response, to get a rough idea of how the campaign is performing. Once all of the results are in, it is critical to document the campaign performance. It is good to have a standard report. This becomes part of a model log (described in the next section). For the case study, the company mailed deciles 1 through 5 and sampled deciles 6 through 10. In Figure 7.9, the model results are compared with the expected performance shown in Figure 7.6. Each component within each decile is compared. We notice a slight difference in the expected performance and the actual performance. But overall, model performance is good. For both the "active rate" Figure 7.9 Back - end validation report. Page 177 and the "average NPV," the rank ordering is strong and the variation from expected performance is at or below 10%. Model Maintenance I have worked with modelers, marketers, and managers for many years. And I am always amazed at how little is known about what models exist within the corporation, how they were built, and how they have been used to date. After all the time and effort spent developing and validating a model, it is worth the extra effort to document and track the model's origin and utilization. The first step is to determine the expected life of the model. Model Life The life of a model depends on a couple of factors. One of the main factors is the target. If you are modeling response, it is possible to redevelop the model within a few months. If the target is risk, it is difficult to know how the model performs for a couple of years. If the model has an expected life of several years, it is always possible to track the performance along the way. Benchmarking As in our case study, most predictive models are developed on data with performance appended. If the performance window is three years, it should contain all the activity for the three -year period. In other words, let's say you want to predict bankruptcy over a three-year period. You would take all names that are current for time T. The performance is then measured in the time period between T + 6 to T + 36 months. So when the model is implemented on a new file, the performance can be measured or benchmarked at each six - month period. If the model is not performing as expected, then the choice has to be made whether to continue use, rebuild, or refresh. Rebuild or Refresh? When a model begins to degrade, the decision must be made to rebuild or refresh the model. To rebuild means to start from scratch, as I did in chapter 3. I would use new data, build new variables, and rework the entire process. To refresh means that you keep the current variables and rerun the model on new data. Page 178 It usually makes sense to refresh the model unless there is an opportunity to introduce new predictive information. For example, if a new data source becomes available it might make sense to incorporate that information into a new model. If a model is very old, it is often advisable to test building a new one. And finally, if there are strong shifts in the marketplace, a full-scale model redevelopment may be warranted. This happened in the credit card industry when low introductory rates were launched. The key drivers for response and balance transfers were changing with the drop in rates. Model Log A model log is a register that contains information about each model such as development details, key features, and an implementation log. Table 7.2 is an example of a model log for our case study. A model log saves hours of time and effort as it serves as a quick reference for managers, marketers, and analysts to see what's available, how models were Table 7.2 Sample Model Log NAME OF MODEL LIFEA2000 Dates of development 3/00–4/00 Model developer O. Parr Rud Overall objective Increase NPV Specific target Accounts with premium amount > 0 Model development data (date) NewLife600 (6/99) First campaign implementation NewLife750 (6/00) Implementation date 6/15/00 Score distribution (validation) Mean = .037, St Dev = .00059, Min = .00001, Max = .683 Score distribution (implementation) Mean = .034, St Dev = .00085, Min = .00001, Max =.462 Selection criteria Decile 5 Selection business logic > $.05 NPV Preselects Age 25–65; minimum risk screening Expected performance $726M NPV Actual performance $703M NPV Model details Sampled lower deciles for model validation and redevelopment Key drivers Population density, life stage variables Page 179 developed, who's the target audience, and more. It tracks models over the long term with details such as the following: Model name or number. Select a name that reflects the objective or product. Combining it with a number allows for tracking redevelopment models. Date of model development. Range of development time. Model developer. Name of person who developed model. Model development data. Campaign used for model development. Overall objective. Reason for model development. Specific target. Specific group of interest or value estimated. Development data. Campaign used for development. Initial campaign. Initial implementation campaign. Implementation date. First use date. Score distribution (validation). Mean, standard deviation, minimum and maximum values of score on validation sample. Score distribution (implementation). Mean, standard deviation, minimum and maximum values of score on implementation sample. Selection criteria. Score cut-off or depth of file. Selection business logic. Reason for selection criteria. Preselects. Cuts prior to scoring. Expected performance. Expected rate of target variable; response, approval, active, etc. Actual performance. Actual rate of target variable; response, approval, active, etc. Model details. Characteristics about the model development that might be unique or unusual. Key drivers. Key predictors in the model. I recommend a spreadsheet with separate pages for each model. One page might look something like the page in Table 7.2. A new page should be added each time a model is used. This should include the target population, date of score, date of mailing, score distribution parameters, preselects, cut-off score, product code or codes, and results. Summary In this chapter, I estimated the financial impact of the model by calculating net present value. This allowed me to assess the model's impact on the company's Page 180 bottom line. Using decile analysis, the marketers and managers are able to select the number of names to solicit to best meet their business goals. As with any great meal, there is also the clean-up! In our case, tracking results and recording model development are critical to the long - term efficiency of using targeting models. TEAMFLY Team-Fly ® Page 181 PART THREE— RECIPES FOR EVERY OCCASION Page 182 Do you like holiday dinners? Are you a vegetarian? Do you have special dietary restrictions? When deciding what to cook, you have many choices! Targeting models also serve a variety of marketing tastes. Determining who will respond, who is low risk, who will be active, loyal, and above all, profitable— these are all activities for which segmentation and targeting can be valuable. In this part of the book, I cover a variety of modeling objectives for several industries. In chapter 8, I begin with profiling and segmentation, a prudent first step in any customer analysis project. I provide examples for both the catalog and financial services industry using both data-driven and market-driven techniques. In chapter 9 I detail the steps for developing a response model for a business-to-business application. In chapter 10 I develop a risk model for the telecommunication industry. And in chapter 11, I develop a churn or attrition model for the credit card industry. Chapter 12 continues the case study from chapters 3 through 7 with the development of a lifetime value model for the direct- mail life insurance industry. If your work schedule is anything like mine, you must eat fast food once in a while. Well, that's how I like to describe modeling on the Web. It's designed to handle large amounts of data very quickly and can't really be done by hand. In chapter 13, I discuss how the Web is changing the world of marketing. With the help of some contributions from leading thinkers in the field, I discuss how modeling, both traditional and interactive, can be used on a Web site for marketing, risk, andcustomer relationship management. [...]... bankruptcies and charge -offs can quickly erode small profit margins To understand our customer base with respect to revenue and risk, I perform a customer value analysis This allows me to segment the customer base with respect to profitability leading to improved customer relationship managementCustomer Value Analysis Credit card profitability is achieved by balancing revenue (less costs) andrisk An... This matrix gives us an instant view of the customer database with respect to revenue andrisk We see that over 66% are considered high revenue and almost 53% are low risk Our best customers, low riskand high revenue, make up 33% of our customer base Figure 8.7 Customer value matrix Page 200 The next step is to see what they look like I will profile the customers within each segment The following... table statement crosses revenue (acctrev) by risk score (riskscr) with the number and percent of customers (records): proc tabulate data=ch08.profit; format acctrev revenue riskscr risk. ; class acctrev riskscr; var records; table (acctrev=' ' all='Total'),(riskscr=' ' all='Total') *(records='#'*sum=' '*f=comma8 records='%'*pctsum=' '*f=8.2) /rts=15 box= 'Customer Value Matrix'; run; The results of the... demographic and psychographic characteristics as well as a multitude of buying behaviors, risk patterns, and levels of profitability among the members of your database This is the beauty of segmentation and profiling Once you understand the distinct groups within the database, you can use this knowledge for product development, customer service customization, media and channel selection, and targeting... experience and analysis In this case, you know the data, you work with a limited number of variables, and you determine a limited number of segments For example, in Sal and Pat's business, we've had experience working with purchase inactive segments, potential attriter segments, and potential credit usage segments The appropriate segments will be defined and selected based on the business objective and your... testing, and analysis stages, you are more assured of meeting your goals, maximizing your profitability and improving your customers' long term behavior Page 190 Profiling and Penetration Analysis of a Catalog Company's Customers Southern Area Merchants (SAM) is a catalog company specializing in gifts and tools for the home and garden It has been running a successful business for more than 10 years and. .. a combination of riskand net revenue The first step is to determine the splitting values for riskand revenue In this case, I use $150 for revenue and 650 for risk These values are not cast in stone The revenue value of $150 represents some revolving activity or high transaction activity Accounts with revenues higher than $150 are worth considering for more marketing efforts The risk score of 650... Accounts with scores below 650 are considered high risk Page 199 The following code uses PROC FORMAT to split the population into two groups by both revenue and risk: proc format; value revenue low- . traditional and interactive, can be used on a Web site for marketing, risk, and customer relationship management. Page 183 Chapter 8— Understanding Your Customer: Profiling and Segmentation The. describe several techniques and applications for understanding your customer. Common sense tells us that it's a good first step to successful customer relationship management. It is also an. operations, and risk management. This will vary by organization and industry. 3. Review and evaluate your data requirements. Make sure you have considered all necessary data elements for analysis and