Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 34 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
34
Dung lượng
1,18 MB
Nội dung
470643 c11.qxd 3/8/04 11:17 AM Page 380 380 Chapter 11 Using Thematic Clusters to Adjust Zone Boundaries The goal of the clustering project was to validate editorial zones that already existed. Each editorial zone consisted of a set of towns assigned one of the four clusters described above. The next step was to manually increase each zone’s purity by swapping towns with adjacent zones. For example, Table 11.1 shows that all of the towns in the City zone are in Cluster 1B except Brookline, which is Cluster 2. In the neighboring West 1 zone, all the towns are in Cluster 2 except for Waltham and Watertown which are in Cluster 1B. Swapping Brook- line into West 1 and Watertown and Waltham into City would make it possible for both editorial zones to be pure in the sense that all the towns in each zone would share the same cluster assignment. The new West 1 would be all Cluster 2, and the new City would be all Cluster 1B. As can be seen in the map in Figure 11.12, the new zones are still geographically contiguous. Having editorial zones composed of similar towns makes it easier for the Globe to provide sharper editorial focus in its localized content, which should lead to higher circulation and better advertising sales. Table 11.1 Towns in the City and West 1 Editorial Zones TOWN EDITORIAL ZONE CLUSTER ASSIGNMENT Brookline City 2 Boston City 1B Cambridge City 1B Somerville City 1B Needham West 1 2 Newton West 1 2 Wellesley West 1 2 Waltham West 1 1B Weston West 1 2 Watertown West 1 1B 470643 c11.qxd 3/8/04 11:17 AM Page 381 Automatic Cluster Detection 381 Lessons Learned Automatic cluster detection is an undirected data mining technique that can be used to learn about the structure of complex databases. By breaking com- plex datasets into simpler clusters, automatic clustering can be used to improve the performance of more directed techniques. By choosing different distance measures, automatic clustering can be applied to almost any kind of data. It is as easy to find clusters in collections of news stories or insurance claims as in astronomical or financial data. Clustering algorithms rely on a similarity metric of some kind to indicate whether two records are close or distant. Often, a geometric interpretation of distance is used, but there are other possibilities, some of which are more appropriate when the records to be clustered contain non-numeric data. One of the most popular algorithms for automatic cluster detection is K-means. The K-means algorithm is an iterative approach to finding K clusters based on distance. The chapter also introduced several other clustering algo- rithms. Gaussian mixture models, are a variation on the K-means idea that allows for overlapping clusters. Divisive clustering builds a tree of clusters by successively dividing an initial large cluster. Agglomerative clustering starts with many small clusters and gradually combines them until there is only one cluster left. Divisive and agglomerative approaches allow the data miner to use external criteria to decide which level of the resulting cluster tree is most useful for a particular application. This chapter introduced some technical measures for cluster fitness, but the most important measure for clustering is how useful the clusters turn out to be for furthering some business goal. 470643 c11.qxd 3/8/04 11:17 AM Page 382 TEAMFLY Team-Fly ® 470643 c12.qxd 3/8/04 11:17 AM Page 383 Analysis in Marketing 12 Knowing When to Worry: Hazard Functions and Survival CHAPTER Hazards. Survival. These very terms conjure up scary images, whether a shimmering-blue, ball-eating golf hazard or something a bit more frightful from a Stephen King novel, a hatchet movie, or some reality television show. Perhaps such dire associations explain why these techniques are not fre- quently associated with marketing. If so, this is a shame. Survival analysis, which is also called time-to-event analysis, is nothing to worry about. Exactly the opposite: survival analysis is very valuable for understanding customers. Although the roots and terminol- ogy come from medical research and failure analysis in manufacturing, the concepts are tailor made for marketing. Survival tells us when to start worry- ing about customers doing something important, such as ending their rela- tionship. It tells us which factors are most correlated with the event. Hazards and survival curves also provide snapshots of customers and their life cycles, answering questions such as: “How much should we worry that this customer is going to leave in the near future?” or “This customer has not made a pur- chase recently; is it time to start worrying that the customer will not return?” The survival approach is centered on the most important facet of customer behavior: tenure. How long customers have been around provides a wealth of information, especially when tied to particular business problems. How long customers will remain customers in the future is a mystery, but a mystery that past customer behavior can help illuminate. Almost every business recognizes the value of customer loyalty. As we see later in this chapter, a guiding principle 383 470643 c12.qxd 3/8/04 11:17 AM Page 384 384 Chapter 12 of loyalty—that the longer customers stay around, the less likely they are to stop at any particular point in time—is really a statement about hazards. The world of marketing is a bit different from the world of medical research. For one thing, the consequences of our actions are much less dire: a patient may die from poor treatment, whereas the consequences in marketing are merely measured in dollars and cents. Another important difference is the vol- ume of data. The largest medical studies have a few tens of thousands of par- ticipants, and many draw conclusions from a just a few hundred. When trying to determine mean time between failure (MTBF) or mean time to failure (MTTF)—manufacturing lingo for how long to wait until an expensive piece of machinery breaks down—conclusions are often based on no more than a few dozen failures. In the world of customers, tens of thousands is the lower limit, since cus- tomer databases often contain data on millions of customers and former customers. Much of the statistical background of survival analysis is focused on extracting every last bit of information out of a few hundred data points. In data mining applications, the volumes of data are so large that statistical con- cerns about confidence and accuracy are replaced by concerns about manag- ing large volumes of data. The importance of survival analysis is that it provides a way of understand- ing time-to-event characteristics, such as: ■■ When a customer is likely to leave ■■ The next time a customer is likely to migrate to a new customer segment ■■ The next time a customer is likely to broaden or narrow the customer relationship ■■ The factors in the customer relationship that increase or decrease likely tenure ■■ The quantitative effect of various factors on customer tenure These insights into customers feed directly into the marketing process. They make it possible to understand how long different groups of customers are likely to be around—and hence how profitable these segments are likely to be. They make it possible to forecast numbers of customers, taking into account both new acquisition and the decline of the current base. Survival analysis also makes it possible to determine which factors, both those at the beginning of customers’ relationships as well as later experiences, have the biggest effect on customers’ staying around the longest. And, the analysis can be applied to things other then the end of the customer tenure, making it possible to deter- mine when another event—such as a customer returning to a Web site—is no longer likely to occur. A good place to start with survival is with visualizing customer retention, which is a rough approximation of survival. After this discussion, we move on to hazards, the building blocks of survival. These are in turn combined into 470643 c12.qxd 3/8/04 11:17 AM Page 385 Hazard Functions and Survival Analysis in Marketing 385 survival curves, which are similar to retention curves but more useful. The chapter ends with a discussion of Cox Proportional Hazard Regression and other applications of survival analysis. Along the way, the chapter provides particular applications of survival in the business context. As with all statisti- cal methods, there is a depth to survival that goes far beyond this introductory chapter, which is consciously trying to avoid the complex mathematics under- lying these techniques. Customer Retention Customer retention is a concept familiar to most businesses that are concerned about their customers, so it is a good place to start. Retention is actually a close approximation to survival, especially when considering a group of customers who all start at about the same time. Retention provides a familiar framework to introduce some key concepts of survival analysis such as customer half-life and average truncated customer tenure. Calculating Retention How long do customers stay around? This seemingly simple question becomes more complicated when applied to the real world. Understanding customer retention requires two pieces of information: ■■ When each customer started ■■ When each customer stopped The difference between these two values is the customer tenure, a good measurement of customer retention. Any reasonable database that purports to be about customers should have this data readily accessible. Of course, marketing databases are rarely simple. There are two challenges with these concepts. The first challenge is deciding on what is a start and stop, a decision that often depends on the type of busi- ness and available data. The second challenge is technical: finding these start and stop dates in available data may be less obvious than it first appears. For subscription and account-based businesses, start and stop dates are well understood. Customers start magazine subscriptions at a particular point in time and end them when they no longer want to pay for the magazine. Customers sign up for telephone service, a banking account, ISP service, cable service, an insurance policy, or electricity service on a particular date and cancel on another date. In all of these cases, the beginning and end of the rela- tionship is well defined. Other businesses do not have such a continuous relationship. This is particu- larly true of transactional businesses, such as retailing, Web portals, and cata- logers, where each customer’s purchases (or visits) are spread out over time—or 470643 c12.qxd 3/8/04 11:17 AM Page 386 386 Chapter 12 may be one-time only. The beginning of the relationship is clear—usually the first purchase or visit to a Web site. The end is more difficult but is sometimes created through business rules. For instance, a customer who has not made a purchase in the previous 12 months may be considered lapsed. Customer reten- tion analysis can produce useful results based on these definitions. A similar area of application is determining the point in time after which a customer is no longer likely to return (there is an example of this later in the chapter). The technical side can be more challenging. Consider magazine subscrip- tions. Do customers start on the date when they sign up for the subscription? Do customers start when the magazine first arrives, which may be several weeks later? Or do they start when the promotional period is over and they start paying? Although all three questions are interesting aspects of the customer relation- ship, the focus is usually on the economic aspects of the relationship. Costs and/or revenue begin when the account starts being used—that is, on the issue date of the magazine—and end when the account stops. For understanding customers, it is definitely interesting to have the original contact date and time, in addition to the first issue date (are customers who sign up on weekdays dif- ferent from customers who sign up on weekends?), but this is not the beginning of the economic relationship. As for the end of the promotional period, this is really an initial condition or time-zero covariate on the customer relationship. When the customer signs up, the initial promotional period is known. Survival analysis can take advantage of such initial conditions for refining models. What a Retention Curve Reveals Once tenures can be calculated, they can be plotted on a retention curve, which shows the proportion of customers that are retained for a particular period of time. This is actually a cumulative histogram, because customers who have tenures of 3 months are included in the proportions for 1 month and 2 months. Hence, a retention curve always starts at 100 percent. For now, let’s assume that all customers start at the same time. Figure 12.1, for instance, compares the retention of two groups of customers who started at about the same point in time 10 years ago. The points on the curve show the proportion of customers who were retained for 1 year, for 2 years, and so on. Such a curve starts at 100 percent and gradually slopes downward. When a retention curve represents customers who all started at about the same time— as in this case—it is a close approximation to the survival curve. Differences in retention among different groups are clearly visible in the chart. These differences can be quantified. The simplest measure is to look at retention at particular points in time. After 10 years, for instance, 24 percent of the regular customers are still around, and only about a third of them even make it to 5 years. Premium customers do much better. Over half make it to 5 years, and 42 percent have a customer lifetime of at least 10 years. 470643 c12.qxd 3/8/04 11:17 AM Page 387 Hazard Functions and Survival Analysis in Marketing 387 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0 12 24 36 48 60 72 84 96 108 120 High End Regular Percent Survived Tenure (Months after Start) Figure 12.1 Retention curves show that high-end customers stay around longer. Another way to compare the different groups is by asking how long it takes for half the customers to leave—the customer half-life (although the statistical term is the median customer lifetime). The median is a useful measure because the few customers who have very long or very short lifetimes do not affect it. In general, medians are not sensitive to a few outliers. Figure 12.2 illustrates how to find the customer half-life using a retention curve. This is the point where exactly 50 percent of the customers remain, which is where the 50 percent horizontal grid line intersects the retention curve. The customer half-life for the two groups shows a much starker differ- ence than the 10-year survival—the premium customers have a median life- time of close to 7 years, whereas the regular customers have a median a bit under over 2 years. Finding the Average Tenure from a Retention Curve The customer half-life is useful for comparisons and easy to calculate, so it is a valuable tool. It does not, however, answer an important question: “How much, on average, were customers worth during this period of time?” Answering this question requires having an average customer worth per time and an average retention for all the customers. The median cannot provide this information because the median only describes what happens to the one cus- tomer in the middle; the customer at exactly the 50 percent rank. A question about average customer worth requires an estimate of the average remaining lifetime for all customers. There is an easy way to find the average remaining lifetime: average cus- tomer lifetime during the period is the area under the retention curve. There is a clever way of visualizing this calculation, which Figure 12.3 walks through. 470643 c12.qxd 3/8/04 11:17 AM Page 388 388 Chapter 12 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 0 12 24 36 48 60 72 84 96 108 120 High End Regular Percent Survived Tenure (Months after Start) Figure 12.2 The median customer lifetime is where the retention curve crosses the 50 percent point. First, imagine that the customers all lie down with their feet lined up on the left. Their heads represent their tenure, so there are customers of all differ- ent heights (or widths, because they are horizontal) for customers of all different tenures. For the sake of visualization, the longer tenured customers lie at the bottom holding up the shorter tenured ones. The line that connects their noses counts the number of customers who are retained for a particular period of time (remember the assumption that all customers started at about the same point in time). The area under this curve is the sum of all the cus- tomers’ tenures, since every customer lying horizontally is being counted. Dividing the vertical axis by the total count produces a retention curve. Instead of count, there is a percentage. The area under the curve is the total tenure divided by the count of customers—voilà, the average customer tenure during the period of time covered by the chart. TIP The area under the customer retention curve is the average customer lifetime for the period of time in the curve. For instance, for a retention curve that has 2 years of data, the area under the curve represents the two-year average tenure. This simple observation explains how to obtain an estimate of the average customer lifetime. There is one caveat when some customers are still active. The average is really an average for the period of time under the retention curve. Consider the earlier retention curve in this chapter. These retention curves were for 10 years, so the area under the curves is an estimate of the average cus- tomer lifetime during the first 10 years of their relationship. For customers who are still active at 10 years, there is no way of knowing whether they will all leave at 10 years plus one day; or if they will all stick around for another century. For this rea- son, it is not possible to determine the real average until all customers have left. 470643 c12.qxd 3/8/04 11:17 AM Page 389 Hazard Functions and Survival Analysis in Marketing 389 time A group of customers with different tenures are stacked on top of each other. Each bar represents one customer. At each point in time, the edges count the number of customers active at that time. Notice that the sum of all the areas is the sum of all the customer tenures. Proportion of Number of Customers Customers Making the vertical axis a proportion instead of a count produces a curve that looks the same. This is a retention curve. The area under the retention curve is the average customer tenure. Figure 12.3 Average customer tenure is calculated from the area under the retention curve. This value, called truncated mean lifetime by statisticians, is very useful. As shown in Figure 12.4, the better customers have an average 10-year lifetime of 6.1 years; the other group has an average of 3.7 years. If, on average, a cus- tomer is worth, say, $100 per year, then the premium customers are worth $610 – $370 = $240 more than the regular customers during the 10 years after they start, or about $24 per year. This $24 might represent the return on a reten- tion program designed specifically for the premium customers, or it might give an upper limit of how much to budget for such retention programs. Looking at Retention as Decay Although we don’t generally advocate comparing customers to radioactive materials, the comparison is useful for understanding retention. Think of cus- tomers as a lump of uranium that is slowly, radioactively decaying into lead. Our “good” customers are the uranium; the ones who have left are the lead. Over time, the amount of uranium left in the lump looks something like our retention curves, with the perhaps subtle difference that the timeframe for ura- nium is measured in billions of years, as opposed to smaller time scales. [...]... that customers who have been around for a while are actually better customers than new customers For whatever reason, longer tenured customers have stuck around in the past and are probably a bit less likely than new customers to leave in the future Exponential decay is a bad situation, because it assumes the opposite: that the tenure of the customer rela tionship has no effect on the rate that customers... or campaign) There is no problem when a customer is included in the population count up to that customer s tenure, and the customer could have stopped on any day before then and still be in the data set An example of what not to do is to take a subset of customers who have stopped during some period of time, say in the past year What is the problem? Consider a customer who stopped yesterday with 2 years... of customers leaving is exactly the same, no matter how long the customers have been around This looks like a horizontal line on a graph Say the hazard is being measured by days, and it is a constant 0.1 percent That is, one customer out of every thousand leaves every day After a year (365 days), this means that about 30.6 percent of the customers have left It takes about 692 days for half the customers... contract is up, customers often rush to leave, and the higher rate continues for a while because customers have been liberated from the contract Once the contract has expired, there may be other reasons, such as the prod uct or service no longer being competitively priced, that cause customers to stop Markets change and customers respond to these changes As telephone charges drop, customers are more... set of customers and what happens at the beginning and end of their relationship In particular, the end is shown with a small circle that is either open or closed When the circle is open, the customer has already left and their exact tenure is known since the stop date is known A closed circle means that the customer has survived to the analysis date, so the stop date is not yet known This customer or... is that marketing efforts change over time, attract ing different qualities of customers For instance, customers arriving by differ ent channels often have different retention characteristics, and the mix of customers from different channels is likely to change over time Survival Hazards give the probability that a customer might stop at a particular point in time Survival, on the other hand, gives... limits customers to having started at a particular point in time Also, because a survival curve always slopes downward, calculations of customer half-life and average cus tomer tenure are more accurate By incorporating more information, survival provides a more accurate, smoother picture of customer retention When analyzing customers, both hazards and survival provide valuable information about customers... haz ard curves and to derive some sort of average for the overall risk Figure 12.11 provides an illustration from the world of marketing It shows two sets of hazard probabilities, one for customers who joined from a tele phone solicitation and the other from direct mail Once again, how someone became a customer is an example of an initial condition The hazards for the telemarketing customers are higher;... attrition, whenever a customer is forced to leave, the customer is included in the analysis until he or she leaves— at that point, the customer is censored This makes sense Up to the point when the customer was forced to leave, the customer did not leave voluntarily This approach can be extended for other purposes Once upon a time, the authors were trying to understand different groups of customers at a news... behavior arises only because there are so few customers in this simple example Similarly, lining up customers in a table is useful for didactic purposes to demonstrate the calculation on a manageable set of data In the real world, such a presentation is not feasible, since there are likely to be thousands or millions of customers going down and hundreds or thousands of days going across It is also worth . significant number of customers leave at this time, the hazard probability spikes up. 7% 6% 5% Weekly Hazard 4% 3% 2% 1% 0% 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 Tenure (Weeks. tenure high end customers = 73 months (6. 1 years) 0 12 24 36 48 60 72 84 96 108 120 Tenure (Months after Start) Figure 12.4 Average customer lifetime for different groups of customers can be. 35–39 yrs 0. 16% 40–44 yrs 0.24% 45–49 yrs 0. 36% 50–54 yrs 0.52% 55–59 yrs 0.80% 60 64 yrs 1. 26% 65 69 yrs 1.93% 70–74 yrs 2.97% 75–79 yrs 4. 56% 80–84 yrs 7.40% 85+ yrs 15.32% A life table