Our Articles

A Monthly Article from our Speakers

Current Article of the month

Articles of 2017

Articles of 2016

Articles of 2015

Articles of 2014

Articles of 2013

Articles of 2012

Articles of 2011

Articles of 2010

Articles of 2009

Articles of 2008

Articles of 2007

Articles of 2006

Articles of 2005

Articles of 2004

Articles of 2003

Articles of 2002

Herbert EdelsteinData Mining:
The Key to Profitable Customer Relationship Management

by Herbert Edelstein

February 2002

 

Customer Relationship Management (CRM) helps companies do a better job of matching products and sales campaigns to customers and prospects. This personalization results in improved profitability.

Until recently most CRM software has focused on simplifying the organization and management of customer information. Such software, called operational CRM, creates a database that presents a consistent picture of the customer’s relationship with the company. These databases are used in applications involving direct contact with customers; for example, sales force automation and customer service.

Although an enormous amount of money and time has been spent on operational CRM, the payback has been limited. Operations became more efficient (that is, the usual tasks were done faster or with fewer people), but not more intelligent. Newer techniques, however, improve profitability by optimizing customer interaction through the entire customer life cycle.

The customer life cycle has three main phases. In the first phase, companies try to acquire new customers and increase the number of relationships they have. In the second phase, companies try to increase the profitability of their relationships by selling more products and services to existing customers. In the third phase, companies try to increase the duration of their profitable relationships by making sure that desirable customers continue doing business with them. The central goal in all three phases is to make the right offer at the right time to the right prospect.

Data mining makes it possible to achieve this goal, by making sense of large amounts of complex data on customers and transactions. Data mining is a process that uses a variety of analysis and modeling techniques to find patterns and relationships in data. These patterns can be used to make accurate predictions about customer behavior.

CRM applications that use data mining are called analytic CRM. Analytic CRM makes it easier to select the right prospects from a large list of potential customers (life cycle phase 1). Data mining can help companies offer the most appealing array of products to existing customers (phase 2), or identify customers the company is at risk of losing (phase 3). The result is improved revenue because of a greatly improved ability to respond to each individual contact in the best way, and reduced costs due to properly allocating resources.

However, data mining is a tool, not a magic solution. It won’t reside in your database watching what happens and notifying you when it sees an interesting pattern. It doesn’t eliminate the need to know your business, to understand your data, or to understand analytical methods. Data mining helps find predictive patterns and relationships in the data, but you must verify the accuracy of these predictions in the real world. Also, keep in mind that successful data mining requires using a broad spectrum of analytical tools. Do not expect one particular algorithm to be sufficient.

The most important key to success is to follow a methodical process. In Part I of this two-part article on data mining and CRM, we will examine the first three steps of the following seven-step data mining process:

  1. Define the business problem
  2. Build the data mining database
  3. Explore the data
  4. Prepare the data for modeling
  5. Build a model
  6. Evaluate the model
  7. Act on the results

1. Define the business problem

First and foremost, the prerequisite to data mining is to understand your data and your business. You must have this understanding in order to identify the problems you’re trying to solve, prepare the data for mining, correctly interpret the results, and have confidence in the relevance of your predictions.

To make the best use of data mining you must prepare a clear statement of your objectives. Typical goals might include: “to target prospects who are most likely to respond to a marketing campaign,” “to sell additional products to existing customers,” or “to develop campaigns to reduce attrition.” Objectives such as “increasing the response rate” and “increasing the value of a response” may look similar, but each would require building a very different model.

An effective statement of the problem will include a way to measure the results of your project. It may also include a cost justification or ROI (Return on Investment) analysis. Without a well-defined goal and a method for determining whether you’ve reached that goal, you won’t benefit from data mining.

2. Build the data mining database

This step, along with the next two, constitute the core of data preparation. Together, they take more time and effort than all the other steps combined. For CRM data, these steps typically consume 60% to 95% of a project’s time and resources.

The data to be mined should be collected in a database. (Depending on the amount of the data, the complexity of the data, and the uses to which it is to be put, the database might be organized using a database management system, a flat file, or even a spreadsheet.)

You will generally be better off creating a separate data mart instead of using your corporate data warehouse. Data mining often involves joining many tables together and accessing substantial portions of the database. A single trial model may require many passes through much of the database. By using a data mart, you will avoid putting too much of a strain on the computing resources of the data warehouse.

Data warehouse administrators also don’t like the data to be changed. Yet almost certainly you will need to modify the data from the data warehouse, plus perhaps bring in data from outside sources, and you may want to add new fields computed from existing fields. Other people building models from the data warehouse will want to make similar alterations. Creating a separate data mart will avoid these problems.

Even if you don’t have a data warehouse, data mining can increase the profits from your existing customers and greatly improve the efficiency of acquiring new customers. Of course the more complete your data and the better its quality, the better your results will be. But don’t wait for perfection.

3. Explore the data

Before you can build good predictive models, you must understand your data. Start by gathering a variety of numerical summaries (including descriptive statistics such as averages, standard deviations and so forth) and looking at the distribution of the data. You may want to produce cross tabulations (pivot tables) for multi-dimensional data. The goal is to identify the most important fields in predicting an outcome, and determine which derived values may be useful.

In a data set with hundreds or possibly thousands of columns, exploring the data can be time-consuming and labor-intensive. You need a good interface and a fast computer processor. Quicker turnaround leads to better results, because you can try many different approaches.

Graphing and visualization tools are a vital aid in data preparation, and their importance to effective data analysis cannot be overemphasized. Data visualization most often provides the key to new insights. Some of the common and very useful graphical displays of data are histograms or box plots that display distributions of values. You may also want to look at scatter plots (in two or three dimensions) of different pairs of variables. The usefulness of some graphs may be enhanced if you are able to add a third, overlay variable, or to split a graph into multiple graphs based on the values of a variable.

4. Prepare the data for modeling

This is the final data preparation step before building models. There are four main tasks in this step:

  1. Select variables. Ideally, you feed all your variables into the data mining tool and let it determine which are the best predictors. In practice, this doesn’t work very well. The time it takes to build a model increases with the number of variables. Also, blindly including extraneous columns can lead to incorrect models. You must select the most likely variables, based on your knowledge of the problem domain. For example, you might exclude “identification number” because it has no value as a predictor variable, and may reduce the weight of other important variables. You would check to see that correlated variables (such as “age” and “date of birth”) are not both included.
  2. Select rows. As in the case of selecting variables, you would like to use all the rows you have to build models. If you have a lot of data, however, this may take too long or require buying a bigger computer than you would like. The solution is sampling — that is, to select a random subset of the data. For most business problems, this yields no loss of information. One large company known for its excellent analytics, with millions of customer records, builds most of its models on samples of 50,000 or fewer customers.
    Given a choice of either investigating a few models built on all the data, or investigating more models built on a sample of the data, the latter approach will usually help you develop a more accurate and robust model.
  3. Construct new variables. It is often necessary to construct new predictors derived from the raw data. Certain variables that have little effect alone may need to be combined with others, using various arithmetic or algebraic operations. For example, forecasting credit risk using the ratio of debt to income may produce better results than using debt and income by themselves as predictor variables. In other cases, a variable with a wide range of values may be modified to construct a better predictor, such as using the logarithm of income instead of the actual income value.
  4. Transform variables. The tool you choose may dictate how you represent your data. Neural nets, for instance, require a categorical explosion. Variables may also be scaled to fall within a limited range, such as 0 to 1. Many decision trees used for classification require continuous data such as income to be grouped in ranges (bins) such as High, Medium, and Low. The choices you make, however — deciding at what point “Medium” becomes “High” — may change the outcome of your model.

5. Build a model

The most important thing to remember about model building is that it is an iterative process. You will need to explore alternative models to find the one that is most useful in solving your business problem. What you learn in searching for a good model may cause you to go back and modify the data you are using, or perhaps even to revise your problem statement.

There are two basic types of predictions. The first type, classification, predicts into what category or class a case falls. For example, how do you choose which of several offers (categories) would be most appealing to a prospective customer (case)? The second type, regression or estimation, predicts a number, such as the value of orders to expect from a particular customer.

The process of building predictive models requires a well-defined training and validation protocol in order to generate the most accurate and robust predictions. This kind of protocol is sometimes called supervised learning. The essence of supervised learning is to train (estimate) your model on a portion of the data, then test and validate it on the remainder of the data.

In a direct mail campaign, your goal may be to select the best targets from your mailing list. You perform a preliminary mailing to a portion of your mailing list, then build a model using the results of that mailing.

There is a risk that the preliminary mailing results will be somehow atypical and not applicable to the mailing list as a whole. To minimize this possibility, set aside a randomly-chosen subset of the data as a test database, and do not use it in the model building and estimation. Dividing the data randomly will help ensure that the training and test data sets are equally good representations of the data being modeled.

After building the model on the training portion of the data, the model is used to predict the classes or values of the rest of the database (test portion). Since you already know the correct answers (the actual responses to your preliminary mailing), you can calculate the accuracy of your predictive model.

6. Evaluate the model

After building a model, you must evaluate its results and interpret their significance. Remember that the accuracy rate found during testing applies only to the data on which the model was built. In practice, the accuracy may vary if the data to which the model is applied differs significantly from the original data.

More importantly, accuracy by itself is not necessarily the right measure for selecting the best model. You also need to know more about the type of errors, and the costs associated with alternative actions. For example, say the model has a tendency to predict “yes” when the real answer is “no” — you may risk losing a customer whom you have offended by making the wrong assumption about what products would be of interest to him. Another mistake might be to increase revenues by more effectively identifying prospects, but at much greater expense than before, thus resulting in lower profits.

Even the best and most accurate models may not reflect the real world. One reason is that there are always assumptions implicit in the model. Perhaps the model did not include inflation rate as a variable in a model predicting the likelihood of an individual to make a certain purchase, because inflation was assumed to be fairly constant. If inflation increases dramatically, however, purchasing behavior will likely change, in ways that the model cannot predict. Another source of error is when the data used to build the model turn out not to be representative after all.

Therefore it is important to test a model in the real world. If a model is used to select the “best” names from a mailing list, do a test mailing to verify the selection. If a model is used to predict credit risk, try the model on a small set of applicants before full deployment. The higher the risk associated with an incorrect model, the more important it is to construct an experiment to check the model results.

7. Act on the results

Once a data mining model has been built and validated, it can be used a general guideline for action or it can be applied as a batch process. As an example of the first kind of use, a business analyst may review the classification rules produced by a model, and use those rules to select a mailing list or to identify credit risks.

The second way is to automatically apply the model to different data sets. The model could flag records based on their classification; assign a score such as the probability of responding to a direct mail solicitation; or select some records from the database and subject these to further analyses with an OLAP tool.

Often the models are incorporated into a business application such as risk analysis, credit authorization or fraud detection. For instance, a predictive model may be integrated into a contact management application that tells an Internet application which offer to make. Or a model might be embedded in an inventory ordering system that automatically generates an order when the forecast inventory levels drop below a threshold.

The data mining model is often applied to one event or transaction at a time. The amount of time to process each new transaction, and the rate at which new transactions arrive, will determine whether a parallelized algorithm is needed. Monitoring credit card transactions or cellular telephone calls for fraud would require a parallel system to deal with the high transaction rate. By contrast, scoring a loan application for risk can easily be evaluated on modest-sized computers.

To summarize: Data mining offers great promise in helping organizations uncover patterns hidden in their data that can be used to predict the behavior of customers, products and processes. However, data mining tools need to be guided by users who understand the business, the data, and the general nature of the analytical methods involved. Realistic expectations can yield rewarding results across a wide range of applications, from improving revenues to reducing costs.