Realities of Datamining

Jan 01, 1999 10:30 PM  By

That’s where datamining comes in. The straightforward definition of datamining is that it is an automatic way to extract previously unknown information from a database. To be more specific, datamining uses a number of techniques, as explained here:

Technique What it does With datamining, you don’t need to pick just one of these techniques; instead, you can select a datamining suite-a software package that uses several database techniques (see “Suite suggestions,” at right). The market leaders are IBM Intelligent Data Miner, SAS Enterprise Miner, and SPSS. Websites for most of these tools can be linked from www.kdnuggets.com.

Yet while the suites are meant to work with minimal human intervention, you still have to supply the starting point for the software. You need to determine, for instance, how you want the software to measure customer satisfaction and how best to compute profitability. You also need to determine whether market share is more important than net income, and the criteria for declaring a customer inactive.

Once you’ve set the parameters, the software will select a measurement system to evaluate the potential models that it develops. In the case of a specific objective, such as improving retention of a certain segment of your database, the measurement system will involve some form of lift, or gains, analysis, which quantifies the extent to which one model does a better job of predicting the objective over another.

Models used to understand general business objectives, such as defining market segments served, will also use a form of gains analysis that simultaneously considers all the attributes you think important. Let’s say you want to segment your credit users into regular users, seasonal users, infrequent users, and inactives. A good clustering technique will indicate how often an infrequent credit user is misidentified as a seasonal user. The marketing implication is important here, since seasonal credit users are good prospects for skip-a-payment holiday promotions, while infrequent users, who tend to rely on credit to help them through significant life events such as moving, marriage, and divorce, are not.

You generally don’t need to see details of the measurement calculations, but you do need to ensure that the system produces performance measures that relate directly to your business objectives. For instance, if you want to enroll customers in a rewards program to increase revenue, you’ll want a liberal measurement system that includes borderline names. But if you’re managing credit risks, you’ll want a conservative measurement system that errs on the side of caution, eliminating borderline individuals.

The human factor There’s no doubt that datamining can improve marketing results. Still, it’s important to keep its abilities in perspective. While these tools can produce significant improvements in predicting scientific physical processes-the lifetime of a light bulb, for instance-when it comes to human behavior, datamining’s success depends on human interpretation of its results and the ability to maintain realistic expectations.

Everyone knows that behavioral processes are next to impossible to predict with a high degree of accuracy. Emotions make it possible for a person to make a decision for no obvious reason and then to change his mind seconds later. It’s tough working in a business where a mere matter of seconds can make the difference between someone responding to a catalog and not responding.

With this in mind, you must realize that datamining is not a panacea. You need to weigh your expectations of datamining with what you know about human nature-particularly its unpredictability. We simply don’t know enough about the many factors that govern human behavior, and we can’t learn them from new datamining technologies.

Yes, datamining can tell us whether a prospect is the right age, ethnic group, or income level for a particular offer; it can correlate a prospect’s most recent purchase behavior with that of other prospects and predict the next purchase cycle; it can indicate the type of purchase a customer might make next and how much he might spend. Indeed, datamining can help guide you in terms of the type and timing of offers, but it won’t necessarily improve the quality of response predictions.

As much as we would like it to, datamining tools and techniques can’t tell us whether a prospect has had a bad day at work, has had a fight with his spouse, or is feeling depressed, ill, angry, tired, or stressed. But datamining will make it easier for you to build a large number of models quickly. How you use these models, and how you interpret the results of the subsequent data analysis they perform, is up to you.

You can no longer use a lack of datamining tools as an excuse for not implementing advanced database techniques. Below are some of the more popular commercial datamining software packages. You can link to Websites for most of these tools from www.kdnuggets.com.

Darwin

DataCruncher

Data Detective

Data Engine 2.1

Datamind

Data Miner Software Kit

Data Sage

Decision Centre

Delta Miner

Gainsmarts

Hyperparallel//Discovery

IBM Intelligent Data Miner

IDIS Data Mining Suite

Inspect

Kepler

Knowledge Shadow

Kwiz

Magnify Pattern

NeoVista

Nuggets Orchestrate

Pilot Discovery Server

Polyanalyst 3.3

Polyanalyst Lite

SAS Enterprise Miner

SGI Mineset

SPSS

Zoom ‘n View