Welcome to a new quarterly column by Jim Wheaton, cofounder/principal of Wheaton Group, a Chicago-based data management, data mining, and decision sciences practice that also offers list processing capabilities and campaign management software through its Daystar Wheaton Group joint venture. Wheaton defines data mining as “all the analytical methods available to transform data into insight.” In each column he’ll discuss how to make the most of those methods.
Myriad methods exist to turn data into actionable insight. Examples include statistics-based predictive models, homogeneous groupings (also known as clusters), cohort analyses such as lifetime value, quantitative approaches to optimizing contact strategies across multiple channels, and the creation of report packages and key-metrics dashboards.
But we won’t spend a lot of time comparing predictive modeling techniques and software packages. Much has been written, for example, about the merits of regression vs. neural networks. Having participated in countless model builds, I know firsthand that technique plays only a secondary role in the success or failure of a predictive model.
Discussions about modeling techniques have always reminded me of the theological debate that took place many centuries ago about how many angels can dance on the head of a pin. Today’s data miners are fixated on their own pins and angels when they wrangle about techniques!
A byproduct of this wrangling are the fantastic claims made by proponents of some of these techniques. Unfortunately such claims are pabulum for the gullible. The inconvenient truth, to borrow a phrase from a prominent national politician, is that technique has very little impact on results. There is only so much variance in the data, and the stark reality is that new techniques are not going to drastically improve the power of predictive models.
Instead, this column will focus on the truly important issues — namely, just about everything else having to do with data mining. This month, for example, we’ll look at the significant improvements that are possible for optimizing the raw inputs to the data mining process. The ultimate goal is to perform data mining off a platform that we at Wheaton Group refer to as “best-practices marketing database content.” This, in turn, supports deep insight into the behavior patterns that form the foundation for data-driven decision-making.
For starters, best-practices database content provides a consolidated view of all customers and inquirers across all channels — direct mail, e-commerce, brick-and-mortar retail, telesales, field sales. Sometimes — particularly in business-to-business and business-to-institution environments — prospects are included.
This content is as robust as the underlying methods of data collection are capable of supporting. The complete history of transactional detail must be captured. Everything within reason must be kept, even if its value is not immediately apparent.
For example, one multichannel marketer failed to forward noncash transactions from its brick-and-mortar operation to its marketing database. This became a problem when a test was done to determine the effectiveness of coupons sent to customers, which were good for free samples of selected merchandise. The goal was to determine whether these coupons would economically stimulate store traffic. But because the coupon transactions did not involve cash, there was no way to mine the database for insights into which customers had taken advantage of the offer and what the corresponding effect was on long-term demand.
There are 10 commandments that, if followed, will ensure best-practices marketing database content. Five are discussed this month, and the balance will be covered in the next column.
Number 1: The data must be maintained at the atomic level. All customer events, such as the purchase of products and services, must be maintained at the lowest feasible level. This is vital because, although you can always aggregate, you can never disaggregate. Robust event detail provides the necessary input for seminal data mining exercises such as product affinity analysis.
Avoid data “buckets” and other such accumulations of data. This is particularly important for businesses that are rapidly expanding, where it can be impossible to audit and maintain summary data approaches across ever-increasing numbers of divisions.
One firm learned the hard way about the need to maintain atomic-level detail when it discovered that its aggregated merchandise data did not support deep-dive product affinity analysis. This is because, by definition, it was impossible to understand purchase patterns within each aggregated merchandise category. For example, with no detail beyond “jewelry,” there was no way to identify patterns across subcategories such as watches, fine/fashion merchandise, bridal diamonds, fashion diamonds, pearls/stones, accessories, and loose goods.
Number 2: The data must not be archived or deleted except under rare circumstances. Ideally you should retain even ancient data because you never know when you might need them. The elimination of older data is perhaps the most common shortcoming of today’s marketing databases — an ironic development given that disk space is cheaper today than it was 10 or 20 years ago.
Data mining can be severely hampered when the data do not extend significantly back in time. One database marketing firm experienced this when it tried to build a model to predict which customers would respond to a holiday promotion. Unfortunately all data content older than 36 months was rolled off the database on a regular basis; it wasn’t even archived. So the database reflected only three years of history for a customer who had been purchasing for 10 years.
The only way to build the holiday model, of course, was to go back to the previous holiday promotion. This reduced to 24 months the historical data available to drive the model. More problematic was the need to validate the model off another holiday promotion, the most recent of which had, by definition, taken place two years earlier. This, in turn, reduced to 12 months the amount of available data. As you can imagine, the resulting model was far from optimal in its effectiveness!
Number 3: The data must be time-stamped. The use of time-stamped data to describe orders, items, promotions, and the like facilitates an understanding of the sequence of progression for customers who have been cross-sold. This is also true if customers are found to have purchased across multiple divisions during the incorporation of acquired companies. Corresponding data mining applications include product affinity analysis and next-most-likely-purchase modeling.
Number 4: The semantics of the data must be consistent and accurate. Descriptive information on products and services must be easily identifiable over time despite any changes that might have taken place in naming conventions. Consider how untenable analysis would be if the data semantics were so inconsistent that “item number 1956” referenced a type of necktie several years ago but an umbrella now. Also, the reconciliation of different product and service coding schemes must be appropriate to the data-driven marketing needs of the overall business, not merely to the individual divisions.
Number 5: The data must not be overwritten. Deep-dive data mining is predicated upon the re-creation of past-point-in-time “views.” For example, a model to predict who is most likely to respond to a summer clearance offer will be based on the historical information available at the time of an earlier summer clearance promotion. The re-creation of point-in-time views is problematic when data are overwritten.
A major financial institution learned this in conjunction with a comprehensive database that it built to facilitate prospecting. After months of work, the prospect database was ready to launch. The internal sponsors of the project, anxious to display immediate payback to senior management, convened a two-day summit meeting to develop a comprehensive, data-driven strategy.
One hour into the meeting, the brainstorming came to an abrupt and premature end. The technical folks, in their quest for processing efficiency, had not included in the database a running history of several fields critical to the execution of any data mining work. Instead the values comprising these fields had been overwritten during each update cycle.
The incorporation of this running history necessitated a redesign of the prospect database. The unfortunate result was a two-month delay, a loss of credibility in the eyes of senior management, and a substantial decline in momentum.
My next column, scheduled for the May issue, will focus on the final five commandments of best-practices marketing database content. In the meantime, consider whether your marketing database violates any of the first five commandments. The extent to which it does is the extent to which your firm’s revenue and profits are being artificially limited.
See The Second Five Commandments here