Data Analysis is No Game For Amateurs

Jul 22, 2008 12:05 AM  By

What sorts of hands-on data investigation should marketers do themselves, and what should they leave to trained data miners?

Today’s database access tools provide marketers with the hands-on ability to access, manipulate and draw conclusions from their own data. Unfortunately, these access tools also have great potential for misuse when they fall into the hands of individuals with sub-par analytical training.

False “insight” is worse than no insight at all. Therefore, wise marketers know the difference between data investigation they can do themselves versus data investigation they should do.

One rule of thumb is that marketers should think twice before tackling data investigation that requires the definition, creation, manipulation and comparison of multiple past-point-in-time views.

The following two analytical questions posed by a hypothetical office products retailer provide some clarity:

· How many customers on my database have purchased laser printer cartridges over the past twelve months?
· What is the rate of cartridge consumption subsequent to the purchase of a laser printer, how has the rate changed over the years, and is this rate affected by key variables such as printer price point and model, and customer type (e.g., B2C versus B2B)?

The first analytical question is a straightforward database query that has nothing to do with multiple past-point-in-time views. The grist for the query is the current marketing database “view;” that is, how each customer “looks” as of the most recent database update.

The second question involves multiple past-point-in-time views. Over-time cohorts must be defined, created, manipulated and compared. Any number of marketing database content issues must be controlled for, such as differences in the rate of data capture, both over time and by merchandise type. For example, perhaps the capture rate of cartridge purchases has increased over time compared with printer purchases.

Retailers know all-too-well the many ways that this sort of anomaly can creep into a marketing database. The following is a cautionary tale of the perils of powerful access tools in the hands of analytically under-trained marketers:

Several years ago, a direct agency employed a leading database access tool to perform extensive analysis for one of its clients.

Unfortunately, the analysis was deeply flawed. One client highlighted “crisis” was a dramatic decline in the percentage of customers who were ordering a second time.

But this was an artifact of failing to control for time-on-file. The rate of second-purchase is always lower for customers who are relatively new-to-file because they have had less time to reorder.

In other words, the agency had failed its client because it had failed to properly handle multiple past-point-in-time views.

Jim Wheaton is a co-founder of Daystar Wheaton Group.