Data Tech: The Case for Database Marketing

One would think that the catalog industry would be awash with database marketing success stories. But co-ops notwithstanding, few of the major database marketing service bureaus have a significant roster of catalog clients. Those that do serve one or two dozen mailers at the most. And the percentage of catalogers pursuing database marketing inhouse appears to be equally small. So why are most catalogers only dabbling in database?

Most likely, many mailers believe that database marketing is more trouble than it’s worth. For one thing, as the name implies, you must start out with a marketing database — a daunting hurdle if your transaction system doesn’t readily support the routine extraction of relevant data from your customer and order files.

But the majority of the commercially available catalog order processing systems do support such extractions of data, though you will need to gather the information from the system on a regular basis — usually no more than once a week.

Getting the data you need may also require setting up extractions from multiple files, such as order fulfillment, and customer records. You must also determine how to structure the external database to optimize queries, since you are building a “data warehouse” rather than a contact database. For instance, you may assemble and store product purchase categories in a single file, or you may tally purchases in dollar-bracket categories rather than in exact amounts for easier reference and analysis.

For certain, the lack of a database foundation “is a significant barrier” for catalogers that want to conduct database marketing, says David Hay, director of database marketing at Madison, VA-based Plow & Hearth, a cataloger of home and gift items. “The ultimate value in database marketing comes from the statistical modeling you can do with your customer data, but you can’t get to that stage without a solid database to work from.”

Case study: Plow & Hearth

Before Plow & Hearth set up its database marketing program more than a decade ago, Hay recalls, “it was almost impossible to do any sort of interactive querying on our transaction system,” a legacy UNIX application called MOSIS. By exporting transaction data to a simple database platform, Plow & Hearth was able to tap a wealth of information about customer activity, from response by book by product category to recency/frequency/monetary value (RFM) analysis.

When Plow & Hearth decided to create a marketing database in the early 1990s, it selected FoxPro as the tool on which to build. Though it is not a highly scalable system, FoxPro was inexpensive (less than $5,000 for user licenses) and easy to use and maintain. Plow & Hearth continues to use the FoxPro platform to this day to manage a database of some 3 million names. Says Hay, “You don’t need IT expertise to manipulate the data,” which allows employees in the marketing and circulation departments to undertake database queries as often as they need to.

Plow & Hearth designed the database system inhouse, but it contracted the programming to a local third-party FoxPro specialist. Even so, it took more than a year to get the system fine-tuned. “We started out with first order/last order/life-to-date dollar totals,” says Hay, “and as we went along we added more transaction detail data and calculated fields like ‘average dollars per month’ to make the database more useful.”

From the outset, Plow & Hearth intended to engage in sophisticated database modeling activities to segment the customer file, forecast response, and design more effective circulation strategies. Modeling also helped the company in 1999 launch a successful spin-off, Plow & Hearth Country Home, by identifying target customers for the mailing.

The cataloger selected modeling tool ModelMAX from Pittsburgh-based Advanced Software Applications, since Plow & Hearth has used the company’s database analysis software since 1994. Bundling several statistical methods — regression, neural networks, and CHAID — into an automated desktop solution, ModelMAX supports both descriptive and predictive modeling. Using a Windows interface, the system lets you create and save two- or three-dimensional cross-tabs. An SQL Query Builder automates database queries, and you can even create macros (stored procedures) that allow you to refresh models with a mouse click.

A separate module imports data from any Open Database Connectivity (ODBC)-compliant database, SAS files, spreadsheets, or even flat ASCII and ISAM files, while providing a method to export model scores back to your marketing database.

Finally, ModelMAX offers a Workbench module that manages the creation of “frozen” history files for comparative analysis. These allow you to re-create the status of the business at previous points in time for comparative purposes. The module also combines data from multiple databases (including e-commerce logs) and supports a long list of predefined algorithms for counts and statistical analysis as well as templates and job-activity management and tracking.

ModelMAX has helped Plow & Hearth mail more profitably to its house file. “We build nine or 10 models per season,” Hay says, “which helps us to identify pockets of nonresponders and trim them off some of our mailings. We are also more consistent now in the way we do segmentation season by season and year by year.” And with the customer file divided into 40-50 segments — including those predicated on product purchase history — Plow & Hearth can generate much more accurate forecasts for each mailing.

What about RFM?

Many companies that tried to implement a database marketing program have come to the conclusion that simple RFM analysis works just as well, if not better, in determining whom to mail and how often.

If your database has no more than a few hundred thousand names or your product line has only three or four categories, RFM may indeed be more cost-effective than statistical modeling. But RFM analysis is in itself a form of modeling, and it is encumbered with a basic shortcoming that you can’t get around.

In an RFM model, you create cells of customers based on how recently they’ve bought, how frequently they’ve bought, and how much they’ve spent. The challenge is determining how many cells to create. How many periods of recency and frequency and which aggregates of monetary value do you account for?

RFM modeling also tends to skew or ignore issues of seasonality and pays no attention whatsoever to response rates, number of times mailed, or large order values that don’t reflect the customer’s total dollars spent. And almost by definition, RFM leads to smaller, albeit more profitable, mailings, which ultimately shrinks the house file.

Not that statistical modeling or profiling is a panacea. For while profiling your house file can lead to more-targeted test-mailings, these can be frustratingly nonproductive; after all, a profile alone is not a good predictor of behavior. A profile assumes unknowable intangibles, such as “a sense of fun” or a propensity for bargain-hunting, which may have brought buyers onto your house file to begin with.

That’s why so many catalogers participate in co-op databases, in which cross-profiling can put a large pool of prime prospects right at your fingertips. But even co-op databases yield diminishing returns. In our next column, we’ll discuss getting past these obstacles.


Ernie Schell is president of systems consultancy Marketing Systems Analysis in Southampton, PA.

Descriptive vs. Predictive Modeling

Catalogers have two basic objectives for statistical modeling: determining the relevant characteristics for managing customer segmentation (in other words, “descriptive” modeling) and scoring database characteristics to forecast which ones will determine response to a particular promotion and what that response will be (“predictive” modeling).

Both types of modeling involve statistical algorithms and computations. But neither descriptive nor segmentation modeling has a “dependent variable” such as response rate or average order size.

Descriptive analysis focuses on which data elements are statistically linked to which other elements to create valid customer groupings. You could overlay demographic data to these segments and still stay within the boundaries of descriptive modeling.

But if you start with RFM and add product category purchase history to determine which RFM cells, if any, are associated with purchases of which product types, you are predictive modeling. You might extend the analysis to how these segments responded to a particular promotion, then complete the circle by scoring the database using that response analysis to determine the statistical probability of each name or segment having a similar response to future similar promotions.
ES