The Best and Worst Ways to Analyze Zip Code

Oct 28, 2008 12:46 AM  By

The distance a customer lives from a store is generally predictive of future purchase behavior. Typically, close proximity is indicative of higher revenue. And, distant customers who are loyal often are stocker-uppers—they shop less frequently than other loyal customers but purchase more per visit.

Distance can be calculated many ways, and accuracy varies by methodology. The most accurate are rooftop-to-rooftop calculations, some of which take into account roadways and natural barriers such as lakes and rivers. Somewhat less accurate are methods based on small-area geographic units such as zip+4′s, census block groups and carrier routes. Specifically, what is calculated is the distance between the center (centroid) of the customer and store-address unit of geography.

The least accurate – and all-too-common – way to calculate distance is based on zip code. This is problematic in many ways, such as in rural areas where individual zip codes span many miles. Frequently, for example, a customer will live within the same zip code as the store. In these instances, the zip-level distance-to-store will be 0, which of course has no relation to reality.

Consider what will happen if the distance field in a marketing database is changed from zip to rooftop. This will be a significant improvement in accuracy. However, the distance value for essentially all customers will change, and often quite dramatically. For example, a rural customer within the same zip as the store might have his or her field value increased from 0 to 15 miles.

Let’s suppose the existence of a predictive model that includes distance-to-store as one of its predictor variables. Because the field values for the scored customers will have changed, the predictive model will be compromised. In fact, the targeting accuracy of the model might very well be destroyed. Deploying such a model on an improved marketing database will result in a serious financial setback.

The specific antidote is careful, up-front coordination with the data mining team, rigorous quality assurance procedures, and –importantly – those rare data management professionals who are more than technicians; that is, who have developed deep direct and database marketing expertise.

Jim Wheaton ( is co-founder of Daystar Wheaton Group.