Modeling Toward True Optimization

In last week’s article Using Finite State Machines to Manage Customer Relations, we discussed a new marketing technology called finite state machines, which provide a flexible way to manage complex, longitudinal customer management programs. Using state machines, marketing programs can be modeled using a combination of states, transition rules that move customers between states, and various actions, such as sending a direct mail piece.

As powerful as they are, state machines have two limitations. First, they require us to determine and implement all possible transition rules, which for large marketing programs can be time-consuming. Second, they do not truly permit us to optimize marketing around particular strategic goals and in the face of real-world constraints. In this installment, we introduce an extension of the state-machine concept that solves both of these problems.

If you meet someone who claims to be optimizing all of their customer communications you can draw one of two conclusions: they’re misusing the term optimization, or they have several of the world’s most powerful supercomputers working around the clock to optimize their programs.

If by optimization we mean creating customer treatments that tailor the timing, frequency, channel, and other aspects of each promotion to each customer, then we create an enormous combinatorial problem that current computing power cannot solve. For an example of a typical problem, grab a calculator and start multiplying the following figures: 10 million customers times 52 weekly campaigns times three versions for each campaign times four communication channels times any other variation you might introduce.

As you can see, the number of possible combinations rapidly reaches gigantic proportions. And you still need to find the right one for each customer. Moreover, we face real economic obstacles to produce all of these variations cost-effectively.

Fortunately, we can extend the concept of state machines with a technique used for optimizing processes in many other industries – a Markov Decision Process (MDP). A decision process also has a series of states and transitions between states. But there’s a key difference: where we had to determine and specify each transition in our finite state machine, here the transitions are a set of probabilities.

For example, if on average 60% of our customers purchase only once, then we can create two states: first-time buyer and second-time buyer. The transition probability between first-time and second-time is 60% and the reverse transition has a 0% probability.

In a typical MDP, we might create between 50 and 100 states, although it is possible to use several thousand. Because we use these states in place of the much larger number of customers, we can reduce the size of our optimization problem by several orders of magnitude.

To optimize our customer management programs, we need to add two more elements to our MDP.

The first is a set of actions for each state. This is relatively simple, since the campaigns and programs we execute are the “actions” we have taken over time

The final element is a reward system. Again, this can be constructed relatively simply from our marketing data: the rewards can be our financial data from past responses and other customer behavior. If we took a particular action – let’s say we sent out a mailing piece – our reward is the amount of the purchase less our promotional and fulfillment costs for those that responded (let’s say $25 on average) and the cost of the mailing for those who did not purchase (let’s say a $1 loss).

We can, of course, have more sophisticated reward structures. For example, we might want to take the profit of the purchase, since some customers might respond by buying higher margin goods. We can also use a global reward structure, so that the lifetime value can be used (in which case the customer’s retention may contribute to the increase in value).

Once we have these four elements – our states, our transition probability matrix, actions, and rewards – the entire process can be solved for an optimum outcome. That is, the MDP will tell us which treatment at each state yields the maximum reward. This also means that if your company decides to change strategy, we can adjust the MDP with a new reward structure, e.g., maximize acquisition of new customers, and rapidly implement the new strategy across the portfolio.

MDPs help solve a couple of additional problems that traditional campaign management promises, but falls short of delivering.

The first is marketing automation. In campaign management, the automation is fairly trivial: we can set up campaigns and re-use them. We also showed that finite state machines offer even better automation, but like traditional campaign managers, need strategies formulated through outside analysis.

With MDPs, we can do something more radical and important: If we know the best action to take, we can automatically take that action each time a customer enters that state. Naturally, we want to continue to perform testing of new concepts, since nothing is truly static, but we can now do our testing within a customer management framework, rather than as a set of isolated events.

This is particularly important because certain promotions can decrease customer value. Let’s revisit our first-time and second-time buyers. If we wanted to increase our new customer acquisition, we could choose a strategy of offering a discount offer to new customers, for example, $5 off. The promotion is supposed to significantly increase acquisition of new customers. Naturally, our profit falls to $20 because of the additional promotional expense assuming that the additional customers purchase the same amount.

In many cases, we would worry about two negative consequences for such a strategy: the average purchase amount might be lower; and/or the repurchase rate is lower. Or these two factors may not be severe, but we may see higher attrition rates in six months’ time. These effects, as well as others, will be apparent within the MDP once customers begin to migrate through the system. We can evaluate the effects of the new strategy from a wider perspective.

Finally, our two customer management approaches – finite state machines and MDPs – can also be used together. For example, we can set up the MDP to provide the overall strategic framework and use our rules-based state machines to implement the tactical marketing programs. We then have the best of both approaches: the analytically driven power of a Markov process coupled with the execution capabilities of state machines.

Whether used separately or in combination, these two technologies are finally enabling marketers to realize the full promise of data-driven marketing.

David King is CEO of New York-based marketing solutions provider Fulcrum.