Catalog Tech: A System to Simplify Data Analysis

Dec 01, 2003 10:30 PM  By

Data analysis typically involves a major commitment of time and effort. Just getting the input data into the right format is a huge undertaking, and matching the type of data with the appropriate analysis tool requires a nuanced appreciation of both. The analysis itself may be a multistage process, requiring interim interpretation and analysis to produce the final results. And even then, further interpretation may require a statistician’s eye for the relevant detail to make sense of the data.

But KXEN has rewritten the old cumbersome analytical rulebook. Though it sounds like the call letters of a West Coast hard-rock radio station, KXEN stands for “Knowledge eXtraction ENgines.”

Founded in 1998 in Paris by two engineers and a mathematician, KXEN has its U.S. headquarters in San Francisco. The company has more than a dozen developers and 150 installations worldwide. Its system focuses on improving the user’s customer relationship management, including customer profiling and attrition modeling.

KXEN provides a data-mining analytic framework that builds models rapidly without the need for data normalization or standardization — and without the need for help from a staff of statisticians. Pursuing the traditional data-modeling approach can take several weeks and a team of analysts to review data, prepare data, build the model, test the model, interpret the results, and apply the model. But a marketer using KXEN can pose a business question, build a model from the question, and apply it to data in a multitude of formats, or export it into scoring code such as SQL, SAS, C, Java, Visual Basic, or predictive model markup language (PMML) for integration directly into operational systems in just a few hours — without any help from analytical support staff.

In fact, some models can be built in about 10 minutes, and training — testing and fine-tuning — a KXEN model usually takes only a few minutes, or even a few seconds (in an example analyzing 50,000 lines and 20 variables). This allows you to build models for projects that would not otherwise be worth the time or effort.

In short, KXEN lets you make data mining one of your daily activities. If you want, you can run hundreds of models each month.

Predictive and descriptive models

The KXEN analytic framework consists of predictive and descriptive modeling engines for scoring, classification, clustering, and variable management. By automating the preprocessing of the data, it eliminates the bottleneck where modelers usually spend the majority of their time. You can work with thousands of data points, with no limitation on the number of input variables.

The system also handles data “outliers” well, so that exceptional, unrepresentative values do not skew your results. By the same token, the outlier or exception handling can also be applied to situations such as fraud identification or prevention or to ward off enterprise server attacks by looking for anomalies in server loads and usage.

In addition, KXEN tells you exactly how accurate and reliable each model is. A “deviation detection” utility provides ongoing monitoring to warn you when a model degrades, so that you can re-create the model with fresh data.

The program is use

KXEN components can be easily integrated into your existing call center or order management system for real-time analysis to predict the most attractive product to offer each customer or customer type for cross-selling or upselling. Alternatively, you might compare return histories for customers by product type to flag potential returns at order entry and prompt with an additional question or two to determine a customer’s expectations about the product.

You can also integrate KXEN models into your Website for real-time click-stream analysis to predict when a customer is about to leave without making a purchase so that you can offer a tailor-made incentive to stay and buy.

But you don’t need an ongoing project to use the system. KXEN lends itself well to ad hoc modeling in a simulation mode for a single input data set to predict the score for an individual business question.

In reporting mode, KXEN graphically displays data and lets you “drill down” for progressively more detailed views. Of course, drill-down functionality, the hallmark of online analytical processing (OLAP), is only as useful as the information it yields. If you have dozens of “dimensions” in your data, it can be difficult to find the needle of information in the data haystack.

To get actionable information you need to assess the importance of each data variable. To that end, a company called Addinsoft has used KXEN’s analytic framework to create an “intelligent OLAP,” or IOLAP, application that can easily be integrated with other OLAP tools or manipulated on Addinsoft’s Excel platform.

IOLAP identifies and displays drill-down variables by their degree of relevance to a given business question. The system proceeds to rank each contributing variable as you drill down in the hierarchy, as well as ranking the quality and reliability of the information obtained, helping you avoid focusing on data that contain no useful information.

Under the hood

KXEN is a highly user-friendly system with an equally user-friendly price: A one-time system license is $60,000 for use on a single PC, $100,000 for use on two PCs, and $160,000 for four users. The IOLAP system costs $1,000 per PC user.

KXEN consists of the following major components:

  • an event log, which allows relative aggregation of transaction data (and correlation with demographic or descriptive data) based on points in time. This is useful for determining, for instance, what events occurred at specified time periods prior to a customer’s defection.

  • a sequence coder that performs relative aggregation of data based on events and can be used to model behavioral data (such as click-streams on a Website)

  • a time series module that identifies trends, cycles, and patterns in a series of data

  • a consistent coder that prepares raw data, filling in missing values and detecting out-of-range values

  • a robust regression module that builds the models

  • a smart segmenter that creates meaningful groups, such as customers with similar attributes.

Because KXEN can handle large numbers of variables, the sequence coder and event log module can be used to track not only what products a customer has purchased but also the order in which they purchased them — over time and on each order. This can help you fine-tune cross-sell and upsell recommendations.

KXEN can run on Microsoft Windows or UNIX/Linux platforms. The system comes with examples of graphical user interfaces source code, to facilitate integration efforts for KXEN components users. A control application programming interface (API) is available for developers or users with programming experience. It grants access to the complete range of functionality and fine-grained parameterization of KXEN components. In addition, the system allows customized integration of KXEN components with other applications or program packages.

The process of drilling down for actionable information in your database underscores the truism about marketing voiced by the 19th-century retail pioneer John Wanamaker: “Half of my advertising is wasted, but I don’t know which half.” In the past 25 years, database marketing has allowed us to accumulate massive amounts of data to try to resolve Wanamaker’s dilemma, but data alone don’t do the job. You need tools that let you derive answers to an ongoing series of questions about customer behavior that the data itself often do more to hide than to reveal.

By giving end users the ability to focus on the data that really matter, modeling tools such as KXEN and its IOLAP cousin give you virtual “mouse click” access to the key variables that drive your business — and just as important, help you rank the importance of each of these variables in achieving your marketing goals.


Ernie Schell is president of Marketing Systems Analysis, a Southampton, PA-based consultancy.

Data-Mining & OLAP Tools
Brio (brio.com) Oracle (oracle.com)
Business Objects (businessobjects.com) Salford Systems (salford-systems.com)
Cognos (cognos.com) SAS (sas.com)
E.piphany (epiphany.com) SPSS (spss.com)
Magnify (magnify.com) Statsoft (stat-soft.com)
Microstrategy (microstrategy.com)
KXEN 650 Townsend St., Suite 300, San Francisco, CA 94103 www.kxen.com 415-503-4168