� Conspiracy | Main | The Variant �

February 15, 2004

On Data Mining

Over at Crooked Timber, social scientists and economists take potshots at data mining. Over here in the Business Intelligence field, there's a lot more potential for it.

As an added value part of a product suite that I am developing, I will be offering some data mining. Coming from the realm of enterprise computing, Data Mining has a mixed reputation, but that stems primarily from the difficulty it is to implement and maintain as a system with regard to the technical expertise required to properly interpret data and that required to weed through BS products.

Of the many DW projects I've been on, there have only been a very small few who expressed interest in data mining and only one or two who have invested. Part of the reason is because unlike social scientists, those folks that I deal with have more real information than they need to make intelligent decisions, furthermore their business models are not immediately amenable to new discovery.

In the first case, as is well known in retail, when you have millions of transactions at the checkout counter at your disposal, you already have more than enough information to handle most inventory and product profitability problems. Since the retailer's problem is pricing according to supply and demand, mining things like purchase affinity is only icing on the cake. For the most part, all the merchandise in the store is going to remain in the same aisles and there are only endcaps to change. So figuring out the 28 products to feature there (especially considering the market research already done by suppliers of endcap goods like potato chips and soda) doesn't require extraordinary precison.

Despite the apocryphal tales of diapers and beer sold together by dads making a run, there aren't a great number of data mining success stories in retail. It doesn't make that much of a difference to the bottom line. Speaking of hearsay, a certain large retailer has confided that they have many many terabytes of data and it's difficult enough for them just to store it much less make mining passes over it for interpretation.

In the second case, there is always a proverbial prophet crying in the wilderness about some problem with a company's business model. The example I love to give had to do with what actually happened in one of the biggest and best systems I put together back in the early 90s. At Philip Morris USA, the proud owner of the world's second most powerful brand behind CocaCola, there was some slight fear about the market share dominance of Marlboro. As I designed and built their tracking system which gave monthly market share numbers (when that was the most frequent numbers were published) with the aid of an economist and *the* statistics text, we programmed some modified confidence intervals. These told us that it was reasonable to assume that the newly arrived bargain brands would actually eat into Marlboro's lunch.

At the time, such a thing as discounting Marlboro was practically unthinkable. PM had declared as much publicly and it was well known that if PM ever reduced the price of its premium cigarettes, it would spell the beginning of the end for the entire industry's legendary profitability. Considering all that, it really didn't matter what our fancy computer projections said the impact of 'Basic' and other generic tobacco products.

Some time later, however (we like to think based upon the information we were able to show) PM actually did discount Marlboro. The stock dropped several percentage points and the industry swooned. Then people got over themselves and adjusted to the new normality. Nevertheless, these changes took place in spite of the psychographic data and the company's sense of the that data which said brand loyalty would survive price competition.

I primarily think of data mining in the context of multidimensional analysis. 'Bucket Shaping' is how I will use it in my next application. Predicting which factors people use as customers is a dicey business, and it's reasonable to pay a marginal amount to gain a marginal edge. Honing that edge and finding the real cost benefit is no simple matter and certainly not used to the same ends of independently verifyable theoretical ends as with social scientists, but marketing is non-trivial work, and marketing managers do buy it.

Posted by mbowen at February 15, 2004 09:15 AM

Trackback Pings

TrackBack URL for this entry:
http://www.visioncircle.org/mt/mt-tb.cgi/1514