News

Predictive analytics and ‘big data’: The good, the bad and the ugly

Nicole Laskowski, News Editor

When David Menninger and his colleagues saw the figures from analytics research they conducted in July, something just didn’t add up. Of the 2,600 businesses surveyed, only 13% reported using predictive analytics, whereas 37% described predictive analytics as being important to their organizations.

“Clearly there’s a mismatch between value and deployment,” said Menninger, vice president and research director for information technologies at San Ramon, Calif.-based Ventana Research Inc.

The results, along with what Menninger calls a “renewed interest” in

Requires Free Membership to View

predictive analytics due to the onslaught of data rising in volume, velocity and variety -- or “big data” -- inspired him to take a closer look at how businesses are using predictive analytics, a form of data mining, and what hurdles may be holding businesses back.

While he’s still in the process of collecting survey information, Menninger recently sat down with SearchBusinessAnalytics.com to talk about predictive analytics in the age of big data.

Walk me through the general steps of how to perform predictive analytics.

David Menninger: First, select the data on which you’re going to perform the analysis. The important thing about selecting data is that it needs to be truly random and representative, because you’re selecting a set of data to develop a model. Then you develop a model and you apply that model to new data as it arrives in your organization.

If we translate that into an example: You take the past purchasing behavior of your customer base; prepare a model of that purchasing behavior to see what characteristics caused people to purchase certain products; then take the model you developed and, as a new person presents themselves to potentially purchase from your organization, you make some specific product recommendations or put certain advertisements in front of them based on the scores that this model suggests for that person.

How difficult is it to put together an unbiased sampling so that you get very good predictive results?

Menninger: Many people don’t understand how to identify and create a random sample of data. For instance, if you took only the sales transactions from the last month without consideration for what promotions were running or where you were running those promotions, you could end up with a bias in your model. Suppose you were running a promotion in New York but you didn’t realize that as you were selecting your sample. You then created a model that shows people in New York are more likely to buy this product.

Part of what we’re hoping to find in the research is whether or not those skill sets exist in organizations that are trying to do predictive analytics. The question we’ll ask of the research findings is, Is skill a big obstacle, and if it is a big obstacle, how can organizations overcome that obstacle?

How has big data impacted predictive analytics?

Menninger: It’s a prerequisite that you need to have the predictive analytics to make full use of the big data you collect, because you can’t swim through all of that big data and find the interesting bits. You’ve got to have technology assist you. There are a couple of technologies that would help assist in that process: One is predictive analytics and the other is visualization. If you use some visualization techniques that let you see lots of data -- things like heat maps and geographic plotting of the data -- that can help you identify some of the trends or some of the interesting observations in the data. But a lot of them would be hard to find visually, so predictive analytics, I think, is really the only way to get maximum value out of your data.

What are some of the benefits of incorporating big data into predictive analytics?

Menninger: The benefits clearly are the opportunity to improve your revenue stream. Fundamentally, you’re going to sell more product, whether by identifying new customers to engage with or identifying products to sell to your existing customers. There’s also a cost savings side of it. If you look on the marketing side -- and we can apply this same logic to other industries, but let me stick with the commercial, sales aspect -- you can reduce your spending on marketing activities if you can identify the right segment of customers to target with your ads. If you have an ad that you know is going to appeal to 8% of your overall market and you only send the ad to 8%, that reduces the cost of that marketing campaign.

What about drawbacks? Do you see any?

Menninger: So I’m here at the Teradata conference right now, and Steven Levitt, the [co-]author of Freakonomics, gave a presentation. I think that one of the observations he shared is a potential risk or downside of predictive analytics. If we become so automated and robotic about our analysis, might we miss some obvious observations? Maybe. Maybe not. Now, you might say, well, wouldn’t the predictive analytics catch that? And maybe it will. But I still think there is this risk. We still want to ask questions; we still want to be inquisitive. I think there’s some risk that we stop looking at the data, not necessarily that we lose the ability, but we may be less motivated or less inclined to look at data in critical way, or even the observations in a critical way. What Steven Levitt said was every now and then ask yourself: Does this make sense? It’s a valuable exercise.