How do new predictive analytics tools help businesses?
Traditional predictive analytics, represented by statistical tools and rules-based systems -- previously known as expert systems -- is stuck in the 1990s. Standard predictive analytics tools, such as SAS (analytics software) and the R programming language, are decades old. The new way is machine learning on "big data," and it is a disruptive change.
First of all, traditional predictive analytics is based on data in relational data warehouses, which handle only structured data collected in batch. Signals from real-time data such as the location of a mobile phone or signals available in social data such as tweets or text data in customer service are not considered, thus reducing precision. Further, many of the tools out there are severely limited by scale and only handle data that fits in memory, which forces analysts to work with samples rather than the full data. Sampling captures the strongest signals in data but misses out on the long tail of weaker signals, thus losing precision.
Manual processes rife in traditional predictive analytics models
Secondly, traditional predictive analytics requires feature design, which is a manual process in which an analyst designs the features that will drive predictions through a hypothesis and test process. For example, for predicting retail purchases, the analyst might hypothesize three features: total amount spent by the customer in the past; total number of times the customer has purchased in the last year; and the last date on which they made a purchase. The analyst then tests which of these features carry predictive power and experiments with various predictive algorithms. Now, machine learning has put us on the path of increasing automation in feature design, which is a fundamental advance.
Third, most traditional predictive analytics tools fail to adapt when customer behavior changes. Predictive models are typically implemented as code inside applications, which makes it impossible to even monitor their performance, much less adapt to change. The world is moving faster and consumers change behavior based on what they see from competitors. Machine learning offers the fundamental advantage of relying on learning from data and is able to adapt.
How businesses used to set up analytics
Typically, in the past, a predictive analytics problem was addressed by getting business, IT and the analytics team together, which, in itself, is a big feat in a large corporation. The first goal was helping the analysts understand the business problem. Then, the analysts would investigate and identify a few attributes. Those attributes could be, say, the number of times the consumer has made a purchase or called in. Once they had those attributes, they'd work with IT to program them to create predictive models using a tool from a provider, such as SAS.
Using the predictive model, an analytics tool mines and rates the variables related to the selected attributes and produces a score, essentially a propensity score. That information is handed back to IT, where it is hard-coded into an application. That's the traditional way.
The main problems with this legacy process are that it doesn't happen quickly and the attributes and information gathered are not very detailed.
Real-time demands call for new predictive analytics approaches
Today's predictive analytics tools can provide real-time analysis and historical information. The tools needed to successfully predict sales trends and consumer behavior include APIs, machine learning and Hadoop.
Hadoop is used to gather data from all your enterprise systems and liberate, if you will, that information. Hadoop solutions are complementary to classic data warehouses, which have been used for more structured purposes. With Hadoop, you can utilize all your data, be it structured or unstructured.
There are important reasons why businesses should automate their business analytics. Today, mobile computing has certainly driven demand for real-time responsiveness. When you use your phone, you want every bit of information and every transaction to be available right now.
Competitiveness is another factor driving predictive analytics advances and the need to adopt those new technologies. If your competitors have real-time responsiveness, then you have to because otherwise you are not serving the end customer as well. APIs are in the real-time mode and can be used to tap into data from the real-time flow. Combining API servers and machine learning enables real-time predictive business intelligence (BI) delivery to businesses, which can pass that along speedily to consumers. Also, businesses can use that BI to reach out to customers quickly and in a more targeted and meaningful way.
Editor's note: This expert advice is taken from an interview by executive editor Jan Stafford.
About the author:
Waqar Hasan is Head of Big Data Business at Apigee, which purchased his company, InsightsOne, a cloud-based predictive analytics solutions provider, in January, 2014. InsightsOne's technologies were rolled into the April 14, 2014, release of Apigee Insights, a big data analytics solution. Previously, as VP of Data Systems Engineering, Hasan ran the Yahoo data platform for five years. He has also served as Architect for Informix and Research Scientist for Hewlett-Packard Labs.
Is SOA the answer to business data management risks?