Predictive Analytics on Small Data

This is a small note on small data. I hope it has a big impact. The common understanding of the world is that one should use predictive and prescriptive data on big data. A vast amount of data, classified and grouped, running analytics to predict what will be the next event that one or more elements of the group will take. Predictive analytics like this allows pushing of right products to e-commerce shoppers. I am sure you all have experienced this on the large e-commerce site and enjoyed it.


In the world or predictive and prescriptive analytics on small data for big impact, one needs to work hard on acquiring the small data and ensuring its validity. The power of processing now available at a scale never experienced before, some traditional methods which were given up due to the limitation of processing power can be revived now. So no new approaches here.

For the sake of this small note, we will take the electrical fittings segment like switches. If we want to predict how this market is moving, the only data you can pull from your ERP is the past sales of different types of switches in a hierarchy of geographical information.

Public data will allow one to get the regional demographics and the GDP. Also, one will be able to get a sense of public participation by election data and public initiatives.

Some data, like Government spending in the region, may or may not be available.

Now our problem is to predict, which price band of electrical switches is going to grow in the next three years if there is no disruptive change in technology.

So the first step is to get two or more products which have a visible connection with switches and have data published. This could be sales data of publicly listed companies, compiled data from companies who are in the business of selling data and finally the data provided as market size by your sales team.

With the above data, we have to create a polynomial equation which can stand the test of multiple variables. That sounds outrageously simple, and it is. We want to know the factors which we can multiply the company’s sales in currency, less or add a factor, be equal to an equation of three or more variables, say, production of paint, production of cement and per capita GDP. The only way we are going to get to this by trying a significant number of options. You may have 20 variables and all you have to try it again to match the perfect balance.

Now you have the computing power and Smarten – Advanced Data Discovery tools which can make a difference.