The amount and richness of data that is generating in the present scenario is overwhelming to control and understand and it is transforming the whole marketing decision-making.
Data has been defined by 3 components as: volume, variety and velocity. Lets discuss each of them.
The volume of data generating today is beyond imagination. The number are in petabytes, exabytes or zettabytes.
Lets give a perspective to these terms.
1 petabyte = 20 millions of traditional cabinets full of text
Walmart as a company generates 2.5 petabytes of data every hour
Now you can imagine!!!
In 2013 the whole world’s data was 4.4 zettabytes, and we are expecting this number to reach 44 by 2020.
For reference 1 zettabyte is equivalent to 250 billion DVDs.
The market for people who are dealing with big data will double itself every couple of years.
The major chunk of data comes from IOT(Internet Of Things) through small and big computers installed in our day to day products. Example might be cars, goggles, watches, mobile etc. So the new trend is to include everyday product under IOT so that the consumer will be able to connect seamlessly. But again it's an opportunity and a problem for marketer to take decisions on these data streams. We are expecting that almost Thirty-two billion objects will be connected online by 2020.
Velocity means relentless rapidity of data creation. To better understand the type, we can take the example of population of India census data and the type data from any e-commerce website. The data that we are receiving from the latter half will include location, brand preferences in a specific category which defines depth of awareness in brand salience, opinions on a brand, attitude towards a brand, purchasing capacity and a lot more. For analysis sake we can easily define the actual and desired personality from these data points.
My point is that only the volume of data is not sufficient, it should contain character that enables decision making as a marketer.
For a very long time in our marketing history we have only worked with structured data, rather than unstructured one, due to its complexity and our inability to use it. If I define a difference between data vs big data, this is the exact difference I am talking about. So structured data are those which have been captured from sensors, our record files and databases. So it’s easy to run analysis on structured data due to its quantifiable nature.
Now currently the unstructured data is more qualitative in nature. We can collect it as textual data from our whatsapp messages or from sms messages or data from our e-mail or the conversation we had on any chat box or from many other sources.
Yes!!! you can recall the case of Cambridge Analytica.
Another type is non-textual data which is collected from youtube, facebook videos/whatsapp videos or from any other type of video sources, images from different mediums and audio recordings also.
For example Parliament office in India has a special security wing, which collects these type of data regularly to stop lynching in India via social media or to curb terrorism in Jammu and Kashmir
These data contains your personal information regarding your behavior, family, the type of content you consume etc.
Then there is semi-structured data which needs some tools to structure them like Standard Generalized Mark-up Language (SGML) software. SGML helps organization capture data from videos, which is of benefit to them.
Recently 2 more V’s have been added
Yes, I agree that you have collected it all, but the problem with big data is that it might be inaccurate. The 3 V’s mentioned above is growing rapidly in size, so the problem with inaccurate data is more relevant in current scenario.
The next big question raised is the value of that data. Whether that data might be useful for your decision making for markets? You don't want to collect data that night not be useful for you. That is a big question needs to be addressed.
Presently, there are a lot of data scientists who are facing this problem of arriving at a solution without understanding the context of it. It is up to you as a marketing manager to use your domain knowledge with the data. There is a reason we call it marketing analytics instead of just analytics.
I remember a famous quote from a statistician, which is apt for this situation.
"If you torture the data long enough, it will confess."-Ronald Coase (Nobel prize in economics).