Big Data…Big Deal? Maybe, if Used with Caution. « Statistical …

Big Data…Big Deal? Maybe, if Used with Caution. « Statistical …

Big Data…Big Deal? Maybe, if Used with Caution.

Posted by

on

27 April 2014, 11:23 am

As we have witnessed, the term “big data” has been thrusted onto the zeitgeist in the past several years, however, when one pushes beyond the hype, there seems to be little substance there. We’ve always had “data” so what so unique about it this time? Yes, we recognize it’s “big” but is there anything unique about data this time around?

I’ve spend some time thinking about this and the answer seems to be yes, and it falls on three dimensions:

Capturing Conversations & Relationships: Individuals have always communicated with one another, but now we can capture some of that conversation – email, blogs, social media (Facebook, Twitter, Pinterest) – and we can now do it with machines via sensors, ie “the internet of things” as we hear so much about;
Granularity: We can now understand individuals at a much finer level of analysis. No longer do we need to rely on a sample size of 500 people to “represent” the nation, but instead we can access millions to do it; and
Realtime. Because computing power has vastly increased (ie clustering, parallel, etc) and the cost to store the data and access the computing power (ie cloud computing) have fallen tremendously in the past 5+ years, we can analyze the volumes of data closer to real-time which has a profound impact on businesses, government and universities. With individuals (as well as hardware, ie the internet of things) being able to continuously generate data that can be captured and analyzed, then the question becomes how can we engage individuals in a real-time basis to purchase a product, change an opinion, modify a behavior, cure an illness, etc. As you can imagine, this is a tremendously important question for businesses, but just as important (and if not more) for policymakers and universities who help shape those policy.

However, as I remind my friends in computer science, engineering, and mathematics, one thing that all this disruption has not changed is the need to think really hard about a problem and to understand the underlying mechanism that drive the processes that generate this data. Data by itself does not derive insights.

A classic (simplistic) example is the relationship between the number of fire trucks and the intensity of fires. If we collected the data and plotted that relationship, but didn’t understand the mechanism between those two things, we could imagine a situation where one would (incorrectly and dangerously) advocate that we need to reduce the number of fire trucks to reduce the intensity of fires. As we venture into this brave new “big data” world, we can imagine where data scientists without the contextual linkages of deep experts in law, business, policy, arts, psychology, economics, sociology, political science, etc. would make similarly bad choices with data and the analysis.

Data and algorithms alone will not fulfill the promises of “big data.” Instead, it is creative humans who need to think very hard about a problem and the underlying mechanisms that drive those processes. It is this intersection of creative critical thinking coupled with data and algorithms that will ultimately fulfill the promise of “big data.”

Filed under Causal InferenceTagged , , Comment (RSS) | Trackback | Permalink

See more here: 

Big Data…Big Deal? Maybe, if Used with Caution. « Statistical …

Share this post