Big Data Shrinks to Grow – NYTimes.com
What’s next for Big Data? Maybe a little thinning down of the industry and a little more focus.
In fact, it may be underway. Google Trends shows searches of the term “Big Data” peaked in October, ending a nearly ceaseless climb that began three years earlier. Google doesn’t offer absolute numbers, but in that time there was a hundredfold increase in interest in the term.
That seems a reasonable match to the number of start-ups, public relations pitches and gurus using the term over this period.
All high-tech hype cycles must end, and usually when they do there’s a crash. Some pundits gloat at bad ideas that got funded and fell apart. The true believers set to building something durable from the remnants of the boom. In some cases, like the ’90s Internet hype, the outcome more than fulfills the promise, but few see it coming.
In the case of Big Data, this probably means less focus on back-end technologies like new types of storage or database frameworks, and a rethinking about how best to integrate human knowledge, algorithms and diverse sets of data.
In December, the company Kaggle changed its Big Data business model in important ways. Kaggle had been a darling of Big Data followers for its contests that sought the world’s best statistical analysts.
Now Kaggle has decided to focus on a few specific industries, starting with oil and gas.
“We liked to say ‘It’s all about the data,’ but the reality is that you have to understand enough about the domain in order to make a business,” said Anthony Goldbloom, Kaggle’s founder and chief executive. “What a pharmaceutical company thinks a prediction about a chemical’s toxicity is worth is very different from what Clorox thinks shelf space is worth. There is a lot to learn in each area.”
Oil and gas, which for Kaggle means mostly fracking wells in the United States, have well-defined data sets and a clear need to find working wells. While the data used in traditional oil drilling is understood, fracking is a somewhat different process. Variables like how long deep rocks have been cooked in the earth may matter. So does which teams are working the fields, meaning early-stage proprietary knowledge is also in play. That makes it a good field to go into and standardize.
Kaggle may also work in the drug industry again, as well as insurance, Mr. Goldbloom said. “We will still do competitions, they are a great engine for finding talent,” he said, “but our returns as a business will be higher if we focus.”
There is reason to think so. Kaggle’s sharper focus follows the odyssey of the Climate Corporation, which in its original incarnation as WeatherBill tried to sell predictive data about weather to farmers, house painters and golf courses. After it focused on agriculture, the Climate Corporation was bought in October by Monsanto for $930 million.
Another company that has looked at Big Data problems in an industry-specific way is Palantir. In December, Palantir, which initially focused on government security work and has moved into areas like finance, disaster relief and pharmaceuticals, raised $100 million on a valuation of $9 billion. In September, Palantir had raised money at a valuation of $6 billion.
Not everyone believes Big Data has to narrow its focus. “Predictive modeling is still going to change the world in every area,” said Jeremy Howard, a close collaborator with Mr. Goldbloom who left Kaggle over the change in direction. He said he was focusing on building new kinds of software that could better learn about the data it was crunching and offer its human owners insights on any subject.
“A lone wolf data scientist can still apply his knowledge to any industry,” he said. “I’m spending time in areas where I have no industrial knowledge and finding things. I’m going to have to build a company, but first I have to spend time as a lone wolf.”
Link: