For all its vaunted importance to large companies, working with big data is kind of hard. Everybody sort of gets the point: Mine huge troves of data gathered over time through the course of normal business operations, and eventually you get to recognize patterns that turn into new insights that, in turn, can lead to changes that either make or save money.
It’s working with the data that’s hard. Companies are quickly learning that the first step is to move their data into systems running Hadoop, the open-source analytics engine, and HBase, the open-source database. The tricky part is working with that data in such a way that you can get useful information out of it.
It’s the kind of problem that Jonathan Gray (pictured) solved for Facebook, where he built an internal system called Puma that gave developers a platform that let them easily build applications to work with that data, without first becoming data scientists.
It’s the basic idea behind Gray’s new company Continuuity. Its proposition is simple: If you know Java, you can work with your company’s data sets, building applications. Or, as Gray recently told me at a dinner party in New York: “Just build the app. Don’t worry about the other stuff.”
It worked for Mike Dauber, a principal at Battery Ventures, who led an investment in Continuuity. Ignition Partners was also a lead investor in a $10 million Series A round, with participation from Andreessen Horowitz, Data Collective and Amplify Partners. “Companies are still getting comfortable with the idea that they can put their data into Hadoop,” Dauber told me recently. “They’re not comfortable with doing anything meaningful with it yet.”
One example that Dauber told me about: A large telco company that’s an early Continuuity customer, but which he couldn’t name, is starting to embrace Hadoop. “They’ve got maybe five people in the whole company who know Hadoop, but thousands who can work with Java … They’re trying to do for Big Data what WebLogic did for Java.”
Beam that statement back to the mid-1990s, when Paul Ambrose (pictured at right) and a bunch of young Java enthusiasts founded a company called WebLogic in his living room. At the time, Java was a curiosity, used mainly for making animated graphics on websites. Ambrose and his team saw it differently. They saw the potential for building real enterprise-grade applications that could be written once and then run on a whole range of different, incompatible hardware.
It caught on, and by 1998, WebLogic was acquired by BEA Systems for a little less than $200 million. Ambrose, who had been WebLogic’s CEO, became BEA’s CTO. Within a decade, software giant Oracle came calling and acquired BEA for $8.5 billion — one big motivation for the deal was so that Oracle could get its hands on WebLogic.
I met Ambrose at the same dinner where I met Gray, and learned that he has joined Continuuity’s board of directors. And he’s still acting like an enthusiast. “He started coming to our earliest developer meetups,” Gray told me.
For Ambrose, it’s a lot like those early days with Java. “We saw the potential for Java back then, and now it’s the basis for this whole new range of big-data applications. It’s exciting,” he said. “At WebLogic we brought the power of the Internet to enterprise developers, and I think Continuuity can play the same role for big data. Most companies are just starting to realize there may be a treasure trove hiding inside the data they already have.”
Continuuity’s links with WebLogic run deeper still. Bob Pasker, who had been WebLogic’s CTO, is an early investor and an adviser.
Ambrose’s joining the board was a quiet bit of news that Continuuity hasn’t officially shared yet. That was because it had other news to announce: First, there was the release of version 2.0 of Reactor, its application server for Hadoop. It’s aiming to help businesses create large-scale applications that turn those huge caches of data in Hadoop into something useful. It also announced a big collaboration with cloud-services provider Rackspace.
Right now, Reactor isn’t available as a cloud service. Eventually, it will be, and that’s where Rackspace comes in. Running in Rackspace’s OpenStack, developers can deploy their applications to a free sandbox environment where they can try them out. From there, the application can be pushed to a public-cloud or private-cloud environment. The plan is to have a platform-as-a-service akin to Salesforce.com’s Heroku, which focuses on Ruby on Rails development. It’s fine for now, because most companies using Hadoop are doing so on their own infrastructure, anyway.