
Terabytes is not big data, petabytes is

I often wonder what's behind the increased trend behind Hadoop and other NoSQL technologies. I realize if you're Yahoo that such technology makes sense. I don't get why everyone else wants to use it.

Reading Stephen O'Grady's self-review of his predictions for 2010 for the first time gave me some insights into how such people think:

Democratization of Big Data

Consider that RedMonk, a four person analyst shop, has the technical wherewithal to attack datasets ranging from gigabytes to terabytes in size. Unless you’re making institutional money, budgets historically have not permitted this. The tools of Big Data have never been more accessible than they are today.

Google even searches the future now

Wired already reported that Google Trends could have been used to find out about the Swine Flu epidemic in Mexico weeks before it was reported in the news media. Then, in anticipation of the Eurovision Song Contest 2009, the google engineers created a widget that would take Google Trends data as input (per country), and transform the search activity in each country to Eurovision points of 1 to 12. I copied the prediction to my Facebook page just when the Eurovision final was starting:

