I have said it before and will say it again you don’t have to be fortune 500 company to use Big Data. Big Data is more about understanding your data, then it is about how big it is and understanding all your different data sources and gathering them into one place, so that you can analyze and understand it better.
At Clearwire we have a big data challenge: Processing millions of unique usage records comprising terabytes of data for millions of customers every week. Historically, massive purpose-built database solutions were used to process data, but weren’t particularly fast, nor did they lend themselves to analysis. As mobile data volumes increase exponentially, we needed a scalable solution that could process usage data for billing, provide a data analysis platform, and inexpensively store the data indefinitely. The solution? A Hadoop-based platform allowed us to architect and deploy an end-to-end solution based on a combination of physical data nodes and virtual edge nodes in less than six months. This solution allowed us to turn off our legacy usage processing solution and reduce processing times from hours to as little as 15-min. This improvement has enabled Clearwire to deliver actionable usage data to partners faster and more predictably than ever before. Usage processing was just the beginning; we’re now turing to the raw data stored in Hadoop, adding new data sources, and starting to anlyze the data. Clearwire is now able to put multiple data sources in the hands of our analysts for further discovery and actionable intelligence.
Lately I have been asked by a lot of my co-workers, if Hadoop runs on Windows. After going to the Hadoop Summit last month, I have been able to tell them about Azure HDInsight. Which is basically Apache Hadoop running on Windows Azure.
It appears that Microsoft has been working with Hortonworks to bring Apache Hadoop to Windows and here is the end produce.