Wrangling Customer Usage Data with Hadoop

Here is our session from the Hadoop Summit 2013.


Title: Wrangling Customer Usage Data with Hadoop

Slides: http://www.slideshare.net/Hadoop_Summit/hall-johnson-june271100amroom211v2


At Clearwire we have a big data challenge: Processing millions of unique usage records comprising terabytes of data for millions of customers every week. Historically, massive purpose-built database solutions were used to process data, but weren’t particularly fast, nor did they lend themselves to analysis. As mobile data volumes increase exponentially, we needed a scalable solution that could process usage data for billing, provide a data analysis platform, and inexpensively store the data indefinitely. The solution? A Hadoop-based platform allowed us to architect and deploy an end-to-end solution based on a combination of physical data nodes and virtual edge nodes in less than six months. This solution allowed us to turn off our legacy usage processing solution and reduce processing times from hours to as little as 15-min. This improvement has enabled Clearwire to deliver actionable usage data to partners faster and more predictably than ever before. Usage processing was just the beginning; we’re now turing to the raw data stored in Hadoop, adding new data sources, and starting to anlyze the data. Clearwire is now able to put multiple data sources in the hands of our analysts for further discovery and actionable intelligence.