Klout’s Big Data Collection Pipelines with Play Framework
The fantastic engineering folks at Klout wrote a great blog post last Friday on the usage of Iteratees in Play Framework calculating social influence of users across several social networks.
Our users and clients expect data to be up to date and accurate and it has been a significant technical challenge to reliably meet these goals. In this blog post we describe the usage of Play! Iteratees in our redesigned data collection pipeline. This post is not meant to be a tutorial on the concept of Iteratees, for which there are many great posts already such as James Roper's post and Josh Suereth's post. Rather, this post is a detailed look at how Klout uses Iteratees in the context of large scale data collection and why it is an appropriate and effective programming abstraction for this use case. In a later post we will describe our distributed Akka-based messaging infrastructure, which allowed us to scale and distribute our Iteratee based collectors across clusters of machines.
Head over the the Klout blog to read the rest of the post. Thanks to Naveen Gattu for providing such an excellent post!