Blog
Posted 2012-08-07 15:22:00.0
In our fifth installment of this series we showed how to implement TF-IDF in Cascading application. If you haven’t read that yet, it’s probably best to start there.
Today’s post extends the TF-IDF app to show best practices for test-driven develop
more »
Posted 2012-07-31 18:17:00.0
In our fourth installment of this series we showed how to use HashJoin on two pipes, to perform “stop words” filtering at scale in a Cascading 2.0 application. If you haven’t read that yet, it’s probably best to start ther
more »
Posted 2012-07-24 10:13:00.0
In our third installment of this series we showed how to write a custom Operation for a Cascading 2.0 application. If you haven’t read that yet, it’s probably best to start ther
more »
Read More Blog Entries »
Presentations
Introduction to Cascading, an application framework for Java developers to deploy robust, enterprise-grade applications on Apache Hadoop. We'll start with the simplest Cascading program possible (file copy in a distributed file system) and progress in sma
more »