Chapter 5. Distributed Data Processing with Cascalog
In this chapter, we will cover the following recipes:
- Initializing Cascalog and Hadoop for distributed processing
- Querying data with Cascalog
- Distributing data with Apache HDFS
- Parsing CSV files with Cascalog
- Executing complex queries with Cascalog
- Aggregating data with Cascalog
- Defining new Cascalog operators
- Composing Cascalog queries
- Transforming data with Cascalog