An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem.
- John Tukey
In this chapter, you learn about data analysis and big data; we see the challenges that big data provides and how they are dealt with. You will learn about distributed computing and the approach suggested by functional programming; we introduce Google's MapReduce, Apache Hadoop, and finally Apache Spark and see how they embrace this approach and these techniques.
In a nutshell, the following topics will be covered throughout this chapter:
- Introduction to data analytics
- Introduction to big data
- Distributed computing using Apache Hadoop
- Here comes Apache Spark