Chapter 9. Testing and Debugging Spark
In an ideal world, we write perfect Spark codes and everything runs perfectly all the time, right? Just kidding; in practice, we know that working with large-scale datasets is hardly ever that easy, and there are inevitably some data points that will expose any corner cases with your code.
Considering the aforementioned challenges, therefore, in this chapter, we will see how difficult it can be to test an application if it is distributed; then, we will see some ways to tackle this. In a nutshell, the following topics will be cover throughout this chapter:
- Testing in a distributed environment
- Testing Spark application
- Debugging Spark application