Summary
In this chapter, you saw how difficult the testing and debugging your Spark applications are. These can even be more critical in a distributed environment. We also discussed some advanced ways to tackle them altogether. In summary, you learned the way of testing in a distributed environment. Then you learned a better way of testing your Spark application. Finally, we discussed some advanced ways of debugging Spark applications.
This is more or less the end of our little journey with advanced topics on Spark. Now, a general suggestion from our side to you as readers or if you are relatively newer to the data science, data analytics, machine learning, Scala, or Spark is that you should at first try to understand what types of analytics you want to perform. To be more specific, for example, if your problem is a machine learning problem, try to guess what type of learning algorithms should be the best fit, that is, classification, clustering, regression, recommendation, or frequent pattern...