Summary
In this chapter, we learned how to transform our data using Scala with Spark. You now understand the difference between transformations and actions. You’ve learned how to use select
, selectExpr
, filter
, join
, and sort
to reduce data to just what you need for your transformation. You’ve worked with various types of complex data and generated aggregations using group by and windows. We’ve covered a lot, and now you’ll be able to take what you’ve learned and apply it to a real-world scenario.
In the next chapter, we are going to cover how to work with various sources and sinks for object data, streaming data, and so on.