Summary
The recipes in this chapter examined importing and data preparation of non-tabular data in a variety of forms, including JSON and HTML. We introduced Spark for working with big data and discussed how to persist tabular and non-tabular data. We also examined how to create a data lake for versioning. We will learn how to take the measure of our data in the next chapter.
Join our community on Discord
Join our community’s Discord space for discussions with the author and other readers:
https://discord.gg/p8uSgEAETX