Summary
In this chapter, we have learned about the evolution of storage to manage large volumes of data with distributed file systems. These systems have evolved over time, from data lakes to lakehouses that use cloud object storage to efficiently store data types and volumes in a way that is not possible with data warehouses.
We also learned how to read, write, and modify data stored in these systems. We then learned about streaming systems and the various ways to use them to enrich and store data in an incremental fashion, all of which is fundamental knowledge the Scala data engineer needs to know.
In the next chapter, we’ll dive deep into how to further transform and use data.