Chapter 12: Taming Big Data with SQL Server
In this last chapter, we will work with data outside of SQL Server. We will introduce technologies that can be used to access external data that also have capabilities that are used for big data processing. One of the newest features of SQL Server 2019 is known as Big Data Clusters, which combines the workload of SQL Server, scalable storage filesystems, and the Spark engine using containers managed by Kubernetes. This will take us away from the common relational data approach we are used to in SQL Server.
In this chapter, we will cover the following main topics:
- Big data overview
- Accessing external data with PolyBase
- Explaining the SQL Server Big Data Clusters architecture and deployment
- Working with a SQL Server Big Data Clusters workload
Let's get started!