Spark SQL language reference
Being a part of the overarching Hadoop ecosystem, Spark has traditionally been Hive-compliant. While the Hive query language diverges greatly from ANSI SQL standards, Spark 3.0 Spark SQL can be made ANSI SQL-compliant using a spark.sql.ansi.enabled
configuration. With this configuration enabled, Spark SQL uses an ANSI SQL-compliant dialect instead of a Hive dialect.
Even with ANSI SQL compliance enabled, Spark SQL may not entirely conform to ANSI SQL dialect, and in this section, we will explore some of the prominent DDL and DML syntax of Spark SQL.
Spark SQL DDL
The syntax for creating a database and a table using Spark SQL is presented as follows:
CREATE DATABASE IF NOT EXISTS feature_store; CREATE TABLE IF NOT EXISTS feature_store.retail_features USING DELTA LOCATION '/FileStore/shared_uploads/delta/retail_features.delta';
In the previous code block, we do the following:
- First, we create a database if it doesn't...