Using Apache Spark to prepare data for storage on Snowflake
This recipe provides you with an example of how Apache Spark and Snowflake partner to utilize the two systems' strengths. The recipe shows a scenario involving reading data from Snowflake into a Spark DataFrame and writing data back to Snowflake from a Spark DataFrame.
Getting ready
You will need to be connected to your Snowflake instance via the Web UI or the SnowSQL client to execute this recipe.
It is assumed that you have already configured the Snowflake Connector for Spark and can connect to the Snowflake instance successfully through Spark.
How to do it
We will be reading data from Snowflake sample tables and transforming the data before writing it back to Snowflake in a new table. The following code in the various steps should be added into a single scala
file called snowflake_transform.scala
since we will be calling that file from within spark-shell
:
- Let's start by creating a new database...