Encoding and decoding data
In this section, we will see how to take care of encoding and decoding values such as American Standard Code for Information Interchange (ASCII), Unicode Transformation Format 8 (UTF-8), UTF-16, and so on while reading or writing data from different sources. We will see examples using Spark, SQL, and ADF here again.
Encoding and decoding using SQL
In Synapse SQL, collation defines the encoding type, sorting type, and so on in SQL strings. Collation can be set at both the database and table level. At the database level, you can set the collation, as shown here:
CREATE DATABASE TripsDB COLLATE Latin1_General_100_BIN2_UTF8;
At the table level, you can set it as shown here:
CREATE EXTERNAL TABLE FactTrips ( [tripId] VARCHAR (40) COLLATE Latin1_General_100_BIN2_UTF8, . . . )
Once you define the right collation, Synapse SQL takes care of storing the data in the right format and using the right...