6. Big Data File Formats
Activity 6.01: Selecting an Appropriate Big Data File Format for Game Logs
Solution
- In the
Chapter06
directory, create theActivity06.01
directory to store the files for this activity. - Move the
session_log
file into theChapter06/Data
directory. - Open your Terminal (macOS or Linux) or Command Prompt window (Windows), move to the installation directory, and open the Spark shell in it using the following command:
spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.5
You should get the following output:
By using this command, the Spark shell will be launched and we will now load the dataset from the CSV file.
- Load the
session_log.csv
dataset:val df_ses_log_csv = spark.read.options(Map("inferSchema"- >"true","delimiter"->",","header"- >"true")).csv("F:/Chapter06/Data/session_log.csv")
Note
Update the input path of the file according...