A solution walkthrough for sportstickets.com
We will walk through a fictional example, sportstickets.com, which is a sports-ticketing franchise. This company manages different sporting events and sells tickets for sports events at a discounted rate. The business analysts from sportsticket.com want to set up an end-to-end data-wrangling pipeline for performing ticket sales analysis on the data.
We will explore the different phases of the data-wrangling pipeline and explain how the Pandas library will help in performing those operations in an effective and performant manner.
Figure 9.1: Different phases of the data-wrangling pipeline
Prerequisites for data ingestion
In order to perform data-wrangling activities for the preceding use case, we need to first ingest data into a data lake. In order to ingest data from on-premise databases into a cloud environment, we have the following options:
- Extract data programmatically using SQL queries from...