Portfolio data engineering project
In this section, we will look at an example Azure data engineering project on sports analytics that involves creating a pipeline that ingests, cleans, and visualizes data.
Scenario
You were recently employed by a company (Connect) that has rendered all data-related services to its clients for the past 2 months. You have been attached to a team, but today, you have been given your first job sole project.
A new season of the English Premier League just commenced and your company assigned you to a data engineering job posted by a client. They provided several website links to acquire the data from. Your job is to have all this data extracted, transformed, and loaded into cloud storage and a PostgreSQL database every Saturday and Sunday until the end of the season.
As a data engineer, you should be able to assess the requirements to properly decide which tool to use for each task that would make the process efficient. For example, using Spark...