This section starts by laying out the implementation infrastructure for Chapter 4, Building a Spam Classification Pipeline. The goal of this section will be to get started on developing one data pipeline to analyze the flight-on-time dataset. The first step is to set up prerequisites, before implementation. That is the goal of the next subsection.
Getting started
Setting up prerequisite software
The following prerequisites or prerequisite checks are recommended. A new prerequisite on this list is MongoDB:
- Increase Java memory
- Review JDK version
- Self-contained Scala application based on Simple Build Tool (SBT), where all dependencies are wired into the build.sbt file
- MongoDB
We start by detailing the steps to increase...