Walking through the data science workflow
While there will be deviations in the path that you take in any particular problem, you can be sure that you'll be following the same rough outline for most of them. In the following diagram, you can see the flow we will use in this chapter, and it's the same that you will use for most problems that you come across:
Figure 9.1 consists of the following steps in the data science flow:
- Understanding the problem space.
- Data exploration/preprocessing/manipulation. We combine these into one, but there are distinct parts of each that we will dive into.
- Feature selection/extraction.
- Predictive modeling.
- Project outcomes and conclusion.
These steps will become very familiar to you in this and the following chapters. Let's now look at the first step in this journey.