Summary
At this point, you have a much better grasp of how to look at a real-world problem and understand the full flow that needs to occur.
Using the backdrop of making a high-quality wine, you first saw how getting a better sense of the problem space is very important for framing what we need to do, and one way to do this was by understanding each column in our dataset.
After that, we looked at how to further explore and clean the data. You saw how the data cleaning phase can be split into two parts, with the dividing line being when things need to be human-readable, and when you need to focus on building a good model. Things such as scaling the data should happen after you feel like you've got an understanding of what you are looking at.
In the pre-training data phase, we made sure to set up a conda environment with everything we needed, including Jupyter notebooks. When we loaded it up, the first thing we did was to get our two different datasets and combine them...