Approaching Data Science Problems
It's important to ensure you have a well-structured plan for your data science project before you start the analysis and modeling phases. We'll outline some factors to keep in mind when making this plan, and then go over some technical details regarding preparing data for modeling in the next section.
Since this book is centered around Jupyter Notebooks, we'll start by highlighting how useful they are for the planning phase of a data science project. They offer a very convenient medium for documenting your analysis and modeling plans, for example, by writing rough notes about the data or a list of models we are interested in training. Having these notes in the same place as your proceeding analysis can help others understand what you're doing when they see your work or provide context for you when you look back after leaving it for a while.
A large part of data science involves the use of machine learning to build predictive...