Machine learning isn't perfect
There are many caveats of machine learning. Many are specific to different models being implemented, but there are some assumptions that are universal for any machine learning model:
- The data used, for the most part, is preprocessed and cleaned using the methods we outlined in earlier chapters. Almost no machine learning model will tolerate dirty data with missing values or categorical values. Use dummy variables and filling/dropping techniques to handle these discrepancies.
- Each row of a cleaned dataset represents a single observation of the environment we are trying to model.
- If our goal is to find relationships between variables, then there is an assumption that there is some kind of relationship between these variables.
This assumption is particularly important. Many machine learning models take this assumption very seriously. These models are not able to communicate that there might not be a relationship.
- Machine learning models are generally considered...