A more structured approach to predictive analytics
We near to finally draw conclusions from our predictive analyses, so I think it is the right moment to make you clear that the one we have followed here is a simplified approach to such kind of analyses.
We already talked before about the difference between training and testing dataset, let me place this into the context of predictive analysis.
When you estimate a model for the purpose of gaining prediction about the future, you are basically assuming that the relative importance of variables and all the other circumstances observed in your estimation sample will be found as well in future data and that therefore we can draw prediction about the future from past data.
What if this is not true? In this case, we would have a model able to describe the past but not to predict the future.
In a similar way, if we specify our model in a way which is too much related to the actual data available within our estimation sample, we are going to incur in...