Understanding model performance, data drift, and MLOps
In machine learning, you can train a model once and use it forever, just like how in software engineering, you can usually write a program once and use it forever – provided the underlying operating system remains operational and supported. However, models are built from data. That data represents observations from a certain time period.
As time moves forward, the data you provided the model during training may start to resemble the real world less and less accurately as time goes on. For example, if I trained a regression model to predict what price a used car would sell for based on its make, model, and mileage, those assumptions would initially be quite valid. However, as time moved forward, new vehicles entered the market, and overall market trends evolved, the prices my model predicted would get further and further from realistic prices.
A somewhat recent example of this was the worldwide shortage of semiconductor...