Introduction to Statistical Drift
Statistical drift refers to changes in the underlying data distribution itself. It can affect both the input features and the target variable. This drift may or may not affect the model's performance but understanding it is crucial for broader data landscape awareness.
To effectively identify instances of Statistical Drift, various metrics can be monitored:
- Mean and Standard Deviation: Significant changes can indicate drift.
- Kurtosis and Skewness: Changes signal data distribution alterations.
- Quantile Statistics: Look at changes in 25th, 50th, and 75th percentiles for example.
To fully grasp how Model Drift and Statistical Drift are interconnected, consider the following key points:
- Cause and Effect Relationship: Statistical drift in either the features or the target variable frequently serves as a precursor to model drift. For example, should the age demographic of your customer base shift (indicative...