Testing your ML solution by design
On top of performing regular software development tests, such as unit tests, integration tests, system testing, and acceptance testing, ML solutions need additional tests because data and ML models are involved. Both the data and models change dynamically over time. Here are some concepts for testing by design; applying them to your use cases can ensure robust ML solutions are produced as a result.
Data testing
The goal of testing data is to ensure that the data is of a high enough quality for ML model training. The better the quality of the data, the better the models trained for the given tasks. So how do we assess the quality of data? It can be done by inspecting the following five factors of the data:
- Accuracy
- Completeness (no missing values)
- Consistency (in terms of expected data format and volume)
- Relevance (data should meet the intended need and requirements)
- Timeliness (the latest or up-to-date data)
Based...