Chapter 3, Explaining Machine Learning with Facets
- Datasets in real-life projects are rarely reliable. (True|False)
True. In most cases, the datasets require a fair amount of quality control before they can be used as input data for ML models. In rare cases, the data is perfect in some companies that constantly check the quality of their data.
- In a real-life project, there are no missing records in a dataset. (True|False)
False. In most cases, data is missing.
True. In some critical areas, such as aerospace projects, the data is clean.
- The distribution distance is the distance between two data points. (True|False)
False. The distribution distance is measured between two data distributions.
- Non-uniformity does not affect an ML model. (True|False)
False. Non-uniformity has profound effects on the outputs of an ML model. However, in some cases, non-uniform datasets reflect the reality of the problem...