Mitigating bias
It is important to know whether you have a biased dataset, as it may mean you do not have a representative dataset. It may also mean that you will produce an AI model that treats different groups unfairly. For example, your AI model accurately forecasts supermarket sales in cities but underperforms in towns. If you use demand forecasting to plan the supplies to send to each supermarket, this can result in constant supply shortages in your supermarkets that are in towns. (We'll talk about forecasting in Chapter 4, Forecasting Time-Series Data.)
When we talk of bias, we often refer to imbalanced data. To know whether a dataset is imbalanced, we mostly look at histograms, box plots, and the distribution of values. As soon as we see that there is an inequality in the number of values we can find in our dataset, this can mean that there is bias in our data.
Bias is a complex problem and does not have one golden solution. When your data is biased, understanding...