Data anomalies
Bad data comes in many flavors. From misspellings to improper encoding, some data quality issues can not be avoided. However, denormalized designs make it possible to walk headlong into several well-known and preventable blunders.
To understand how normalization prevents data anomalies, we need to unpack the dual dangers it mitigates: redundancy and dependency:
- Redundancy: Repeated data, whether within one or across multiple tables. When data values are duplicated, synchronizing everything through DML operations becomes harder.
- Dependency: When the value of one attribute depends on the value of another. Dependencies can be functional (such as a person’s age attribute depending on their name) or multivalued (such as name, age, and hobby stored in a single table would make it impossible to delete a hobby without deleting all the people who practice it).
With these dangers in mind, let’s review the kinds of data anomalies that have the...