Modeling relational data
Our application currently stores data in a single CSV file; a file like this is often called a flat file, because the data has been flattened to two dimensions. While this format works acceptably for our application and could be translated directly to an SQL table, a more accurate and useful data model requires more complexity.
Normalization
The process of breaking out a flat data file into multiple tables is called normalization. Normalization is a process involving a series of levels called normal forms which progressively remove duplication and create a more precise model of the data we're storing. Although there are many normal forms, most issues encountered in common business data can be handled by conforming to the first three.
Roughly speaking, that requires the following conditions:
- The first normal form requires that each field contains only one value, and that repeating columns must be eliminated.
- The second normal form additionally requires that every value...