Data modeling
Data modeling is the conceptual and logical representation of the proposed physical database provided in a visual format using entity relationship (ER) diagrams. An ER diagram represents all the database entities in a way that defines their relationships and properties. The goal of the ER diagram is to lay out the structure of the entities such that they are easy to understand and are implemented later in the database system.
To understand data modeling, there are two crucial concepts you need to be aware of. The first is the primary key. Primary keys are used to uniquely identify a specific record or row in your database. For now, you should know that it enforces the table to have no duplicate rows with the same key. The other concept is the foreign key. The foreign key allows you to link tables together with a field or collection of fields that refer to a primary key of another table.
The preceding screenshot shows you parts of the data model for the sakila
database. It shows how different tables are connected and what their relationships are. You can read the relationships through the fields shared between the connected tables. For example, the rental
table and category
table are connected by the last_update
field. The category
table is then connected to the country
table through the same last_update
field. This demonstrates the general structure of the table relationships.
The data model ensures that all the required data objects (including tables, primary keys, foreign keys, and stored procedures) are represented and that the relationships between them are correctly defined. The data model also helps to identify missing or redundant data.
MySQL offers an Enhanced Entity Relationship Diagram for data modeling with which you can interact directly to add, modify, and remove the database objects and set the relationships and indexes. This can be accessed through the Workbench (this is explained in detail in the next chapter). When the model is completed, it can then be used to create the physical database if it does not exist or update an existing physical database.
The following steps describe the process by which a database comes into existence:
- Someone gets an idea for a database and application creation.
- A database analyst or developer is hired to create the database.
- An analysis is performed to determine what data must be stored. This source information could come from another system, documents, or verbal requirements.
- The analyst then normalizes the data to define the tables.
- The database is modeled using the normalized tables.
- The database is created.
- Applications that use the database for reporting, processing, and computation are developed.
- The database goes live.
For example, suppose that you are working on a system that stores videos for users. First, you need to determine how the database will be structured. This includes determining what data needs to be stored, what fields are relevant, what data types the fields should have, and the relationships between the data. For your video database example, you may want to store the video's location on the server, the name of the video, and a description of the video. This might link into a database table that contains ratings and comments for the video. Once this is produced, you can create a database that matches the proposed structure. Finally, you can place the database on a server so that it is live and accessible for users.
In the next section, you will learn about database normalization, which is the act of creating an optimized database schema with as few redundancies as possible with the help of constraints and removing functional dependency by breaking up the database into smaller tables.