Preventing duplicate rows
Preventing duplicate rows is one of the most important aspects of data quality for any database. PostgreSQL offers some useful features in this area, extending beyond most relational databases.
Getting ready
Identify the set of columns that you wish to make unique. Does this apply to all rows, or just a subset of rows?
Let's start with our example table:
postgres=# SELECT * FROM new_cust; customerid ------------ 1 2 3 4 (4 rows)
How to do it…
To prevent duplicate rows, we need to create a unique index that the database server can use to enforce uniqueness of a particular set of columns. We can do this in the following three similar ways for basic data types:
- Create a primary key constraint on the set of columns. We are allowed only one of these per table. The values of the data rows must not be
NULL
, as we force the columns to beNOT NULL
if they aren't already:
ALTER TABLE new_cust ADD PRIMARY KEY(customerid);
- This creates a new...