SELECT is used to restrict tuples from the relation. SELECT always returns a unique set of tuples; that is inherited from entity integrity constraint. For example, the query give me the customer information where the customer_id equals 2 is written as follows:
σcustomer_id =2 customer
The selection, as mentioned earlier, is commutative; the query give me all customers where the customer's email is known, and the customer's first name is kim is written in three different ways, as follows:
σemail is not null(σfirst_name =kim customer)
σfirst_name =kim(σemail is not null customer)
σfirst_name =kim and email is not null (customer)
The selection predicates are certainly determined by the data types. For numeric data types, the comparison operator might be ≠, =, <, >, ≥, or ≤. The predicate expression can also contain complex expressions and functions. The equivalent SQL statement for the SELECT operator is the SELECT * statement, and the predicate is defined in the WHERE clause.
The * symbol means all the relation attributes; note that in a production environment, it is not recommended to use *. Instead, one should list all the relation attributes explicitly. Using * in production code could easily break the application since the order and the type of expected result is given implicitly. This situation can occur when one renames a table attribute field, or adds a new column.
The following SELECT statement is equivalent to the relational algebra expression σcustomer_id =2 customer:
SELECT * FROM customer WHERE customer_id = 2;
The PROJECT operation could be visualized as a vertical slicing of the table. The query give me the customer names is written in relational algebra as follows:
π first_name, last_name customer
The following is the result of projection expression:
first_name
|
last_name
|
thomas
|
sieh
|
wang
|
kim
|
Duplicate tuples are not allowed in the formal relational model; the number of tuples returned from the PROJECT operator is always equal to or less than the number of total tuples in the relation. If a PROJECT operator's attribute list contains a primary key, then the resultant relation has the same number of tuples as the projected relation.
The projection operator also can be optimized, for example, cascading projections could be optimized as the following expression:
πa(πa,πb(R)) = πa(R)
The SQL equivalent for the PROJECT operator is SELECT DISTINCT. The DISTINCT keyword is used to eliminate duplicates. To get the result shown in the preceding expression, one could execute the following SQL statement:
SELECT DISTINCT first_name, last_name FROM customers;
The sequence of the execution of the PROJECT and SELECT operations can be interchangeable in some cases. The query give me the name of the customer with customer_id equal to 2 could be written as follows:
σcustomer_id =2 (π first_name, last_name customer)
π first_name, last_name(σcustomer_id =2 customer)
In other cases, the PROJECT and SELECT operators must have an explicit order, as shown in the following example; otherwise, it will lead to an incorrect expression. The query give me the last name of the customers where the first name is kim could be written in the following way:
π last_name(σfirst_name=kim customer)