|
As shown in the previous section, the table expression in the SELECT
command constructs an intermediate virtual table by possibly combining tables, views,
eliminating rows, grouping, etc. This table is finally passed on to processing by the select list. The select list determines which columns of the intermediate table are actually output.
The simplest kind of select list is * which emits all columns
that the table expression produces. Otherwise, a select list is a comma-separated list of
value expressions (as defined in Section 1.2). For
instance, it could be a list of column names:
SELECT a, b, c FROM ...
The columns names a, b, and c are either the actual names of the columns of tables referenced in
the FROM clause, or the aliases given to them as explained in Section
4.2.1.2. The name space available in the select list is the same as in the WHERE clause, unless grouping is used, in which case it is the same as
in the HAVING clause.
If more than one table has a column of the same name, the table name must also be given,
as in
SELECT tbl1.a, tbl2.b, tbl1.c FROM ...
(See also Section
4.2.2.)
If an arbitrary value expression is used in the select list, it conceptually adds a new
virtual column to the returned table. The value expression is evaluated once for each
retrieved row, with the row's values substituted for any column references. But the
expressions in the select list do not have to reference any columns in the table expression
of the FROM clause; they could be constant arithmetic expressions
as well, for instance.
The entries in the select list can be assigned names for further processing. The "further processing" in this case is an optional sort
specification and the client application (e.g., column headers for display). For example:
SELECT a AS value, b + c AS sum FROM ...
If no output column name is specified via AS, the system assigns a default name. For
simple column references, this is the name of the referenced column. For function calls,
this is the name of the function. For complex expressions, the system will generate a
generic name.
Note: The naming of output columns here is different from that done in the FROM clause (see Section
4.2.1.2). This pipeline will in fact allow you to rename the same column twice, but
the name chosen in the select list is the one that will be passed on.
After the select list has been processed, the result table may optionally be subject to
the elimination of duplicates. The DISTINCT key word is written
directly after the SELECT to enable this:
SELECT DISTINCT select_list ...
(Instead of DISTINCT the word ALL can
be used to select the default behavior of retaining all rows.)
Obviously, two rows are considered distinct if they differ in at least one column value.
Null values are considered equal in this comparison.
Alternatively, an arbitrary expression can determine what rows are to be considered
distinct:
SELECT DISTINCT ON (expression [, expression ...]) select_list ...
Here expression is an arbitrary value expression that
is evaluated for all rows. A set of rows for which all the expressions are equal are
considered duplicates, and only the first row of the set is kept in the output. Note that
the "first row" of a set is unpredictable unless the
query is sorted on enough columns to guarantee a unique ordering of the rows arriving at the
DISTINCT filter. (DISTINCT ON processing
occurs after ORDER BY sorting.)
The DISTINCT ON clause is not part of the SQL standard and is
sometimes considered bad style because of the potentially indeterminate nature of its
results. With judicious use of GROUP BY and subselects in FROM the construct can be avoided, but it is often the most convenient
alternative.
|