aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2010-10-15 15:48:45 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2010-10-15 15:48:45 -0400
commit71d24466fb84bf51c479dcc85b52a0c2b71b9c50 (patch)
tree951b251cc99fd6c709d5329edfb56c759f183a9c /doc/src
parent0c9b166db5329119b6553e0f38fe486521f1352f (diff)
downloadpostgresql-71d24466fb84bf51c479dcc85b52a0c2b71b9c50.tar.gz
postgresql-71d24466fb84bf51c479dcc85b52a0c2b71b9c50.zip
Document the DISTINCT noise word in the UNION/INTERSECT/EXCEPT constructs.
I also rearranged the order of the sections to match the logical order of processing steps: the distinct-elimination implied by SELECT DISTINCT happens before, not after, any UNION/INTERSECT/EXCEPT combination. Per a suggestion from Hitoshi Harada.
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/ref/select.sgml126
-rw-r--r--doc/src/sgml/ref/select_into.sgml2
2 files changed, 69 insertions, 59 deletions
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index e840070873f..24f82497139 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -40,7 +40,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
- [ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
+ [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
[ OFFSET <replaceable class="parameter">start</replaceable> [ ROW | ROWS ] ]
@@ -106,7 +106,7 @@ TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
<listitem>
<para>
If the <literal>GROUP BY</literal> clause is specified, the
- output is divided into groups of rows that match on one or more
+ output is combined into groups of rows that match on one or more
values. If the <literal>HAVING</literal> clause is present, it
eliminates groups that do not satisfy the given condition. (See
<xref linkend="sql-groupby" endterm="sql-groupby-title"> and
@@ -118,7 +118,7 @@ TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
<para>
The actual output rows are computed using the
<command>SELECT</command> output expressions for each selected
- row. (See
+ row or row group. (See
<xref linkend="sql-select-list" endterm="sql-select-list-title">
below.)
</para>
@@ -126,6 +126,17 @@ TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
<listitem>
<para>
+ <literal>SELECT DISTINCT</literal> eliminates duplicate rows from the
+ result. <literal>SELECT DISTINCT ON</literal> eliminates rows that
+ match on all the specified expressions. <literal>SELECT ALL</literal>
+ (the default) will return all candidate rows, including
+ duplicates. (See <xref linkend="sql-distinct"
+ endterm="sql-distinct-title"> below.)
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
Using the operators <literal>UNION</literal>,
<literal>INTERSECT</literal>, and <literal>EXCEPT</literal>, the
output of more than one <command>SELECT</command> statement can
@@ -136,7 +147,11 @@ TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
strictly in both result sets. The <literal>EXCEPT</literal>
operator returns the rows that are in the first result set but
not in the second. In all three cases, duplicate rows are
- eliminated unless <literal>ALL</literal> is specified. (See
+ eliminated unless <literal>ALL</literal> is specified. The noise
+ word <literal>DISTINCT</> can be added to explicitly specify
+ eliminating duplicate rows. Notice that <literal>DISTINCT</> is
+ the default behavior here, even though <literal>ALL</literal> is
+ the default for <command>SELECT</> itself. (See
<xref linkend="sql-union" endterm="sql-union-title">, <xref
linkend="sql-intersect" endterm="sql-intersect-title">, and
<xref linkend="sql-except" endterm="sql-except-title"> below.)
@@ -155,17 +170,6 @@ TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
<listitem>
<para>
- <literal>DISTINCT</literal> eliminates duplicate rows from the
- result. <literal>DISTINCT ON</literal> eliminates rows that
- match on all the specified expressions. <literal>ALL</literal>
- (the default) will return all candidate rows, including
- duplicates. (See <xref linkend="sql-distinct"
- endterm="sql-distinct-title"> below.)
- </para>
- </listitem>
-
- <listitem>
- <para>
If the <literal>LIMIT</literal> (or <literal>FETCH FIRST</literal>) or <literal>OFFSET</literal>
clause is specified, the <command>SELECT</command> statement
only returns a subset of the result rows. (See <xref
@@ -219,7 +223,7 @@ TABLE { [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] |
subquery to reference itself by name. Such a subquery must have
the form
<synopsis>
-<replaceable class="parameter">non_recursive_term</replaceable> UNION [ ALL ] <replaceable class="parameter">recursive_term</replaceable>
+<replaceable class="parameter">non_recursive_term</replaceable> UNION [ ALL | DISTINCT ] <replaceable class="parameter">recursive_term</replaceable>
</synopsis>
where the recursive self-reference must appear on the right-hand
side of the <literal>UNION</>. Only one recursive self-reference
@@ -755,13 +759,52 @@ UNBOUNDED FOLLOWING
</para>
</refsect2>
+ <refsect2 id="sql-distinct">
+ <title id="sql-distinct-title"><literal>DISTINCT</literal> Clause</title>
+
+ <para>
+ If <literal>SELECT DISTINCT</> is specified, all duplicate rows are
+ removed from the result set (one row is kept from each group of
+ duplicates). <literal>SELECT ALL</> specifies the opposite: all rows are
+ kept; that is the default.
+ </para>
+
+ <para>
+ <literal>SELECT DISTINCT ON ( <replaceable
+ class="parameter">expression</replaceable> [, ...] )</literal>
+ keeps only the first row of each set of rows where the given
+ expressions evaluate to equal. The <literal>DISTINCT ON</literal>
+ expressions are interpreted using the same rules as for
+ <literal>ORDER BY</> (see above). Note that the <quote>first
+ row</quote> of each set is unpredictable unless <literal>ORDER
+ BY</> is used to ensure that the desired row appears first. For
+ example:
+<programlisting>
+SELECT DISTINCT ON (location) location, time, report
+ FROM weather_reports
+ ORDER BY location, time DESC;
+</programlisting>
+ retrieves the most recent weather report for each location. But
+ if we had not used <literal>ORDER BY</> to force descending order
+ of time values for each location, we'd have gotten a report from
+ an unpredictable time for each location.
+ </para>
+
+ <para>
+ The <literal>DISTINCT ON</> expression(s) must match the leftmost
+ <literal>ORDER BY</> expression(s). The <literal>ORDER BY</> clause
+ will normally contain additional expression(s) that determine the
+ desired precedence of rows within each <literal>DISTINCT ON</> group.
+ </para>
+ </refsect2>
+
<refsect2 id="SQL-UNION">
<title id="sql-union-title"><literal>UNION</literal> Clause</title>
<para>
The <literal>UNION</literal> clause has this general form:
<synopsis>
-<replaceable class="parameter">select_statement</replaceable> UNION [ ALL ] <replaceable class="parameter">select_statement</replaceable>
+<replaceable class="parameter">select_statement</replaceable> UNION [ ALL | DISTINCT ] <replaceable class="parameter">select_statement</replaceable>
</synopsis>
<replaceable class="parameter">select_statement</replaceable> is
any <command>SELECT</command> statement without an <literal>ORDER
@@ -791,6 +834,8 @@ UNBOUNDED FOLLOWING
<literal>ALL</> prevents elimination of duplicates. (Therefore,
<literal>UNION ALL</> is usually significantly quicker than
<literal>UNION</>; use <literal>ALL</> when you can.)
+ <literal>DISTINCT</> can be written to explicitly specify the
+ default behavior of eliminating duplicate rows.
</para>
<para>
@@ -812,7 +857,7 @@ UNBOUNDED FOLLOWING
<para>
The <literal>INTERSECT</literal> clause has this general form:
<synopsis>
-<replaceable class="parameter">select_statement</replaceable> INTERSECT [ ALL ] <replaceable class="parameter">select_statement</replaceable>
+<replaceable class="parameter">select_statement</replaceable> INTERSECT [ ALL | DISTINCT ] <replaceable class="parameter">select_statement</replaceable>
</synopsis>
<replaceable class="parameter">select_statement</replaceable> is
any <command>SELECT</command> statement without an <literal>ORDER
@@ -833,6 +878,8 @@ UNBOUNDED FOLLOWING
With <literal>ALL</>, a row that has <replaceable>m</> duplicates in the
left table and <replaceable>n</> duplicates in the right table will appear
min(<replaceable>m</>,<replaceable>n</>) times in the result set.
+ <literal>DISTINCT</> can be written to explicitly specify the
+ default behavior of eliminating duplicate rows.
</para>
<para>
@@ -858,7 +905,7 @@ UNBOUNDED FOLLOWING
<para>
The <literal>EXCEPT</literal> clause has this general form:
<synopsis>
-<replaceable class="parameter">select_statement</replaceable> EXCEPT [ ALL ] <replaceable class="parameter">select_statement</replaceable>
+<replaceable class="parameter">select_statement</replaceable> EXCEPT [ ALL | DISTINCT ] <replaceable class="parameter">select_statement</replaceable>
</synopsis>
<replaceable class="parameter">select_statement</replaceable> is
any <command>SELECT</command> statement without an <literal>ORDER
@@ -878,6 +925,8 @@ UNBOUNDED FOLLOWING
With <literal>ALL</>, a row that has <replaceable>m</> duplicates in the
left table and <replaceable>n</> duplicates in the right table will appear
max(<replaceable>m</>-<replaceable>n</>,0) times in the result set.
+ <literal>DISTINCT</> can be written to explicitly specify the
+ default behavior of eliminating duplicate rows.
</para>
<para>
@@ -987,45 +1036,6 @@ SELECT name FROM distributors ORDER BY code;
</para>
</refsect2>
- <refsect2 id="sql-distinct">
- <title id="sql-distinct-title"><literal>DISTINCT</literal> Clause</title>
-
- <para>
- If <literal>DISTINCT</> is specified, all duplicate rows are
- removed from the result set (one row is kept from each group of
- duplicates). <literal>ALL</> specifies the opposite: all rows are
- kept; that is the default.
- </para>
-
- <para>
- <literal>DISTINCT ON ( <replaceable
- class="parameter">expression</replaceable> [, ...] )</literal>
- keeps only the first row of each set of rows where the given
- expressions evaluate to equal. The <literal>DISTINCT ON</literal>
- expressions are interpreted using the same rules as for
- <literal>ORDER BY</> (see above). Note that the <quote>first
- row</quote> of each set is unpredictable unless <literal>ORDER
- BY</> is used to ensure that the desired row appears first. For
- example:
-<programlisting>
-SELECT DISTINCT ON (location) location, time, report
- FROM weather_reports
- ORDER BY location, time DESC;
-</programlisting>
- retrieves the most recent weather report for each location. But
- if we had not used <literal>ORDER BY</> to force descending order
- of time values for each location, we'd have gotten a report from
- an unpredictable time for each location.
- </para>
-
- <para>
- The <literal>DISTINCT ON</> expression(s) must match the leftmost
- <literal>ORDER BY</> expression(s). The <literal>ORDER BY</> clause
- will normally contain additional expression(s) that determine the
- desired precedence of rows within each <literal>DISTINCT ON</> group.
- </para>
- </refsect2>
-
<refsect2 id="SQL-LIMIT">
<title id="sql-limit-title"><literal>LIMIT</literal> Clause</title>
diff --git a/doc/src/sgml/ref/select_into.sgml b/doc/src/sgml/ref/select_into.sgml
index 715d64097e4..787c106a49d 100644
--- a/doc/src/sgml/ref/select_into.sgml
+++ b/doc/src/sgml/ref/select_into.sgml
@@ -30,7 +30,7 @@ SELECT [ ALL | DISTINCT [ ON ( <replaceable class="parameter">expression</replac
[ GROUP BY <replaceable class="parameter">expression</replaceable> [, ...] ]
[ HAVING <replaceable class="parameter">condition</replaceable> [, ...] ]
[ WINDOW <replaceable class="parameter">window_name</replaceable> AS ( <replaceable class="parameter">window_definition</replaceable> ) [, ...] ]
- [ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="parameter">select</replaceable> ]
+ [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] <replaceable class="parameter">select</replaceable> ]
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
[ LIMIT { <replaceable class="parameter">count</replaceable> | ALL } ]
[ OFFSET <replaceable class="parameter">start</replaceable> [ ROW | ROWS ] ]