Redesign DISTINCT ON as discussed in pgsql-sql 1/25/00: syntax is now

SELECT DISTINCT ON (expr [, expr ...]) targetlist ... and there is a check to make sure that the user didn't specify an ORDER BY that's incompatible with the DISTINCT operation. Reimplement nodeUnique and nodeGroup to use the proper datatype-specific equality function for each column being compared --- they used to do bitwise comparisons or convert the data to text strings and strcmp(). (To add insult to injury, they'd look up the conversion functions once for each tuple...) Parse/plan representation of DISTINCT is now a list of SortClause nodes. initdb forced by querytree change...
author: Tom Lane <tgl@sss.pgh.pa.us> 2000-01-27 18:11:50 +0000
committer: Tom Lane <tgl@sss.pgh.pa.us> 2000-01-27 18:11:50 +0000
commit: dd979f66be20fc54aad06da743f788fbc505bbe1 (patch)
tree: 4d6b578e0afc7dc0198296940cd77c79c5eff66c /doc/src
parent: 3f0074e403ac3a145c5e43db3348f45fe084703d (diff)
download: postgresql-dd979f66be20fc54aad06da743f788fbc505bbe1.tar.gz
postgresql-dd979f66be20fc54aad06da743f788fbc505bbe1.zip
1 files changed, 21 insertions, 8 deletions
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 070f8b43d0f..ed180ac91c1 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -1,5 +1,5 @@
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v 1.23 1999/12/13 17:39:38 tgl Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/ref/select.sgml,v 1.24 2000/01/27 18:11:25 tgl Exp $
 Postgres documentation
 -->
 
@@ -22,7 +22,7 @@ Postgres documentation
    <date>1999-07-20</date>
   </refsynopsisdivinfo>
   <synopsis>
-SELECT [ ALL | DISTINCT [ ON <replaceable class="PARAMETER">column</replaceable> ] ]
+SELECT [ ALL | DISTINCT [ ON ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) ] ]
     <replaceable class="PARAMETER">expression</replaceable> [ AS <replaceable class="PARAMETER">name</replaceable> ] [, ...]
     [ INTO [ TEMPORARY | TEMP ] [ TABLE ] <replaceable class="PARAMETER">new_table</replaceable> ]
     [ FROM <replaceable class="PARAMETER">table</replaceable> [ <replaceable class="PARAMETER">alias</replaceable> ] [, ...] ]
@@ -201,17 +201,30 @@ SELECT [ ALL | DISTINCT [ ON <replaceable class="PARAMETER">column</replaceable>
   </para>
 
   <para>
-   <command>DISTINCT</command> will eliminate all duplicate rows from the
+   <command>DISTINCT</command> will eliminate duplicate rows from the
    result.
-   <command>DISTINCT ON <replaceable class="PARAMETER">column</replaceable></command>
-   will eliminate all duplicates in the specified column; this is
-   similar to using
-   <command>GROUP BY <replaceable class="PARAMETER">column</replaceable></command>.
-   <command>ALL</command> will return all candidate rows,
+   <command>ALL</command> (the default) will return all candidate rows,
    including duplicates.
   </para>
 
   <para>
+   <command>DISTINCT ON</command> eliminates rows that match on all the
+   specified expressions, keeping only the first row of each set of
+   duplicates.  Note that "the first row" of each set is unpredictable
+   unless <command>ORDER BY</command> is used to ensure that the desired
+   row appears first.  For example,
+   <programlisting>
+        SELECT DISTINCT ON (location) location, time, report
+        FROM weatherReports
+        ORDER BY location, time DESC;
+   </programlisting>
+   retrieves the most recent weather report for each location.  But if
+   we had not used ORDER BY to force descending order of time values
+   for each location, we'd have gotten a report of unpredictable age
+   for each location.
+  </para>
+
+  <para>
    The GROUP BY clause allows a user to divide a table
    conceptually into groups.
    (See <xref linkend="sql-groupby" endterm="sql-groupby-title">.)
author	Tom Lane <tgl@sss.pgh.pa.us>	2000-01-27 18:11:50 +0000
committer	Tom Lane <tgl@sss.pgh.pa.us>	2000-01-27 18:11:50 +0000
commit	dd979f66be20fc54aad06da743f788fbc505bbe1 (patch)
tree	4d6b578e0afc7dc0198296940cd77c79c5eff66c /doc/src
parent	3f0074e403ac3a145c5e43db3348f45fe084703d (diff)
download	postgresql-dd979f66be20fc54aad06da743f788fbc505bbe1.tar.gz postgresql-dd979f66be20fc54aad06da743f788fbc505bbe1.zip