aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorJeff Davis <jdavis@postgresql.org>2024-03-02 13:37:43 -0800
committerJeff Davis <jdavis@postgresql.org>2024-03-02 13:37:43 -0800
commit875e46a0a246e416b12a9debe084ede9d02f1b5d (patch)
treeec4b814224b9906bc730849650257159cce25831 /doc/src
parent1e013746544bd1f9df70f5547894fd72719c4b85 (diff)
downloadpostgresql-875e46a0a246e416b12a9debe084ede9d02f1b5d.tar.gz
postgresql-875e46a0a246e416b12a9debe084ede9d02f1b5d.zip
Documentation update for Standard Collations.
Correct out-of-date text that said the "default" collation is always based on LC_COLLATE and LC_CTYPE. Also reformat into a list to make it easier to understand and compare the available collations, and briefly document the stability characteristics of each one. Discussion: https://postgr.es/m/4a69d067374d2f6bfb66f5bfb2ab9a020493d49f.camel@j-davis.com
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/charset.sgml72
1 files changed, 45 insertions, 27 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 74783d148fe..4fc143025ef 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -788,37 +788,19 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
<title>Standard Collations</title>
<para>
- On all platforms, the collations named <literal>default</literal>,
- <literal>C</literal>, and <literal>POSIX</literal> are available. Additional
- collations may be available depending on operating system support.
- The <literal>default</literal> collation selects the <symbol>LC_COLLATE</symbol>
- and <symbol>LC_CTYPE</symbol> values specified at database creation time.
- The <literal>C</literal> and <literal>POSIX</literal> collations both specify
- <quote>traditional C</quote> behavior, in which only the ASCII letters
- <quote><literal>A</literal></quote> through <quote><literal>Z</literal></quote>
- are treated as letters, and sorting is done strictly by character
- code byte values.
- </para>
-
- <note>
- <para>
- The <literal>C</literal> and <literal>POSIX</literal> locales may behave
- differently depending on the database encoding.
- </para>
- </note>
-
- <para>
- Additionally, two SQL standard collation names are available:
+ On all platforms, the following collations are supported:
<variablelist>
<varlistentry>
<term><literal>unicode</literal></term>
<listitem>
<para>
- This collation sorts using the Unicode Collation Algorithm with the
- Default Unicode Collation Element Table. It is available in all
- encodings. ICU support is required to use this collation. (This
- collation has the same behavior as the ICU root locale; see <xref
+ This SQL standard collation sorts using the Unicode Collation
+ Algorithm with the Default Unicode Collation Element Table. It is
+ available in all encodings. ICU support is required to use this
+ collation, and behavior may change if Postgres is built with a
+ different version of ICU. (This collation has the same behavior as
+ the ICU root locale; see <xref
linkend="collation-managing-predefined-icu-und-x-icu"/>.)
</para>
</listitem>
@@ -828,15 +810,51 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
<term><literal>ucs_basic</literal></term>
<listitem>
<para>
- This collation sorts by Unicode code point. It is only available for
- encoding <literal>UTF8</literal>. (This collation has the same
+ This SQL standard collation sorts using the Unicode code point values
+ rather than natural language order, and only the ASCII letters
+ <quote><literal>A</literal></quote> through
+ <quote><literal>Z</literal></quote> are treated as letters. The
+ behavior is efficient and stable across all versions. Only available
+ for encoding <literal>UTF8</literal>. (This collation has the same
behavior as the libc locale specification <literal>C</literal> in
<literal>UTF8</literal> encoding.)
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>C</literal> (equivalent to <literal>POSIX</literal>)</term>
+ <listitem>
+ <para>
+ The <literal>C</literal> and <literal>POSIX</literal> collations are
+ based on <quote>traditional C</quote> behavior. They sort by byte
+ values rather than natural language order, and only the ASCII letters
+ <quote><literal>A</literal></quote> through
+ <quote><literal>Z</literal></quote> are treated as letters. The
+ behavior is efficient and stable across all versions for a given
+ database encoding, but behavior may vary between different database
+ encodings.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>default</literal></term>
+ <listitem>
+ <para>
+ The <literal>default</literal> collation selects the locale specified
+ at database creation time.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
+
+ <para>
+ Additional collations may be available depending on operating system
+ support. The efficiency and stability of these additional collations
+ depend on the collation provider, the provider version, and the locale.
+ </para>
</sect3>
<sect3 id="collation-managing-predefined">