aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorJeff Davis <jdavis@postgresql.org>2024-03-13 23:33:44 -0700
committerJeff Davis <jdavis@postgresql.org>2024-03-13 23:33:44 -0700
commit2d819a08a1cbc11364e36f816b02e33e8dcc030b (patch)
tree1a8d3b459866d7df936faffa0e64f5e339e6a6c2 /doc/src
parent6ab2e8385d55e0b73bb8bbc41d9c286f5f7f357f (diff)
downloadpostgresql-2d819a08a1cbc11364e36f816b02e33e8dcc030b.tar.gz
postgresql-2d819a08a1cbc11364e36f816b02e33e8dcc030b.zip
Introduce "builtin" collation provider.
New provider for collations, like "libc" or "icu", but without any external dependency. Initially, the only locale supported by the builtin provider is "C", which is identical to the libc provider's "C" locale. The libc provider's "C" locale has always been treated as a special case that uses an internal implementation, without using libc at all -- so the new builtin provider uses the same implementation. The builtin provider's locale is independent of the server environment variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the database collation locale can be "C" while LC_COLLATE and LC_CTYPE are set to "en_US", which is impossible with the libc provider. By offering a new builtin provider, it clarifies that the semantics of a collation using this provider will never depend on libc, and makes it easier to document the behavior. Discussion: https://postgr.es/m/ab925f69-5f9d-f85e-b87c-bd2a44798659@joeconway.com Discussion: https://postgr.es/m/dd9261f4-7a98-4565-93ec-336c1c110d90@manitou-mail.org Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/charset.sgml90
-rw-r--r--doc/src/sgml/ref/create_collation.sgml11
-rw-r--r--doc/src/sgml/ref/create_database.sgml7
-rw-r--r--doc/src/sgml/ref/createdb.sgml2
-rw-r--r--doc/src/sgml/ref/initdb.sgml17
5 files changed, 104 insertions, 23 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 4fc143025ef..7114eb7b522 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -342,22 +342,14 @@ initdb --locale=sv_SE
<title>Locale Providers</title>
<para>
- <productname>PostgreSQL</productname> supports multiple <firstterm>locale
- providers</firstterm>. This specifies which library supplies the locale
- data. One standard provider name is <literal>libc</literal>, which uses
- the locales provided by the operating system C library. These are the
- locales used by most tools provided by the operating system. Another
- provider is <literal>icu</literal>, which uses the external
- ICU<indexterm><primary>ICU</primary></indexterm> library. ICU locales can
- only be used if support for ICU was configured when PostgreSQL was built.
+ A locale provider specifies which library defines the locale behavior for
+ collations and character classifications.
</para>
<para>
The commands and tools that select the locale settings, as described
- above, each have an option to select the locale provider. The examples
- shown earlier all use the <literal>libc</literal> provider, which is the
- default. Here is an example to initialize a database cluster using the
- ICU provider:
+ above, each have an option to select the locale provider. Here is an
+ example to initialize a database cluster using the ICU provider:
<programlisting>
initdb --locale-provider=icu --icu-locale=en
</programlisting>
@@ -370,12 +362,76 @@ initdb --locale-provider=icu --icu-locale=en
</para>
<para>
- Which locale provider to use depends on individual requirements. For most
- basic uses, either provider will give adequate results. For the libc
- provider, it depends on what the operating system offers; some operating
- systems are better than others. For advanced uses, ICU offers more locale
- variants and customization options.
+ Regardless of the locale provider, the operating system is still used to
+ provide some locale-aware behavior, such as messages (see <xref
+ linkend="guc-lc-messages"/>).
</para>
+
+ <para>
+ The available locale providers are listed below:
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>builtin</literal></term>
+ <listitem>
+ <para>
+ The <literal>builtin</literal> provider uses built-in operations. Only
+ the <literal>C</literal> locale is supported for this provider.
+ </para>
+ <para>
+ The <literal>C</literal> locale behavior is identical to the
+ <literal>C</literal> locale in the libc provider. When using this
+ locale, the behavior may depend on the database encoding.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>icu</literal></term>
+ <listitem>
+ <para>
+ The <literal>icu</literal> provider uses the external
+ ICU<indexterm><primary>ICU</primary></indexterm>
+ library. <productname>PostgreSQL</productname> must have been
+ configured with support.
+ </para>
+ <para>
+ ICU provides collation and character classification behavior that is
+ independent of the operating system and database encoding, which is
+ preferable if you expect to transition to other platforms without any
+ change in results. <literal>LC_COLLATE</literal> and
+ <literal>LC_CTYPE</literal> can be set independently of the ICU
+ locale.
+ </para>
+ <note>
+ <para>
+ For the ICU provider, results may depend on the version of the ICU
+ library used, as it is updated to reflect changes in natural language
+ over time.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>libc</literal></term>
+ <listitem>
+ <para>
+ The <literal>libc</literal> provider uses the operating system's C
+ library. The collation and character classification behavior is
+ controlled by the settings <literal>LC_COLLATE</literal> and
+ <literal>LC_CTYPE</literal>, so they cannot be set independently.
+ </para>
+ <note>
+ <para>
+ The same locale name may have different behavior on different
+ platforms when using the libc provider.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</sect2>
<sect2 id="icu-locales">
diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml
index 5cf9777764b..98cd7d56be9 100644
--- a/doc/src/sgml/ref/create_collation.sgml
+++ b/doc/src/sgml/ref/create_collation.sgml
@@ -96,6 +96,11 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
<replaceable>locale</replaceable>, you cannot specify either of those
parameters.
</para>
+ <para>
+ If <replaceable>provider</replaceable> is <literal>builtin</literal>,
+ then <replaceable>locale</replaceable> must be specified and set to
+ <literal>C</literal>.
+ </para>
</listitem>
</varlistentry>
@@ -129,9 +134,9 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace
<listitem>
<para>
Specifies the provider to use for locale services associated with this
- collation. Possible values are
- <literal>icu</literal><indexterm><primary>ICU</primary></indexterm>
- (if the server was built with ICU support) or <literal>libc</literal>.
+ collation. Possible values are <literal>builtin</literal>,
+ <literal>icu</literal><indexterm><primary>ICU</primary></indexterm> (if
+ the server was built with ICU support) or <literal>libc</literal>.
<literal>libc</literal> is the default. See <xref
linkend="locale-providers"/> for details.
</para>
diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml
index 72927960ebb..6c1fd95602d 100644
--- a/doc/src/sgml/ref/create_database.sgml
+++ b/doc/src/sgml/ref/create_database.sgml
@@ -162,6 +162,11 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
linkend="create-database-lc-ctype"/>, or <xref
linkend="create-database-icu-locale"/> individually.
</para>
+ <para>
+ If <xref linkend="create-database-locale-provider"/> is
+ <literal>builtin</literal>, then <replaceable>locale</replaceable>
+ must be specified and set to <literal>C</literal>.
+ </para>
<tip>
<para>
The other locale settings <xref linkend="guc-lc-messages"/>, <xref
@@ -243,7 +248,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable>
<listitem>
<para>
Specifies the provider to use for the default collation in this
- database. Possible values are
+ database. Possible values are <literal>builtin</literal>,
<literal>icu</literal><indexterm><primary>ICU</primary></indexterm>
(if the server was built with ICU support) or <literal>libc</literal>.
By default, the provider is the same as that of the <xref
diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml
index e4647d5ce71..d3e815f659c 100644
--- a/doc/src/sgml/ref/createdb.sgml
+++ b/doc/src/sgml/ref/createdb.sgml
@@ -171,7 +171,7 @@ PostgreSQL documentation
</varlistentry>
<varlistentry>
- <term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
+ <term><option>--locale-provider={<literal>builtin</literal>|<literal>libc</literal>|<literal>icu</literal>}</option></term>
<listitem>
<para>
Specifies the locale provider for the database's default collation.
diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index cd75cae10e2..4760570f6ab 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -286,6 +286,11 @@ PostgreSQL documentation
environment that <command>initdb</command> runs in. Locale
support is described in <xref linkend="locale"/>.
</para>
+ <para>
+ If <option>--locale-provider</option> is <literal>builtin</literal>,
+ <option>--locale</option> must be specified and set to
+ <literal>C</literal>.
+ </para>
</listitem>
</varlistentry>
@@ -314,8 +319,18 @@ PostgreSQL documentation
</listitem>
</varlistentry>
+ <varlistentry id="app-initdb-builtin-locale">
+ <term><option>--builtin-locale=<replaceable>locale</replaceable></option></term>
+ <listitem>
+ <para>
+ Specifies the locale name when the builtin provider is used. Locale support
+ is described in <xref linkend="locale"/>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="app-initdb-option-locale-provider">
- <term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term>
+ <term><option>--locale-provider={<literal>builtin</literal>|<literal>libc</literal>|<literal>icu</literal>}</option></term>
<listitem>
<para>
This option sets the locale provider for databases created in the new