diff options
author | Jeff Davis <jdavis@postgresql.org> | 2024-03-13 23:33:44 -0700 |
---|---|---|
committer | Jeff Davis <jdavis@postgresql.org> | 2024-03-13 23:33:44 -0700 |
commit | 2d819a08a1cbc11364e36f816b02e33e8dcc030b (patch) | |
tree | 1a8d3b459866d7df936faffa0e64f5e339e6a6c2 /doc/src | |
parent | 6ab2e8385d55e0b73bb8bbc41d9c286f5f7f357f (diff) | |
download | postgresql-2d819a08a1cbc11364e36f816b02e33e8dcc030b.tar.gz postgresql-2d819a08a1cbc11364e36f816b02e33e8dcc030b.zip |
Introduce "builtin" collation provider.
New provider for collations, like "libc" or "icu", but without any
external dependency.
Initially, the only locale supported by the builtin provider is "C",
which is identical to the libc provider's "C" locale. The libc
provider's "C" locale has always been treated as a special case that
uses an internal implementation, without using libc at all -- so the
new builtin provider uses the same implementation.
The builtin provider's locale is independent of the server environment
variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the
database collation locale can be "C" while LC_COLLATE and LC_CTYPE are
set to "en_US", which is impossible with the libc provider.
By offering a new builtin provider, it clarifies that the semantics of
a collation using this provider will never depend on libc, and makes
it easier to document the behavior.
Discussion: https://postgr.es/m/ab925f69-5f9d-f85e-b87c-bd2a44798659@joeconway.com
Discussion: https://postgr.es/m/dd9261f4-7a98-4565-93ec-336c1c110d90@manitou-mail.org
Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com
Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/charset.sgml | 90 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_collation.sgml | 11 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_database.sgml | 7 | ||||
-rw-r--r-- | doc/src/sgml/ref/createdb.sgml | 2 | ||||
-rw-r--r-- | doc/src/sgml/ref/initdb.sgml | 17 |
5 files changed, 104 insertions, 23 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml index 4fc143025ef..7114eb7b522 100644 --- a/doc/src/sgml/charset.sgml +++ b/doc/src/sgml/charset.sgml @@ -342,22 +342,14 @@ initdb --locale=sv_SE <title>Locale Providers</title> <para> - <productname>PostgreSQL</productname> supports multiple <firstterm>locale - providers</firstterm>. This specifies which library supplies the locale - data. One standard provider name is <literal>libc</literal>, which uses - the locales provided by the operating system C library. These are the - locales used by most tools provided by the operating system. Another - provider is <literal>icu</literal>, which uses the external - ICU<indexterm><primary>ICU</primary></indexterm> library. ICU locales can - only be used if support for ICU was configured when PostgreSQL was built. + A locale provider specifies which library defines the locale behavior for + collations and character classifications. </para> <para> The commands and tools that select the locale settings, as described - above, each have an option to select the locale provider. The examples - shown earlier all use the <literal>libc</literal> provider, which is the - default. Here is an example to initialize a database cluster using the - ICU provider: + above, each have an option to select the locale provider. Here is an + example to initialize a database cluster using the ICU provider: <programlisting> initdb --locale-provider=icu --icu-locale=en </programlisting> @@ -370,12 +362,76 @@ initdb --locale-provider=icu --icu-locale=en </para> <para> - Which locale provider to use depends on individual requirements. For most - basic uses, either provider will give adequate results. For the libc - provider, it depends on what the operating system offers; some operating - systems are better than others. For advanced uses, ICU offers more locale - variants and customization options. + Regardless of the locale provider, the operating system is still used to + provide some locale-aware behavior, such as messages (see <xref + linkend="guc-lc-messages"/>). </para> + + <para> + The available locale providers are listed below: + </para> + + <variablelist> + <varlistentry> + <term><literal>builtin</literal></term> + <listitem> + <para> + The <literal>builtin</literal> provider uses built-in operations. Only + the <literal>C</literal> locale is supported for this provider. + </para> + <para> + The <literal>C</literal> locale behavior is identical to the + <literal>C</literal> locale in the libc provider. When using this + locale, the behavior may depend on the database encoding. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>icu</literal></term> + <listitem> + <para> + The <literal>icu</literal> provider uses the external + ICU<indexterm><primary>ICU</primary></indexterm> + library. <productname>PostgreSQL</productname> must have been + configured with support. + </para> + <para> + ICU provides collation and character classification behavior that is + independent of the operating system and database encoding, which is + preferable if you expect to transition to other platforms without any + change in results. <literal>LC_COLLATE</literal> and + <literal>LC_CTYPE</literal> can be set independently of the ICU + locale. + </para> + <note> + <para> + For the ICU provider, results may depend on the version of the ICU + library used, as it is updated to reflect changes in natural language + over time. + </para> + </note> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>libc</literal></term> + <listitem> + <para> + The <literal>libc</literal> provider uses the operating system's C + library. The collation and character classification behavior is + controlled by the settings <literal>LC_COLLATE</literal> and + <literal>LC_CTYPE</literal>, so they cannot be set independently. + </para> + <note> + <para> + The same locale name may have different behavior on different + platforms when using the libc provider. + </para> + </note> + </listitem> + </varlistentry> + </variablelist> </sect2> <sect2 id="icu-locales"> diff --git a/doc/src/sgml/ref/create_collation.sgml b/doc/src/sgml/ref/create_collation.sgml index 5cf9777764b..98cd7d56be9 100644 --- a/doc/src/sgml/ref/create_collation.sgml +++ b/doc/src/sgml/ref/create_collation.sgml @@ -96,6 +96,11 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace <replaceable>locale</replaceable>, you cannot specify either of those parameters. </para> + <para> + If <replaceable>provider</replaceable> is <literal>builtin</literal>, + then <replaceable>locale</replaceable> must be specified and set to + <literal>C</literal>. + </para> </listitem> </varlistentry> @@ -129,9 +134,9 @@ CREATE COLLATION [ IF NOT EXISTS ] <replaceable>name</replaceable> FROM <replace <listitem> <para> Specifies the provider to use for locale services associated with this - collation. Possible values are - <literal>icu</literal><indexterm><primary>ICU</primary></indexterm> - (if the server was built with ICU support) or <literal>libc</literal>. + collation. Possible values are <literal>builtin</literal>, + <literal>icu</literal><indexterm><primary>ICU</primary></indexterm> (if + the server was built with ICU support) or <literal>libc</literal>. <literal>libc</literal> is the default. See <xref linkend="locale-providers"/> for details. </para> diff --git a/doc/src/sgml/ref/create_database.sgml b/doc/src/sgml/ref/create_database.sgml index 72927960ebb..6c1fd95602d 100644 --- a/doc/src/sgml/ref/create_database.sgml +++ b/doc/src/sgml/ref/create_database.sgml @@ -162,6 +162,11 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable> linkend="create-database-lc-ctype"/>, or <xref linkend="create-database-icu-locale"/> individually. </para> + <para> + If <xref linkend="create-database-locale-provider"/> is + <literal>builtin</literal>, then <replaceable>locale</replaceable> + must be specified and set to <literal>C</literal>. + </para> <tip> <para> The other locale settings <xref linkend="guc-lc-messages"/>, <xref @@ -243,7 +248,7 @@ CREATE DATABASE <replaceable class="parameter">name</replaceable> <listitem> <para> Specifies the provider to use for the default collation in this - database. Possible values are + database. Possible values are <literal>builtin</literal>, <literal>icu</literal><indexterm><primary>ICU</primary></indexterm> (if the server was built with ICU support) or <literal>libc</literal>. By default, the provider is the same as that of the <xref diff --git a/doc/src/sgml/ref/createdb.sgml b/doc/src/sgml/ref/createdb.sgml index e4647d5ce71..d3e815f659c 100644 --- a/doc/src/sgml/ref/createdb.sgml +++ b/doc/src/sgml/ref/createdb.sgml @@ -171,7 +171,7 @@ PostgreSQL documentation </varlistentry> <varlistentry> - <term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term> + <term><option>--locale-provider={<literal>builtin</literal>|<literal>libc</literal>|<literal>icu</literal>}</option></term> <listitem> <para> Specifies the locale provider for the database's default collation. diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index cd75cae10e2..4760570f6ab 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -286,6 +286,11 @@ PostgreSQL documentation environment that <command>initdb</command> runs in. Locale support is described in <xref linkend="locale"/>. </para> + <para> + If <option>--locale-provider</option> is <literal>builtin</literal>, + <option>--locale</option> must be specified and set to + <literal>C</literal>. + </para> </listitem> </varlistentry> @@ -314,8 +319,18 @@ PostgreSQL documentation </listitem> </varlistentry> + <varlistentry id="app-initdb-builtin-locale"> + <term><option>--builtin-locale=<replaceable>locale</replaceable></option></term> + <listitem> + <para> + Specifies the locale name when the builtin provider is used. Locale support + is described in <xref linkend="locale"/>. + </para> + </listitem> + </varlistentry> + <varlistentry id="app-initdb-option-locale-provider"> - <term><option>--locale-provider={<literal>libc</literal>|<literal>icu</literal>}</option></term> + <term><option>--locale-provider={<literal>builtin</literal>|<literal>libc</literal>|<literal>icu</literal>}</option></term> <listitem> <para> This option sets the locale provider for databases created in the new |