diff options
author | Peter Eisentraut <peter@eisentraut.org> | 2023-08-23 08:12:50 +0200 |
---|---|---|
committer | Peter Eisentraut <peter@eisentraut.org> | 2023-08-23 08:12:50 +0200 |
commit | ed9330cff57bf38df2c38ea4d4cbb44e1378d4d4 (patch) | |
tree | 1a53c8e29ab285fd28faa742c8eda7fe654f91aa /doc/src | |
parent | ae556c44163b900aa5dfcca023dfbcdb1e2fd5fd (diff) | |
download | postgresql-ed9330cff57bf38df2c38ea4d4cbb44e1378d4d4.tar.gz postgresql-ed9330cff57bf38df2c38ea4d4cbb44e1378d4d4.zip |
Improve vertical spacing of documentation markup
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/charset.sgml | 60 |
1 files changed, 46 insertions, 14 deletions
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml index ed844659967..22721b105ff 100644 --- a/doc/src/sgml/charset.sgml +++ b/doc/src/sgml/charset.sgml @@ -377,10 +377,13 @@ initdb --locale-provider=icu --icu-locale=en variants and customization options. </para> </sect2> + <sect2 id="icu-locales"> <title>ICU Locales</title> + <sect3 id="icu-locale-names"> <title>ICU Locale Names</title> + <para> The ICU format for the locale name is a <link linkend="icu-language-tag">Language Tag</link>. @@ -412,16 +415,19 @@ NOTICE: using standard form "de-DE" for locale "de_DE.utf8" linkend="icu-language-tag">language tag</link> instead of relying on the transformation. </para> + <para> A locale with no language name, or the special language name <literal>root</literal>, is transformed to have the language <literal>und</literal> ("undefined"). </para> + <para> ICU can transform most libc locale names, as well as some other formats, into language tags for easier transition to ICU. If a libc locale name is used in ICU, it may not have precisely the same behavior as in libc. </para> + <para> If there is a problem interpreting the locale name, or if the locale name represents a language or region that ICU does not recognize, you will see @@ -442,10 +448,12 @@ CREATE COLLATION <sect3 id="icu-language-tag"> <title>Language Tag</title> + <para> A language tag, defined in BCP 47, is a standardized identifier used to identify languages, regions, and other information about a locale. </para> + <para> Basic language tags are simply <replaceable>language</replaceable><literal>-</literal><replaceable>region</replaceable>; @@ -457,6 +465,7 @@ CREATE COLLATION <literal>ja-JP</literal>, <literal>de</literal>, or <literal>fr-CA</literal>. </para> + <para> Collation settings may be included in the language tag to customize collation behavior. ICU allows extensive customization, such as @@ -464,6 +473,7 @@ CREATE COLLATION treatment of digits within text; and many other options to satisfy a variety of uses. </para> + <para> To include this additional collation information in a language tag, append <literal>-u</literal>, which indicates there are additional @@ -477,6 +487,7 @@ CREATE COLLATION <literal>-</literal><replaceable>value</replaceable>, which implies a value of <literal>true</literal>. </para> + <para> For example, the language tag <literal>en-US-u-kn-ks-level2</literal> means the locale with the English language in the US region, with @@ -500,6 +511,7 @@ SELECT 'N-45' < 'N-123' COLLATE mycollation5 as result; (1 row) </screen> </para> + <para> See <xref linkend="icu-custom-collations"/> for details and additional examples of using language tags with custom collation information for the @@ -507,6 +519,7 @@ SELECT 'N-45' < 'N-123' COLLATE mycollation5 as result; </para> </sect3> </sect2> + <sect2 id="locale-problems"> <title>Problems</title> @@ -1100,6 +1113,7 @@ CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-tr </tip> </sect3> </sect2> + <sect2 id="icu-custom-collations"> <title>ICU Custom Collations</title> @@ -1129,8 +1143,10 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true linkend="icu-collation-settings"/>, or see <xref linkend="icu-external-references"/> for more details. </para> + <sect3 id="icu-collation-comparison-levels"> <title>ICU Comparison Levels</title> + <para> Comparison of two strings (collation) in ICU is determined by a multi-level process, where textual features are grouped into @@ -1138,6 +1154,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true linkend="icu-collation-settings-table">collation settings</link>. Higher levels correspond to finer textual features. </para> + <para> <xref linkend="icu-collation-levels"/> shows which textual feature differences are considered significant when determining equality at the @@ -1145,7 +1162,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true invisible separator, and as seen in the table, is ignored for at all levels of comparison less than <literal>identic</literal>. </para> - <para> + <table id="icu-collation-levels"> <title>ICU Collation Levels</title> <tgroup cols="8"> @@ -1157,6 +1174,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true <colspec colname="col6" colwidth="1*"/> <colspec colname="col7" colwidth="1*"/> <colspec colname="col8" colwidth="1*"/> + <thead> <row> <entry>Level</entry> @@ -1169,6 +1187,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true <entry><literal>'y' = 'z'</literal></entry> </row> </thead> + <tbody> <row> <entry>level1</entry> @@ -1224,6 +1243,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true </tgroup> </table> + <para> At every level, even with full normalization off, basic normalization is performed. For example, <literal>'á'</literal> may be composed of the code points <literal>U&'\0061\0301'</literal> or the single code @@ -1233,9 +1253,9 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true created with <symbol>deterministic</symbol> set to <literal>true</literal>. </para> + <sect4 id="icu-collation-level-examples"> <title>Collation Level Examples</title> - <para> <programlisting> CREATE COLLATION level3 (provider = icu, deterministic = false, locale = 'und-u-ka-shifted-ks-level3'); @@ -1251,18 +1271,18 @@ SELECT 'x-y' = 'x_y' COLLATE level3; -- true SELECT 'x-y' = 'x_y' COLLATE level4; -- false </programlisting> - </para> </sect4> </sect3> <sect3 id="icu-collation-settings"> <title>Collation Settings for an ICU Locale</title> + <para> <xref linkend="icu-collation-settings-table"/> shows the available collation settings, which can be used as part of a language tag to customize a collation. </para> - <para> + <table id="icu-collation-settings-table"> <title>ICU Collation Settings</title> <tgroup cols="4"> @@ -1270,6 +1290,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false <colspec colname="col2" colwidth="2*"/> <colspec colname="col3" colwidth="2*"/> <colspec colname="col4" colwidth="5*"/> + <thead> <row> <entry>Key</entry> @@ -1278,6 +1299,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false <entry>Description</entry> </row> </thead> + <tbody> <row> <entry><literal>co</literal></entry> @@ -1287,6 +1309,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false Collation type. See <xref linkend="icu-external-references"/> for additional options and details. </entry> </row> + <row> <entry><literal>ka</literal></entry> <entry><literal>noignore</literal>, <literal>shifted</literal></entry> @@ -1299,6 +1322,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false character classes are ignored. </entry> </row> + <row> <entry><literal>kb</literal></entry> <entry><literal>true</literal>, <literal>false</literal></entry> @@ -1309,6 +1333,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false before <literal>'aé'</literal>. </entry> </row> + <row> <entry><literal>kc</literal></entry> <entry><literal>true</literal>, <literal>false</literal></entry> @@ -1325,6 +1350,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false </para> </entry> </row> + <row> <entry><literal>kf</literal></entry> <entry> @@ -1339,6 +1365,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false the rules of the locale. </entry> </row> + <row> <entry><literal>kn</literal></entry> <entry><literal>true</literal>, <literal>false</literal></entry> @@ -1350,6 +1377,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false <literal>'id-123'</literal>. </entry> </row> + <row> <entry><literal>kk</literal></entry> <entry><literal>true</literal>, <literal>false</literal></entry> @@ -1373,6 +1401,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false </para> </entry> </row> + <row> <entry><literal>kr</literal></entry> <entry> @@ -1398,6 +1427,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false </para> </entry> </row> + <row> <entry><literal>ks</literal></entry> <entry><literal>level1</literal>, <literal>level2</literal>, <literal>level3</literal>, <literal>level4</literal>, <literal>identic</literal></entry> @@ -1409,6 +1439,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false <xref linkend="icu-collation-levels"/> for details. </entry> </row> + <row> <entry><literal>kv</literal></entry> <entry> @@ -1429,10 +1460,13 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false </tbody> </tgroup> </table> - Defaults may depend on locale. The above table is not meant to be - complete. See <xref linkend="icu-external-references"/> for additional - options and details. + + <para> + Defaults may depend on locale. The above table is not meant to be + complete. See <xref linkend="icu-external-references"/> for additional + options and details. </para> + <note> <para> For many collation settings, you must create the collation with @@ -1448,7 +1482,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false <sect3 id="icu-locale-examples"> <title>Examples</title> - <para> + <variablelist> <varlistentry id="collation-managing-create-icu-de-u-co-phonebk-x-icu"> <term><literal>CREATE COLLATION "de-u-co-phonebk-x-icu" (provider = icu, locale = 'de-u-co-phonebk');</literal></term> @@ -1494,22 +1528,21 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false </listitem> </varlistentry> </variablelist> - </para> </sect3> <sect3 id="icu-external-references"> <title>External References for ICU</title> + <para> This section (<xref linkend="icu-custom-collations"/>) is only a brief overview of ICU behavior and language tags. Refer to the following documents for technical details, additional options, and new behavior: </para> + <itemizedlist> <listitem> <para> - <ulink - url="https://www.unicode.org/reports/tr35/tr35-collation.html">Unicode - Technical Standard #35</ulink> + <ulink url="https://www.unicode.org/reports/tr35/tr35-collation.html">Unicode Technical Standard #35</ulink> </para> </listitem> <listitem> @@ -1519,8 +1552,7 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false </listitem> <listitem> <para> - <ulink url="https://github.com/unicode-org/cldr/blob/master/common/bcp47/collation.xml">CLDR - repository</ulink> + <ulink url="https://github.com/unicode-org/cldr/blob/master/common/bcp47/collation.xml">CLDR repository</ulink> </para> </listitem> <listitem> |