aboutsummaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2010-08-25 02:12:00 +0000
committerTom Lane <tgl@sss.pgh.pa.us>2010-08-25 02:12:00 +0000
commit7fc614c6982702d411d82ee07240bd93134ba692 (patch)
treea615e621efa0bbdfb5ae7460413fddb7fe2b38d2 /doc/src
parent1dab218a69831b396faec553bf967d75abcc7ebc (diff)
downloadpostgresql-7fc614c6982702d411d82ee07240bd93134ba692.tar.gz
postgresql-7fc614c6982702d411d82ee07240bd93134ba692.zip
Docs review for unaccent: fix grammar, markup, etc.
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/unaccent.sgml96
1 files changed, 51 insertions, 45 deletions
diff --git a/doc/src/sgml/unaccent.sgml b/doc/src/sgml/unaccent.sgml
index ff6a2989dd4..6c73c3f2986 100644
--- a/doc/src/sgml/unaccent.sgml
+++ b/doc/src/sgml/unaccent.sgml
@@ -1,3 +1,5 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/unaccent.sgml,v 1.6 2010/08/25 02:12:00 tgl Exp $ -->
+
<sect1 id="unaccent">
<title>unaccent</title>
@@ -6,24 +8,24 @@
</indexterm>
<para>
- <filename>unaccent</> removes accents (diacritic signs) from a lexeme.
- It's a filtering dictionary, that means its output is
- always passed to the next dictionary (if any), contrary to the standard
- behavior. Currently, it supports most important accents from European
- languages.
+ <filename>unaccent</> is a text search dictionary that removes accents
+ (diacritic signs) from lexemes.
+ It's a filtering dictionary, which means its output is
+ always passed to the next dictionary (if any), unlike the normal
+ behavior of dictionaries. This allows accent-insensitive processing
+ for full text search.
</para>
<para>
- Limitation: Current implementation of <filename>unaccent</>
- dictionary cannot be used as a normalizing dictionary for
- <filename>thesaurus</filename> dictionary.
+ The current implementation of <filename>unaccent</> cannot be used as a
+ normalizing dictionary for the <filename>thesaurus</filename> dictionary.
</para>
-
+
<sect2>
<title>Configuration</title>
<para>
- A <literal>unaccent</> dictionary accepts the following options:
+ An <literal>unaccent</> dictionary accepts the following options:
</para>
<itemizedlist>
<listitem>
@@ -43,23 +45,27 @@
<itemizedlist>
<listitem>
<para>
- Each line represents pair: character_with_accent character_without_accent
+ Each line represents a pair, consisting of a character with accent
+ followed by a character without accent. The first is translated into
+ the second. For example,
<programlisting>
&Agrave; A
&Aacute; A
-&Acirc; A
+&Acirc; A
&Atilde; A
-&Auml; A
-&Aring; A
-&AElig; A
+&Auml; A
+&Aring; A
+&AElig; A
</programlisting>
</para>
</listitem>
</itemizedlist>
<para>
- Look at <filename>unaccent.rules</>, which is installed in
- <filename>$SHAREDIR/tsearch_data/</>, for an example.
+ A more complete example, which is directly useful for most European
+ languages, can be found in <filename>unaccent.rules</>, which is installed
+ in <filename>$SHAREDIR/tsearch_data/</> when the <filename>unaccent</>
+ module is installed.
</para>
</sect2>
@@ -67,66 +73,66 @@
<title>Usage</title>
<para>
- Running the installation script creates a text search template
- <literal>unaccent</> and a dictionary <literal>unaccent</>
+ Running the installation script <filename>unaccent.sql</> creates a text
+ search template <literal>unaccent</> and a dictionary <literal>unaccent</>
based on it, with default parameters. You can alter the
parameters, for example
<programlisting>
-=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
+mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
</programlisting>
or create new dictionaries based on the template.
</para>
<para>
- To test the dictionary, you can try
-
+ To test the dictionary, you can try:
<programlisting>
-=# select ts_lexize('unaccent','Hôtel');
- ts_lexize
+mydb=# select ts_lexize('unaccent','H&ocirc;tel');
+ ts_lexize
-----------
{Hotel}
(1 row)
</programlisting>
</para>
-
+
<para>
- Filtering dictionary are useful for correct work of
- <function>ts_headline</function> function.
+ Here is an example showing how to insert the
+ <filename>unaccent</> dictionary into a text search configuration:
<programlisting>
-=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
-=# ALTER TEXT SEARCH CONFIGURATION fr
+mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
+mydb=# ALTER TEXT SEARCH CONFIGURATION fr
ALTER MAPPING FOR hword, hword_part, word
WITH unaccent, french_stem;
-=# select to_tsvector('fr','Hôtels de la Mer');
- to_tsvector
+mydb=# select to_tsvector('fr','H&ocirc;tels de la Mer');
+ to_tsvector
-------------------
'hotel':1 'mer':4
(1 row)
-=# select to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
- ?column?
+mydb=# select to_tsvector('fr','H&ocirc;tel de la Mer') @@ to_tsquery('fr','Hotels');
+ ?column?
----------
t
(1 row)
-=# select ts_headline('fr','Hôtel de la Mer',to_tsquery('fr','Hotels'));
- ts_headline
+
+mydb=# select ts_headline('fr','H&ocirc;tel de la Mer',to_tsquery('fr','Hotels'));
+ ts_headline
------------------------
- &lt;b&gt;Hôtel&lt;/b&gt;de la Mer
+ &lt;b&gt;H&ocirc;tel&lt;/b&gt; de la Mer
(1 row)
-
</programlisting>
</para>
</sect2>
<sect2>
- <title>Function</title>
+ <title>Functions</title>
<para>
- <function>unaccent</> function removes accents (diacritic signs) from
- argument string. Basically, it's a wrapper around
- <filename>unaccent</> dictionary.
+ The <function>unaccent()</> function removes accents (diacritic signs) from
+ a given string. Basically, it's a wrapper around the
+ <filename>unaccent</> dictionary, but it can be used outside normal
+ text search contexts.
</para>
<indexterm>
@@ -134,14 +140,14 @@
</indexterm>
<synopsis>
-unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>)
-returns <type>text</type>
+unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>) returns <type>text</type>
</synopsis>
<para>
+ For example:
<programlisting>
-SELECT unaccent('unaccent', 'Hôtel');
-SELECT unaccent('Hôtel');
+SELECT unaccent('unaccent', 'H&ocirc;tel');
+SELECT unaccent('H&ocirc;tel');
</programlisting>
</para>
</sect2>