diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2016-01-19 12:04:32 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2016-01-19 12:04:36 -0500 |
commit | 9ff60273e35cad6e9d3a4adf59d5c2455afe9d9e (patch) | |
tree | 197b0e0ba7124ba99aa77b0344f418a7b6fbb9cf /doc/src | |
parent | 53c949c1be2f43cd47cb433923e76ea00e9222bc (diff) | |
download | postgresql-9ff60273e35cad6e9d3a4adf59d5c2455afe9d9e.tar.gz postgresql-9ff60273e35cad6e9d3a4adf59d5c2455afe9d9e.zip |
Fix assorted inconsistencies in GiST opclass support function declarations.
The conventions specified by the GiST SGML documentation were widely
ignored. For example, the strategy-number argument for "consistent" and
"distance" functions is specified to be a smallint, but most of the
built-in support functions declared it as an integer, and for that matter
the core code passed it using Int32GetDatum not Int16GetDatum. None of
that makes any real difference at runtime, but it's quite confusing for
newcomers to the code, and it makes it very hard to write an amvalidate()
function that checks support function signatures. So let's try to instill
some consistency here.
Another similar issue is that the "query" argument is not of a single
well-defined type, but could have different types depending on the strategy
(corresponding to search operators with different righthand-side argument
types). Some of the functions threw up their hands and declared the query
argument as being of "internal" type, which surely isn't right ("any" would
have been more appropriate); but the majority position seemed to be to
declare it as being of the indexed data type, corresponding to a search
operator with both input types the same. So I've specified a convention
that that's what to do always.
Also, the result of the "union" support function actually must be of the
index's storage type, but the documentation suggested declaring it to
return "internal", and some of the functions followed that. Standardize
on telling the truth, instead.
Similarly, standardize on declaring the "same" function's inputs as
being of the storage type, not "internal".
Also, somebody had forgotten to add the "recheck" argument to both
the documentation of the "distance" support function and all of their
SQL declarations, even though the C code was happily using that argument.
Clean that up too.
Fix up some other omissions in the docs too, such as documenting that
union's second input argument is vestigial.
So far as the errors in core function declarations go, we can just fix
pg_proc.h and bump catversion. Adjusting the erroneous declarations in
contrib modules is more debatable: in principle any change in those
scripts should involve an extension version bump, which is a pain.
However, since these changes are purely cosmetic and make no functional
difference, I think we can get away without doing that.
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/gist.sgml | 75 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_opclass.sgml | 2 |
2 files changed, 58 insertions, 19 deletions
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml index 2d1a5aa863f..b3cc347e5cc 100644 --- a/doc/src/sgml/gist.sgml +++ b/doc/src/sgml/gist.sgml @@ -359,9 +359,20 @@ my_consistent(PG_FUNCTION_ARGS) the value being looked up in the index. The <literal>StrategyNumber</> parameter indicates which operator of your operator class is being applied — it matches one of the operator numbers in the - <command>CREATE OPERATOR CLASS</> command. Depending on what operators - you have included in the class, the data type of <varname>query</> could - vary with the operator, but the above skeleton assumes it doesn't. + <command>CREATE OPERATOR CLASS</> command. + </para> + + <para> + Depending on which operators you have included in the class, the data + type of <varname>query</> could vary with the operator, since it will + be whatever type is on the righthand side of the operator, which might + be different from the indexed data type appearing on the lefthand side. + (The above code skeleton assumes that only one type is possible; if + not, fetching the <varname>query</> argument value would have to depend + on the operator.) It is recommended that the SQL declaration of + the <function>consistent</> function use the opclass's indexed data + type for the <varname>query</> argument, even though the actual type + might be something else depending on the operator. </para> </listitem> @@ -381,7 +392,7 @@ my_consistent(PG_FUNCTION_ARGS) <programlisting> CREATE OR REPLACE FUNCTION my_union(internal, internal) -RETURNS internal +RETURNS storage_type AS 'MODULE_PATHNAME' LANGUAGE C STRICT; </programlisting> @@ -434,9 +445,21 @@ my_union(PG_FUNCTION_ARGS) </para> <para> - The <function>union</> implementation function should return a - pointer to newly <function>palloc()</>ed memory. You can't just - return whatever the input is. + The result of the <function>union</> function must be a value of the + index's storage type, whatever that is (it might or might not be + different from the indexed column's type). The <function>union</> + function should return a pointer to newly <function>palloc()</>ed + memory. You can't just return the input value as-is, even if there is + no type change. + </para> + + <para> + As shown above, the <function>union</> function's + first <type>internal</> argument is actually + a <structname>GistEntryVector</> pointer. The second argument is a + pointer to an integer variable, which can be ignored. (It used to be + required that the <function>union</> function store the size of its + result value into that variable, but this is no longer necessary.) </para> </listitem> </varlistentry> @@ -576,6 +599,12 @@ my_penalty(PG_FUNCTION_ARGS) PG_RETURN_POINTER(penalty); } </programlisting> + + For historical reasons, the <function>penalty</> function doesn't + just return a <type>float</> result; instead it has to store the value + at the location indicated by the third argument. The return + value per se is ignored, though it's conventional to pass back the + address of that argument. </para> <para> @@ -615,9 +644,9 @@ Datum my_picksplit(PG_FUNCTION_ARGS) { GistEntryVector *entryvec = (GistEntryVector *) PG_GETARG_POINTER(0); + GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1); OffsetNumber maxoff = entryvec->n - 1; GISTENTRY *ent = entryvec->vector; - GIST_SPLITVEC *v = (GIST_SPLITVEC *) PG_GETARG_POINTER(1); int i, nbytes; OffsetNumber *left, @@ -683,6 +712,11 @@ my_picksplit(PG_FUNCTION_ARGS) PG_RETURN_POINTER(v); } </programlisting> + + Notice that the <function>picksplit</> function's result is delivered + by modifying the passed-in <structname>v</> structure. The return + value per se is ignored, though it's conventional to pass back the + address of <structname>v</>. </para> <para> @@ -700,13 +734,15 @@ my_picksplit(PG_FUNCTION_ARGS) <listitem> <para> Returns true if two index entries are identical, false otherwise. + (An <quote>index entry</> is a value of the index's storage type, + not necessarily the original indexed column's type.) </para> <para> The <acronym>SQL</> declaration of the function must look like this: <programlisting> -CREATE OR REPLACE FUNCTION my_same(internal, internal, internal) +CREATE OR REPLACE FUNCTION my_same(storage_type, storage_type, internal) RETURNS internal AS 'MODULE_PATHNAME' LANGUAGE C STRICT; @@ -731,7 +767,9 @@ my_same(PG_FUNCTION_ARGS) For historical reasons, the <function>same</> function doesn't just return a Boolean result; instead it has to store the flag - at the location indicated by the third argument. + at the location indicated by the third argument. The return + value per se is ignored, though it's conventional to pass back the + address of that argument. </para> </listitem> </varlistentry> @@ -756,7 +794,7 @@ my_same(PG_FUNCTION_ARGS) The <acronym>SQL</> declaration of the function must look like this: <programlisting> -CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid) +CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal) RETURNS float8 AS 'MODULE_PATHNAME' LANGUAGE C STRICT; @@ -824,7 +862,7 @@ my_distance(PG_FUNCTION_ARGS) <term><function>fetch</></term> <listitem> <para> - Converts the compressed index representation of the data item into the + Converts the compressed index representation of a data item into the original data type, for index-only scans. The returned data must be an exact, non-lossy copy of the originally indexed value. </para> @@ -840,11 +878,12 @@ LANGUAGE C STRICT; </programlisting> The argument is a pointer to a <structname>GISTENTRY</> struct. On - entry, its 'key' field contains a non-NULL leaf datum in its + entry, its <structfield>key</> field contains a non-NULL leaf datum in compressed form. The return value is another <structname>GISTENTRY</> - struct, whose 'key' field contains the same datum in the original, - uncompressed form. If the opclass' compress function does nothing for - leaf entries, the fetch method can return the argument as is. + struct, whose <structfield>key</> field contains the same datum in its + original, uncompressed form. If the opclass's compress function does + nothing for leaf entries, the <function>fetch</> method can return the + argument as-is. </para> <para> @@ -879,8 +918,8 @@ my_fetch(PG_FUNCTION_ARGS) <para> If the compress method is lossy for leaf entries, the operator class - cannot support index-only scans, and must not define a 'fetch' - function. + cannot support index-only scans, and must not define + a <function>fetch</> function. </para> </listitem> diff --git a/doc/src/sgml/ref/create_opclass.sgml b/doc/src/sgml/ref/create_opclass.sgml index b21a1a13b32..527f33c7a80 100644 --- a/doc/src/sgml/ref/create_opclass.sgml +++ b/doc/src/sgml/ref/create_opclass.sgml @@ -289,7 +289,7 @@ CREATE OPERATOR CLASS gist__int_ops OPERATOR 7 @>, OPERATOR 8 <@, OPERATOR 20 @@ (_int4, query_int), - FUNCTION 1 g_int_consistent (internal, _int4, int, oid, internal), + FUNCTION 1 g_int_consistent (internal, _int4, smallint, oid, internal), FUNCTION 2 g_int_union (internal, internal), FUNCTION 3 g_int_compress (internal), FUNCTION 4 g_int_decompress (internal), |