diff options
author | Heikki Linnakangas <heikki.linnakangas@iki.fi> | 2015-05-15 17:59:46 +0300 |
---|---|---|
committer | Heikki Linnakangas <heikki.linnakangas@iki.fi> | 2015-05-15 18:09:31 +0300 |
commit | 98edd617f3b62a02cb2df9b418fcc4ece45c7ec0 (patch) | |
tree | 6fdfbfe88d8e2aa0e43bbbc827957459e404adf1 /src/backend/access/gist/gistscan.c | |
parent | a868931fecdf93f3ceb1c9431bb93757b706269d (diff) | |
download | postgresql-98edd617f3b62a02cb2df9b418fcc4ece45c7ec0.tar.gz postgresql-98edd617f3b62a02cb2df9b418fcc4ece45c7ec0.zip |
Fix datatype confusion with the new lossy GiST distance functions.
We can only support a lossy distance function when the distance function's
datatype is comparable with the original ordering operator's datatype.
The distance function always returns a float8, so we are limited to float8,
and float4 (by a hard-coded cast of the float8 to float4).
In light of this limitation, it seems like a good idea to have a separate
'recheck' flag for the ORDER BY expressions, so that if you have a non-lossy
distance function, it still works with lossy quals. There are cases like
that with the build-in or contrib opclasses, but it's plausible.
There was a hidden assumption that the ORDER BY values returned by GiST
match the original ordering operator's return type, but there are plenty
of examples where that's not true, e.g. in btree_gist and pg_trgm. As long
as the distance function is not lossy, we can tolerate that and just not
return the distance to the executor (or rather, always return NULL). The
executor doesn't need the distances if there are no lossy results.
There was another little bug: the recheck variable was not initialized
before calling the distance function. That revealed the bigger issue,
as the executor tried to reorder tuples that didn't need reordering, and
that failed because of the datatype mismatch.
Diffstat (limited to 'src/backend/access/gist/gistscan.c')
-rw-r--r-- | src/backend/access/gist/gistscan.c | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c index 099849a606b..d0fea2b86ce 100644 --- a/src/backend/access/gist/gistscan.c +++ b/src/backend/access/gist/gistscan.c @@ -17,6 +17,7 @@ #include "access/gist_private.h" #include "access/gistscan.h" #include "access/relscan.h" +#include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/rel.h" @@ -263,6 +264,8 @@ gistrescan(PG_FUNCTION_ARGS) memmove(scan->orderByData, orderbys, scan->numberOfOrderBys * sizeof(ScanKeyData)); + so->orderByTypes = (Oid *) palloc(scan->numberOfOrderBys * sizeof(Oid)); + /* * Modify the order-by key so that the Distance method is called for * all comparisons. The original operator is passed to the Distance @@ -281,6 +284,19 @@ gistrescan(PG_FUNCTION_ARGS) GIST_DISTANCE_PROC, skey->sk_attno, RelationGetRelationName(scan->indexRelation)); + /* + * Look up the datatype returned by the original ordering operator. + * GiST always uses a float8 for the distance function, but the + * ordering operator could be anything else. + * + * XXX: The distance function is only allowed to be lossy if the + * ordering operator's result type is float4 or float8. Otherwise + * we don't know how to return the distance to the executor. But + * we cannot check that here, as we won't know if the distance + * function is lossy until it returns *recheck = true for the + * first time. + */ + so->orderByTypes[i] = get_func_rettype(skey->sk_func.fn_oid); fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt); /* Restore prior fn_extra pointers, if not first time */ |