diff options
author | Alexander Korotkov <akorotkov@postgresql.org> | 2022-02-07 23:20:42 +0300 |
---|---|---|
committer | Alexander Korotkov <akorotkov@postgresql.org> | 2022-02-07 23:20:42 +0300 |
commit | f1ea98a7975e15cefdb446385880a2f55224ee7d (patch) | |
tree | 8efc57dc7b2480397fc838baa0dd328630f611b6 /src/backend/access/gist/README | |
parent | 42a9e88bf6a809e6023c9d50f58cc6b9446f229d (diff) | |
download | postgresql-f1ea98a7975e15cefdb446385880a2f55224ee7d.tar.gz postgresql-f1ea98a7975e15cefdb446385880a2f55224ee7d.zip |
Reduce non-leaf keys overlap in GiST indexes produced by a sorted build
The GiST sorted build currently chooses split points according to the only page
space utilization. That may lead to higher non-leaf keys overlap and, in turn,
slower search query answers.
This commit makes the sorted build use the opclass's picksplit method. Once
four pages at the level are accumulated, the picksplit method is applied until
each split partition fits the page. Some of our split algorithms could show
significant performance degradation while processing 4-times more data at once.
But those opclasses haven't received the sorted build support and shouldn't
receive it before their split algorithms are improved.
Discussion: https://postgr.es/m/CAHqSB9jqtS94e9%3D0vxqQX5dxQA89N95UKyz-%3DA7Y%2B_YJt%2BVW5A%40mail.gmail.com
Author: Aliaksandr Kalenik, Sergei Shoulbakov, Andrey Borodin
Reviewed-by: Björn Harrtell, Darafei Praliaskouski, Andres Freund
Reviewed-by: Alexander Korotkov
Diffstat (limited to 'src/backend/access/gist/README')
-rw-r--r-- | src/backend/access/gist/README | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README index 25cab0047b6..efb2730e181 100644 --- a/src/backend/access/gist/README +++ b/src/backend/access/gist/README @@ -26,6 +26,7 @@ The current implementation of GiST supports: * Concurrency * Recovery support via WAL logging * Buffering build algorithm + * Sorted build method The support for concurrency implemented in PostgreSQL was developed based on the paper "Access Methods for Next-Generation Database Systems" by @@ -414,6 +415,21 @@ emptied yet; tuples never move upwards in the tree. The final emptying loops through buffers at a given level until all buffers at that level have been emptied, and then moves down to the next level. +Sorted build method +------------------- + +Sort all input tuples, pack them into GiST leaf pages in the sorted order, +and create downlinks and internal pages as we go. This method builds the index +from the bottom up, similar to how the B-tree index is built. + +The sorted method is used if the operator classes for all columns have a +"sortsupport" defined. Otherwise, we fall back on inserting tuples one by one +with optional buffering. + +Sorting GiST build requires good linearization of the sort opclass. That is not +always the case in multidimensional data. To tackle the anomalies, we buffer +index tuples and apply a picksplit function that can be multidimensional-aware. + Bulk delete algorithm (VACUUM) ------------------------------ |