aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/gist/README
diff options
context:
space:
mode:
authorAlexander Korotkov <akorotkov@postgresql.org>2022-02-07 23:20:42 +0300
committerAlexander Korotkov <akorotkov@postgresql.org>2022-02-07 23:20:42 +0300
commitf1ea98a7975e15cefdb446385880a2f55224ee7d (patch)
tree8efc57dc7b2480397fc838baa0dd328630f611b6 /src/backend/access/gist/README
parent42a9e88bf6a809e6023c9d50f58cc6b9446f229d (diff)
downloadpostgresql-f1ea98a7975e15cefdb446385880a2f55224ee7d.tar.gz
postgresql-f1ea98a7975e15cefdb446385880a2f55224ee7d.zip
Reduce non-leaf keys overlap in GiST indexes produced by a sorted build
The GiST sorted build currently chooses split points according to the only page space utilization. That may lead to higher non-leaf keys overlap and, in turn, slower search query answers. This commit makes the sorted build use the opclass's picksplit method. Once four pages at the level are accumulated, the picksplit method is applied until each split partition fits the page. Some of our split algorithms could show significant performance degradation while processing 4-times more data at once. But those opclasses haven't received the sorted build support and shouldn't receive it before their split algorithms are improved. Discussion: https://postgr.es/m/CAHqSB9jqtS94e9%3D0vxqQX5dxQA89N95UKyz-%3DA7Y%2B_YJt%2BVW5A%40mail.gmail.com Author: Aliaksandr Kalenik, Sergei Shoulbakov, Andrey Borodin Reviewed-by: Björn Harrtell, Darafei Praliaskouski, Andres Freund Reviewed-by: Alexander Korotkov
Diffstat (limited to 'src/backend/access/gist/README')
-rw-r--r--src/backend/access/gist/README16
1 files changed, 16 insertions, 0 deletions
diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README
index 25cab0047b6..efb2730e181 100644
--- a/src/backend/access/gist/README
+++ b/src/backend/access/gist/README
@@ -26,6 +26,7 @@ The current implementation of GiST supports:
* Concurrency
* Recovery support via WAL logging
* Buffering build algorithm
+ * Sorted build method
The support for concurrency implemented in PostgreSQL was developed based on
the paper "Access Methods for Next-Generation Database Systems" by
@@ -414,6 +415,21 @@ emptied yet; tuples never move upwards in the tree. The final emptying loops
through buffers at a given level until all buffers at that level have been
emptied, and then moves down to the next level.
+Sorted build method
+-------------------
+
+Sort all input tuples, pack them into GiST leaf pages in the sorted order,
+and create downlinks and internal pages as we go. This method builds the index
+from the bottom up, similar to how the B-tree index is built.
+
+The sorted method is used if the operator classes for all columns have a
+"sortsupport" defined. Otherwise, we fall back on inserting tuples one by one
+with optional buffering.
+
+Sorting GiST build requires good linearization of the sort opclass. That is not
+always the case in multidimensional data. To tackle the anomalies, we buffer
+index tuples and apply a picksplit function that can be multidimensional-aware.
+
Bulk delete algorithm (VACUUM)
------------------------------