diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2010-12-03 23:49:06 -0500 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2010-12-03 23:49:06 -0500 |
commit | b576757d7ee064ada5351c2e6a36c2f7234aa1d4 (patch) | |
tree | 5ca6b09b224f7e5a970bed4a9a58f59a47824521 /src/backend/access/gist | |
parent | 04910a3ad5cd2901558da2a4fad9a2e2819348aa (diff) | |
download | postgresql-b576757d7ee064ada5351c2e6a36c2f7234aa1d4.tar.gz postgresql-b576757d7ee064ada5351c2e6a36c2f7234aa1d4.zip |
Add external documentation for KNNGIST.
Diffstat (limited to 'src/backend/access/gist')
-rw-r--r-- | src/backend/access/gist/README | 94 |
1 files changed, 49 insertions, 45 deletions
diff --git a/src/backend/access/gist/README b/src/backend/access/gist/README index 66d559d33ba..03069f02685 100644 --- a/src/backend/access/gist/README +++ b/src/backend/access/gist/README @@ -20,6 +20,7 @@ The current implementation of GiST supports: * Variable length keys * Composite keys (multi-key) + * Ordered search (nearest-neighbor search) * provides NULL-safe interface to GiST core * Concurrency * Recovery support via WAL logging @@ -32,8 +33,8 @@ Marcel Kornaker: The original algorithms were modified in several ways: -* They should be adapted to PostgreSQL conventions. For example, the SEARCH - algorithm was considerably changed, because in PostgreSQL function search +* They had to be adapted to PostgreSQL conventions. For example, the SEARCH + algorithm was considerably changed, because in PostgreSQL the search function should return one tuple (next), not all tuples at once. Also, it should release page locks between calls. * Since we added support for variable length keys, it's not possible to @@ -41,12 +42,12 @@ The original algorithms were modified in several ways: defined function picksplit doesn't have information about size of tuples (each tuple may contain several keys as in multicolumn index while picksplit could work with only one key) and pages. -* We modified original INSERT algorithm for performance reason. In particular, +* We modified original INSERT algorithm for performance reasons. In particular, it is now a single-pass algorithm. * Since the papers were theoretical, some details were omitted and we - have to find out ourself how to solve some specific problems. + had to find out ourself how to solve some specific problems. -Because of the above reasons, we have to revised interaction of GiST +Because of the above reasons, we have revised the interaction of GiST core and PostgreSQL WAL system. Moreover, we encountered (and solved) a problem of uncompleted insertions when recovering after crash, which was not touched in the paper. @@ -54,46 +55,49 @@ was not touched in the paper. Search Algorithm ---------------- -Function gettuple finds a tuple which satisfies the search -predicate. It store their state and returns next tuple under -subsequent calls. Stack contains page, its LSN and LSN of parent page -and currentposition is saved between calls. - -gettuple(search-pred) - if ( firsttime ) - push(stack, [root, 0, 0]) // page, LSN, parentLSN - currentposition=0 - end - ptr = top of stack - while(true) - latch( ptr->page, S-mode ) - if ( ptr->page->lsn != ptr->lsn ) - ptr->lsn = ptr->page->lsn - currentposition=0 - if ( ptr->parentlsn < ptr->page->nsn ) - add to stack rightlink - else - currentposition++ - end - - while(true) - currentposition = find_first_match( currentposition ) - if ( currentposition is invalid ) - unlatch( ptr->page ) - pop stack - ptr = top of stack - if (ptr is NULL) - return NULL - break loop - else if ( ptr->page is leaf ) - unlatch( ptr->page ) - return tuple - else - add to stack child page - end - currentposition++ - end - end +The search code maintains a queue of unvisited items, where an "item" is +either a heap tuple known to satisfy the search conditions, or an index +page that is consistent with the search conditions according to inspection +of its parent page's downlink item. Initially the root page is searched +to find unvisited items in it. Then we pull items from the queue. A +heap tuple pointer is just returned immediately; an index page entry +causes that page to be searched, generating more queue entries. + +The queue is kept ordered with heap tuple items at the front, then +index page entries, with any newly-added index page entry inserted +before existing index page entries. This ensures depth-first traversal +of the index, and in particular causes the first few heap tuples to be +returned as soon as possible. That is helpful in case there is a LIMIT +that requires only a few tuples to be produced. + +To implement nearest-neighbor search, the queue entries are augmented +with distance data: heap tuple entries are labeled with exact distance +from the search argument, while index-page entries must be labeled with +the minimum distance that any of their children could have. Then, +queue entries are retrieved in smallest-distance-first order, with +entries having identical distances managed as stated in the previous +paragraph. + +The search algorithm keeps an index page locked only long enough to scan +its entries and queue those that satisfy the search conditions. Since +insertions can occur concurrently with searches, it is possible for an +index child page to be split between the time we make a queue entry for it +(while visiting its parent page) and the time we actually reach and scan +the child page. To avoid missing the entries that were moved to the right +sibling, we detect whether a split has occurred by comparing the child +page's NSN to the LSN that the parent had when visited. If it did, the +sibling page is immediately added to the front of the queue, ensuring that +its items will be scanned in the same order as if they were still on the +original child page. + +As is usual in Postgres, the search algorithm only guarantees to find index +entries that existed before the scan started; index entries added during +the scan might or might not be visited. This is okay as long as all +searches use MVCC snapshot rules to reject heap tuples newer than the time +of scan start. In particular, this means that we need not worry about +cases where a parent page's downlink key is "enlarged" after we look at it. +Any such enlargement would be to add child items that we aren't interested +in returning anyway. Insert Algorithm |