aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/gin/ginbtree.c
Commit message (Collapse)AuthorAge
...
* Get rid of the post-recovery cleanup step of GIN page splits.Heikki Linnakangas2013-11-27
| | | | | | | | | | | | | | | | | | Replace it with an approach similar to what GiST uses: when a page is split, the left sibling is marked with a flag indicating that the parent hasn't been updated yet. When the parent is updated, the flag is cleared. If an insertion steps on a page with the flag set, it will finish split before proceeding with the insertion. The post-recovery cleanup mechanism was never totally reliable, as insertion to the parent could fail e.g because of running out of memory or disk space, leaving the tree in an inconsistent state. This also divides the responsibility of WAL-logging more clearly between the generic ginbtree.c code, and the parts specific to entry and posting trees. There is now a common WAL record format for insertions and deletions, which is written by ginbtree.c, followed by tree-specific payload, which is returned by the placetopage- and split- callbacks.
* More GIN refactoring.Heikki Linnakangas2013-11-27
| | | | | | | | | | | | Separate the insertion payload from the more static portions of GinBtree. GinBtree now only contains information related to searching the tree, and the information of what to insert is passed separately. Add root block number to GinBtree, instead of passing it around all the functions as argument. Split off ginFinishSplit() from ginInsertValue(). ginFinishSplit is responsible for finding the parent and inserting the downlink to it.
* More GIN refactoring.Heikki Linnakangas2013-11-20
| | | | | | | | | | | Split off the portion of ginInsertValue that inserts the tuple to current level into a separate function, ginPlaceToPage. ginInsertValue's charter is now to recurse up the tree to insert the downlink, when a page split is required. This is in preparation for a patch to change the way incomplete splits are handled, which will need to do these operations separately. And IMHO makes the code more readable anyway.
* Refactor the internal GIN B-tree interface for forming a downlink.Heikki Linnakangas2013-11-20
| | | | | | This creates a new gin-btree callback function for creating a downlink for a page. Previously, ginxlog.c duplicated the logic used during normal operation.
* Further GIN refactoring.Heikki Linnakangas2013-11-20
| | | | | Merge some functions that were always called together. Makes the code little bit more readable.
* Fix race condition in GIN posting tree page deletion.Heikki Linnakangas2013-11-08
| | | | | | | | | | | | | | | | | | If a page is deleted, and reused for something else, just as a search is following a rightlink to it from its left sibling, the search would continue scanning whatever the new contents of the page are. That could lead to incorrect query results, or even something more curious if the page is reused for a different kind of a page. To fix, modify the search algorithm to lock the next page before releasing the previous one, and refrain from deleting pages from the leftmost branch of the tree. Add a new Concurrency section to the README, explaining why this works. There is a lot more one could say about concurrency in GIN, but that's for another patch. Backpatch to all supported versions.
* Misc GIN refactoring.Heikki Linnakangas2013-11-06
| | | | | | | | | | | | | | | Merge the isEnoughSpace and placeToPage functions in the b-tree interface into one function that tries to put a tuple on page, and returns false if it doesn't fit. Move createPostingTree function to gindatapage.c, and change its contract so that it can be passed more items than fit on the root page. It's in a better position than the callers to know how many items fit. Move ginMergeItemPointers out of gindatapage.c, into a separate file. These changes make no difference now, but reduce the footprint of Alexander Korotkov's upcoming patch to pack item pointers more tightly.
* pgindent run for release 9.3Bruce Momjian2013-05-29
| | | | | This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
* Remove PageSetTLI and rename pd_tli to pd_checksumSimon Riggs2013-03-18
| | | | | | | | | | | | | | Remove use of PageSetTLI() from all page manipulation functions and adjust README to indicate change in the way we make changes to pages. Repurpose those bytes into the pd_checksum field and explain how that works in comments about page header. Refactoring ahead of actual feature patch which would make use of the checksum field, arriving later. Jeff Davis, with comments and doc changes by Simon Riggs Direction suggested by Robert Haas; many others providing review comments.
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Remove unreachable codePeter Eisentraut2012-07-16
| | | | | | | The Solaris Studio compiler warns about these instances, unlike more mainstream compilers such as gcc. But manual inspection showed that the code is clearly not reachable, and we hope no worthy compiler will complain about removing this code.
* Cosmetic cleanup of ginInsertValue().Tom Lane2012-07-13
| | | | | | | | Make it clearer that the passed stack mustn't be empty, and that we are not supposed to fall off the end of the stack in the main loop. Tighten the loop that extracts the root block number, too. Markus Wanner and Tom Lane
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Remove unnecessary #include references, per pgrminclude script.Bruce Momjian2011-09-01
|
* Fix GIN to support null keys, empty and null items, and full index scans.Tom Lane2011-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per my recent proposal(s). Null key datums can now be returned by extractValue and extractQuery functions, and will be stored in the index. Also, placeholder entries are made for indexable items that are NULL or contain no keys according to extractValue. This means that the index is now always complete, having at least one entry for every indexed heap TID, and so we can get rid of the prohibition on full-index scans. A full-index scan is implemented much the same way as partial-match scans were already: we build a bitmap representing all the TIDs found in the index, and then drive the results off that. Also, introduce a concept of a "search mode" that can be requested by extractQuery when the operator requires matching to empty items (this is just as cheap as matching to a single key) or requires a full index scan (which is not so cheap, but it sure beats failing or giving wrong answers). The behavior remains backward compatible for opclasses that don't return any null keys or request a non-default search mode. Using these features, we can now make the GIN index opclass for anyarray behave in a way that matches the actual anyarray operators for &&, <@, @>, and = ... which it failed to do before in assorted corner cases. This commit fixes the core GIN code and ginarrayprocs.c, updates the documentation, and adds some simple regression test cases for the new behaviors using the array operators. The tsearch and contrib GIN opclass support functions still need to be looked over and probably fixed. Another thing I intend to fix separately is that this is pretty inefficient for cases where more than one scan condition needs a full-index search: we'll run duplicate GinScanEntrys, each one of which builds a large bitmap. There is some existing logic to merge duplicate GinScanEntrys but it needs refactoring to make it work for entries belonging to different scan keys. Note that most of gin.h has been split out into a new file gin_private.h, so that gin.h doesn't export anything that's not supposed to be used by GIN opclasses or the rest of the backend. I did quite a bit of other code beautification work as well, mostly fixing comments and choosing more appropriate names for things.
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Generalize concept of temporary relations to "relation persistence".Robert Haas2010-12-13
| | | | | | | | | | | | | | | This commit replaces pg_class.relistemp with pg_class.relpersistence; and also modifies the RangeVar node type to carry relpersistence rather than istemp. It also removes removes rd_istemp from RelationData and instead performs the correct computation based on relpersistence. For clarity, we add three new macros: RelationNeedsWAL(), RelationUsesLocalBuffers(), and RelationUsesTempNamespace(), so that we can clarify the purpose of each check that previous depended on rd_istemp. This is intended as infrastructure for the upcoming unlogged tables patch, as well as for future possible work on global temporary tables.
* Fix a passel of inappropriately-named global functions in GIN.Tom Lane2010-10-17
| | | | | | | | | | | | | | | | The GIN code has absolutely no business exporting GIN-specific functions with names as generic as compareItemPointers() or newScanKey(); that's just trouble waiting to happen. I got annoyed about this again just now and decided to fix it. This commit ensures that all global symbols defined in access/gin/ have names including "gin" or "Gin". There were a couple of cases, like names involving "PostingItem", where arguably the names were already sufficiently nongeneric; but I figured as long as I was risking creating merge problems for unapplied GIN patches I might as well impose a uniform policy. I didn't touch any static symbol names. There might be some places where it'd be appropriate to rename some static functions to match siblings that are exported, but I'll leave that for another time.
* Improve GIN indexscan cost estimation.Tom Lane2010-10-17
| | | | | | | | | | | | | The better estimate requires more statistics than we previously stored: in particular, counts of "entry" versus "data" pages within the index, as well as knowledge of the number of distinct key values. We collect this information during initial index build and update it during VACUUM, storing the info in new fields on the index metapage. No initdb is required because these fields will read as zeroes in a pre-existing index, and the new gincostestimate code is coded to behave (reasonably) sanely if they are zeroes. Teodor Sigaev, reviewed by Jan Urbanski, Tom Lane, and Itagaki Takahiro.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Rewrite the rbtree routines so that an RBNode is the first field of theTom Lane2010-08-01
| | | | | | | | | | | | | | | | | | | | struct representing a tree entry, rather than being a separately allocated piece of storage. This API is at least as clean as the old one (if not more so --- there were some bizarre choices in there) and it permits a very substantial memory savings, on the order of 2X in ginbulk.c's usage. Also, fix minor memory leaks in code called by ginEntryInsert, in particular in ginInsertValue and entryFillRoot, as well as ginEntryInsert itself. These leaks resulted in the GIN index build context continuing to bloat even after we'd filled it to maintenance_work_mem and started to dump data out to the index. In combination these fixes restore the GIN index build code to honoring the maintenance_work_mem limit about as well as it did in 8.4. Speed seems on par with 8.4 too, maybe even a bit faster, for a non-pathological case in which HEAD was formerly slower. Back-patch to 9.0 so we don't have a performance regression from 8.4.
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Improve our #include situation by moving pointer types away from theAlvaro Herrera2008-06-19
| | | | | | | corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.
* Restructure some header files a bit, in particular heapam.h, by removing someAlvaro Herrera2008-05-12
| | | | | | | | | | | | unnecessary #include lines in it. Also, move some tuple routine prototypes and macros to htup.h, which allows removal of heapam.h inclusion from some .c files. For this to work, a new header file access/sysattr.h needed to be created, initially containing attribute numbers of system columns, for pg_dump usage. While at it, make contrib ltree, intarray and hstore header files more consistent with our header style.
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* pgindent run for 8.3.Bruce Momjian2007-11-15
|
* Move call of MarkBufferDirty() before XLogInsert() as required.Teodor Sigaev2007-06-05
| | | | | Many thanks to Heikki Linnakangas <heikki@enterprisedb.com> for his sharp eyes.
* Fix a few typos in comments in GiN.Neil Conway2007-02-01
|
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* Fix some typos in comments.Neil Conway2006-11-12
|
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* Remove 576 references of include files that were not needed.Bruce Momjian2006-07-14
|
* Correct cheking in findParents(). iTeodor Sigaev2006-05-29
| | | | From Andreas Seltenreich <andreas+pg@gate450.dyndns.org>
* Fix findParents() in case of multiple levels to find.Teodor Sigaev2006-05-26
| | | | By Andreas Seltenreich <andreas+pg@gate450.dyndns.org>
* GIN: Generalized Inverted iNdex.Teodor Sigaev2006-05-02
text[], int4[], Tsearch2 support for GIN.