aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/index/indexam.c
Commit message (Collapse)AuthorAge
...
* HOT updates. When we update a tuple without changing any of its indexedTom Lane2007-09-20
| | | | | | | | | | | | columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.
* Fix up pgstats counting of live and dead tuples to recognize that committedTom Lane2007-05-27
| | | | | | | | | | | and aborted transactions have different effects; also teach it not to assume that prepared transactions are always committed. Along the way, simplify the pgstats API by tying counting directly to Relations; I cannot detect any redeeming social value in having stats pointers in HeapScanDesc and IndexScanDesc structures. And fix a few corner cases in which counts might be missed because the relation's pgstat_info pointer hadn't been set.
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* Restructure operator classes to allow improved handling of cross-data-typeTom Lane2006-12-23
| | | | | | | | | | | | | | | | cases. Operator classes now exist within "operator families". While most families are equivalent to a single class, related classes can be grouped into one family to represent the fact that they are semantically compatible. Cross-type operators are now naturally adjunct parts of a family, without having to wedge them into a particular opclass as we had done originally. This commit restructures the catalogs and cleans up enough of the fallout so that everything still works at least as well as before, but most of the work needed to actually improve the planner's behavior will come later. Also, there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way to create a new family right now is to allow CREATE OPERATOR CLASS to make one by default. I owe some more documentation work, too. But that can all be done in smaller pieces once this infrastructure is in place.
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* Change the relation_open protocol so that we obtain lock on a relationTom Lane2006-07-31
| | | | | | | | | | | | (table or index) before trying to open its relcache entry. This fixes race conditions in which someone else commits a change to the relation's catalog entries while we are in process of doing relcache load. Problems of that ilk have been reported sporadically for years, but it was not really practical to fix until recently --- for instance, the recent addition of WAL-log support for in-place updates helped. Along the way, remove pg_am.amconcurrent: all AMs are now expected to support concurrent update.
* Rewrite btree index scans to work a page at a time in all cases (bothTom Lane2006-05-07
| | | | | | | | | | | | | | | | btgettuple and btgetmulti). This eliminates the problem of "re-finding" the exact stopping point, since the stopping point is effectively always a page boundary, and index items are never moved across pre-existing page boundaries. A small penalty is that the keys_are_unique optimization is effectively disabled (and, therefore, is removed in this patch), causing us to apply _bt_checkkeys() to at least one more tuple than necessary when looking up a unique key. However, the advantages for non-unique cases seem great enough to accept this tradeoff. Aside from simplifying and (sometimes) speeding up the indexscan code, this will allow us to reimplement btbulkdelete as a largely sequential scan instead of index-order traversal, thereby significantly reducing the cost of VACUUM. Those changes will come in a separate patch. Original patch by Heikki Linnakangas, rework by Tom Lane.
* Clean up API for ambulkdelete/amvacuumcleanup as per today's discussion.Tom Lane2006-05-02
| | | | | | This formulation requires every AM to provide amvacuumcleanup, unlike before, but it's surely a whole lot cleaner. Also, add an 'amstorage' column to pg_am so that we can get rid of hardwired knowledge in DefineOpClass().
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Skip ambulkdelete scan if there's nothing to delete and the index is notTom Lane2006-02-11
| | | | | | | | | | | partial. None of the existing AMs do anything useful except counting tuples when there's nothing to delete, and we can get a tuple count from the heap as long as it's not a partial index. (hash actually can skip anyway because it maintains a tuple count in the index metapage.) GIST is not currently able to exploit this optimization because, due to failure to index NULLs, GIST is always effectively partial. Possibly we should fix that sometime. Simon Riggs w/ some review by Tom Lane.
* Revert based on Tom's recommendation:Bruce Momjian2006-02-11
| | | | | > Allow VACUUM to complete faster by avoiding scanning the indexes when no > rows were removed from the heap by the VACUUM.
* Allow VACUUM to complete faster by avoiding scanning the indexes when noBruce Momjian2006-02-11
| | | | | | rows were removed from the heap by the VACUUM. Simon Riggs
* Tweak indexscan machinery to avoid taking an AccessShareLock on an indexTom Lane2005-12-03
| | | | | | | if we already have a stronger lock due to the index's table being the update target table of the query. Same optimization I applied earlier at the table level. There doesn't seem to be much interest in the more radical idea of not locking indexes at all, so do what we can ...
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* Revise pgstats stuff to fix the problems with not counting accessesTom Lane2005-10-06
| | | | | | | generated by bitmap index scans. Along the way, simplify and speed up the code for counting sequential and index scans; it was both confusing and inefficient to be taking care of that in the per-tuple loops, IMHO. initdb forced because of internal changes in pg_stat view definitions.
* Concurrency for GiSTTeodor Sigaev2005-06-27
| | | | | | | | | | | | | | | | | | - full concurrency for insert/update/select/vacuum: - select and vacuum never locks more than one page simultaneously - select (gettuple) hasn't any lock across it's calls - insert never locks more than two page simultaneously: - during search of leaf to insert it locks only one page simultaneously - while walk upward to the root it locked only parent (may be non-direct parent) and child. One of them X-lock, another may be S- or X-lock - 'vacuum full' locks index - improve gistgetmulti - simplify XLOG records Fix bug in index_beginscan_internal: LockRelation may clean rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation
* Change the planner to allow indexscan qualification clauses to useTom Lane2005-06-13
| | | | | | | | | nonconsecutive columns of a multicolumn index, as per discussion around mid-May (pghackers thread "Best way to scan on-disk bitmaps"). This turns out to require only minimal changes in btree, and so far as I can see none at all in GiST. btcostestimate did need some work, but its original assumption that index selectivity == heap selectivity was quite bogus even before this.
* Arrange to cache fmgr lookup information for an index's access methodTom Lane2005-05-27
| | | | | | | | | | | | routines in the index's relcache entry, instead of doing a fresh fmgr_info on every index access. We were already doing this for the index's opclass support functions; not sure why we didn't think to do it for the AM functions too. This supersedes the former method of caching (only) amgettuple in indexscan scan descriptors; it's an improvement because the function lookup can be amortized across multiple statements instead of being repeated for each statement. Even though lookup for builtin functions is pretty cheap, this seems to drop a percent or two off some simple benchmarks.
* Fix latent bug in ExecSeqRestrPos: it leaves the plan node's result slotTom Lane2005-05-15
| | | | | | | in an inconsistent state. (This is only latent because in reality ExecSeqRestrPos is dead code at the moment ... but someday maybe it won't be.) Add some comments about what the API for plan node mark/restore actually is, because it's not immediately obvious.
* Completion of project to use fixed OIDs for all system catalogs andTom Lane2005-04-14
| | | | | | | indexes. Replace all heap_openr and index_openr calls by heap_open and index_open. Remove runtime lookups of catalog OID numbers in various places. Remove relcache's support for looking up system catalogs by name. Bulky but mostly very boring patch ...
* First steps towards index scans with heap access decoupled from indexTom Lane2005-03-27
| | | | | | | | | | access: define new index access method functions 'amgetmulti' that can fetch multiple TIDs per call. (The functions exist but are totally untested as yet.) Since I was modifying pg_am anyway, remove the no-longer-needed 'rel' parameter from amcostestimate functions, and also remove the vestigial amowner column that was creating useless work for Alvaro's shared-object-dependencies project. Initdb forced due to changes in pg_am.
* Convert index-related tuple handling routines from char 'n'/' ' to boolTom Lane2005-03-21
| | | | | | | | | | convention for isnull flags. Also, remove the useless InsertIndexResult return struct from index AM aminsert calls --- there is no reason for the caller to know where in the index the tuple was inserted, and we were wasting a palloc cycle per insert to deliver this uninteresting value (plus nontrivial complexity in some AMs). I forced initdb because of the change in the signature of the aminsert routines, even though nothing really looks at those pg_proc entries...
* Tag appropriate files for rc3PostgreSQL Daemon2004-12-31
| | | | | | | | Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
* Repair possible failure to update hint bits back to disk, perTom Lane2004-10-15
| | | | | | | | | | http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php. This fix is intended to be permanent: it moves the responsibility for calling SetBufferCommitInfoNeedsSave() into the tqual.c routines, eliminating the requirement for callers to test whether t_infomask changed. Also, tighten validity checking on buffer IDs in bufmgr.c --- several routines were paranoid about out-of-range shared buffer numbers but not about out-of-range local ones, which seems a tad pointless.
* Adjust index locking rules as per my proposal of earlier today. YouTom Lane2004-09-30
| | | | | | now are supposed to take some kind of lock on an index whenever you are going to access the index contents, rather than relying only on a lock on the parent table.
* Update copyright to 2004.Bruce Momjian2004-08-29
|
* Tweak indexscan and seqscan code to arrange that steps from one page toTom Lane2004-04-21
| | | | | | | | | the next are handled by ReleaseAndReadBuffer rather than separate ReleaseBuffer and ReadBuffer calls. This cuts the number of acquisitions of the BufMgrLock by a factor of 2 (possibly more, if an indexscan happens to pull successive rows from the same heap page). Unfortunately this doesn't seem enough to get us out of the recently discussed context-switch storm problem, but it's surely worth doing anyway.
* $Header: -> $PostgreSQL Changes ...PostgreSQL Daemon2003-11-29
|
* Message editing: remove gratuitous variations in message wording, standardizePeter Eisentraut2003-09-25
| | | | | terms, add some clarifications, fix some untranslatable attempts at dynamic message building.
* Another pgindent run with updated typedefs.Bruce Momjian2003-08-08
|
* Update copyrights to 2003.Bruce Momjian2003-08-04
|
* pgindent run.Bruce Momjian2003-08-04
|
* Error message editing in backend/access.Tom Lane2003-07-21
|
* Modify keys_are_unique optimization to release buffer pins before itTom Lane2003-03-24
| | | | | returns NULL. This avoids out-of-buffers failures during many-way indexscans, as in Shraibman's complaint of 21-Mar.
* Adjust amrescan code so that it's allowed to call index_rescan with aTom Lane2003-03-23
| | | | | | NULL key pointer, indicating that the existing scan key should be reused. This behavior isn't used yet but will be needed for my planned fix to the keys_are_unique code.
* More infrastructure for btree compaction project. Tree-traversal codeTom Lane2003-02-22
| | | | | | | | now knows what to do upon hitting a dead page (in theory anyway, it's untested...). Add a post-VACUUM-cleanup entry point for index AMs, to provide a place for dead-page scavenging to happen. Also, fix oversight that broke btpo_prev links in temporary indexes. initdb forced due to additions in pg_am.
* Fix for bug #866. 7.3 contains new logic for avoiding redundant calls toTom Lane2003-01-08
| | | | | | the index AM when we know we are fetching a unique row. However, this logic did not consider the possibility that it would be asked to fetch backwards. Also fix mark/restore to work correctly in this scenario.
* pgindent run.Bruce Momjian2002-09-04
|
* Update copyright to 2002.Bruce Momjian2002-06-20
|
* Wups, managed to break ANALYZE with one aspect of that heap_fetch change.Tom Lane2002-05-24
|
* Mark index entries "killed" when they are no longer visible to anyTom Lane2002-05-24
| | | | | | | | | | | | | | | transaction, so as to avoid returning them out of the index AM. Saves repeated heap_fetch operations on frequently-updated rows. Also detect queries on unique keys (equality to all columns of a unique index), and don't bother continuing scan once we have found first match. Killing is implemented in the btree and hash AMs, but not yet in rtree or gist, because there isn't an equally convenient place to do it in those AMs (the outer amgetnext routine can't do it without re-pinning the index page). Did some small cleanup on APIs of HeapTupleSatisfies, heap_fetch, and index_insert to make this a little easier.
* Restructure indexscan API (index_beginscan, index_getnext) perTom Lane2002-05-20
| | | | | | | yesterday's proposal to pghackers. Also remove unnecessary parameters to heap_beginscan, heap_rescan. I modified pg_proc.h to reflect the new numbers of parameters for the AM interface routines, but did not force an initdb because nothing actually looks at those fields.
* Opclasses live in namespaces. I also took the opportunity to createTom Lane2002-04-17
| | | | | | | an 'opclass owner' column in pg_opclass. Nothing is done with it at present, but since there are plans to invent a CREATE OPERATOR CLASS command soon, we'll probably want DROP OPERATOR CLASS too, which suggests that a notion of ownership would be a good idea.
* pg_class has a relnamespace column. You can create and access tablesTom Lane2002-03-26
| | | | | | in schemas other than the system namespace; however, there's no search path yet, and not all operations work yet on tables outside the system namespace.
* Fix problem reported by Alex Korn: if a relation has been dropped andTom Lane2001-11-02
| | | | | | | | | | | | | | | | | | recreated since the start of our transaction, our first reference to it errored out because we'd try to reuse our old relcache entry for it. Do this by accepting SI inval messages just before relcache search in heap_openr, so that dead relcache entries will be flushed before we search. Also, break heap_open/openr into two pairs of routines, relation_open(r) and heap_open(r). The relation_open routines make no tests on relkind and so can be used to open anything that has a pg_class entry. The heap_open routines are wrappers that add a relkind test to preserve their established behavior. Use the relation_open routines in several places that had various kluge solutions for opening rels that might be either heap or index rels. Also, remove the old 'heap stats' code that's been superseded by Jan's stats collector, and clean up some inconsistencies in error reporting between the different types of ALTER TABLE.
* pgindent run on all C files. Java run to follow. initdb/regressionBruce Momjian2001-10-25
| | | | tests pass.
* Rearrange fmgr.c and relcache so that it's possible to keep FmgrInfoTom Lane2001-10-06
| | | | | | | | | lookup info in the relcache for index access method support functions. This makes a huge difference for dynamically loaded support functions, and should save a few cycles even for built-in ones. Also tweak dfmgr.c so that load_external_function is called only once, not twice, when doing fmgr_info for a dynamically loaded function. All per performance gripe from Teodor Sigaev, 5-Oct-01.
* Restructure index AM interface for index building and index tuple deletion,Tom Lane2001-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | per previous discussion on pghackers. Most of the duplicate code in different AMs' ambuild routines has been moved out to a common routine in index.c; this means that all index types now do the right things about inserting recently-dead tuples, etc. (I also removed support for EXTEND INDEX in the ambuild routines, since that's about to go away anyway, and it cluttered the code a lot.) The retail indextuple deletion routines have been replaced by a "bulk delete" routine in which the indexscan is inside the access method. I haven't pushed this change as far as it should go yet, but it should allow considerable simplification of the internal bookkeeping for deletions. Also, add flag columns to pg_am to eliminate various hardcoded tests on AM OIDs, and remove unused pg_am columns. Fix rtree and gist index types to not attempt to store NULLs; before this, gist usually crashed, while rtree managed not to crash but computed wacko bounding boxes for NULL entries (which might have had something to do with the performance problems we've heard about occasionally). Add AtEOXact routines to hash, rtree, and gist, all of which have static state that needs to be reset after an error. We discovered this need long ago for btree, but missed the other guys. Oh, one more thing: concurrent VACUUM is now the default.
* Statistical system views (yet without the config stuff, butJan Wieck2001-06-22
| | | | | | | it's hard to keep such massive changes in sync with the tree so I need to get it in and work from there now). Jan
* Clean up some minor problems exposed by further thought about Panon's bugTom Lane2001-06-01
| | | | | | | | | | | | | | report on old-style functions invoked by RI triggers. We had a number of other places that were being sloppy about which memory context FmgrInfo subsidiary data will be allocated in. Turns out none of them actually cause a problem in 7.1, but this is for arcane reasons such as the fact that old-style triggers aren't supported anyway. To avoid getting burnt later, I've restructured the trigger support so that we don't keep trigger FmgrInfo structs in relcache memory. Some other related cleanups too: it's not really necessary to call fmgr_info at all while setting up the index support info in relcache entries, because those ScanKeyEntry structs are never used to invoke the functions. This should speed up relcache initialization a tiny bit.