aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access
Commit message (Collapse)AuthorAge
* Restore correct btree preprocessing of "indexedcol IS NULL" conditions.Tom Lane2011-06-29
| | | | | | | | | | | | Such a condition is unsatisfiable in combination with any other type of btree-indexable condition (since we assume btree operators are always strict). 8.3 and 8.4 had an explicit test for this, which I removed in commit 29c4ad98293e3c5cb3fcdd413a3f4904efff8762, mistakenly thinking that the case would be subsumed by the more general handling of IS (NOT) NULL added in that patch. Put it back, and improve the comments about it, and add a regression test case. Per bug #6079 from Renat Nasyrov, and analysis by Dean Rasheed.
* Move the PredicateLockRelation() call from nodeSeqscan.c to heapam.c. It'sHeikki Linnakangas2011-06-29
| | | | | | | | | | | | | | | | | | | | more consistent that way, since all the other PredicateLock* calls are made in various heapam.c and index AM functions. The call in nodeSeqscan.c was unnecessarily aggressive anyway, there's no need to try to lock the relation every time a tuple is fetched, it's enough to do it once. This has the user-visible effect that if a seq scan is initialized in the executor, but never executed, we now acquire the predicate lock on the heap relation anyway. We could avoid that by taking the lock on the first heap_getnext() call instead, but it doesn't seem worth the trouble given that it feels more natural to do it in heap_beginscan(). Also, remove the retail PredicateLockTuple() calls from heap_getnext(). In a seqscan, started with heap_begin(), we're holding a whole-relation predicate lock on the heap so there's no need to lock the tuples individually. Kevin Grittner and me
* Unify spelling of "canceled", "canceling", "cancellation"Peter Eisentraut2011-06-29
| | | | | We had previously (af26857a2775e7ceb0916155e931008c2116632f) established the U.S. spellings as standard.
* Introduce compact WAL record for the common case of commit (non-DDL).Simon Riggs2011-06-28
| | | | | | | | XLOG_XACT_COMMIT_COMPACT leaves out invalidation messages and relfilenodes, saving considerable space for the vast majority of transaction commits. XLOG_XACT_COMMIT keeps same definition as XLOG_PAGE_MAGIC 0xD067 and earlier. Leonardo Francalanci and Simon Riggs
* Reduce impact of btree page reuse on Hot Standby by fixing off-by-1 error.Simon Riggs2011-06-27
| | | | | | | | | WAL records of type XLOG_BTREE_REUSE_PAGE were generated using a latestRemovedXid one higher than actually needed because xid used was page opaque->btpo.xact rather than an actually removed xid. Noticed on an otherwise quiet system by Noah Misch. Noah Misch and Simon Riggs
* Allow callers to pass a missing_ok flag when opening a relation.Robert Haas2011-06-27
| | | | | | | | | | | | | Since the names try_relation_openrv() and try_heap_openrv() don't seem quite appropriate, rename the functions to relation_openrv_extended() and heap_openrv_extended(). This is also more general, if we have a future need for additional parameters that are of interest to only a few callers. This is infrastructure for a forthcoming patch to allow get_object_address() to take a missing_ok argument as well. Patch by me, review by Noah Misch.
* Try again to make the visibility map crash safe.Robert Haas2011-06-27
| | | | | My previous attempt was quite a bit less than half-baked with respect to heap_update().
* Avoid having two copies of the HOT-chain search logic.Robert Haas2011-06-27
| | | | | | | | | | | | | It's been like this since HOT was originally introduced, but the logic is complex enough that this is a recipe for bugs, as we've already found out with SSI. So refactor heap_hot_search_buffer() so that it can satisfy the needs of index_getnext(), and make index_getnext() use that rather than duplicating the logic. This change was originally proposed by Heikki Linnakangas as part of a larger refactoring oriented towards allowing index-only scans. I extracted and adjusted this part, since it seems to have independent merit. Review by Jeff Davis.
* Make the visibility map crash-safe.Robert Haas2011-06-21
| | | | | | | | | | | | | | | | | | | | This involves two main changes from the previous behavior. First, when we set a bit in the visibility map, emit a new WAL record of type XLOG_HEAP2_VISIBLE. Replay sets the page-level PD_ALL_VISIBLE bit and the visibility map bit. Second, when inserting, updating, or deleting a tuple, we can no longer get away with clearing the visibility map bit after releasing the lock on the corresponding heap page, because an intervening crash might leave the visibility map bit set and the page-level bit clear. Making this work requires a bit of interface refactoring. In passing, a few minor but related cleanups: change the test in visibilitymap_set and visibilitymap_clear to throw an error if the wrong page (or no page) is pinned, rather than silently doing nothing; this case should never occur. Also, remove duplicate definitions of InvalidXLogRecPtr. Patch by me, review by Noah Misch.
* Message style and spelling improvementsPeter Eisentraut2011-06-22
|
* pgindent run of recent SSI changes. Also, remove an unnecessary #include.Heikki Linnakangas2011-06-16
| | | | Kevin Grittner
* Respect Hot Standby controls while recycling btree index pages.Simon Riggs2011-06-16
| | | | | | | | | | | | | | Btree pages were recycled after VACUUM deletes all records on a page and then a subsequent VACUUM occurs after the RecentXmin horizon is reached. Using RecentXmin meant that we did not respond correctly to the user controls provide to avoid Hot Standby conflicts and so spurious conflicts could be generated in some workload combinations. We now reuse pages only when we reach RecentGlobalXmin, which can be much later in the presence of long running queries and is also controlled by vacuum_defer_cleanup_age and hot_standby_feedback. Noah Misch and Simon Riggs
* Make non-MVCC snapshots exempt from predicate locking. Scans with non-MVCCHeikki Linnakangas2011-06-15
| | | | | | | | snapshots, like in REINDEX, are basically non-transactional operations. The DDL operation itself might participate in SSI, but there's separate functions for that. Kevin Grittner and Dan Ports, with some changes by me.
* Oops, forgot to change the order of entries in 2PC callback arrays when IHeikki Linnakangas2011-06-14
| | | | renumbered the resource managers. This should fix the buildfarm..
* Work around gcc 4.6.0 bug that breaks WAL replay.Tom Lane2011-06-10
| | | | | | | | | | | | | ReadRecord's habit of using both direct references to tmpRecPtr and references to *RecPtr (which is pointing at tmpRecPtr) triggers an optimization bug in gcc 4.6.0, which apparently has forgotten about aliasing rules. Avoid the compiler bug, and make the code more readable to boot, by getting rid of the direct references. Improve the comments while at it. Back-patch to all supported versions, in case they get built with 4.6.0. Tom Lane, with some cosmetic suggestions from Alex Hunsaker
* Pgindent run before 9.1 beta2.Bruce Momjian2011-06-09
|
* Protect GIST logic that assumes penalty values can't be negative.Tom Lane2011-05-31
| | | | | | | | | | Apparently sane-looking penalty code might return small negative values, for example because of roundoff error. This will confuse places like gistchoose(). Prevent problems by clamping negative penalty values to zero. (Just to be really sure, I also made it force NaNs to zero.) Back-patch to all supported branches. Alexander Korotkov
* The row-version chaining in Serializable Snapshot Isolation was still wrong.Heikki Linnakangas2011-05-30
| | | | | | | | | | On further analysis, it turns out that it is not needed to duplicate predicate locks to the new row version at update, the lock on the version that the transaction saw as visible is enough. However, there was a different bug in the code that checks for dangerous structures when a new rw-conflict happens. Fix that bug, and remove all the row-version chaining related code. Kevin Grittner & Dan Ports, with some comment editorialization by me.
* Spell checking and markup refinementPeter Eisentraut2011-05-19
|
* Fix assorted typosAlvaro Herrera2011-05-12
|
* Shut down WAL receiver if it's still running at end of recovery. We used toHeikki Linnakangas2011-05-11
| | | | | just check that it's not running and PANIC if it was, but that can rightfully happen if recovery stops at recovery target.
* Move RegisterPredicateLockingXid() call to a safer place.Tom Lane2011-05-06
| | | | | | | | | | | | | | | | | | | The SSI patch inserted a call of RegisterPredicateLockingXid into GetNewTransactionId, which was a bad idea on a couple of grounds. First, it's not necessary to hold XidGenLock while manipulating that shared memory, and doing so is bad because XidGenLock is a high-contention lock that should be held for as short a time as possible. (Not to mention that it adds an entirely unnecessary deadlock hazard, since we must take SerializableXactHashLock as well.) Second, the specific place where it was put was between extending CLOG and advancing nextXid, which could result in unpleasant behavior in case of a failure there. Pull the call out to AssignTransactionId, which is much safer and arguably better from a modularity standpoint too. There is more work to do to clean up the failure-before-advancing-nextXid issue, but that is a separate change that will need to be back-patched. So for the moment I just want to make GetNewTransactionId look the same as it did in prior versions.
* Fix SSI-related assertion failure.Robert Haas2011-04-25
| | | | | | Bug #5899, reported by Marko Tiikkaja. Heikki Linnakangas, reviewed by Kevin Grittner and Dan Ports.
* Hash indexes had better pass the index collation to support functions, too.Tom Lane2011-04-23
| | | | | Per experimentation with contrib/citext, whose hash function assumes that it'll be passed a collation.
* Make GIN and GIST pass the index collation to all their support functions.Tom Lane2011-04-22
| | | | | | | Experimentation with contrib/btree_gist shows that the majority of the GIST support functions potentially need collation information. Safest policy seems to be to pass it to all of them, instead of making assumptions about which ones could possibly need it.
* Make a code-cleanup pass over the collations patch.Tom Lane2011-04-22
| | | | | | | This patch is almost entirely cosmetic --- mostly cleaning up a lot of neglected comments, and fixing code layout problems in places where the patch made lines too long and then pgindent did weird things with that. I did find a bug-of-omission in equalTupleDescs().
* recoveryStopsHere() must check the resource manager ID.Robert Haas2011-04-18
| | | | | | | | | | Before commit c016ce728139be95bb0dc7c4e5640507334c2339, this wasn't needed, but now that multiple resource manager IDs can percolate down through here, we have to make sure we know which one we've got. Otherwise, we can confuse (for example) an XLOG_XACT_COMMIT record with an XLOG_CHECKPOINT_SHUTDOWN record. Review by Jaime Casanova
* Add an Assert that indexam.c isn't used on an index awaiting reindexing.Tom Lane2011-04-16
| | | | | | | This might have caught the recent embarrassment over trying to modify pg_index while its indexes were being rebuilt. Noah Misch
* Revert the patch to check if we've reached end-of-backup also when doingHeikki Linnakangas2011-04-13
| | | | | | | | | crash recovery, and throw an error if not. hubert depesz lubaczewski pointed out that that situation also happens in the crash recovery following a system crash that happens during an online backup. We might want to do something smarter in 9.1, like put the check back for backups taken with pg_basebackup, but that's for another patch.
* Pass collations to functions in FunctionCallInfoData, not FmgrInfo.Tom Lane2011-04-12
| | | | | | | | | | | Since collation is effectively an argument, not a property of the function, FmgrInfo is really the wrong place for it; and this becomes critical in cases where a cached FmgrInfo is used for varying purposes that might need different collation settings. Fix by passing it in FunctionCallInfoData instead. In particular this allows a clean fix for bug #5970 (record_cmp not working). This requires touching a bit more code than the original method, but nobody ever thought that collations would not be an invasive patch...
* Clean up most -Wunused-but-set-variable warnings from gcc 4.6Peter Eisentraut2011-04-11
| | | | | | This warning is new in gcc 4.6 and part of -Wall. This patch cleans up most of the noise, but there are some still warnings that are trickier to remove.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Tweak collation setup for GIN index comparison functions.Tom Lane2011-04-08
| | | | | | | Honor index column's collation spec if there is one, don't go to the expense of calling get_typcollation when we can reasonably assume that all GIN storage types will use default collation, and be sure to set a collation for the comparePartialFn too.
* Revise the API for GUC variable assign hooks.Tom Lane2011-04-07
| | | | | | | | | | | | | | | | | The previous functions of assign hooks are now split between check hooks and assign hooks, where the former can fail but the latter shouldn't. Aside from being conceptually clearer, this approach exposes the "canonicalized" form of the variable value to guc.c without having to do an actual assignment. And that lets us fix the problem recently noted by Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus log messages about "parameter "wal_buffers" cannot be changed without restarting the server". There may be some speed advantage too, because this design lets hook functions avoid re-parsing variable values when restoring a previous state after a rollback (they can store a pre-parsed representation of the value instead). This patch also resolves a longstanding annoyance about custom error messages from variable assign hooks: they should modify, not appear separately from, guc.c's own message about "invalid parameter value".
* Avoid assuming there will be only 3 states for synchronous_commit.Simon Riggs2011-04-04
| | | | | | Also avoid hardcoding the current default state by giving it the name "on" and replace with a meaningful name that reflects its behaviour. Coding only, no change in behaviour.
* Merge synchronous_replication setting into synchronous_commit.Robert Haas2011-04-04
| | | | | | | | This means one less thing to configure when setting up synchronous replication, and also avoids some ambiguity around what the behavior should be when the settings of these variables conflict. Fujii Masao, with additional hacking by me.
* Improve error message when WAL ends before reaching end of online backup.Heikki Linnakangas2011-03-31
|
* Check that we've reached end-of-backup also when we're not performingHeikki Linnakangas2011-03-30
| | | | | | | | | | | | | | archive recovery. It's possible to restore an online backup without recovery.conf, by simply copying all the necessary WAL files to pg_xlog. "pg_basebackup -x" does that too. That's the use case where this cross-check is useful. Backpatch to 9.0. We used to do this in earlier versins, but in 9.0 the code was inadvertently changed so that the check is only performed after archive recovery. Fujii Masao.
* Clean up cruft around collation initialization for tupdescs and scankeys.Tom Lane2011-03-26
| | | | | I found actual bugs in GiST and plpgsql; the rest of this is cosmetic but meant to decrease the odds of future bugs of omission.
* Minor changes to recovery pause behaviour.Simon Riggs2011-03-23
| | | | | | | | | Change location LOG message so it works each time we pause, not just for final pause. Ensure that we pause only if we are in Hot Standby and can connect to allow us to run resume function. This change supercedes the code to override parameter recoveryPauseAtTarget to false if not attempting to enter Hot Standby, which is now removed.
* Prevent intermittent hang in recovery from bgwriter interaction.Simon Riggs2011-03-23
| | | | | | Startup process waited for cleanup lock but when hot_standby = off the pid was not registered, so that the bgwriter would not wake the waiting process as intended.
* When two base backups are started at the same time with pg_basebackup,Heikki Linnakangas2011-03-21
| | | | | | | | ensure that they use different checkpoints as the starting point. We use the checkpoint redo location as a unique identifier for the base backup in the end-of-backup record, and in the backup history file name. Bug spotted by Fujii Masao.
* Revise collation derivation method and expression-tree representation.Tom Lane2011-03-19
| | | | | | | | | | | | | | | | | | | All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing.
* Remove bogus semicolons in recoveryPausesHere.Robert Haas2011-03-18
| | | | | Without this, the startup process goes into a tight loop, consuming 100% of one CPU and failing to respond to interrupts.
* Add pause_at_recovery_target to recovery.conf.sample; improve docs.Robert Haas2011-03-17
| | | | | Fujii Masao, but with the proposed behavior change reverted, and the rest adjusted accordingly.
* Clarify C comment that O_SYNC/O_FSYNC are really the same settting, asBruce Momjian2011-03-10
| | | | opposed to O_DSYNC.
* Emit a LOG message when pausing at the recovery target.Robert Haas2011-03-10
| | | | Fujii Masao
* Remove collation information from TypeName, where it does not belong.Tom Lane2011-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial collations patch treated a COLLATE spec as part of a TypeName, following what can only be described as brain fade on the part of the SQL committee. It's a lot more reasonable to treat COLLATE as a syntactically separate object, so that it can be added in only the productions where it actually belongs, rather than needing to reject it in a boatload of places where it doesn't belong (something the original patch mostly failed to do). In addition this change lets us meet the spec's requirement to allow COLLATE anywhere in the clauses of a ColumnDef, and it avoids unfriendly behavior for constructs such as "foo::type COLLATE collation". To do this, pull collation information out of TypeName and put it in ColumnDef instead, thus reverting most of the collation-related changes in parse_type.c's API. I made one additional structural change, which was to use a ColumnDef as an intermediate node in AT_AlterColumnType AlterTableCmd nodes. This provides enough room to get rid of the "transform" wart in AlterTableCmd too, since the ColumnDef can carry the USING expression easily enough. Also fix some other minor bugs that have crept in in the same areas, like failure to copy recently-added fields of ColumnDef in copyfuncs.c. While at it, document the formerly secret ability to specify a collation in ALTER TABLE ALTER COLUMN TYPE, ALTER TYPE ADD ATTRIBUTE, and ALTER TYPE ALTER ATTRIBUTE TYPE; and correct some misstatements about what the default collation selection will be when COLLATE is omitted. BTW, the three-parameter form of format_type() should go away too, since it just contributes to the confusion in this area; but I'll do that in a separate patch.
* Truncate predicate lock manager's SLRU lazily at checkpoint. That's saferHeikki Linnakangas2011-03-08
| | | | | | | | than doing it aggressively whenever the tail-XID pointer is advanced, because this way we don't need to do it while holding SerializableXactHashLock. This also fixes bug #5915 spotted by YAMAMOTO Takashi, and removes an obsolete comment spotted by Kevin Grittner.
* If recovery_target_timeline is set to 'latest' and standby mode is enabled,Heikki Linnakangas2011-03-07
| | | | | | | | | | | | | | | | | periodically rescan the archive for new timelines, while waiting for new WAL segments to arrive. This allows you to set up a standby server that follows the TLI change if another standby server is promoted to master. Before this, you had to restart the standby server to make it notice the new timeline. This patch only scans the archive for TLI changes, it won't follow a TLI change in streaming replication. That is much needed too, but it would be a much bigger patch than I dare to sneak in this late in the release cycle. There was discussion on improving the sanity checking of the WAL segments so that the system would notice more reliably if the new timeline isn't an ancestor of the current one, but that is not included in this patch. Reviewed by Fujii Masao.