postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Properly set relpersistence for fake relcache entries.	Robert Haas	2012-09-14
\| \| \| \| \| \| \|	This can result in buffers failing to be properly flushed at checkpoint time, leading to data loss. Report, diagnosis, and patch by Jeff Davis.
*	Fix inappropriate error messages for Hot Standby misconfiguration errors.	Tom Lane	2012-09-05
\| \| \| \| \| \| \| \|	Give the correct name of the GUC parameter being complained of. Also, emit a more suitable SQLSTATE (INVALID_PARAMETER_VALUE, not the default INTERNAL_ERROR). Gurjeet Singh, errcode adjustment by me
*	fsync backup_label after pg_start_backup()	Simon Riggs	2012-08-07
\| \| \| \|	Dave Kerr, backpatched by Simon Riggs
*	Initialize shared memory copy of ckptXidEpoch correctly when not in recovery.	Heikki Linnakangas	2012-06-29
\| \| \| \| \| \| \|	This bug was introduced by commit 20d98ab6e4110087d1816cd105a40fcc8ce0a307, so backpatch this to 9.0-9.2 like that one. This fixes bug #6710, reported by Tarvi Pillessaar
*	Wake WALSender to reduce data loss at failover for async commit.	Simon Riggs	2012-06-07
\| \| \| \| \| \| \| \| \|	WALSender now woken up after each background flush by WALwriter, avoiding multi-second replication delay for an all-async commit workload. Replication delay reduced from 7s with default settings to 200ms, allowing significantly reduced data loss at failover. Andres Freund and Simon Riggs
*	Revert back-branch changes in behavior of age(xid).	Tom Lane	2012-05-31
\| \| \| \| \| \| \| \|	Per discussion, it does not seem like a good idea to change the behavior of age(xid) in a minor release, even though the old definition causes the function to fail on hot standby slaves. Therefore, revert commit 5829387381d2e4edf84652bb5a712f6185860670 and follow-on commits in the back branches only.
*	Teach AbortOutOfAnyTransaction to clean up partially-started transactions.	Tom Lane	2012-05-28
\| \| \| \| \| \| \| \| \| \| \| \|	AbortOutOfAnyTransaction failed to do anything if the state it saw on entry corresponded to failing partway through StartTransaction. I fixed AbortCurrentTransaction to cope with that case way back in commit 60b2444cc3ba037630c9b940c3c9ef01b954b87b, but evidently overlooked that AbortOutOfAnyTransaction should do likewise. Back-patch to all supported branches. It's not clear that this omission has any more-than-cosmetic consequences, but it's also not clear that it doesn't, so back-patching seems the least risky choice.
*	Ensure backwards compatibility for GetStableLatestTransactionId()	Simon Riggs	2012-05-12
\|
*	Ensure age() returns a stable value rather than the latest value	Simon Riggs	2012-05-11
\|
*	Don't wait for the commit record to be replicated if we wrote no WAL.	Heikki Linnakangas	2012-04-17
\| \| \| \| \| \| \| \|	When using synchronous replication, we waited for the commit record to be replicated, but if we our transaction didn't write any other WAL records, that's not required because we don't even flush the WAL locally to disk in that case. This lead to long waits when committing a transaction that only modified a temporary table. Bug spotted by Thom Brown.
*	Correct epoch of txid_current() when executed on a Hot Standby server.	Simon Riggs	2012-03-29
\| \| \| \| \| \| \| \| \|	Initialise ckptXidEpoch from starting checkpoint and maintain the correct value as we roll forwards. This allows GetNextXidAndEpoch() to return the correct epoch when executed during recovery. Backpatch to 9.0 when the problem is first observable by a user. Bug report from Daniel Farina
*	Correctly initialise shared recoveryLastRecPtr in recovery.	Simon Riggs	2012-02-22
\| \| \| \| \| \| \| \|	Previously we used ReadRecPtr rather than EndRecPtr, which was not a serious error but caused pg_stat_replication to report incorrect replay_location until at least one WAL record is replayed. Fujii Masao
*	Avoid problems with OID wraparound during WAL replay.	Tom Lane	2012-02-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a longstanding thinko in replay of NEXTOID and checkpoint records: we tried to advance nextOid only if it was behind the value in the WAL record, but the comparison would draw the wrong conclusion if OID wraparound had occurred since the previous value. Better to just unconditionally assign the new value, since OID assignment shouldn't be happening during replay anyway. The consequences of a failure to update nextOid would be pretty minimal, since we have long had the code set up to obtain another OID and try again if the generated value is already in use. But in the worst case there could be significant performance glitches while such loops iterate through many already-used OIDs before finding a free one. The odds of a wraparound happening during WAL replay would be small in a crash-recovery scenario, and the length of any ensuing OID-assignment stall quite limited anyway. But neither of these statements hold true for a replication slave that follows a WAL stream for a long period; its behavior upon going live could be almost unboundedly bad. Hence it seems worth back-patching this fix into all supported branches. Already fixed in HEAD in commit c6d76d7c82ebebb7210029f7382c0ebe2c558bca.
*	Fix transient clobbering of shared buffers during WAL replay.	Tom Lane	2012-02-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RestoreBkpBlocks was in the habit of zeroing and refilling the target buffer; which was perfectly safe when the code was written, but is unsafe during Hot Standby operation. The reason is that we have coding rules that allow backends to continue accessing a tuple in a heap relation while holding only a pin on its buffer. Such a backend could see transiently zeroed data, if WAL replay had occasion to change other data on the page. This has been shown to be the cause of bug #6425 from Duncan Rance (who deserves kudos for developing a sufficiently-reproducible test case) as well as Bridget Frey's re-report of bug #6200. It most likely explains the original report as well, though we don't yet have confirmation of that. To fix, change the code so that only bytes that are supposed to change will change, even transiently. This actually saves cycles in RestoreBkpBlocks, since it's not writing the same bytes twice. Also fix seq_redo, which has the same disease, though it has to work a bit harder to meet the requirement. So far as I can tell, no other WAL replay routines have this type of bug. In particular, the index-related replay routines, which would certainly be broken if they had to meet the same standard, are not at risk because we do not have coding rules that allow access to an index page when not holding a buffer lock on it. Back-patch to 9.0 where Hot Standby was added.
*	Avoid crashing when we have problems unlinking files post-commit.	Tom Lane	2011-12-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	smgrdounlink takes care to not throw an ERROR if it fails to unlink something, but that caution was rendered useless by commit 3396000684b41e7e9467d1abc67152b39e697035, which put an smgrexists call in front of it; smgrexists does throw error if anything looks funny, such as getting a permissions error from trying to open the file. If that happens post-commit, you get a PANIC, and what's worse the same logic appears in the WAL replay code, so the database even fails to restart. Restore the intended behavior by removing the smgrexists call --- it isn't accomplishing anything that we can't do better by adjusting mdunlink's ideas of whether it ought to warn about ENOENT or not. Per report from Joseph Shraibman of unrecoverable crash after trying to drop a table whose FSM fork had somehow gotten chmod'd to 000 permissions. Backpatch to 8.4, where the bogus coding was introduced.
*	Don't set reachedMinRecoveryPoint during crash recovery. In crash recovery,	Heikki Linnakangas	2011-12-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	we don't reach consistency before replaying all of the WAL. Rename the variable to reachedConsistency, to make its intention clearer. In master, that was an active bug because of the recent patch to immediately PANIC if a reference to a missing page is found in WAL after reaching consistency, as Tom Lane's test case demonstrated. In 9.1 and 9.0, the only consequence was a misleading "consistent recovery state reached at %X/%X" message in the log at the beginning of crash recovery (the database is not consistent at that point yet). In 8.4, the log message was not printed in crash recovery, even though there was a similar reachedMinRecoveryPoint local variable that was also set early. So, backpatch to 9.1 and 9.0.
*	Derive oldestActiveXid at correct time for Hot Standby.	Simon Riggs	2011-11-02
\| \| \| \| \| \| \| \| \|	There was a timing window between when oldestActiveXid was derived and when it should have been derived that only shows itself under heavy load. Move code around to ensure correct timing of derivation. No change to StartupSUBTRANS() code, which is where this failed. Bug report by Chris Redekop
*	Fix timing of Startup CLOG and MultiXact during Hot Standby	Simon Riggs	2011-11-02
\| \| \| \|	Patch by me, bug report by Chris Redekop, analysis by Florian Pflug
*	Adjust translator comment format to xgettext expectations	Alvaro Herrera	2011-09-05
\|
*	Mark some untranslatable messages with errmsg_internal	Alvaro Herrera	2011-09-05
\|
*	If backup-end record is not seen, and we reach end of recovery from a	Heikki Linnakangas	2011-08-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	streamed backup, throw an error and refuse to start up. The restore has not finished correctly in that case and the data directory is possibly corrupt. We already errored out in case of archive recovery, but could not during crash recovery because we couldn't distinguish between the case that pg_start_backup() was called and the database then crashed (must not error, data is OK), and the case that we're restoring from a backup and not all the needed WAL was replayed (data can be corrupt). To distinguish those cases, add a line to backup_label to indicate whether the backup was taken with pg_start/stop_backup(), or by streaming (ie. pg_basebackup). This is a different implementation than what I committed to 9.2 a week ago. That implementation was not back-patchable because it required re-initdb. Fujii Masao
*	Fix race condition in relcache init file invalidation.	Tom Lane	2011-08-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous code tried to synchronize by unlinking the init file twice, but that doesn't actually work: it leaves a window wherein a third process could read the already-stale init file but miss the SI messages that would tell it the data is stale. The result would be bizarre failures in catalog accesses, typically "could not read block 0 in file ..." later during startup. Instead, hold RelCacheInitLock across both the unlink and the sending of the SI messages. This is more straightforward, and might even be a bit faster since only one unlink call is needed. This has been wrong since it was put in (in 2002!), so back-patch to all supported releases.
*	Back-patch assorted latch-related fixes.	Tom Lane	2011-08-10
\| \| \| \| \| \| \| \| \| \|	Fix a whole bunch of signal handlers that had been hacked to do things that might change errno, without adding the necessary save/restore logic for errno. Also make some minor fixes in unix_latch.c, and clean up bizarre and unsafe scheme for disowning the process's latch. While at it, rename the PGPROC latch field to procLatch for consistency with 9.2. Issues noted while reviewing a patch by Peter Geoghegan.
*	Measure WaitLatch's timeout parameter in milliseconds, not microseconds.	Tom Lane	2011-08-09
\| \| \| \| \| \| \| \| \| \| \| \|	The original definition had the problem that timeouts exceeding about 2100 seconds couldn't be specified on 32-bit machines. Milliseconds seem like sufficient resolution, and finer grain than that would be fantasy anyway on many platforms. Back-patch to 9.1 so that this aspect of the latch API won't change between 9.1 and later releases. Peter Geoghegan
*	Unify spelling of "canceled", "canceling", "cancellation"	Peter Eisentraut	2011-07-02
\| \| \| \| \|	We had previously (af26857a2775e7ceb0916155e931008c2116632f) established the U.S. spellings as standard.
*	pgindent run of recent SSI changes. Also, remove an unnecessary #include.	Heikki Linnakangas	2011-06-16
\| \| \| \|	Kevin Grittner
*	Oops, forgot to change the order of entries in 2PC callback arrays when I	Heikki Linnakangas	2011-06-14
\| \| \| \|	renumbered the resource managers. This should fix the buildfarm..
*	Work around gcc 4.6.0 bug that breaks WAL replay.	Tom Lane	2011-06-10
\| \| \| \| \| \| \| \| \| \| \| \| \|	ReadRecord's habit of using both direct references to tmpRecPtr and references to *RecPtr (which is pointing at tmpRecPtr) triggers an optimization bug in gcc 4.6.0, which apparently has forgotten about aliasing rules. Avoid the compiler bug, and make the code more readable to boot, by getting rid of the direct references. Improve the comments while at it. Back-patch to all supported versions, in case they get built with 4.6.0. Tom Lane, with some cosmetic suggestions from Alex Hunsaker
*	Pgindent run before 9.1 beta2.	Bruce Momjian	2011-06-09
\|
*	Fix assorted typos	Alvaro Herrera	2011-05-12
\|
*	Shut down WAL receiver if it's still running at end of recovery. We used to	Heikki Linnakangas	2011-05-11
\| \| \| \| \|	just check that it's not running and PANIC if it was, but that can rightfully happen if recovery stops at recovery target.
*	Move RegisterPredicateLockingXid() call to a safer place.	Tom Lane	2011-05-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SSI patch inserted a call of RegisterPredicateLockingXid into GetNewTransactionId, which was a bad idea on a couple of grounds. First, it's not necessary to hold XidGenLock while manipulating that shared memory, and doing so is bad because XidGenLock is a high-contention lock that should be held for as short a time as possible. (Not to mention that it adds an entirely unnecessary deadlock hazard, since we must take SerializableXactHashLock as well.) Second, the specific place where it was put was between extending CLOG and advancing nextXid, which could result in unpleasant behavior in case of a failure there. Pull the call out to AssignTransactionId, which is much safer and arguably better from a modularity standpoint too. There is more work to do to clean up the failure-before-advancing-nextXid issue, but that is a separate change that will need to be back-patched. So for the moment I just want to make GetNewTransactionId look the same as it did in prior versions.
*	recoveryStopsHere() must check the resource manager ID.	Robert Haas	2011-04-18
\| \| \| \| \| \| \| \| \| \|	Before commit c016ce728139be95bb0dc7c4e5640507334c2339, this wasn't needed, but now that multiple resource manager IDs can percolate down through here, we have to make sure we know which one we've got. Otherwise, we can confuse (for example) an XLOG_XACT_COMMIT record with an XLOG_CHECKPOINT_SHUTDOWN record. Review by Jaime Casanova
*	Revert the patch to check if we've reached end-of-backup also when doing	Heikki Linnakangas	2011-04-13
\| \| \| \| \| \| \| \| \|	crash recovery, and throw an error if not. hubert depesz lubaczewski pointed out that that situation also happens in the crash recovery following a system crash that happens during an online backup. We might want to do something smarter in 9.1, like put the check back for backups taken with pg_basebackup, but that's for another patch.
*	pgindent run before PG 9.1 beta 1.	Bruce Momjian	2011-04-10
\|
*	Revise the API for GUC variable assign hooks.	Tom Lane	2011-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous functions of assign hooks are now split between check hooks and assign hooks, where the former can fail but the latter shouldn't. Aside from being conceptually clearer, this approach exposes the "canonicalized" form of the variable value to guc.c without having to do an actual assignment. And that lets us fix the problem recently noted by Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus log messages about "parameter "wal_buffers" cannot be changed without restarting the server". There may be some speed advantage too, because this design lets hook functions avoid re-parsing variable values when restoring a previous state after a rollback (they can store a pre-parsed representation of the value instead). This patch also resolves a longstanding annoyance about custom error messages from variable assign hooks: they should modify, not appear separately from, guc.c's own message about "invalid parameter value".
*	Avoid assuming there will be only 3 states for synchronous_commit.	Simon Riggs	2011-04-04
\| \| \| \| \| \|	Also avoid hardcoding the current default state by giving it the name "on" and replace with a meaningful name that reflects its behaviour. Coding only, no change in behaviour.
*	Merge synchronous_replication setting into synchronous_commit.	Robert Haas	2011-04-04
\| \| \| \| \| \| \| \|	This means one less thing to configure when setting up synchronous replication, and also avoids some ambiguity around what the behavior should be when the settings of these variables conflict. Fujii Masao, with additional hacking by me.
*	Improve error message when WAL ends before reaching end of online backup.	Heikki Linnakangas	2011-03-31
\|
*	Check that we've reached end-of-backup also when we're not performing	Heikki Linnakangas	2011-03-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	archive recovery. It's possible to restore an online backup without recovery.conf, by simply copying all the necessary WAL files to pg_xlog. "pg_basebackup -x" does that too. That's the use case where this cross-check is useful. Backpatch to 9.0. We used to do this in earlier versins, but in 9.0 the code was inadvertently changed so that the check is only performed after archive recovery. Fujii Masao.
*	Minor changes to recovery pause behaviour.	Simon Riggs	2011-03-23
\| \| \| \| \| \| \| \| \|	Change location LOG message so it works each time we pause, not just for final pause. Ensure that we pause only if we are in Hot Standby and can connect to allow us to run resume function. This change supercedes the code to override parameter recoveryPauseAtTarget to false if not attempting to enter Hot Standby, which is now removed.
*	Prevent intermittent hang in recovery from bgwriter interaction.	Simon Riggs	2011-03-23
\| \| \| \| \| \|	Startup process waited for cleanup lock but when hot_standby = off the pid was not registered, so that the bgwriter would not wake the waiting process as intended.
*	When two base backups are started at the same time with pg_basebackup,	Heikki Linnakangas	2011-03-21
\| \| \| \| \| \| \| \|	ensure that they use different checkpoints as the starting point. We use the checkpoint redo location as a unique identifier for the base backup in the end-of-backup record, and in the backup history file name. Bug spotted by Fujii Masao.
*	Remove bogus semicolons in recoveryPausesHere.	Robert Haas	2011-03-18
\| \| \| \| \|	Without this, the startup process goes into a tight loop, consuming 100% of one CPU and failing to respond to interrupts.
*	Add pause_at_recovery_target to recovery.conf.sample; improve docs.	Robert Haas	2011-03-17
\| \| \| \| \|	Fujii Masao, but with the proposed behavior change reverted, and the rest adjusted accordingly.
*	Clarify C comment that O_SYNC/O_FSYNC are really the same settting, as	Bruce Momjian	2011-03-10
\| \| \| \|	opposed to O_DSYNC.
*	Emit a LOG message when pausing at the recovery target.	Robert Haas	2011-03-10
\| \| \| \|	Fujii Masao
*	Truncate predicate lock manager's SLRU lazily at checkpoint. That's safer	Heikki Linnakangas	2011-03-08
\| \| \| \| \| \| \| \|	than doing it aggressively whenever the tail-XID pointer is advanced, because this way we don't need to do it while holding SerializableXactHashLock. This also fixes bug #5915 spotted by YAMAMOTO Takashi, and removes an obsolete comment spotted by Kevin Grittner.
*	If recovery_target_timeline is set to 'latest' and standby mode is enabled,	Heikki Linnakangas	2011-03-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	periodically rescan the archive for new timelines, while waiting for new WAL segments to arrive. This allows you to set up a standby server that follows the TLI change if another standby server is promoted to master. Before this, you had to restart the standby server to make it notice the new timeline. This patch only scans the archive for TLI changes, it won't follow a TLI change in streaming replication. That is much needed too, but it would be a much bigger patch than I dare to sneak in this late in the release cycle. There was discussion on improving the sanity checking of the WAL segments so that the system would notice more reliably if the new timeline isn't an ancestor of the current one, but that is not included in this patch. Reviewed by Fujii Masao.
*	Efficient transaction-controlled synchronous replication.	Simon Riggs	2011-03-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a standby is broadcasting reply messages and we have named one or more standbys in synchronous_standby_names then allow users who set synchronous_replication to wait for commit, which then provides strict data integrity guarantees. Design avoids sending and receiving transaction state information so minimises bookkeeping overheads. We synchronize with the highest priority standby that is connected and ready to synchronize. Other standbys can be defined to takeover in case of standby failure. This version has very strict behaviour; more relaxed options may be added at a later date. Simon Riggs and Fujii Masao, with reviews by Yeb Havinga, Jaime Casanova, Heikki Linnakangas and Robert Haas, plus the assistance of many other design reviewers.