aboutsummaryrefslogtreecommitdiff
path: root/src/backend/postmaster/checkpointer.c
Commit message (Collapse)AuthorAge
* pgindent run for release 9.3Bruce Momjian2013-05-29
| | | | | This is the first run of the Perl-based pgindent script. Also update pgindent instructions.
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Don't set ThisTimeLineID in checkpointer & bgwriter during recovery.Heikki Linnakangas2012-12-20
| | | | | | | | | | | We used to set it to the current recovery target timeline, but the recovery target timeline can change during recovery, leaving ThisTimeLineID at an old value. That seems worse than always leaving it at zero to begin with. AFAICS there was no good reason to set it in the first place. ThisTimeLineID is not needed in checkpointer or bgwriter process, until it's time to write the end-of-recovery checkpoint, and at that point ThisTimeLineID is updated anyway.
* Make xlog_internal.h includable in frontend context.Heikki Linnakangas2012-12-13
| | | | | | | This makes unnecessary the ugly hack used to #include postgres.h in pg_basebackup. Based on Alvaro Herrera's patch
* Close un-owned SMgrRelations at transaction end.Tom Lane2012-10-17
| | | | | | | | | | | | | | | | | | If an SMgrRelation is not "owned" by a relcache entry, don't allow it to live past transaction end. This design allows the same SMgrRelation to be used for blind writes of multiple blocks during a transaction, but ensures that we don't hold onto such an SMgrRelation indefinitely. Because an SMgrRelation typically corresponds to open file descriptors at the fd.c level, leaving it open when there's no corresponding relcache entry can mean that we prevent the kernel from reclaiming deleted disk space. (While CacheInvalidateSmgr messages usually fix that, there are cases where they're not issued, such as DROP DATABASE. We might want to add some more sinval messaging for that, but I'd be inclined to keep this type of logic anyway, since allowing VFDs to accumulate indefinitely for blind-written relations doesn't seem like a good idea.) This code replaces a previous attempt towards the same goal that proved to be unreliable. Back-patch to 9.1 where the previous patch was added.
* Split resowner.hAlvaro Herrera2012-08-28
| | | | | This lets files that are mere users of ResourceOwner not automatically include the headers for stuff that is managed by the resowner mechanism.
* Fix statistics breakage from bgwriter/checkpointer process split.Tom Lane2012-07-18
| | | | | | | | | | | | ForwardFsyncRequest() supposed that it could only be called in regular backends, which used to be true; but since the splitup of bgwriter and checkpointer, it is also called in the bgwriter. We do not want to count such calls in pg_stat_bgwriter.buffers_backend statistics, so fix things so that they aren't. (It's worth noting here that this implies an alarmingly large increase in the expected amount of cross-process fsync request traffic, which may well mean that the process splitup was not such a hot idea.)
* Fix management of pendingOpsTable in auxiliary processes.Tom Lane2012-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mdinit() was misusing IsBootstrapProcessingMode() to decide whether to create an fsync pending-operations table in the current process. This led to creating a table not only in the startup and checkpointer processes as intended, but also in the bgwriter process, not to mention other auxiliary processes such as walwriter and walreceiver. Creation of the table in the bgwriter is fatal, because it absorbs fsync requests that should have gone to the checkpointer; instead they just sit in bgwriter local memory and are never acted on. So writes performed by the bgwriter were not being fsync'd which could result in data loss after an OS crash. I think there is no live bug with respect to walwriter and walreceiver because those never perform any writes of shared buffers; but the potential is there for future breakage in those processes too. To fix, make AuxiliaryProcessMain() export the current process's AuxProcType as a global variable, and then make mdinit() test directly for the types of aux process that should have a pendingOpsTable. Having done that, we might as well also get rid of the random bool flags such as am_walreceiver that some of the aux processes had grown. (Note that we could not have fixed the bug by examining those variables in mdinit(), because it's called from BaseInit() which is run by AuxiliaryProcessMain() before entering any of the process-type-specific code.) Back-patch to 9.2, where the problem was introduced by the split-up of bgwriter and checkpointer processes. The bogus pendingOpsTable exists in walwriter and walreceiver processes in earlier branches, but absent any evidence that it causes actual problems there, I'll leave the older branches alone.
* Improve coding around the fsync request queue.Tom Lane2012-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In all branches back to 8.3, this patch fixes a questionable assumption in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are no uninitialized pad bytes in the request queue structs. This would only cause trouble if (a) there were such pad bytes, which could happen in 8.4 and up if the compiler makes enum ForkNumber narrower than 32 bits, but otherwise would require not-currently-planned changes in the widths of other typedefs; and (b) the kernel has not uniformly initialized the contents of shared memory to zeroes. Still, it seems a tad risky, and we can easily remove any risk by pre-zeroing the request array for ourselves. In addition to that, we need to establish a coding rule that struct RelFileNode can't contain any padding bytes, since such structs are copied into the request array verbatim. (There are other places that are assuming this anyway, it turns out.) In 9.1 and up, the risk was a bit larger because we were also effectively assuming that struct RelFileNodeBackend contained no pad bytes, and with fields of different types in there, that would be much easier to break. However, there is no good reason to ever transmit fsync or delete requests for temp files to the bgwriter/checkpointer, so we can revert the request structs to plain RelFileNode, getting rid of the padding risk and saving some marginal number of bytes and cycles in fsync queue manipulation while we are at it. The savings might be more than marginal during deletion of a temp relation, because the old code transmitted an entirely useless but nonetheless expensive-to-process ForgetRelationFsync request to the background process, and also had the background process perform the file deletion even though that can safely be done immediately. In addition, make some cleanup of nearby comments and small improvements to the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.
* Replace XLogRecPtr struct with a 64-bit integer.Heikki Linnakangas2012-06-24
| | | | | | | | | | | | | | This simplifies code that needs to do arithmetic on XLogRecPtrs. To avoid changing on-disk format of data pages, the LSN on data pages is still stored in the old format. That should keep pg_upgrade happy. However, we have XLogRecPtrs embedded in the control file, and in the structs that are sent over the replication protocol, so this changes breaks compatibility of pg_basebackup and server. I didn't do anything about this in this patch, per discussion on -hackers, the right thing to do would to be to change the replication protocol to be architecture-independent, so that you could use a newer version of pg_receivexlog, for example, against an older server version.
* Don't waste the last segment of each 4GB logical log file.Heikki Linnakangas2012-06-24
| | | | | | | | | | | | | | | | The comments claimed that wasting the last segment made it easier to do calculations with XLogRecPtrs, because you don't have problems representing last-byte-position-plus-1 that way. In my experience, however, it only made things more complicated, because the there was two ways to represent the boundary at the beginning of a logical log file: logid = n+1 and xrecoff = 0, or as xlogid = n and xrecoff = 4GB - XLOG_SEG_SIZE. Some functions were picky about which representation was used. Also, use a 64-bit segment number instead of the log/seg combination, to point to a certain WAL segment. We assume that all platforms have a working 64-bit integer type nowadays. This is an incompatible change in WAL format, so bumping WAL version number.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* After any checkpoint, close all smgr files handles in bgwriterSimon Riggs2012-06-01
|
* Provide interim statistics while in mid-checkpoint.Simon Riggs2012-06-01
| | | | | | | Re-implements similar functionality in 9.1 and previously which was removed during split of checkpointer and bgwriter. Requested/spotted by Magnus Hagander
* Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers.Tom Lane2012-05-10
| | | | | | | | Per a suggestion from Peter Geoghegan, make WaitLatch responsible for verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by testing PostmasterIsAlive). Then simplify its callers, who no longer need to do that for themselves. Remove weasel wording about falsely-set result bits from WaitLatch's API contract.
* Improve tests for postmaster death in auxiliary processes.Tom Lane2012-05-10
| | | | | | | | | | | | | In checkpointer and walwriter, avoid calling PostmasterIsAlive unless WaitLatch has reported WL_POSTMASTER_DEATH. This saves a kernel call per iteration of the process's outer loop, which is not all that much, but a cycle shaved is a cycle earned. I had already removed the unconditional PostmasterIsAlive calls in bgwriter and pgstat in previous patches, but forgot that WL_POSTMASTER_DEATH is supposed to be treated as untrustworthy (per comment in unix_latch.c); so adjust those two cases to match. There are a few other places where the same idea might be applied, but only after substantial code rearrangement, so I didn't bother.
* Further tweaking of nomenclature in checkpointer.c.Tom Lane2012-05-10
| | | | | | Get rid of some more naming choices that only make sense if you know that this code used to be in the bgwriter, as well as some stray comments referencing the bgwriter.
* Rename BgWriterShmem/Request to CheckpointerShmem/RequestSimon Riggs2012-05-09
|
* Rename BgWriterCommLock to CheckpointerCommLockSimon Riggs2012-05-09
|
* Reduce idle power consumption of walwriter and checkpointer processes.Tom Lane2012-05-08
| | | | | | | | | | | | | | | | | | | | | | | This patch modifies the walwriter process so that, when it has not found anything useful to do for many consecutive wakeup cycles, it extends its sleep time to reduce the server's idle power consumption. It reverts to normal as soon as it's done any successful flushes. It's still true that during any async commit, backends check for completed, unflushed pages of WAL and signal the walwriter if there are any; so that in practice the walwriter can get awakened and returned to normal operation sooner than the sleep time might suggest. Also, improve the checkpointer so that it uses a latch and a computed delay time to not wake up at all except when it has something to do, replacing a previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for reducing the server's power consumption when idle. In passing, get rid of the dedicated latch for signaling the walwriter in favor of using its procLatch, since that comports better with possible generic signal handlers using that latch. Also, fix a pre-existing bug with failure to save/restore errno in walwriter's signal handlers. Peter Geoghegan, somewhat simplified by Tom
* Lots of doc corrections.Robert Haas2012-04-23
| | | | Josh Kupershmidt
* Minor bug fix and cleanup from self-review of sync rep queues patch.Simon Riggs2012-01-30
|
* Various minor comments changes from bgwriter to checkpointer.Simon Riggs2012-01-30
|
* Allow pg_basebackup from standby node with safety checking.Simon Riggs2012-01-25
| | | | | | | Base backup follows recommended procedure, plus goes to great lengths to ensure that partial page writes are avoided. Jun Ishizuka and Fujii Masao, with minor modifications
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Have checkpointer send stats once each processing loop.Simon Riggs2011-11-01
| | | | Noted by Fujii Masao
* Add new file for checkpointer.cSimon Riggs2011-11-01