aboutsummaryrefslogtreecommitdiff
path: root/src/backend/port/unix_latch.c
Commit message (Collapse)AuthorAge
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Assert that WaitLatch's timeout is not more than INT_MAX milliseconds.Tom Lane2012-11-18
| | | | | | | | | | | | The behavior with larger values is unspecified by the Single Unix Spec. It appears that BSD-derived kernels report EINVAL, although Linux does not. If waiting for longer intervals is desired, the calling code has to do something to limit the delay; we can't portably fix it here since "long" may not be any wider than "int" in the first place. Part of response to bug #7670, though this change doesn't fix that (in fact, it converts the problem from an ERROR into an Assert failure). No back-patch since it's just an assertion addition.
* Fix WaitLatch() to return promptly when the requested timeout expires.Tom Lane2012-11-08
| | | | | | | | | | | | | | | | If the sleep is interrupted by a signal, we must recompute the remaining time to wait; otherwise, a steady stream of non-wait-terminating interrupts could delay return from WaitLatch indefinitely. This has been shown to be a problem for the autovacuum launcher, and there may well be other places now or in the future with similar issues. So we'd better make the function robust, even though this'll add at least one gettimeofday call per wait. Back-patch to 9.2. We might eventually need to fix 9.1 as well, but the code is quite different there, and the usage of WaitLatch in 9.1 is so limited that it's not clearly important to do so. Reported and diagnosed by Jeff Janes, though I rewrote his patch rather heavily.
* Split up process latch initialization for more-fail-soft behavior.Tom Lane2012-10-14
| | | | | | | | | | | | | | | | | | | | In the previous coding, new backend processes would attempt to create their self-pipe during the OwnLatch call in InitProcess. However, pipe creation could fail if the kernel is short of resources; and the system does not recover gracefully from a FATAL error right there, since we have armed the dead-man switch for this process and not yet set up the on_shmem_exit callback that would disarm it. The postmaster then forces an unnecessary database-wide crash and restart, as reported by Sean Chittenden. There are various ways we could rearrange the code to fix this, but the simplest and sanest seems to be to split out creation of the self-pipe into a new function InitializeLatchSupport, which must be called from a place where failure is allowed. For most processes that gets called in InitProcess or InitAuxiliaryProcess, but processes that don't call either but still use latches need their own calls. Back-patch to 9.1, which has only a part of the latch logic that 9.2 and HEAD have, but nonetheless includes this bug.
* Make walsender more responsive.Robert Haas2012-07-02
| | | | | | | | | | | Per testing by Andres Freund, this improves replication performance and reduces replication latency and latency jitter. I was a bit concerned about moving more work into XLogInsert, but testing seems to show that it's not a problem in practice. Along the way, improve comments for WaitLatchOrSocket. Andres Freund. Review and stylistic cleanup by me.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Assert that WaitLatchOrSocket callers cannot wait only for writability.Tom Lane2012-05-14
| | | | | | | | | | | | | Since we have chosen to report socket EOF and error conditions via the WL_SOCKET_READABLE flag bit, it's unsafe to wait only for WL_SOCKET_WRITEABLE; the caller would never be notified of the socket condition, and in some of these implementations WaitLatchOrSocket would busy-wait until something else happens. Add this restriction to the API specification, and add Asserts to check that callers don't try to do that. At some point we might want to consider adjusting the API to relax this restriction, but until we have an actual use case for waiting on a write-only socket, it seems premature to design a solution.
* Fix WaitLatchOrSocket to handle EOF on socket correctly.Tom Lane2012-05-12
| | | | | | | | | | When using poll(), EOF on a socket is reported with the POLLHUP not POLLIN flag (at least on Linux). WaitLatchOrSocket failed to check this bit, causing it to go into a busy-wait loop if EOF occurs. We earlier fixed the same mistake in the test for the state of the postmaster_alive socket, but missed it for the caller-supplied socket. Fortunately, this error is new in 9.2, since 9.1 only had a select() based code path not a poll() based one.
* Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers.Tom Lane2012-05-10
| | | | | | | | Per a suggestion from Peter Geoghegan, make WaitLatch responsible for verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by testing PostmasterIsAlive). Then simplify its callers, who no longer need to do that for themselves. Remove weasel wording about falsely-set result bits from WaitLatch's API contract.
* Fix poll() implementation of WaitLatchOrSocket to notice postmaster death.Heikki Linnakangas2012-01-15
| | | | | | | When the remote end of the pipe is closed, select() reports the fd as readable, but poll() has a separate POLLHUP return code for that. Spotted by Peter Geoghegan.
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Teach unix_latch.c to use poll() where available.Tom Lane2011-08-11
| | | | | | | | | poll() is preferred over select() on platforms where both are available, because it tends to be a bit faster and it doesn't have an arbitrary limit on the range of FD numbers that can be accessed. The FD range limit does not appear to be a risk factor for any 9.1 usages, so this doesn't need to be back-patched, but we need to have it in place if we keep on expanding the uses of WaitLatch.
* Change the autovacuum launcher to use WaitLatch instead of a poll loop.Tom Lane2011-08-10
| | | | | | | | | | | | | | | | | In pursuit of this (and with the expectation that WaitLatch will be needed in more places), convert the latch field that was already added to PGPROC for sync rep into a generic latch that is activated for all PGPROC-owning processes, and change many of the standard backend signal handlers to set that latch when a signal happens. This will allow WaitLatch callers to be wakened properly by these signals. In passing, fix a whole bunch of signal handlers that had been hacked to do things that might change errno, without adding the necessary save/restore logic for errno. Also make some minor fixes in unix_latch.c, and clean up bizarre and unsafe scheme for disowning the process's latch. Much of this has to be back-patched into 9.1. Peter Geoghegan, with additional work by Tom
* Measure WaitLatch's timeout parameter in milliseconds, not microseconds.Tom Lane2011-08-09
| | | | | | | | | | | | The original definition had the problem that timeouts exceeding about 2100 seconds couldn't be specified on 32-bit machines. Milliseconds seem like sufficient resolution, and finer grain than that would be fantasy anyway on many platforms. Back-patch to 9.1 so that this aspect of the latch API won't change between 9.1 and later releases. Peter Geoghegan
* Documentation improvement and minor code cleanups for the latch facility.Tom Lane2011-08-09
| | | | | | | | | | | | | | Improve the documentation around weak-memory-ordering risks, and do a pass of general editorialization on the comments in the latch code. Make the Windows latch code more like the Unix latch code where feasible; in particular provide the same Assert checks in both implementations. Fix poorly-placed WaitLatch call in syncrep.c. This patch resolves, for the moment, concerns around weak-memory-ordering bugs in latch-related code: we have documented the restrictions and checked that existing calls meet them. In 9.2 I hope that we will install suitable memory barrier instructions in SetLatch/ResetLatch, so that their callers don't need to be quite so careful.
* Introduce a pipe between postmaster and each backend, which can be used toHeikki Linnakangas2011-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | detect postmaster death. Postmaster keeps the write-end of the pipe open, so when it dies, children get EOF in the read-end. That can conveniently be waited for in select(), which allows eliminating some of the polling loops that check for postmaster death. This patch doesn't yet change all the loops to use the new mechanism, expect a follow-on patch to do that. This changes the interface to WaitLatch, so that it takes as argument a bitmask of events that it waits for. Possible events are latch set, timeout, postmaster death, and socket becoming readable or writeable. The pipe method behaves slightly differently from the kill() method previously used in PostmasterIsAlive() in the case that postmaster has died, but its parent has not yet read its exit code with waitpid(). The pipe returns EOF as soon as the process dies, but kill() continues to return true until waitpid() has been called (IOW while the process is a zombie). Because of that, change PostmasterIsAlive() to use the pipe too, otherwise WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while PostmasterIsAlive() would claim it's still alive. That could easily lead to busy-waiting while postmaster is in zombie state. Peter Geoghegan with further changes by me, reviewed by Fujii Masao and Florian Pflug.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Automatically terminate replication connections that are idle for moreHeikki Linnakangas2011-03-30
| | | | | | | | | than replication_timeout (a new GUC) milliseconds. The TCP timeout is often too long, you want the master to notice a dead connection much sooner. People complained about that in 9.0 too, but with synchronous replication it's even more important to notice dead connections promptly. Fujii Masao and Heikki Linnakangas
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Simplify Windows implementation of latches. There's no need to keep aHeikki Linnakangas2010-09-15
| | | | | | | | dynamic pool of event handles, we can permanently assign one for each shared latch. Thanks to that, we no longer need a separate shared memory block for latches, and we don't need to know in advance how many shared latches there is, so you no longer need to remember to update NumSharedLatches when you introduce a new latch to the system.
* Add a comment noting that the owner_pid test in OwnLatch is just a sanityHeikki Linnakangas2010-09-13
| | | | check, per request by Jeff Davis.
* Add missing #includes, needed on some platforms. This should makeHeikki Linnakangas2010-09-11
| | | | the unixware buildfarm animals happy again.
* Introduce latches. A latch is a boolean variable, with the capability toHeikki Linnakangas2010-09-11
wait until it is set. Latches can be used to reliably wait until a signal arrives, which is hard otherwise because signals don't interrupt select() on some platforms, and even when they do, there's race conditions. On Unix, latches use the so called self-pipe trick under the covers to implement the sleep until the latch is set, without race conditions. On Windows, Windows events are used. Use the new latch abstraction to sleep in walsender, so that as soon as a transaction finishes, walsender is woken up to immediately send the WAL to the standby. This reduces the latency between master and standby, which is good. Preliminary work by Fujii Masao. The latch implementation is by me, with helpful comments from many people.