| Commit message (Collapse) | Author | Age |
|
|
|
|
| |
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
|
|
|
|
| |
Commit 81107282a changed it in assert.c, but overlooked this other file.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The behavior with larger values is unspecified by the Single Unix Spec.
It appears that BSD-derived kernels report EINVAL, although Linux does not.
If waiting for longer intervals is desired, the calling code has to do
something to limit the delay; we can't portably fix it here since "long"
may not be any wider than "int" in the first place.
Part of response to bug #7670, though this change doesn't fix that
(in fact, it converts the problem from an ERROR into an Assert failure).
No back-patch since it's just an assertion addition.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the sleep is interrupted by a signal, we must recompute the remaining
time to wait; otherwise, a steady stream of non-wait-terminating interrupts
could delay return from WaitLatch indefinitely. This has been shown to be
a problem for the autovacuum launcher, and there may well be other places
now or in the future with similar issues. So we'd better make the function
robust, even though this'll add at least one gettimeofday call per wait.
Back-patch to 9.2. We might eventually need to fix 9.1 as well, but the
code is quite different there, and the usage of WaitLatch in 9.1 is so
limited that it's not clearly important to do so.
Reported and diagnosed by Jeff Janes, though I rewrote his patch rather
heavily.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the previous coding, new backend processes would attempt to create their
self-pipe during the OwnLatch call in InitProcess. However, pipe creation
could fail if the kernel is short of resources; and the system does not
recover gracefully from a FATAL error right there, since we have armed the
dead-man switch for this process and not yet set up the on_shmem_exit
callback that would disarm it. The postmaster then forces an unnecessary
database-wide crash and restart, as reported by Sean Chittenden.
There are various ways we could rearrange the code to fix this, but the
simplest and sanest seems to be to split out creation of the self-pipe into
a new function InitializeLatchSupport, which must be called from a place
where failure is allowed. For most processes that gets called in
InitProcess or InitAuxiliaryProcess, but processes that don't call either
but still use latches need their own calls.
Back-patch to 9.1, which has only a part of the latch logic that 9.2 and
HEAD have, but nonetheless includes this bug.
|
|
|
|
|
|
| |
Since the request size will now be ~48 bytes regardless of how
shared_buffers et. al. are set, much of this advice is no longer
relevant.
|
|
|
|
|
|
|
|
|
|
|
| |
Per testing by Andres Freund, this improves replication performance
and reduces replication latency and latency jitter. I was a bit
concerned about moving more work into XLogInsert, but testing seems
to show that it's not a problem in practice.
Along the way, improve comments for WaitLatchOrSocket.
Andres Freund. Review and stylistic cleanup by me.
|
|
|
|
|
|
|
|
| |
The original coding had it as "PGShmemHeader *", but that doesn't offer any
notational benefit because we don't dereference it. And it was resulting
in compiler warnings on some platforms, notably buildfarm member
castoroides, where mmap() and munmap() are evidently declared to take and
return "char *".
|
|
|
|
| |
Noted by Andres Freund. Also improve a couple of comments.
|
|
|
|
|
|
| |
On old HPUX this has to be #defined to -1. It might be that other values
are required on other dinosaur systems, but we'll worry about that when
and if we get reports.
|
|
|
|
|
| |
Per an observation by Thom Brown that my previous commit made an
overly large shmem allocation crash the server, on Linux.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Except when compiling with EXEC_BACKEND, we'll now allocate only a tiny
amount of System V shared memory (as an interlock to protect the data
directory) and allocate the rest as anonymous shared memory via mmap.
This will hopefully spare most users the hassle of adjusting operating
system parameters before being able to start PostgreSQL with a
reasonable value for shared_buffers.
There are a bunch of documentation updates needed here, and we might
need to adjust some of the HINT messages related to shared memory as
well. But it's not 100% clear how portable this is, so before we
write the documentation, let's give it a spin on the buildfarm and
see what turns red.
|
|
|
|
| |
commit-fest.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we have chosen to report socket EOF and error conditions via the
WL_SOCKET_READABLE flag bit, it's unsafe to wait only for
WL_SOCKET_WRITEABLE; the caller would never be notified of the socket
condition, and in some of these implementations WaitLatchOrSocket would
busy-wait until something else happens. Add this restriction to the API
specification, and add Asserts to check that callers don't try to do that.
At some point we might want to consider adjusting the API to relax this
restriction, but until we have an actual use case for waiting on a
write-only socket, it seems premature to design a solution.
|
|
|
|
|
| |
These should have been removed when the BeOS port was removed in
44f90212236bfb6fc1279e95dc8fa315104d964e.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure WaitLatchOrSocket regards FD_CLOSE as a read-ready condition.
We might want to tweak this further, but it was surely wrong as-is.
Make pgwin32_waitforsinglesocket detach its private event object from the
passed socket before returning. I suspect that failure to do so leads
to race conditions when other code (such as WaitLatchOrSocket) attaches
a different event object to the same socket. Moreover, the existing
coding meant that repeated calls to pgwin32_waitforsinglesocket would
perform ResetEvent on an event actively connected to a socket, which
is rumored to be an unsafe practice; the WSAEventSelect documentation
appears to recommend against this, though it does not say not to do it
in so many words.
Also, uniformly use the coding pattern "WSAEventSelect(s, NULL, 0)" to
detach events from sockets, rather than passing the event in the second
parameter. The WSAEventSelect documentation says that the second parameter
is ignored if the third is 0, so theoretically this should make no
difference. However, elsewhere on the same reference page the use of NULL
in this context is recommended, and I have found suggestions on the net
that some versions of Windows have bugs with a non-NULL second parameter
in this usage.
Some other mostly-cosmetic cleanup, such as using the right one of
WSAGetLastError and GetLastError for reporting errors from these functions.
|
|
|
|
|
|
|
|
|
|
| |
When using poll(), EOF on a socket is reported with the POLLHUP not
POLLIN flag (at least on Linux). WaitLatchOrSocket failed to check
this bit, causing it to go into a busy-wait loop if EOF occurs.
We earlier fixed the same mistake in the test for the state of the
postmaster_alive socket, but missed it for the caller-supplied socket.
Fortunately, this error is new in 9.2, since 9.1 only had a select()
based code path not a poll() based one.
|
|
|
|
|
|
|
|
| |
Per a suggestion from Peter Geoghegan, make WaitLatch responsible for
verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by
testing PostmasterIsAlive). Then simplify its callers, who no longer
need to do that for themselves. Remove weasel wording about falsely-set
result bits from WaitLatch's API contract.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original coding failed to reset ImmediateInterruptOK before returning,
which would potentially allow a subsequent query-cancel interrupt to be
accepted at an unsafe point. This is a really nasty bug since it's so hard
to predict the consequences, but they could be unpleasant.
Also, ensure that signal handlers are serviced before this function
returns, even if the semaphore is already set. This should make the
behavior more like Unix.
Back-patch to all supported versions.
|
|
|
|
|
|
|
|
|
|
| |
Ensure that signal handlers are serviced before this function returns.
This should make the behavior more like Unix. Also, add some more
error checking, and make some other cosmetic improvements.
No back-patch since it's not clear whether this is fixing any live bug
that would affect 9.1. I'm more concerned about 9.2 anyway given our
considerable recent expansions in the usage of WaitLatch.
|
|
|
|
| |
Postgres 9.2, and perhaps no existing users either.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the following ports:
- dgux
- nextstep
- sunos4
- svr4
- ultrix4
- univel
These are obsolete and not worth rescuing. In most cases, there is
circumstantial evidence that they wouldn't work anymore anyway.
|
|
|
|
| |
Josh Kupershmidt
|
| |
|
| |
|
|
|
|
|
|
|
| |
When the remote end of the pipe is closed, select() reports the fd as
readable, but poll() has a separate POLLHUP return code for that.
Spotted by Peter Geoghegan.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Use something like "error code %lu" for reporting GetLastError()
values on Windows. Previously, a mix of different wordings and
formats were in use.
|
|
|
|
|
|
|
|
|
| |
poll() is preferred over select() on platforms where both are available,
because it tends to be a bit faster and it doesn't have an arbitrary limit
on the range of FD numbers that can be accessed. The FD range limit does
not appear to be a risk factor for any 9.1 usages, so this doesn't need to
be back-patched, but we need to have it in place if we keep on expanding
the uses of WaitLatch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In pursuit of this (and with the expectation that WaitLatch will be needed
in more places), convert the latch field that was already added to PGPROC
for sync rep into a generic latch that is activated for all PGPROC-owning
processes, and change many of the standard backend signal handlers to set
that latch when a signal happens. This will allow WaitLatch callers to be
wakened properly by these signals.
In passing, fix a whole bunch of signal handlers that had been hacked to do
things that might change errno, without adding the necessary save/restore
logic for errno. Also make some minor fixes in unix_latch.c, and clean
up bizarre and unsafe scheme for disowning the process's latch. Much of
this has to be back-patched into 9.1.
Peter Geoghegan, with additional work by Tom
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original definition had the problem that timeouts exceeding about 2100
seconds couldn't be specified on 32-bit machines. Milliseconds seem like
sufficient resolution, and finer grain than that would be fantasy anyway
on many platforms.
Back-patch to 9.1 so that this aspect of the latch API won't change between
9.1 and later releases.
Peter Geoghegan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve the documentation around weak-memory-ordering risks, and do a pass
of general editorialization on the comments in the latch code. Make the
Windows latch code more like the Unix latch code where feasible; in
particular provide the same Assert checks in both implementations.
Fix poorly-placed WaitLatch call in syncrep.c.
This patch resolves, for the moment, concerns around weak-memory-ordering
bugs in latch-related code: we have documented the restrictions and checked
that existing calls meet them. In 9.2 I hope that we will install suitable
memory barrier instructions in SetLatch/ResetLatch, so that their callers
don't need to be quite so careful.
|
|
|
|
|
| |
They are identical, but the overwhelming majority of the code uses %d,
so standardize on that.
|
|
|
|
|
| |
It is set correctly on the only path that uses it, but the
compiler can't know that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
detect postmaster death. Postmaster keeps the write-end of the pipe open,
so when it dies, children get EOF in the read-end. That can conveniently
be waited for in select(), which allows eliminating some of the polling
loops that check for postmaster death. This patch doesn't yet change all
the loops to use the new mechanism, expect a follow-on patch to do that.
This changes the interface to WaitLatch, so that it takes as argument a
bitmask of events that it waits for. Possible events are latch set, timeout,
postmaster death, and socket becoming readable or writeable.
The pipe method behaves slightly differently from the kill() method
previously used in PostmasterIsAlive() in the case that postmaster has died,
but its parent has not yet read its exit code with waitpid(). The pipe
returns EOF as soon as the process dies, but kill() continues to return
true until waitpid() has been called (IOW while the process is a zombie).
Because of that, change PostmasterIsAlive() to use the pipe too, otherwise
WaitLatch() would return immediately with WL_POSTMASTER_DEATH, while
PostmasterIsAlive() would claim it's still alive. That could easily lead to
busy-waiting while postmaster is in zombie state.
Peter Geoghegan with further changes by me, reviewed by Fujii Masao and
Florian Pflug.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
For consistency, have all non-ASCII characters from contributors'
names in the source be in UTF-8. But remove some other more
gratuitous uses of non-ASCII characters.
|
|
|
|
|
|
| |
Also, we removed the display of the current value of
max_connections/MaxBackends from some messages earlier, because it was
confusing, so do that in the remaining one as well.
|
| |
|
|
|
|
| |
Mostly to do with macro redefinitions or object signedness.
|
|
|
|
|
|
|
| |
Most of these cast DWORD to int or unsigned int for printf type handling.
This is safe even on 64 bit architectures because a DWORD is always 32 bits.
In one case a variable is initialised to keep the compiler happy.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Remove the hard-wired assumption that __mips__ (and only __mips__) lacks
dlopen in FreeBSD and OpenBSD. This assumption is outdated at least for
OpenBSD, as per report from an anonymous 9.1 tester. We can perfectly well
use HAVE_DLOPEN instead to decide which code to use.
Some other cosmetic adjustments to make freebsd.c, netbsd.c, and openbsd.c
exactly alike.
|
|
|
|
| |
Josh Kupershmidt
|
|
|
|
|
|
|
|
|
| |
than replication_timeout (a new GUC) milliseconds. The TCP timeout is often
too long, you want the master to notice a dead connection much sooner.
People complained about that in 9.0 too, but with synchronous replication
it's even more important to notice dead connections promptly.
Fujii Masao and Heikki Linnakangas
|