aboutsummaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw/postgres_fdw.c
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2019-06-12 17:29:48 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2019-06-12 17:29:48 -0400
commit9346d396fd4a643653b5f3822dbfbd9968b32679 (patch)
treeaeea1e152e415c535f550d0c78fecf0be7263ed8 /contrib/postgres_fdw/postgres_fdw.c
parent0b6edb9fb3d24c73b21917945e830e1c84135575 (diff)
downloadpostgresql-9346d396fd4a643653b5f3822dbfbd9968b32679.tar.gz
postgresql-9346d396fd4a643653b5f3822dbfbd9968b32679.zip
In walreceiver, don't try to do ereport() in a signal handler.
This is quite unsafe, even for the case of ereport(FATAL) where we won't return control to the interrupted code, and despite this code's use of a flag to restrict the areas where we'd try to do it. It's possible for example that we interrupt malloc or free while that's holding a lock that's meant to protect against cross-thread interference. Then, any attempt to do malloc or free within ereport() will result in a deadlock, preventing the walreceiver process from exiting in response to SIGTERM. We hypothesize that this explains some hard-to-reproduce failures seen in the buildfarm. Hence, get rid of the immediate-exit code in WalRcvShutdownHandler, as well as the logic associated with WalRcvImmediateInterruptOK. Instead, we need to take care that potentially-blocking operations in the walreceiver's data transmission logic (libpqwalreceiver.c) will respond reasonably promptly to the process's latch becoming set and then call ProcessWalRcvInterrupts. Much of the needed code for that was already present in libpqwalreceiver.c. I refactored things a bit so that all the uses of PQgetResult use latch-aware waiting, but didn't need to do much more. These changes should be enough to ensure that libpqwalreceiver.c will respond promptly to SIGTERM whenever it's waiting to receive data. In principle, it could block for a long time while waiting to send data too, and this patch does nothing to guard against that. I think that that hazard is mostly theoretical though: such blocking should occur only if we fill the kernel's data transmission buffers, and we don't generally send enough data to make that happen without waiting for input. If we find out that the hazard isn't just theoretical, we could fix it by using PQsetnonblocking, but that would require more ticklish changes than I care to make now. Back-patch of commit a1a789eb5. This problem goes all the way back to the origins of walreceiver; but given the substantial reworking the module received during the v10 cycle, it seems unsafe to assume that our testing on HEAD validates this patch for pre-v10 branches. And we'd need to back-patch some prerequisite patches (at least 597a87ccc and its followups, maybe other things), increasing the risk of problems. Given the dearth of field reports matching this problem, it's not worth much risk. Hence back-patch to v10 and v11 only. Patch by me; thanks to Thomas Munro for review. Discussion: https://postgr.es/m/20190416070119.GK2673@paquier.xyz
Diffstat (limited to 'contrib/postgres_fdw/postgres_fdw.c')
0 files changed, 0 insertions, 0 deletions