diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2016-05-25 12:39:57 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2016-05-25 12:40:12 -0400 |
commit | 9abd64ec997cc5f0bac485aa1585064308f73c83 (patch) | |
tree | b514e97d727d71f367268c0734ccdc3d4c578cfc /src/bin/pg_dump/pg_backup_utils.c | |
parent | 627e360358e3beb67cd2f54393835f979c5e30b7 (diff) | |
download | postgresql-9abd64ec997cc5f0bac485aa1585064308f73c83.tar.gz postgresql-9abd64ec997cc5f0bac485aa1585064308f73c83.zip |
Fix broken error handling in parallel pg_dump/pg_restore.
In the original design for parallel dump, worker processes reported errors
by sending them up to the master process, which would print the messages.
This is unworkably fragile for a couple of reasons: it risks deadlock if a
worker sends an error at an unexpected time, and if the master has already
died for some reason, the user will never get to see the error at all.
Revert that idea and go back to just always printing messages to stderr.
This approach means that if all the workers fail for similar reasons (eg,
bad password or server shutdown), the user will see N copies of that
message, not only one as before. While that's slightly annoying, it's
certainly better than not seeing any message; not to mention that we
shouldn't assume that only the first failure is interesting.
An additional problem in the same area was that the master failed to
disable SIGPIPE (at least until much too late), which meant that sending a
command to an already-dead worker would cause the master to crash silently.
That was bad enough in itself but was made worse by the total reliance on
the master to print errors: even if the worker had reported an error, you
would probably not see it, depending on timing. Instead disable SIGPIPE
right after we've forked the workers, before attempting to send them
anything.
Additionally, the master relies on seeing socket EOF to realize that a
worker has exited prematurely --- but on Windows, there would be no EOF
since the socket is attached to the process that includes both the master
and worker threads, so it remains open. Make archive_close_connection()
close the worker end of the sockets so that this acts more like the Unix
case. It's not perfect, because if a worker thread exits without going
through exit_nicely() the closures won't happen; but that's not really
supposed to happen.
This has been wrong all along, so back-patch to 9.3 where parallel dump
was introduced.
Report: <2458.1450894615@sss.pgh.pa.us>
Diffstat (limited to 'src/bin/pg_dump/pg_backup_utils.c')
-rw-r--r-- | src/bin/pg_dump/pg_backup_utils.c | 32 |
1 files changed, 31 insertions, 1 deletions
diff --git a/src/bin/pg_dump/pg_backup_utils.c b/src/bin/pg_dump/pg_backup_utils.c index 0aa13cda433..5d1d87565a2 100644 --- a/src/bin/pg_dump/pg_backup_utils.c +++ b/src/bin/pg_dump/pg_backup_utils.c @@ -93,6 +93,23 @@ vwrite_msg(const char *modulename, const char *fmt, va_list ap) vfprintf(stderr, _(fmt), ap); } +/* + * Fail and die, with a message to stderr. Parameters as for write_msg. + * + * Note that on_exit_nicely callbacks will get run. + */ +void +exit_horribly(const char *modulename, const char *fmt,...) +{ + va_list ap; + + va_start(ap, fmt); + vwrite_msg(modulename, fmt, ap); + va_end(ap); + + exit_nicely(1); +} + /* Register a callback to be run when exit_nicely is invoked. */ void on_exit_nicely(on_exit_nicely_callback function, void *arg) @@ -106,7 +123,20 @@ on_exit_nicely(on_exit_nicely_callback function, void *arg) /* * Run accumulated on_exit_nicely callbacks in reverse order and then exit - * quietly. This needs to be thread-safe. + * without printing any message. + * + * If running in a parallel worker thread on Windows, we only exit the thread, + * not the whole process. + * + * Note that in parallel operation on Windows, the callback(s) will be run + * by each thread since the list state is necessarily shared by all threads; + * each callback must contain logic to ensure it does only what's appropriate + * for its thread. On Unix, callbacks are also run by each process, but only + * for callbacks established before we fork off the child processes. (It'd + * be cleaner to reset the list after fork(), and let each child establish + * its own callbacks; but then the behavior would be completely inconsistent + * between Windows and Unix. For now, just be sure to establish callbacks + * before forking to avoid inconsistency.) */ void exit_nicely(int code) |