diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2019-03-19 16:20:20 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2019-03-19 16:20:28 -0400 |
commit | 1f39a1c0641531e0462a4822f2dba904c5d4d699 (patch) | |
tree | 125928ffda2fbd24e09508e4def3bbc1cc15b70c /src/interfaces/libpq/fe-misc.c | |
parent | 5e28b778bf9a5835e702277119c5f92b4dbab45e (diff) | |
download | postgresql-1f39a1c0641531e0462a4822f2dba904c5d4d699.tar.gz postgresql-1f39a1c0641531e0462a4822f2dba904c5d4d699.zip |
Restructure libpq's handling of send failures.
Originally, if libpq got a failure (e.g., ECONNRESET) while trying to
send data to the server, it would just report that and wash its hands
of the matter. It was soon found that that wasn't a very pleasant way
of coping with server-initiated disconnections, so we introduced a hack
(pqHandleSendFailure) in the code that sends queries to make it peek
ahead for server error reports before reporting the send failure.
It now emerges that related cases can occur during connection setup;
in particular, as of TLS 1.3 it's unsafe to assume that SSL connection
failures will be reported by SSL_connect rather than during our first
send attempt. We could have fixed that in a hacky way by applying
pqHandleSendFailure after a startup packet send failure, but
(a) pqHandleSendFailure explicitly disclaims suitability for use in any
state except query startup, and (b) the problem still potentially exists
for other send attempts in libpq.
Instead, let's fix this in a more general fashion by eliminating
pqHandleSendFailure altogether, and instead arranging to postpone
all reports of send failures in libpq until after we've made an
attempt to read and process server messages. The send failure won't
be reported at all if we find a server message or detect input EOF.
(Note: this removes one of the reasons why libpq typically overwrites,
rather than appending to, conn->errorMessage: pqHandleSendFailure needed
that behavior so that the send failure report would be replaced if we
got a server message or read failure report. Eventually I'd like to get
rid of that overwrite behavior altogether, but today is not that day.
For the moment, pqSendSome is assuming that its callees will overwrite
not append to conn->errorMessage.)
Possibly this change should get back-patched someday; but it needs
testing first, so let's not consider that till after v12 beta.
Discussion: https://postgr.es/m/CAEepm=2n6Nv+5tFfe8YnkUm1fXgvxR0Mm1FoD+QKG-vLNGLyKg@mail.gmail.com
Diffstat (limited to 'src/interfaces/libpq/fe-misc.c')
-rw-r--r-- | src/interfaces/libpq/fe-misc.c | 53 |
1 files changed, 45 insertions, 8 deletions
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c index e5ef8d44bd4..ea4c9d2ee0c 100644 --- a/src/interfaces/libpq/fe-misc.c +++ b/src/interfaces/libpq/fe-misc.c @@ -824,6 +824,13 @@ definitelyFailed: * * Return 0 on success, -1 on failure and 1 when not all data could be sent * because the socket would block and the connection is non-blocking. + * + * Upon write failure, conn->write_failed is set and the error message is + * saved in conn->write_err_msg, but we clear the output buffer and return + * zero anyway; this is because callers should soldier on until it's possible + * to read from the server and check for an error message. write_err_msg + * should be reported only when we are unable to obtain a server error first. + * (Thus, a -1 result is returned only for an internal *read* failure.) */ static int pqSendSome(PGconn *conn, int len) @@ -832,13 +839,32 @@ pqSendSome(PGconn *conn, int len) int remaining = conn->outCount; int result = 0; + /* + * If we already had a write failure, we will never again try to send data + * on that connection. Even if the kernel would let us, we've probably + * lost message boundary sync with the server. conn->write_failed + * therefore persists until the connection is reset, and we just discard + * all data presented to be written. + */ + if (conn->write_failed) + { + /* conn->write_err_msg should be set up already */ + conn->outCount = 0; + return 0; + } + if (conn->sock == PGINVALID_SOCKET) { printfPQExpBuffer(&conn->errorMessage, libpq_gettext("connection not open\n")); + conn->write_failed = true; + /* Transfer error message to conn->write_err_msg, if possible */ + /* (strdup failure is OK, we'll cope later) */ + conn->write_err_msg = strdup(conn->errorMessage.data); + resetPQExpBuffer(&conn->errorMessage); /* Discard queued data; no chance it'll ever be sent */ conn->outCount = 0; - return -1; + return 0; } /* while there's still data to send */ @@ -876,17 +902,24 @@ pqSendSome(PGconn *conn, int len) default: /* pqsecure_write set the error message for us */ + conn->write_failed = true; /* - * We used to close the socket here, but that's a bad idea - * since there might be unread data waiting (typically, a - * NOTICE message from the backend telling us it's - * committing hara-kiri...). Leave the socket open until - * pqReadData finds no more data can be read. But abandon - * attempt to send data. + * Transfer error message to conn->write_err_msg, if + * possible (strdup failure is OK, we'll cope later). + * + * Note: this assumes that pqsecure_write and its children + * will overwrite not append to conn->errorMessage. If + * that's ever changed, we could remember the length of + * conn->errorMessage at entry to this routine, and then + * save and delete just what was appended. */ + conn->write_err_msg = strdup(conn->errorMessage.data); + resetPQExpBuffer(&conn->errorMessage); + + /* Discard queued data; no chance it'll ever be sent */ conn->outCount = 0; - return -1; + return 0; } } else @@ -921,6 +954,9 @@ pqSendSome(PGconn *conn, int len) * can do, and works pretty well in practice. (The documentation * used to say that you only need to wait for write-ready, so * there are still plenty of applications like that out there.) + * + * Note that errors here don't result in write_failed becoming + * set. */ if (pqReadData(conn) < 0) { @@ -956,6 +992,7 @@ pqSendSome(PGconn *conn, int len) * * Return 0 on success, -1 on failure and 1 when not all data could be sent * because the socket would block and the connection is non-blocking. + * (See pqSendSome comments about how failure should be handled.) */ int pqFlush(PGconn *conn) |