diff options
author | Thomas Munro <tmunro@postgresql.org> | 2022-09-08 20:25:20 +1200 |
---|---|---|
committer | Thomas Munro <tmunro@postgresql.org> | 2022-09-08 21:44:55 +1200 |
commit | adb466150b44d1eaf43a2d22f58ff4c545a0ed3f (patch) | |
tree | 0c3b8dbf28df0032efefc9a3a7930269b0da33a8 /src/backend/access/transam/xlogreader.c | |
parent | 12d40d4a8d0495cf2c7b564daa8aaa7f107a6c56 (diff) | |
download | postgresql-adb466150b44d1eaf43a2d22f58ff4c545a0ed3f.tar.gz postgresql-adb466150b44d1eaf43a2d22f58ff4c545a0ed3f.zip |
Fix recovery_prefetch with low maintenance_io_concurrency.
We should process completed IOs *before* trying to start more, so that
it is always possible to decode one more record when the decoded record
queue is empty, even if maintenance_io_concurrency is set so low that a
single earlier WAL record might have saturated the IO queue.
That bug was hidden because the effect of maintenance_io_concurrency was
arbitrarily clamped to be at least 2. Fix the ordering, and also remove
that clamp. We need a special case for 0, which is now treated the same
as recovery_prefetch=off, but otherwise the number is used directly.
This allows for testing with 1, which would have made the problem
obvious in simple test scenarios.
Also add an explicit error message for missing contrecords. It was a
bit strange that we didn't report an error already, and became a latent
bug with prefetching, since the internal state that tracks aborted
contrecords would not survive retrying, as revealed by
026_overwrite_contrecord.pl with this adjustment. Reporting an error
prevents that.
Back-patch to 15.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20220831140128.GS31833%40telsasoft.com
Diffstat (limited to 'src/backend/access/transam/xlogreader.c')
-rw-r--r-- | src/backend/access/transam/xlogreader.c | 23 |
1 files changed, 19 insertions, 4 deletions
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index cdcacc78037..cd3dd8cc5c0 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -275,22 +275,24 @@ XLogBeginRead(XLogReaderState *state, XLogRecPtr RecPtr) } /* - * See if we can release the last record that was returned by - * XLogNextRecord(), if any, to free up space. + * Release the last record that was returned by XLogNextRecord(), if any, to + * free up space. Returns the LSN past the end of the record. */ -void +XLogRecPtr XLogReleasePreviousRecord(XLogReaderState *state) { DecodedXLogRecord *record; + XLogRecPtr next_lsn; if (!state->record) - return; + return InvalidXLogRecPtr; /* * Remove it from the decoded record queue. It must be the oldest item * decoded, decode_queue_head. */ record = state->record; + next_lsn = record->next_lsn; Assert(record == state->decode_queue_head); state->record = NULL; state->decode_queue_head = record->next; @@ -336,6 +338,8 @@ XLogReleasePreviousRecord(XLogReaderState *state) state->decode_buffer_tail = state->decode_buffer; } } + + return next_lsn; } /* @@ -907,6 +911,17 @@ err: */ state->abortedRecPtr = RecPtr; state->missingContrecPtr = targetPagePtr; + + /* + * If we got here without reporting an error, report one now so that + * XLogPrefetcherReadRecord() doesn't bring us back a second time and + * clobber the above state. Otherwise, the existing error takes + * precedence. + */ + if (!state->errormsg_buf[0]) + report_invalid_record(state, + "missing contrecord at %X/%X", + LSN_FORMAT_ARGS(RecPtr)); } if (decoded && decoded->oversized) |