aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/transam/xlog.c
Commit message (Collapse)AuthorAge
* Rename two functions that wake up other processesHeikki Linnakangas2024-11-01
| | | | | | | | | | | Instead of talking about setting latches, which is a pretty low-level mechanism, emphasize that they wake up other processes. This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
* Use ProcNumbers instead of direct Latch pointers to address other procsHeikki Linnakangas2024-11-01
| | | | | | | | This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
* Fix WAL_DEBUG buildPeter Eisentraut2024-10-28
| | | | | | broken by commit e18512c000e Reported-by: Peter Geoghegan <pg@bowt.ie>
* Remove unused #include's from backend .c filesPeter Eisentraut2024-10-27
| | | | | | | | as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
* Move LSN waiting declarations and definitions to better placeAlexander Korotkov2024-10-24
| | | | | | | | | | | | | | | | | | | 3c5db1d6b implemented the pg_wal_replay_wait() stored procedure. Due to the patch development history, the implementation resided in src/backend/commands/waitlsn.c (src/include/commands/waitlsn.h for headers). 014f9f34d moved pg_wal_replay_wait() itself to src/backend/access/transam/xlogfuncs.c near to the WAL-manipulation functions. But most of the implementation stayed in place. The code in src/backend/commands/waitlsn.c has nothing to do with commands, but is related to WAL. So, this commit moves this code into src/backend/access/transam/xlogwait.c (src/include/access/xlogwait.h for headers). Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org Reviewed-by: Pavel Borisov
* Fix inconsistent reporting of checkpointer stats.Fujii Masao2024-10-02
| | | | | | | | | | | | | | | | | | | | | | | Previously, the pg_stat_checkpointer view and the checkpoint completion log message could show different numbers for buffers written during checkpoints. The view only counted shared buffers, while the log message included both shared and SLRU buffers, causing inconsistencies. This commit resolves the issue by updating both the view and the log message to separately report shared and SLRU buffers written during checkpoints. A new slru_written column is added to the pg_stat_checkpointer view to track SLRU buffers, while the existing buffers_written column now tracks only shared buffers. This change would help users distinguish between the two types of buffers, in the pg_stat_checkpointer view and the checkpoint complete log message, respectively. Bump catalog version. Author: Nitin Jadhav Reviewed-by: Bharath Rupireddy, Michael Paquier, Kyotaro Horiguchi, Robert Haas Reviewed-by: Andres Freund, vignesh C, Fujii Masao Discussion: https://postgr.es/m/CAMm1aWb18EpT0whJrjG+-nyhNouXET6ZUw0pNYYAe+NezpvsAA@mail.gmail.com
* Add num_done counter to the pg_stat_checkpointer view.Fujii Masao2024-09-30
| | | | | | | | | | | | | | | Checkpoints can be skipped when the server is idle. The existing num_timed and num_requested counters in pg_stat_checkpointer track both completed and skipped checkpoints, but there was no way to count only the completed ones. This commit introduces the num_done counter, which tracks only completed checkpoints, making it easier to see how many were actually performed. Bump catalog version. Author: Anton A. Melnikov Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
* Minor cleanup related to pg_wal_replay_wait() procedureAlexander Korotkov2024-09-17
| | | | | | | | | | | | * Rename $node_standby1 to $node_standby in 043_wal_replay_wait.pl as there is only one standby. * Remove useless debug printing in 043_wal_replay_wait.pl. * Fix typo in one check description in 043_wal_replay_wait.pl. * Fix some wording in comments and documentation. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com Reviewed-by: Alexander Lakhin
* Unify some error messages to ease work of translatorsMichael Paquier2024-09-04
| | | | | | | | | This commit updates a couple of error messages around control file data, GUCs and server settings, unifying to the same message where possible. This reduces the translation burden a bit. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
* Apply more quoting to GUC names in messagesMichael Paquier2024-09-04
| | | | | | | | | This is a continuation of 17974ec25946. More quotes are applied to GUC names in error messages and hints, taking care of what seems to be all the remaining holes currently in the tree for the GUCs. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
* Add const qualifiers to XLogRegister*() functionsPeter Eisentraut2024-09-03
| | | | | | | | Add const qualifiers to XLogRegisterData() and XLogRegisterBufData(). Several unconstify() calls can be removed. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/dd889784-9ce7-436a-b4f1-52e4a5e577bd@eisentraut.org
* Define PG_TBLSPC_DIR for path pg_tblspc/ in data folderMichael Paquier2024-09-03
| | | | | | | | | | | | | | Similarly to 2065ddf5e34c, this introduces a define for "pg_tblspc". This makes the style more consistent with the existing PG_STAT_TMP_DIR, for example. There is a difference with the other cases with the introduction of PG_TBLSPC_DIR_SLASH, required in two places for recovery and backups. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Álvaro Herrera, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
* Implement pg_wal_replay_wait() stored procedureAlexander Korotkov2024-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | pg_wal_replay_wait() is to be used on standby and specifies waiting for the specific WAL location to be replayed. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes are on standby. The queue of waiters is stored in the shared memory as an LSN-ordered pairing heap, where the waiter with the nearest LSN stays on the top. During the replay of WAL, waiters whose LSNs have already been replayed are deleted from the shared memory pairing heap and woken up by setting their latches. pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records, implying a kind of self-deadlock. This is why it is only possible to implement pg_wal_replay_wait() as a procedure working without an active snapshot, not a function. Catversion is bumped. Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru Author: Kartyshov Ivan, Alexander Korotkov Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi
* Add redo LSN to pgstats filesMichael Paquier2024-08-02
| | | | | | | | | | | | | | | | This is used in the startup process to check that the pgstats file we are reading includes the redo LSN referring to the shutdown checkpoint where it has been written. The redo LSN in the pgstats file needs to match with what the control file has. This is intended to be used for an upcoming change that will extend the write of the stats file to happen during checkpoints, rather than only shutdown sequences. Bump PGSTAT_FILE_FORMAT_ID. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
* Replace static buf with a stack-allocated one in ReadControlFileHeikki Linnakangas2024-07-30
| | | | | | | It's only used very locally within the function. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
* Replace static buf with palloc in str_time()Heikki Linnakangas2024-07-30
| | | | | | | | | | The function is used only once in the startup process, so the leak into current memory context is harmless. This is a tiny step in making the server thread-safe. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
* Wait for WAL summarization to catch up before creating .partial file.Robert Haas2024-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a standby is promoted, CleanupAfterArchiveRecovery() may decide to rename the final WAL file from the old timeline by adding ".partial" to the name. If WAL summarization is enabled and this file is renamed before its partial contents are summarized, WAL summarization breaks: the summarizer gets stuck at that point in the WAL stream and just errors out. To fix that, first make the startup process wait for WAL summarization to catch up before renaming the file. Generally, this should be quick, and if it's not, the user can shut off summarize_wal and try again. To make this fix work, also teach the WAL summarizer that after a promotion has occurred, no more WAL can appear on the previous timeline: previously, the WAL summarizer wouldn't switch to the new timeline until we actually started writing WAL there, but that meant that when the startup process was waiting for the WAL summarizer, it was waiting for an action that the summarizer wasn't yet prepared to take. In the process of fixing these bugs, I realized that the logic to wait for WAL summarization to catch up was spread out in a way that made it difficult to reuse properly, so this code refactors things to make it easier. Finally, add a test case that would have caught this bug and the previously-fixed bug that WAL summarization sometimes needs to back up when the timeline changes. Discussion: https://postgr.es/m/CA+TgmoZGEsZodXC4f=XZNkAeyuDmWTSkpkjCEOcF19Am0mt_OA@mail.gmail.com
* Get rid of a global variablePeter Eisentraut2024-07-23
| | | | | | | | bootstrap_data_checksum_version can just as easily be passed to where it is used via function arguments. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
* Initialize wal_level in the initial checkpoint record.Robert Haas2024-07-22
| | | | | | As per Coverity and Tom Lane, commit 402b586d0 (back-patched to v17 as 2b5819e2b) forgot to initialize this new structure member in this code path.
* Do not summarize WAL if generated with wal_level=minimal.Robert Haas2024-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | To do this, we must include the wal_level in the first WAL record covered by each summary file; so add wal_level to struct Checkpoint and the payload of XLOG_CHECKPOINT_REDO and XLOG_END_OF_RECOVERY. This, in turn, requires bumping XLOG_PAGE_MAGIC and, since the Checkpoint is also stored in the control file, also PG_CONTROL_VERSION. It's not great to do that so late in the release cycle, but the alternative seems to ship v17 without robust protections against this scenario, which could result in corrupted incremental backups. A side effect of this patch is that, when a server with wal_level=replica is started with summarize_wal=on for the first time, summarization will no longer begin with the oldest WAL that still exists in pg_wal, but rather from the first checkpoint after that. This change should be harmless, because a WAL summary for a partial checkpoint cycle can never make an incremental backup possible when it would otherwise not have been. Report by Fujii Masao. Patch by me. Review and/or testing by Jakub Wartak and Fujii Masao. Discussion: http://postgr.es/m/6e30082e-041b-4e31-9633-95a66de76f5d@oss.nttdata.com
* Use safe string copy routineDaniel Gustafsson2024-07-02
| | | | | | | | | | | Using memcpy with strlen as the size parameter will not take the NULL terminator into account, relying instead on the destination buffer being properly initialized. Replace with strlcpy which is a safer alternative, and more in line with how we handle copying strings elsewhere. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQApAsbLsQ+gGiw-hT+JwGhgogFa_=5NUkgFO6kOPxyNidQ@mail.gmail.com
* Prevent summarizer hang when summarize_wal turned off and back on.Robert Haas2024-06-28
| | | | | | | | | | | | | | | Before this commit, when the WAL summarizer started up or recovered from an error, it would resume summarization from wherever it left off. That was OK normally, but wrong if summarize_wal=off had been turned off temporary, allowing some WAL to be removed, and then turned back on again. In such cases, the WAL summarizer would simply hang forever. This commit changes the reinitialization sequence for WAL summarizer to rederive the starting position in the way we were already doing at initial startup, fixing the problem. Per report from Israel Barth Rubio. Reviewed by Tom Lane. Discussion: http://postgr.es/m/CA+TgmoYN6x=YS+FoFOS6=nr6=qkXZFWhdiL7k0oatGwug2hcuA@mail.gmail.com
* Fix MVCC bug with prepared xact with subxacts on standbyHeikki Linnakangas2024-06-27
| | | | | | | | | | | | | | | | | | We did not recover the subtransaction IDs of prepared transactions when starting a hot standby from a shutdown checkpoint. As a result, such subtransactions were considered as aborted, rather than in-progress. That would lead to hint bits being set incorrectly, and the subtransactions suddenly becoming visible to old snapshots when the prepared transaction was committed. To fix, update pg_subtrans with prepared transactions's subxids when starting hot standby from a shutdown checkpoint. The snapshots taken from that state need to be marked as "suboverflowed", so that we also check the pg_subtrans. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
* Fix bugs in MultiXact truncationHeikki Linnakangas2024-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. TruncateMultiXact() performs the SLRU truncations in a critical section. Deleting the SLRU segments calls ForwardSyncRequest(), which will try to compact the request queue if it's full (CompactCheckpointerRequestQueue()). That in turn allocates memory, which is not allowed in a critical section. Backtrace: TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981 postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e] postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d] postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e] postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb] postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a] postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1] postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b] postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3] postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66] postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d] postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead] postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e] postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb] postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e] /lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45] postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31] To fix, bail out in CompactCheckpointerRequestQueue() without doing anything, if it's called in a critical section. That covers the above call path, as well as any other similar cases where RegisterSyncRequest might be called in a critical section. 2. After fixing that, another problem became apparent: Autovacuum process doing that truncation can deadlock with the checkpointer process. TruncateMultiXact() sets "MyProc->delayChkptFlags |= DELAY_CHKPT_START". If the sync request queue is full and cannot be compacted, the process will repeatedly sleep and retry, until there is room in the queue. However, if the checkpointer is trying to start a checkpoint at the same time, and is waiting for the DELAY_CHKPT_START processes to finish, the queue will never shrink. More concretely, the autovacuum process is stuck here: #0 0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570 #2 WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516 #3 0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:538 #4 0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614 #5 0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495 #6 0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566 #7 0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006 #8 TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201 #9 0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917 #10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760 #11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550 #12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569 and the checkpointer is stuck here: #0 0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50 #3 0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098 #4 0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464 To fix, add AbsorbSyncRequests() to the loops where the checkpointer waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to finish. Backpatch to v14. Before that, SLRU deletion didn't call RegisterSyncRequest, which avoided this failure. I'm not sure if there are other similar scenarios on older versions, but we haven't had any such reports. Discussion: https://www.postgresql.org/message-id/ccc66933-31c1-4f6a-bf4b-45fef0d4f22e@iki.fi
* Revise GUC names quoting in messages againPeter Eisentraut2024-05-17
| | | | | | | | | | | | | | | After further review, we want to move in the direction of always quoting GUC names in error messages, rather than the previous (PG16) wildly mixed practice or the intermittent (mid-PG17) idea of doing this depending on how possibly confusing the GUC name is. This commit applies appropriate quotes to (almost?) all mentions of GUC names in error messages. It partially supersedes a243569bf65 and 8d9978a7176, which had moved things a bit in the opposite direction but which then were abandoned in a partial state. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
* Fix an assortment of typosDavid Rowley2024-05-04
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
* Revert: Implement pg_wal_replay_wait() stored procedureAlexander Korotkov2024-04-11
| | | | | | | This commit reverts 06c418e163, e37662f221, bf1e650806, 25f42429e2, ee79928441, and 74eaf66f98 per review by Heikki Linnakangas. Discussion: https://postgr.es/m/b155606b-e744-4218-bda5-29379779da1a%40iki.fi
* Add XLogCtl->logInsertResultAlvaro Herrera2024-04-07
| | | | | | | | | | | | | | | | | | This tracks the position of WAL that's been fully copied into WAL buffers by all processes emitting WAL. (For some reason we call that "WAL insertion"). This is updated using atomic monotonic advance during WaitXLogInsertionsToFinish, which is not when the insertions actually occur, but it's the only place where we know where have all the insertions have completed. This value is useful in WALReadFromBuffers, which can verify that callers don't try to read past what has been inserted. (However, more infrastructure is needed in order to actually use WAL after the flush point, since it could be lost.) The value is also useful in WaitXLogInsertionsToFinish() itself, since we can now exit quickly when all WAL has been already inserted, without even having to take any locks.
* Operate XLogCtl->log{Write,Flush}Result with atomicsAlvaro Herrera2024-04-05
| | | | | | | | | | | | | | | | | This removes the need to hold both the info_lck spinlock and WALWriteLock to update them. We use stock atomic write instead, with WALWriteLock held. Readers can use atomic read, without any locking. This allows for some code to be reordered: some places were a bit contorted to avoid repeated spinlock acquisition, but that's no longer a concern, so we can turn them to more natural coding. Some further changes are possible (maybe to performance wins), but in this commit I did rather minimal ones only, to avoid increasing the blast radius. Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/20200831182156.GA3983@alvherre.pgsql
* Split XLogCtl->LogwrtResult into separate struct membersAlvaro Herrera2024-04-03
| | | | | | | | | | | | | | | | | | | After this change we have XLogCtl->logWriteResult and ->logFlushResult. There's no functional change, other than the fact that the assignment from shared memory to local is no longer done via struct assignment, but instead using a macro that copies each member separately. The current representation is inconvenient going forward; notably, we would like to add a new member "Copy" (to keep track of the last position copied into WAL buffers), so the symmetry between the values in shared memory vs. those in local would be lost. This also gives us freedom to later change the concurrency model for the values in shared memory: we can make them use atomics instead of relying on the info_lck spinlock. Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/202404031119.cd2kugjk2vho@alvherre.pgsql
* Add error codes to some PANIC/FATAL errors reportsDaniel Gustafsson2024-04-03
| | | | | | | | | | | This adds errcodes to a set of PANIC and FATAL errors in xlog.c and relcache.c, which previously had no errcode at all set, in order to make fleetwide analysis of errorlogs easier. There are many more ereport/elogs left which could benefit from having an errcode but this at least makes a dent in the issue. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ1k8LgLEqncPGmz_fWnrobV6bjABOTH4tOWta6xNcPQig@mail.gmail.com
* Implement pg_wal_replay_wait() stored procedureAlexander Korotkov2024-04-02
| | | | | | | | | | | | | | | | | | | | | | | | pg_wal_replay_wait() is to be used on standby and specifies waiting for the specific WAL location to be replayed before starting the transaction. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes on standby. The queue of waiters is stored in the shared memory array sorted by LSN. During replay of WAL waiters whose LSNs are already replayed are deleted from the shared memory array and woken up by setting of their latches. pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records implying a kind of self-deadlock. This is why it is only possible to implement pg_wal_replay_wait() as a procedure working in a non-atomic context, not a function. Catversion is bumped. Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru Author: Kartyshov Ivan, Alexander Korotkov Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
* Remove unused #include's from backend .c filesPeter Eisentraut2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
* Add regression test for restart points during promotionMichael Paquier2024-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | This test serves as a way to demonstrate how to use the features introduced in 37b369dc67bc, while providing coverage for 7863ee4def65 that caused the startup process to throw "PANIC: could not locate a valid checkpoint record" when starting recovery. The test checks that a node is able to properly restart following a crash when a restart point was finishing across a promotion, with an injection point added in the middle of CreateRestartPoint() to stop the restartpoint in flight. Note that this test fails when 7863ee4def65 is reverted. Kyotaro Horiguchi is the original author of this test, that has been originally posted on the thread where 7863ee4def65 was discussed. I have just upgraded and polished it to rely on injection points, making it much cheaper to reproduce the failure. This test requires injection points to be enabled in the builds, hence meson and ./configure need an update to pass this knowledge down to the test. The name of the new injection point follows the same naming convention as 6a1ea02c491d. The Makefile's EXTRA_INSTALL of recovery TAP tests is updated to include modules/injection_points. Author: Kyotaro Horiguchi, Michael Paquier Reviewed-by: Andrey Borodin, Bertrand Drouvot Discussion: https://postgr.es/m/ZdLuxBk5hGpol91B@paquier.xyz
* Convert unloggedLSN to an atomic variable.Nathan Bossart2024-02-29
| | | | | | | | | | | | | Currently, this variable is an XLogRecPtr protected by a spinlock. By converting it to an atomic variable, we can remove the spinlock, which saves a small amount of shared memory space. Since this code is not performance-critical, we use atomic operations with full barrier semantics to make it easy to reason about correctness. Author: John Morris Reviewed-by: Michael Paquier, Robert Haas, Andres Freund, Stephen Frost, Bharath Rupireddy Discussion: https://postgr.es/m/BYAPR13MB26772534335255E50318C574A0409%40BYAPR13MB2677.namprd13.prod.outlook.com Discussion: https://postgr.es/m/MN2PR13MB2688FD8B757316CB5C54C8A2A0DDA%40MN2PR13MB2688.namprd13.prod.outlook.com
* Remove superfluous 'pgprocno' field from PGPROCHeikki Linnakangas2024-02-22
| | | | | | | | | It was always just the index of the PGPROC entry from the beginning of the proc array. Introduce a macro to compute it from the pointer instead. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407@iki.fi
* Add assert to WALReadFromBuffers().Jeff Davis2024-02-16
| | | | | | Per suggestion from Andres. Discussion: https://postgr.es/m/20240214025508.6mcblauossthvaw3@awork3.anarazel.de
* Read WAL directly from WAL buffers.Jeff Davis2024-02-12
| | | | | | | | | | | | If available, read directly from WAL buffers, avoiding the need to go through the filesystem. Only for physical replication for now, but can be expanded to other callers. In preparation for replicating unflushed WAL data. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com Reviewed-by: Andres Freund, Alvaro Herrera, Nathan Bossart, Dilip Kumar, Nitin Jadhav, Melih Mutlu, Kyotaro Horiguchi
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Fix incorrect data type choices in some read and write calls.Tom Lane2023-12-27
| | | | | | | | | | | | | | | | | | | | | | | | Recently-introduced code in reconstruct.c was using "unsigned" to store the result of read(), pg_pread(), or write(). This is completely bogus: it breaks subsequent tests for the result being negative, as we're being reminded of by a chorus of buildfarm warnings. Switch to "int" as was doubtless intended. (There are several other uses of "unsigned" in this file that also look poorly chosen to me, but for now I'm just trying to clean up the buildfarm.) A larger problem is that "int" is not necessarily wide enough to hold the result: per POSIX, all these functions return ssize_t. In places where the requested read or write length clearly fits in int, that's academic. It may be academic anyway as long as we constrain individual data files to 1GB, since even a readv or writev-like operation would then not be responsible for transferring more than 1GB. Nonetheless it seems like trouble waiting to happen, so I made a pass over readv and writev calls and fixed the result variables where that seemed appropriate. We might want to think about changing some of the fd.c functions to return ssize_t too, for future-proofing; but I didn't tackle that here. Discussion: https://postgr.es/m/1672202.1703441340@sss.pgh.pa.us
* Add a new WAL summarizer process.Robert Haas2023-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When active, this process writes WAL summary files to $PGDATA/pg_wal/summaries. Each summary file contains information for a certain range of LSNs on a certain TLI. For each relation, it stores a "limit block" which is 0 if a relation is created or destroyed within a certain range of WAL records, or otherwise the shortest length to which the relation was truncated during that range of WAL records, or otherwise InvalidBlockNumber. In addition, it stores a list of blocks which have been modified during that range of WAL records, but excluding blocks which were removed by truncation after they were modified and never subsequently modified again. In other words, it tells us which blocks need to copied in case of an incremental backup covering that range of WAL records. But this doesn't yet add the capability to actually perform an incremental backup; the next patch will do that. A new parameter summarize_wal enables or disables this new background process. The background process also automatically deletes summary files that are older than wal_summarize_keep_time, if that parameter has a non-zero value and the summarizer is configured to run. Patch by me, with some design help from Dilip Kumar and Andres Freund. Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter Eisentraut, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
* Additional write barrier in AdvanceXLInsertBuffer().Jeff Davis2023-12-19
| | | | | | | | | | | | First, mark the xlblocks member with InvalidXLogRecPtr, then issue a write barrier, then initialize it. That ensures that the xlblocks member doesn't appear valid while the contents are being initialized. In preparation for reading WAL buffer contents without a lock. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com Reviewed-by: Andres Freund
* Use 64-bit atomics for xlblocks array elements.Jeff Davis2023-12-19
| | | | | | | | | | In preparation for reading the contents of WAL buffers without a lock. Also, avoids the previously-needed comment in GetXLogBuffer() explaining why it's safe from torn reads. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com Reviewed-by: Andres Freund
* Remove trace_recovery_messagesMichael Paquier2023-12-11
| | | | | | | | | | | | This GUC was intended as a debugging help in the 9.0 area when hot standby and streaming replication were being developped, able to offer more information at LOG level rather than DEBUGn. There are more tools available these days that are able to offer rather equivalent information, like pg_waldump introduced in 9.3. It is not obvious how this facility is useful these days, so let's remove it. Author: Bharath Rupireddy Discussion: https://postgr.es/m/ZXEXEAUVFrvpquSd@paquier.xyz
* Rename ShmemVariableCache to TransamVariablesHeikki Linnakangas2023-12-08
| | | | | | | | The old name was misleading: It's not a cache, the values kept in the struct are the authoritative source. Reviewed-by: Tristan Partin, Richard Guo Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi
* Fix compilation on Windows with WAL_DEBUGMichael Paquier2023-12-06
| | | | | | | | | | This has been broken since b060dbe0001a that has reworked the callback mechanism of XLogReader, most likely unnoticed because any form of development involving WAL happens on platforms where this compiles fine. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com Backpatch-through: 13
* Apply quotes more consistently to GUC names in logsMichael Paquier2023-11-30
| | | | | | | | | | | | | | Quotes are applied to GUCs in a very inconsistent way across the code base, with a mix of double quotes or no quotes used. This commit removes double quotes around all the GUC names that are obviously referred to as parameters with non-English words (use of underscore, mixed case, etc). This is the result of a discussion with Álvaro Herrera, Nathan Bossart, Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
* Reduce rate of walwriter wakeups due to async commits.Heikki Linnakangas2023-11-27
| | | | | | | | | | | | | | | | | XLogSetAsyncXactLSN(), called at asynchronous commit, would wake up walwriter every time the LSN advances, but walwriter doesn't actually do anything unless it has at least 'wal_writer_flush_after' full blocks of WAL to write. Repeatedly waking up walwriter to do nothing is a waste of CPU cycles in both walwriter and the backends doing the wakeups. To fix, apply the same logic in XLogSetAsyncXactLSN() to decide whether to wake up walwriter, as walwriter uses to determine if it has any work to do. In the passing, rename misleadingly named 'flushbytes' local variable to 'flushblocks'. Author: Andres Freund, Heikki Linnakangas Discussion: https://www.postgresql.org/message-id/20231024230929.vsc342baqs7kmbte@awork3.anarazel.de
* C comment: clarify that WAL files can be _recycled_ or removedBruce Momjian2023-11-25
| | | | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/CAB7nPqSDdF0heotQU3gsepgqx+9c+6KjLd3R6aNYH7KKfDd2ig@mail.gmail.com Author: Michael Paquier Backpatch-through: master
* Prohibit max_slot_wal_keep_size to value other than -1 during upgrade.Amit Kapila2023-11-10
| | | | | | | | | | | We don't want existing slots in the old cluster to get invalidated during the upgrade. During an upgrade, we set this variable to -1 via the command line in an attempt to prevent such invalidations, but users have ways to override it. This patch ensures the value is not overridden by the user. Author: Kyotaro Horiguchi Reviewed-by: Peter Smith, Alvaro Herrera Discussion: http://postgr.es/m/20231027.115759.2206827438943188717.horikyota.ntt@gmail.com