postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Clarify nbtree array preprocessing comment.	Peter Geoghegan	2024-11-01
\| \| \| \|	Oversight in commit 5bf748b8.
*	Rename two functions that wake up other processes	Heikki Linnakangas	2024-11-01
\| \| \| \| \| \| \| \| \| \| \|	Instead of talking about setting latches, which is a pretty low-level mechanism, emphasize that they wake up other processes. This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
*	Use ProcNumbers instead of direct Latch pointers to address other procs	Heikki Linnakangas	2024-11-01
\| \| \| \| \| \| \| \|	This is in preparation for replacing Latches with a new abstraction. That's still work in progress, but this seems a little tidier anyway, so let's get this refactoring out of the way already. Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
*	Remove use of pg_memory_is_all_zeros() in bufpage.c	Michael Paquier	2024-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \|	After a closer lookup, this makes the all-zero check of the page more expensive, so let's remove the new function call in bufpage.c. The maths of the check were also incorrect, checking that the page was full of zeros only for the first 1kB. This brings back the code to the state it was at 49d6c7d8daba. Per discussion with David Rowley and Bertrand Drouvot. Discussion: https://postgr.es/m/CAApHDvrXzPAr3FxoBuB7b3D-okNoNA2jxLun1rW8Yw5wkbqusw@mail.gmail.com
*	Add pg_memory_is_all_zeros() in memutils.h	Michael Paquier	2024-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new function tests if a memory region starting at a given location for a defined length is made only of zeroes. This unifies in a single path the all-zero checks that were happening in a couple of places of the backend code: - For pgstats entries of relation, checkpointer and bgwriter, where some "all_zeroes" variables were previously used with memcpy(). - For all-zero buffer pages in PageIsVerifiedExtended(). This new function uses the same forward scan as the check for all-zero buffer pages, applying it to the three pgstats paths mentioned above. Author: Bertrand Drouvot Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Peter Smith Discussion: https://postgr.es/m/ZupUDDyf1hHI4ibn@ip-10-97-1-34.eu-west-3.compute.internal
*	Add SQL function array_reverse()	Michael Paquier	2024-11-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function takes in input an array, and reverses the position of all its elements. This operation only affects the first dimension of the array, like array_shuffle(). The implementation structure is inspired by array_shuffle(), with a subroutine called array_reverse_n() that may come in handy in the future, should more functions able to reverse portions of arrays be introduced. Bump catalog version. Author: Aleksander Alekseev Reviewed-by: Ashutosh Bapat, Tom Lane, Vladlen Popolitov Discussion: https://postgr.es/m/CAJ7c6TMpeO_ke+QGOaAx9xdJuxa7r=49-anMh3G5476e3CX1CA@mail.gmail.com
*	Make all ereport() calls within gram.y provide error locations.	Tom Lane	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch responds to a comment that I (tgl) made in the discussion leading up to 774171c4f, that really all errors occurring during raw parsing should provide error cursors. Syntax errors reported by Bison will have one, and most of the handwritten ereport's in gram.y already provide one, but there were a few stragglers. (It is not claimed that this handles every failure reachable during raw parsing --- out-of-memory is an obvious exception. But this makes a good start on cases that are likely to occur.) While we're at it, clean up the reported positions for errors associated with LIMIT/OFFSET clauses. Previously we were relying on applying exprLocation() to the contained expressions, but that leads to slightly odd cursor placement, e.g. regression=# (select * from foo limit 10) limit 10; ERROR: multiple LIMIT clauses not allowed LINE 1: (select * from foo limit 10) limit 10; ^ We can afford to keep a little more state in the transient SelectLimit structs in order to make that better. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
*	Add a parse location field to struct FunctionParameter.	Tom Lane	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows an error cursor to be supplied for a bunch of bad-function-definition errors that previously lacked one, or that cheated a bit by pointing at the contained type name when the error isn't really about that. Bump catversion from an abundance of caution --- I don't think this node type can actually appear in stored views/rules, but better safe than sorry. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
*	Fix refreshing physical relfilenumber on shared index	Heikki Linnakangas	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Buildfarm member 'prion', which is configured with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, failed with errors like this: ERROR: could not read blocks 0..0 in file "global/2672": read only 0 of 8192 bytes while running a parallel test group that includes VACUUM FULL on some catalog tables among other things. I was not able to reproduce that just by running the tests with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, even though 'prion' hit it on first run after commit 2b9b8ebbf8, so there might be something else that makes it more susceptible to the race. However, I was able to reproduce it by adding another test to the same test group that runs "vacuum full pg_database" repeatedly. The problem is that RelationReloadIndexInfo() no longer calls RelationInitPhysicalAddr() on a nailed, shared index, when an invalidation happens early during backend startup, before the critical relcaches have been built. Before commit 2b9b8ebbf8, that was done by RelationReloadNailed(), but it went missing from that path. Add it back as an explicit step. Broken by commit 2b9b8ebbf8, which refactored these functions. Discussion: https://www.postgresql.org/message-id/db876575-8f5b-4193-a538-df7e1f92d47a%40iki.fi
*	Remove duplicate words in comments	Daniel Gustafsson	2024-10-31
\| \| \| \| \| \| \| \|	A few comments contained duplicate "the" in sentences, fix by removing one occurrence. Author: Vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm2aEEiPwGJmPdzBxROVvs8n75yCjKz4K1f1B2TdWpzxTA@mail.gmail.com
*	Split RelationClearRelation into three different functions	Heikki Linnakangas	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old RelationClearRelation function did different things depending on the arguments and circumstances. It could: a) remove the relation completely from relcache (rebuild == false), b) mark the entry as invalid (rebuild == true, but not in xact), or c) rebuild the entry (rebuild == true). Different callers used it for different purposes, and often assumed a particular behavior, which was confusing. Split it into three different functions, one for each of the above actions (one of them, RelationInvalidateRelation, was already added in commit e6cd857726). Move the responsibility of choosing the action and calling the right function to the callers. Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
*	Simplify call to rebuild relcache entry for indexes	Heikki Linnakangas	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RelationClearRelation(rebuild == true) calls RelationReloadIndexInfo() for indexes. We can rely on that in RelationIdGetRelation(), instead of calling RelationReloadIndexInfo() directly. That simplifies the code a little. In the passing, add a comment in RelationBuildLocalRelation() explaining why it doesn't call RelationInitIndexAccessInfo(). It's because at index creation, it's called before the pg_index row has been created. That's also the reason that RelationClearRelation() still needs a special case to go through the full-blown rebuild if the index support information in the relcache entry hasn't been populated yet. Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
*	Remove unused field from SubPlanState struct	David Rowley	2024-10-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bf6c614a2 did some conversion work to use ExprState instead of manually calling equality functions to check if one set of values is not distinct from another set. That patch removed many of the fields that became redundant as a result of that change, but it forgot to remove SubPlanState.tab_eq_funcs. Fix that. In passing, fix the header comment for TupleHashEntryData to correctly spell the field name it's talking about. Author: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CA+FpmFeycdombFzrjZw7Rmc29CVm4OOzCWwu=dVBQ6q=PX8SvQ@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvrWR2jYVhec=COyF2g2BE_ns91NDsCHAMFiXbyhEujKdQ@mail.gmail.com
*	nbtree: assert no scheduled primscan between pages.	Peter Geoghegan	2024-10-30
\| \| \| \| \| \| \| \| \| \|	Follow-up to bugfix commit 763d65ae. Technically this new assertion is redundant with the assertion recently added to _bt_readpage by that same commit, but it seems like a good idea to have both. The new assertion makes it clear that we expect to call _bt_readnextpage when there's another primitive index scan scheduled, though only when needed as the final step of ending the current primitive scan.
*	Clarify nbtree array exhaustion comments.	Peter Geoghegan	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Strictly speaking, we only need to make sure to leave the scan's array keys in their final positions (final for the current scan direction) to handle SAOP array exhaustion because btgettuple might only return a subset of the items for the final page (final for the current scan direction), before the scan changes direction. While it's typical for so->currPos to be invalidated shortly after the scan's arrays are first exhausted, and while so->currPos invalidation does obviate the need to leave the scan's arrays in any particular state, we can't rely on any of that actually happening when handling array exhaustion. Adjust comments to make all of that a lot clearer. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution.
*	Fix bug in nbtree array primitive scan scheduling.	Peter Geoghegan	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A bug in nbtree's handling of primitive index scan scheduling could lead to wrong answers when a scrollable cursor was used with an index scan that had a SAOP index qual. Wrong answers were only possible when the scan direction changed after a primitive scan was scheduled, but before _bt_next was asked to fetch the next tuple in line (i.e. for things to break, _bt_next had to be denied the opportunity to step off the page in the same direction as the one used when the primscan was scheduled). Furthermore, the issue only occurred when the page in question happened to be the first page to be visited by the entire top-level scan; the issue hinged upon the cursor backing up to the absolute beginning of the key space that it returns tuples from (fetching in the opposite scan direction across a "primitive scan boundary" always worked correctly). To fix, make _bt_next unset the "needs primitive index scan" flag when it detects that the current scan direction is not the one that was used by _bt_readpage back when the primitive scan in question was scheduled. This fixes the cases that are known to be faulty, and also seems like a good idea on general robustness grounds. Affected scrollable cursor cases now avoid a spurious primitive index scan when they fetch backwards to the absolute start of the key space to be visited by their cursor. Fetching backwards now only returns those tuples at the start of the scan, as expected. It'll also be okay to once again fetch forwards from the start at that point, since the scan will be left in a state that's exactly consistent with the state it was in before any tuples were ever fetched, as expected. Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wznv49bFsE2jkt4GuZ0tU2C91dEST=50egzjY2FeOcHL4Q@mail.gmail.com Backpatch: 17-, where commit 5bf748b8 first appears.
*	Fix some more bugs in foreign keys connecting partitioned tables	Álvaro Herrera	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* In DetachPartitionFinalize() we were applying a tuple conversion map to tuples that didn't need one, which can lead to erratic behavior if a partitioned table has a partition with a different column order, as reported by Alexander Lakhin. This was introduced by 53af9491a043. Don't do that. Also, modify a recently added test case to exercise this. * The same function as well as CloneFkReferenced() were acquiring AccessShareLock on a partition, only to have CreateTrigger() later acquire ShareRowExclusiveLock on it. This can lead to deadlock by lock escalation, unnecessarily. Avoid that by acquiring the stronger lock to begin with. This probably dates back to branch 12, but I have never seen a report of this being a problem in the field. * Innocuous but wasteful: also introduced by 53af9491a043, we were reading a pg_constraint tuple from syscache that we don't need, as reported by Tender Wang. Don't. Backpatch to 15. Discussion: https://postgr.es/m/461e9c26-2076-8224-e119-84998b6a784e@gmail.com
*	Replicate generated columns when specified in the column list.	Amit Kapila	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit allows logical replication to publish and replicate generated columns when explicitly listed in the column list. We also ensured that the generated columns were copied during the initial tablesync when they were published. We will allow to replicate generated columns even when they are not specified in the column list (via a new publication option) in a separate commit. The motivation of this work is to allow replication for cases where the client doesn't have generated columns. For example, the case where one is trying to replicate data from Postgres to the non-Postgres database. Author: Shubham Khanna, Vignesh C, Hou Zhijie Reviewed-by: Peter Smith, Hayato Kuroda, Shlok Kyal, Amit Kapila Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
*	Add missing CommandCounterIncrement() in stats import functions.	Jeff Davis	2024-10-29
\| \| \| \| \|	Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/98b2fcf0-f701-369e-d63d-6be9739ce17c@gmail.com
*	Unpin buffer before inplace update waits for an XID to end.	Noah Misch	2024-10-29
\| \| \| \| \| \| \| \| \| \| \| \|	Commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257 changed inplace updates to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. By keeping the pin during that wait, a sequence of autovacuum workers and an uncommitted GRANT starved one foreground LockBufferForCleanup() for six minutes, on buildfarm member sarus. Prevent, at the cost of a bit of complexity. Back-patch to v12, like the earlier commit. That commit and heap_inplace_lock() have not yet appeared in any release. Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
*	Reduce variable scope and possibly useless palloc	David Rowley	2024-10-30
\| \| \| \| \| \| \| \|	Move the CreateStmt down to the branch that it's used in, thus preventing the makeNode() call in cases where the CreateStmt isn't used. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAq=06YPWPhS+yyTbCwv5JLKRz8rm3dWx6JR5Uj_d_fQDA@mail.gmail.com
*	Fix dependency of partitioned table and table AM with CREATE TABLE .. USING	Michael Paquier	2024-10-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A pg_depend entry between a partitioned table and its table access method was missing when using CREATE TABLE .. USING with an unpinned access method. DROP ACCESS METHOD could be used, while it should be blocked if CASCADE is not specified, even if there was a partitioned table that depends on the table access method. pg_class.relam would then hold an orphaned OID value still pointing to the AM dropped. The problem is fixed by adding a dependency between the partitioned table and its table access method if set when the relation is created. A test checking the contents of pg_depend in this case is added. Issue introduced in 374c7a229042, that has added support for CREATE TABLE .. USING for partitioned tables. Reviewed-by: Alexander Lakhin Discussion: https://postgr.es/m/18674-1ef01eceec278fab@postgresql.org Backpatch-through: 17
*	Ensure we have a snapshot when updating pg_index in index_drop().	Nathan Bossart	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	I assumed that all index_drop() callers set an active snapshot beforehand, but that is evidently not true. One counterexample is autovacuum, which doesn't set an active snapshot when cleaning up orphan temp indexes. To fix, unconditionally push an active snapshot before updating pg_index in index_drop(). Oversight in commit b52adbad46. Reported-by: Masahiko Sawada Reviewed-by: Stepan Neretin, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoBgF9etQrXbN9or_YHsmBRJHHNUEkhHp9rGK9CyQv5aTQ%40mail.gmail.com
*	Strip Windows newlines from extension script files manually.	Tom Lane	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Revert commit 924e03917 in favor of adding code to convert \r\n to \n explicitly, on Windows only. The idea of letting text mode do the work fails for a couple of reasons: * Per Microsoft documentation, text mode also causes control-Z to be interpreted as end-of-file. While it may be unlikely that extension scripts contain control-Z, we've historically allowed it, and breaking the case doesn't seem wise. * Apparently, on some Windows configurations, "r" mode is interpreted as binary not text mode. We could force it with "rt" but that would be inconsistent with our code elsewhere, and it would still require Windows-specific coding. Thanks to Alexander Lakhin for investigation. Discussion: https://postgr.es/m/79284195-4993-7b00-f6df-8db28ca60fa3@gmail.com
*	Fix WAL_DEBUG build	Peter Eisentraut	2024-10-28
\| \| \| \| \| \|	broken by commit e18512c000e Reported-by: Peter Geoghegan <pg@bowt.ie>
*	nbtree: Minor sibling link traversal tweaks.	Peter Geoghegan	2024-10-28
\| \| \| \| \| \| \|	Tweak some code comments for clarity, and relocate some local variable declarations to the scope where they're actually used. Follow-up to recent commit 1bd4bc85.
*	Change the default value of the streaming option to 'parallel'.	Amit Kapila	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the default value of streaming option for a subscription was 'off'. The parallel option indicates that the changes in large transactions (greater than logical_decoding_work_mem) are to be applied directly via one of the parallel apply workers, if available. The parallel mode was introduced in 16, but we refrain from enabling it by default to avoid seeing any unpleasant behavior in the existing applications. However we haven't found any such report yet, so this is a good time to enable it by default. Reported-by: Vignesh C Author: Hayato Kuroda, Masahiko Sawada, Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CALDaNm1=MedhW23NuoePJTmonwsMSp80ddsw+sEJs0GUMC_kqQ@mail.gmail.com
*	Set query ID for inner queries of CREATE TABLE AS and DECLARE	Michael Paquier	2024-10-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some utility statements contain queries that can be planned and executed: CREATE TABLE AS and DECLARE CURSOR. This commit adds query ID computation for the inner queries executed by these two utility commands, with and without EXPLAIN. This change leads to four new callers of JumbleQuery() and post_parse_analyze_hook() so as extensions can decide what to do with this new data. Previously, extensions relying on the query ID, like pg_stat_statements, were not able to track these nested queries as the query_id was 0. For pg_stat_statements, this commit leads to additions under !toplevel when pg_stat_statements.track is set to "all", as shown in its regression tests. The output of EXPLAIN for these two utilities gains a "Query Identifier" if compute_query_id is enabled. Author: Anthonin Bonnefoy Reviewed-by: Michael Paquier, Jian He Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
*	Fix obsolete nbtree split buffer comment.	Peter Geoghegan	2024-10-27
\| \| \| \|	Oversight in commit d088ba5a.
*	Remove unused #include's from backend .c files	Peter Eisentraut	2024-10-27
\| \| \| \| \| \| \| \|	as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
*	Refactor the code to create a pg_locale_t into new function.	Jeff Davis	2024-10-25
\| \| \| \| \|	Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/59da7ee4-5e1a-4727-b464-a603c6ed84cd@proxel.se
*	Read extension script files in text not binary mode.	Tom Lane	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change affects only Windows, where it should cause DOS-style newlines (\r\n) to be converted to plain \n during script loading. This eliminates one potential discrepancy in the behavior of extension script files between Windows and non-Windows. While there's a small chance that this might cause undesirable behavior changes for some extensions, it can also be argued that this may remove behavioral surprises for others. An example is that in the buildfarm, we are getting different results for the tests added by commit 774171c4f depending on whether our git tree has been checked out with Unix or DOS newlines. The choice to use binary mode goes all the way back to our invention of extensions in commit d9572c4e3. However, I suspect it was not thought through carefully but was just a side-effect of the ready availability of an almost-suitable function read_binary_file(). On balance, changing to text mode seems like a better answer than other ways in which we might fix the inconsistent test results. Discussion: https://postgr.es/m/2480333.1729784872@sss.pgh.pa.us
*	Make table_scan_bitmap_next_block() async-friendly	Melanie Plageman	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all responsibility for indicating a block is exhuasted into table_scan_bitmap_next_tuple() and advance the main iterator in heap-specific code. This flow control makes more sense and is a step toward using the read stream API for bitmap heap scans. Previously, table_scan_bitmap_next_block() returned false to indicate table_scan_bitmap_next_tuple() should not be called for the tuples on the page. This happened both when 1) there were no visible tuples on the page and 2) when the block returned by the iterator was past the end of the table. BitmapHeapNext() (generic bitmap table scan code) handled the case when the bitmap was exhausted. It makes more sense for table_scan_bitmap_next_tuple() to return false when there are no visible tuples on the page and table_scan_bitmap_next_block() to return false when the bitmap is exhausted or there are no more blocks in the table. As part of this new design, TBMIterateResults are no longer used as a flow control mechanism in BitmapHeapNext(), so we removed table_scan_bitmap_next_tuple's TBMIterateResult parameter. Note that the prefetch iterator is still saved in the BitmapHeapScanState node and advanced in generic bitmap table scan code. This is because 1) it was not necessary to change the prefetch iterator location to change the flow control in BitmapHeapNext() 2) modifying prefetch iterator management requires several more steps better split over multiple commits and 3) the prefetch iterator will be removed once the read stream API is used. Author: Melanie Plageman Reviewed-by: Tomas Vondra, Andres Freund, Heikki Linnakangas, Mark Dilger Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
*	Move EXPLAIN counter increment to heapam_scan_bitmap_next_block	Melanie Plageman	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Increment the lossy and exact page counters for EXPLAIN of bitmap heap scans in heapam_scan_bitmap_next_block(). Note that other table AMs will need to do this as well Pushing the counters into heapam_scan_bitmap_next_block() is required to be able to use the read stream API for bitmap heap scans. The bitmap iterator must be advanced from inside the read stream callback, so TBMIterateResults cannot be used as a flow control mechanism in BitmapHeapNext(). Author: Melanie Plageman Reviewed-by: Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
*	WAL-log inplace update before revealing it to other sessions.	Noah Misch	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \|	A buffer lock won't stop a reader having already checked tuple visibility. If a vac_update_datfrozenid() and then a crash happened during inplace update of a relfrozenxid value, datfrozenxid could overtake relfrozenxid. That could lead to "could not access status of transaction" errors. Back-patch to v12 (all supported versions). In v14 and earlier, this also back-patches the assertion removal from commit 7fcf2faf9c7dd473208fd6d5565f88d7f733782b. Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com
*	For inplace update, send nontransactional invalidations.	Noah Misch	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v12 (all supported versions). The back branch versions don't change WAL, because those branches just added end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Reviewed by Nitin Motiani and (in earlier versions) Andres Freund. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
*	Refactor code converting a publication name List to a StringInfo	Michael Paquier	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing get_publications_str() is renamed to GetPublicationsStr() and is moved to pg_subscription.c, so as it is possible to reuse it at two locations of the tablesync code where the same logic was duplicated. fetch_remote_table_info() was doing two List->StringInfo conversions when dealing with a server of version 15 or newer. The conversion happens only once now. This refactoring leads to less code overall. Author: Peter Smith Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/CAHut+PtJMk4bKXqtpvqVy9ckknCgK9P6=FeG8zHF=6+Em_Snpw@mail.gmail.com
*	Remove the RTE_GROUP RTE if we drop the groupClause	Richard Guo	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For an EXISTS subquery, the only thing that matters is whether it returns zero or more than zero rows. Therefore, we remove certain SQL features that won't affect that, among them the GROUP BY clauses. After we drop the groupClause, we'd better remove the RTE_GROUP RTE and clear the hasGroupRTE flag, as they depend on the groupClause. Failing to do so could result in a bogus RTE_GROUP entry in the parent query, leading to an assertion failure on the hasGroupRTE flag. Reported-by: David Rowley Author: Richard Guo Discussion: https://postgr.es/m/CAApHDvp2_yht8uPLyWO-kVGWZhYvx5zjGfSrg4fBQ9fsC13V0g@mail.gmail.com
*	Add functions pg_restore_relation_stats(), pg_restore_attribute_stats().	Jeff Davis	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to the pg_set_*_stats() functions, except with a variadic signature that's designed to be more future-proof. Additionally, most problems are reported as WARNINGs rather than ERRORs, allowing most stats to be restored even if some cannot. These functions are intended to be called from pg_dump to avoid the need to run ANALYZE after an upgrade. Author: Corey Huinker Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
*	Fix parallel worker tracking of new catalog relfilenumbers.	Noah Misch	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reunite RestorePendingSyncs() with RestoreRelationMap(). If RelationInitPhysicalAddr() ran after RestoreRelationMap() but before RestorePendingSyncs(), the relcache entry could cause RelationNeedsWAL() to return true erroneously. Trouble required commands of the current transaction to include REINDEX or CLUSTER of a system catalog. The parallel leader correctly derived RelationNeedsWAL()==false from the new relfilenumber, but the worker saw RelationNeedsWAL()==true. Worker MarkBufferDirtyHint() then wrote unwanted WAL. Recovery of that unwanted WAL could lose tuples like the system could before commit c6b92041d38512a4176ed76ad06f713d2e6c01a8 introduced this tracking. RestorePendingSyncs() and RestoreRelationMap() were adjacent till commit 126ec0bc76d044d3a9eb86538b61242bf7da6db4, so no back-patch for now. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20241019232815.c6.nmisch@google.com
*	Stop reading uninitialized memory in heap_inplace_lock().	Noah Misch	2024-10-24
\| \| \| \| \| \| \| \| \| \|	Stop computing a never-used value. This removes the read; the read had no functional implications. Back-patch to v12, like commit a07e03fd8fa7daf4d1356f7cb501ffe784ea6257. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/6c92f59b-f5bc-e58c-9bdd-d1f21c17c786@gmail.com
*	Refactor GetLockStatusData() to skip backends/groups without fast-path locks.	Fujii Masao	2024-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, GetLockStatusData() checked all slots for every backend to gather fast-path lock data, which could be inefficient. This commit refactors it by skipping backends with PID=0 (since they don't hold fast-path locks) and skipping groups with no registered fast-path locks, improving efficiency. This refactoring is particularly beneficial, for example when max_connections and max_locks_per_transaction are set high, as it reduces unnecessary checks across numerous slots. Author: Fujii Masao Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/a0a00c44-31e9-4c67-9846-fb9636213ac9@oss.nttdata.com
*	Support configuring TLSv1.3 cipher suites	Daniel Gustafsson	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ssl_ciphers GUC can only set cipher suites for TLSv1.2, and lower, connections. For TLSv1.3 connections a different OpenSSL API must be used. This adds a new GUC, ssl_tls13_ciphers, which can be used to configure a colon separated list of cipher suites to support when performing a TLSv1.3 handshake. Original patch by Erica Zhang with additional hacking by me. Author: Erica Zhang <ericazhangy2021@qq.com> Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
*	Support configuring multiple ECDH curves	Daniel Gustafsson	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ssl_ecdh_curve GUC only accepts a single value, but the TLS handshake can list multiple curves in the groups extension (the extension has been renamed to contain more than elliptic curves). This changes the GUC to accept a colon-separated list of curves. This commit also renames the GUC to ssl_groups to match the new nomenclature for the TLS extension. Original patch by Erica Zhang with additional hacking by me. Author: Erica Zhang <ericazhangy2021@qq.com> Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
*	Add 'no_error' argument to pg_wal_replay_wait()	Alexander Korotkov	2024-10-24
\| \| \| \| \| \| \| \| \| \| \|	This argument allow skipping throwing an error. Instead, the result status can be obtained using pg_wal_replay_wait_status() function. Catversion is bumped. Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz Reviewed-by: Pavel Borisov
*	Refactor WaitForLSNReplay() to return the result of waiting	Alexander Korotkov	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, WaitForLSNReplay() immediately throws an error if waiting for LSN replay is not successful. This commit teaches WaitForLSNReplay() to return the result of waiting, while making pg_wal_replay_wait() responsible for throwing an appropriate error. This is preparation to adding 'no_error' argument to pg_wal_replay_wait() and new function pg_wal_replay_wait_status(), which returns the last wait result status. Additionally, we stop distinguishing situations when we find our instance to be not in a recovery state before entering the waiting loop and inside the waiting loop. Standby promotion may happen at any moment, even between issuing a procedure call statement and pg_wal_replay_wait() doing a first check of recovery status. Thus, there is no pointing distinguishing these situations. Also, since we may exit the waiting loop and see our instance not in recovery without throwing an error, we need to deleteLSNWaiter() in that case. We do this unconditionally for the sake of simplicity, even if standby was already promoted after reaching the target LSN, the startup process surely already deleted us. Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz Reviewed-by: Michael Paquier, Pavel Borisov
*	Make WaitForLSNReplay() issue FATAL on postmaster death	Alexander Korotkov	2024-10-24
\| \| \| \| \| \|	Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZvY2C8N4ZqgCFaLu%40paquier.xyz Reviewed-by: Pavel Borisov
*	Move LSN waiting declarations and definitions to better place	Alexander Korotkov	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	3c5db1d6b implemented the pg_wal_replay_wait() stored procedure. Due to the patch development history, the implementation resided in src/backend/commands/waitlsn.c (src/include/commands/waitlsn.h for headers). 014f9f34d moved pg_wal_replay_wait() itself to src/backend/access/transam/xlogfuncs.c near to the WAL-manipulation functions. But most of the implementation stayed in place. The code in src/backend/commands/waitlsn.c has nothing to do with commands, but is related to WAL. So, this commit moves this code into src/backend/access/transam/xlogwait.c (src/include/access/xlogwait.h for headers). Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org Reviewed-by: Pavel Borisov
*	Avoid looping over all type cache entries in TypeCacheRelCallback()	Alexander Korotkov	2024-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, when a single relcache entry gets invalidated, TypeCacheRelCallback() has to loop over all type cache entries to find appropriate typentry to invalidate. Unfortunately, using the syscache here is impossible, because this callback could be called outside a transaction and this makes impossible catalog lookups. This is why present commit introduces RelIdToTypeIdCacheHash to map relation OID to its composite type OID. We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't get bloat in the case of temporary tables flood. There are many places in lookup_type_cache() where syscache invalidation, user interruption, or even error could occur. In order to handle this, we keep an array of in-progress type cache entries. In the case of lookup_type_cache() interruption this array is processed to keep RelIdToTypeIdCacheHash in a consistent state. Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru Author: Teodor Sigaev Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin Reviewed-by: Artur Zakirov
*	Update header comment for lookup_type_cache()	Alexander Korotkov	2024-10-24
\| \| \| \| \| \| \|	Describe the way we handle concurrent invalidation messages. Discussion: https://postgr.es/m/CAPpHfdsQhwUrnB3of862j9RgHoJM--eRbifvBMvtQxpC57dxCA%40mail.gmail.com Reviewed-by: Andrei Lepikhov, Artur Zakirov, Pavel Borisov