aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* reindexdb: move PQfinish() calls to the right placeÁlvaro Herrera2025-03-06
| | | | | | | | | | | | | | | | get_parallel_object_list() has no business closing a connection it did not create. Make things more sensible by closing the connection at the level where it is created, in reindex_one_database(). Extracted from a larger patch by the same author. However, the patch as submitted not only was not described as containing this change, but in addition it contained a fatal flaw whereby reindexdb would crash and fail across all of its TAP test, which is why I list myself as co-author. Author: Ranier Vilela <ranier.vf@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CAEudQArfqr0-s0VVPSEh=0kgOgBJvFNdGW=xSL5rBcr0WDMQYQ@mail.gmail.com
* Fix some performance issues in GIN query startup.Tom Lane2025-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a GIN index search had a lot of search keys (for example, "jsonbcol ?| array[]" with tens of thousands of array elements), both ginFillScanKey() and startScanKey() took O(N^2) time. Worse, those loops were uncancelable for lack of CHECK_FOR_INTERRUPTS. The problem in ginFillScanKey() is the brute-force search key de-duplication done in ginFillScanEntry(). The most expedient solution seems to be to just stop trying to de-duplicate once there are "too many" search keys. We could imagine working harder, say by using a sort-and-unique algorithm instead of brute force compare-all-the-keys. But it seems unlikely to be worth the trouble. There is no correctness issue here, since the code already allowed duplicate keys if any extra_data is present. The problem in startScanKey() is the loop that attempts to identify the first non-required search key. In the submitted test case, that vainly tests all the key positions, and each iteration takes O(N) time. One part of that is that it's reinitializing the entryRes[] array from scratch each time, which is entirely unnecessary given that the triConsistentFn isn't supposed to scribble on its input. We can easily adjust the array contents incrementally instead. The other part of it is that the triConsistentFn may itself take O(N) time (and does in this test case). This is all extremely brute force: in simple cases with AND or OR semantics, we could know without any looping whatever that all or none of the keys are required. But GIN opclasses don't have any API for exposing that knowledge, so at least in the short run there is little to be done about that. Put in a CHECK_FOR_INTERRUPTS so that at least the loop is cancelable. These two changes together resolve the primary complaint that the test query doesn't respond promptly to cancel interrupts. Also, while they don't completely eliminate the O(N^2) behavior, they do provide quite a nice speedup for mid-sized examples. Bug: #18831 Reported-by: Niek <niek.brasa@hitachienergy.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18831-e845ac44ebc5dd36@postgresql.org Backpatch-through: 13
* Avoid invalidating all RelationSyncCache entries on publication change.Amit Kapila2025-03-06
| | | | | | | | | | | | | | | | | | | | | On change of publication via ALTER PUBLICATION ... SET/ADD/DROP commands, we were invalidating all the relations present in relation sync cache maintained by pgoutput. We need to invalidate only the relation entries that are changed as part of publication DDL. We have ensured that the publication DDL execution generated the invalidations required to invalidate impacted relation sync entries in RelationSyncCache. This improves the performance by avoiding building the cache entries for the cases where a publication has many tables but only one of them is dropped. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com
* Organize and deduplicate statistics import tests.Jeff Davis2025-03-06
| | | | | | | Author: Corey Huinker <corey.huinker@gmail.com> Reported-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_bWEqUfxhODfJ-XbZC75vq=P6DYOKK6biyey=yM1Ah3Hg@mail.gmail.com Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com
* Address stats export review comments.Jeff Davis2025-03-06
| | | | | | | | | Per discussion, did not use Jian He's patch exactly. Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CACJufxFVq=tq9u1zrHWYSbMi1T07gS9Ff0LJScMco4HZmtZ1xw@mail.gmail.com Discussion: https://postgr.es/m/CADkLM=f1n2_Vomq0gKab7xdxDHmJGgn=DE48P8fzQOp3Mrs1Qg@mail.gmail.com
* Address stats import review comments.Jeff Davis2025-03-05
| | | | | Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHG9MBQozbJQ4JRBcRbUO+t+sx4qLZX092rS_9b4SR_EA@mail.gmail.com
* Fix compiler warnings about typedef redefinitionsHeikki Linnakangas2025-03-06
| | | | | | | | Clang with -Wtypedef-redefinition produced warnings: src/include/storage/latch.h:122:3: error: redefinition of typedef 'Latch' is a C11 feature [-Werror,-Wtypedef-redefinition] Per buildfarm
* Add more monitoring data for WAL writes in the WAL receiverMichael Paquier2025-03-06
| | | | | | | | | | | | | | | | | | | | | | This commit adds two improvements related to the monitoring of WAL writes for the WAL receiver. First, write counts and timings are now counted in pg_stat_io for the WAL receiver. These have been discarded from pg_stat_wal in ff99918c625a due to performance concerns, related to the fact that we still relied on an on-disk file for the stats back then, even with track_wal_io_timing to avoid the overhead of the timestamp calculations. This implementation is simpler than the original proposal as it is possible to rely on the APIs of pgstat_io.c to do the job. Like the fsync and read data, track_wal_io_timing needs to be enabled to track the timings. Second, a wait event is added around the pg_pwrite() call in charge of the writes, using the exiting WAIT_EVENT_WAL_WRITE. This is useful as the WAL receiver data is tracked in pg_stat_activity. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z8gFnH4o3jBm5BRz@ip-10-97-1-34.eu-west-3.compute.internal
* Split WaitEventSet functions to separate source fileHeikki Linnakangas2025-03-06
| | | | | | | | | latch.c now only contains the Latch related functions, which build on the WaitEventSet abstraction. Most of the platform-dependent stuff is now in waiteventset.c. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
* Use ModifyWaitEvent to update exit_on_postmaster_deathHeikki Linnakangas2025-03-06
| | | | | | | | | | This is in preparation for splitting WaitEventSet related functions to a separate source file. That will hide the details of WaitEventSet from WaitLatch, so it must use an exposed function instead of modifying WaitEventSet->exit_on_postmaster_death directly. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
* ecpg: Fix compiler warning in ecpg build with Meson.Fujii Masao2025-03-06
| | | | | | | | | | | | | | Previously, Meson could produce a warning about the use of 'deps' in ecpg: WARNING: Project targets '>=0.54' but uses a feature introduced in '0.60.0': list.<plus>. The right-hand operand was not a list. The right-hand operand of 'deps' should be a list. This commit fixes the warning by wrapping it with square brackets. This issue was introduced in commit 28f04984f0c. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CAOYmi+ks8wO06Ymxduw2h_eQJ_D4_jHGeyMK0P=p5Q3psnEdMA@mail.gmail.com
* Remove unused ShutdownLatchSupport() functionHeikki Linnakangas2025-03-05
| | | | | | | | | The only caller was removed in commit 80a8f95b3b. I don't foresee needing it any time soon, and I'm working on some big changes in this area, so let's remove it out of the way. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
* Revert "Show index search count in EXPLAIN ANALYZE."Peter Geoghegan2025-03-05
| | | | | | | This reverts commit 5ead85fbc81162ab1594f656b036a22e814f96b3. This commit shows test failures with debug_parallel_query=regress. The underlying issue needs to be debugged, so revert for now.
* Allow json{b}_strip_nulls to remove null array elementsAndrew Dunstan2025-03-05
| | | | | | | | | | | An additional paramater ("strip_in_arrays") is added to these functions. It defaults to false. If true, then null array elements are removed as well as null valued object fields. JSON that just consists of a single null is not affected. Author: Florents Tselai <florents.tselai@gmail.com> Discussion: https://postgr.es/m/4BCECCD5-4F40-4313-9E98-9E16BEB0B01D@gmail.com
* Show index search count in EXPLAIN ANALYZE.Peter Geoghegan2025-03-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the count of index searches/index descents in EXPLAIN ANALYZE's output for index scan nodes. This information is particularly useful with scans that use ScalarArrayOp quals, where the number of index scans isn't predictable in advance (at least not with optimizations like the one added to nbtree by Postgres 17 commit 5bf748b8). It will also be useful when EXPLAIN ANALYZE shows details of an nbtree index scan that uses skip scan optimizations set to be introduced by an upcoming patch. The instrumentation works by teaching index AMs to increment a new nsearches counter whenever a new index search begins. The counter is incremented at exactly the same point that index AMs must already increment the index's pg_stat_*_indexes.idx_scan counter (we're counting the same event, but at the scan level rather than the relation level). The new counter is stored in the scan descriptor (IndexScanDescData), which explain.c reaches by going through the scan node's PlanState. This approach doesn't match the approach used when tracking other index scan specific costs (e.g., "Rows Removed by Filter:"). It is similar to the approach used in other cases where we must track costs that are only readily accessible inside an access method, and not from the executor (e.g., "Heap Blocks:" output for a Bitmap Heap Scan). It is inherently necessary to maintain a counter that can be incremented multiple times during a single amgettuple call (or amgetbitmap call), and directly exposing PlanState.instrument to index access methods seems unappealing. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
* Rename some signal and interrupt handling functions for consistencyHeikki Linnakangas2025-03-05
| | | | | | | | | | | | | | | | | | | | | | | | The usual pattern for handling a signal is that the signal handler sets a flag and calls SetLatch(MyLatch), and CHECK_FOR_INTERRUPTS() or other code that is part of a wait loop calls another function to deal with it. The naming of the functions involved was a bit inconsistent, however. CHECK_FOR_INTERRUPTS() calls ProcessInterrupts() to do the heavy-lifting, but the analogous functions in aux processes were called HandleMainLoopInterrupts(), HandleStartupProcInterrupts(), etc. Similarly, most subroutines of ProcessInterrupts() were called Process*(), but some were called Handle*(). To make things less confusing, rename all the functions that are part of the overall signal/interrupt handling system but are not executed in a signal handler to e.g. ProcessSomething(), rather than HandleSomething(). The "Process" prefix is now consistently used in the non-signal-handler functions, and the "Handle" prefix in functions that are part of signal handlers, except for some completely unrelated functions that clearly have nothing to do with signal or interrupt handling. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/8a384b26-1499-41f6-be33-64b801fb98b8@iki.fi
* Add ALTER TABLE ... ALTER CONSTRAINT ... SET [NO] INHERITÁlvaro Herrera2025-03-05
| | | | | | | | | | | | | | This allows to redefine an existing non-inheritable constraint to be inheritable, which allows to straighten up situations with NO INHERIT constraints so that thay can become normal constraints without having to re-verify existing data. For existing inheritance children this may require creating additional constraints, if they don't exist already. It also allows to do the opposite, if only for symmetry. Author: Suraj Kharage <suraj.kharage@enterprisedb.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAF1DzPVfOW6Kk=7SSh7LbneQDJWh=PbJrEC_Wkzc24tHOyQWGg@mail.gmail.com
* Fix some gaps in pg_stat_io with WAL receiver and WAL summarizerMichael Paquier2025-03-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The WAL receiver and WAL summarizer processes gain each one a call to pgstat_report_wal(), to make sure that they report their WAL statistics to pgstats, gathering data for pg_stat_io. In the WAL receiver, the stats reports are timed with status updates sent to the primary, that depend on wal_receiver_status_interval and wal_receiver_timeout. This is a conservative choice, but perhaps we could be more aggressive with the frequency of the stats reports. An interesting historical fact is that the WAL receiver does writes and syncs of WAL, but it has never reported its statistics to pgstats in pg_stat_wal. In the WAL summarizer, the stats reports are done each time the process waits for WAL. While on it, pg_stat_io is adjusted so as these two processes do not report any rows when IOObject is not WAL, making the view easier to use with less rows. Two tests are added in TAP, checking statistics for the WAL summarizer and the WAL receiver. Status updates in the WAL receiver are currently possible in the recovery test 001_stream_rep.pl. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z8UKZyVSHUUQJHNb@paquier.xyz
* psql: Fix memory leak with \gx used within a pipelineMichael Paquier2025-03-05
| | | | | | | | | While inside a pipeline, \gx is currently forbidden and will make exec_command_g() exit early. There was a memory leak in this code path, so let's fix it. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqqFVQjLjZQiL7xdwLpzZEy1ghO_JWvCFPM_OmwF9s7XdA@mail.gmail.com
* Enforce memory limit during parallel GIN buildsTomas Vondra2025-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Index builds are expected to respect maintenance_work_mem, just like other maintenance operations. For serial builds this is done simply by flushing the buffer in ginBuildCallback() into the index. But with parallel builds it's more complicated, because there are multiple places that can allocate memory. ginBuildCallbackParallel() does the same thing as ginBuildCallback(), except that the accumulated items are written into tuplesort. Then the entries with the same key get merged - first in the worker, then in the leader - and the TID lists may get (arbitrarily) long. It's unlikely it would exceed the memory limit, but it's possible. We address this by evicting some of the data if the list gets too long. We can't simply dump the whole in-memory TID list. The GIN index bulk insert code expects to see TIDs in monotonic order; it may fail if the TIDs go backwards. If the TID lists overlap, evicting the whole current TID list would break this (a later entry might add "old" TID values into the already-written part). In the workers this is not an issue, because the lists never overlap. But the leader may see overlapping lists produced by the workers. We can however derive a safe "horizon" TID - the entries (for a given key) are sorted by (key, first TID), which means no future list can add values before the last "first TID" we've seen. This patch tracks the "frozen" part of the TID list, which we know can't change by merging additional TID lists. If needed, we can evict this part of the list. We don't want to do this too often - the smaller lists we evict, the more expensive it'll be to merge them in the next step (especially in the leader). Therefore we only trim the list if we have at least 1024 frozen items, and if the whole list is at least 64kB large. These thresholds are somewhat arbitrary and conservative. We might calculate the values from maintenance_work_mem, but tests show that does not really improve anything (time, compression ratio, ...). So we stick to these conservative values to release memory faster. Author: Tomas Vondra Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
* pg_upgrade: Check for the expected error message in TAP tests.Masahiko Sawada2025-03-04
| | | | | | | | | | | Since pg_upgrade prints its error messages on stdout, we can't use command_fails_like() to check if it fails for the right reason. This commit uses command_checks_all() in pg_upgrade TAP tests to check the exit status and stdout, enabling proper verification of error reasons. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87tt8h1vb7.fsf@wibble.ilmari.org
* Fix ALTER TABLE error messageÁlvaro Herrera2025-03-04
| | | | | | | | | | | | | | This bogus error message was introduced in 2013 by commit f177cbfe676d, because of misunderstanding the processCASbits() API; at the time, no test cases were added that would be affected by this change. Only in ca87c415e2fc was one added (along with a couple of typos), with an XXX note that the error message was bogus. Fix the whole, add some test cases. Backpatch all the way back. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202503041822.aobpqke3igvb@alvherre.pgsql
* Refactor Copy{From|To}GetRoutine() to use pass-by-reference argument.Masahiko Sawada2025-03-04
| | | | | | | | | | | | | | | | | The change improves efficiency by eliminating unnecessary copying of CopyFormatOptions. The coverity also complained about inefficiencies caused by pass-by-value. Oversight in 7717f6300 and 2e4127b6d. Reported-by: Junwang Zhao <zhjwpku@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (per reports from coverity) Author: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAEG8a3L6YCpPksTQMzjD_CvwDEhW3D_t=5md9BvvdOs5k+TA=Q@mail.gmail.com
* Compress TID lists when writing GIN tuples to diskTomas Vondra2025-03-04
| | | | | | | | | | | | | | | | | | | | | | | When serializing GIN tuples to tuplesorts during parallel index builds, we can significantly reduce the amount of data by compressing the TID lists. The GIN opclasses may produce a lot of data (depending on how many keys are extracted from each row), and the TID compression is very efficient and effective. If the number of distinct keys is high, the first worker pass (reading data from the table and writing them into a private tuplesort) may not benefit from the compression very much. It is likely to spill data to disk before the TID lists get long enough for the compression to help. The second pass (writing the merged data into the shared tuplesort) is more likely to benefit from compression. The compression can be seen as a way to reduce the amount of disk space needed by the parallel builds, because the data is written twice. First into the per-worker tuplesorts, then into the shared tuplesort. Author: Tomas Vondra Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
* Add .gitignore entry for ecpg test detritus.Tom Lane2025-03-04
| | | | Oversight in commit 28f04984f.
* Make FP_LOCK_SLOTS_PER_BACKEND look like a functionTomas Vondra2025-03-04
| | | | | | | | | | | | | The FP_LOCK_SLOTS_PER_BACKEND macro looks like a constant, but it depends on the max_locks_per_transaction GUC, and thus can change. This is non-obvious and confusing, so make it look more like a function by renaming it to FastPathLockSlotsPerBackend(). While at it, use the macro when initializing fast-path shared memory, instead of using the formula. Reported-by: Andres Freund Discussion: https://postgr.es/m/ffiwtzc6vedo6wb4gbwelon5nefqg675t5c7an2ta7pcz646cg%40qwmkdb3l4ett
* Add regression tests for pg_stat_progress_copy.tuples_skipped.Fujii Masao2025-03-04
| | | | | | | | | | | | | This commit adds tests to verify that tuples_skipped in pg_stat_progress_copy works as expected. While existing tests checked other fields, tuples_skipped was previously untested. This improves test coverage and ensures accurate tracking of skipped tuples. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Josef Šimánek <josef.simanek@gmail.com> Discussion: https://postgr.es/m/CACJufxFazq-bfyhiO0KBojR=yOr84E25Rqf6mHB0Ow0KPidkKw@mail.gmail.com
* Fix outdated commentHeikki Linnakangas2025-03-04
| | | | | | | Commit bc971f4025 replaced the latch-setting mechanism that the comment talked about with a condition variable. And before that, commit 2258e76f90 moved the code so that the comment got detached from the loop that it talked about, so move the comment closer to the loop.
* Fix accidental use of = instead of ==Peter Eisentraut2025-03-04
| | | | | | | | | Fix for commit 630f9a43cec. It used = instead of ==. The result would be an incorrect error message. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/CA%2BCOZaC-JMbhQ4O0Q8V1Bxa0R%2BNex_RN9D6UyuLPiEx_CK4Heg%40mail.gmail.com
* Fix ALTER TABLE ADD VIRTUAL GENERATED COLUMN when table rewritePeter Eisentraut2025-03-04
| | | | | | | | | | | | | | | | | demo: CREATE TABLE gtest20a (a int PRIMARY KEY, b int GENERATED ALWAYS AS (a * 2) VIRTUAL); ALTER TABLE gtest20a ADD COLUMN c float8 DEFAULT RANDOM() CHECK (b < 60); ERROR: no generation expression found for column number 2 of table "pg_temp_17306" In ATRewriteTable, the variable OIDNewHeap (if valid) corresponding pg_attrdef default expression entry was not populated. So OIDNewHeap cannot be used to call expand_generated_columns_in_expr or build_generation_expression. Therefore in ATRewriteTable, we can only use the existing relation to expand the generated expression. Author: jian he <jian.universality@gmail.com> Reviewed-by: Srinath Reddy <srinath2133@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxEJ%3DFoajabWXjszo_yrQeKSxdZ87KJqBW373rSbajKGAA%40mail.gmail.com
* Avoid NullTest deduction for clone clausesRichard Guo2025-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit b262ad440, we introduced an optimization that reduces an IS NOT NULL qual on a column defined as NOT NULL to constant true, and an IS NULL qual on a NOT NULL column to constant false, provided we can prove that the input expression of the NullTest is not nullable by any outer join. This deduction happens after we have generated multiple clones of the same qual condition to cope with commuted-left-join cases. However, performing the NullTest deduction for clone clauses can be unsafe, because we don't have a reliable way to determine if the input expression of a NullTest is non-nullable: nullingrel bits in clone clauses may not reflect reality, so we dare not draw conclusions from clones about whether Vars are guaranteed not-null. To fix, we check whether the given RestrictInfo is a clone clause in restriction_is_always_true and restriction_is_always_false, and avoid performing any reduction if it is. There are several ensuing plan changes in predicate.out, and we have to modify the tests to ensure that they continue to test what they are intended to. Additionally, this fix causes the test case added in f00ab1fd1 to no longer trigger the bug that commit fixed, so we also remove that test case. Back-patch to v17 where this bug crept in. Reported-by: Ronald Cruz <cruz@rentec.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/f5320d3d-77af-4ce8-b9c3-4715ff33f213@rentec.com Backpatch-through: 17
* ecpg: Add TAP test for the ecpg command.Fujii Masao2025-03-04
| | | | | | | | | | | | | | This commit adds a TAP test to verify that the ecpg command correctly detects unsupported or disallowed statements in input files and reports the appropriate error or warning messages. This test helps catch bugs like the one introduced in commit 3d009e45bd, which broke ecpg's handling of unsupported COPY FROM STDIN statements, later fixed by commit 94b914f601b. Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CANOn0EzoMyxA1m-quDS1UeQUq6FNki6+GGiGucgr9tm2R78rKw@mail.gmail.com
* Split pgstat_bestart() into three different routinesMichael Paquier2025-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pgstat_bestart(), used post-authentication to set up a backend entry in the PgBackendStatus array, so as its data becomes visible in pg_stat_activity and related catalogs, has its logic divided into three routines with this commit, called in order at different steps of the backend initialization: * pgstat_bestart_initial() sets up the backend entry with a minimal amount of information, reporting it with a new BackendState called STATE_STARTING while waiting for backend initialization and client authentication to complete. The main benefit that this offers is observability, so as it is possible to monitor the backend activity during authentication. This step happens earlier than in the logic prior to this commit. pgstat_beinit() happens earlier as well, before authentication. * pgstat_bestart_security() reports the SSL/GSS status of the connection, once authentication completes. Auxiliary processes, for example, do not need to call this step, hence it is optional. This step is called after performing authentication, same as previously. * pgstat_bestart_final() reports the user and database IDs, takes the entry out of STATE_STARTING, and reports its application_name. This is called as the last step of the three, once authentication completes. An injection point is added, with a test checking that the "starting" phase of a backend entry is visible in pg_stat_activity. Some follow-up patches are planned to take advantage of this refactoring with more information provided in backend entries during authentication (LDAP hanging was a problem for the author, initially). Author: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
* Add more assertions in palloc0() and palloc_extended()Michael Paquier2025-03-04
| | | | | | | | | | | | | | palloc() includes an assertion checking that an alloc() implementation never returns NULL for all MemoryContextMethods. This commit adds a similar assertion in palloc0(). In palloc_extend(), a different assertion is added, checking that MCXT_ALLOC_NO_OOM is set when an alloc() routine returns NULL. These additions can be useful to catch errors when implementing a new set of MemoryContextMethods routines. Author: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/507e8eba-2035-4a12-a777-98199a66beb8@proxel.se
* Allow => syntax for named cursor arguments in plpgsql.Tom Lane2025-03-03
| | | | | | | | | | | | We've traditionally accepted "name := value" syntax for cursor arguments in plpgsql. But it turns out that the equivalent statements in Oracle use "name => value". Since we accept both forms of punctuation for function arguments, it makes sense to do the same here. Author: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Gilles Darold <gilles@darold.net> Discussion: https://postgr.es/m/CAFj8pRA3d0ARQEMbABa1n6q25AUdNmyO8aGs56XNf9pD4sRMjQ@mail.gmail.com
* ci: Use a RAM disk for NetBSD and OpenBSD.Thomas Munro2025-03-04
| | | | | | | | | | | | | | | | Put the RAM disk setup for all three *BSD CI tasks into a common script, replacing the old FreeBSD-specific one from commit 0265e5c1. This makes them run 3 times and a bit over 2 times faster, respectively. NetBSD and FreeBSD now share the same one-liner to mount tmpfs. OpenBSD needs a GCP-image specific recipe that knows where to steal an unused disk partition needed to reserve swap space for an mfs RAM disk, because its tmpfs is deprecated and currently broken. The configured size is enough for our current tests but could potentially need future expansion. Thanks to Bilal for the disklabel incantation. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJJ-XrPhN%2BQA4ZUfYAAXcwOSDty9t0vE9Z8__AdacKnQg%40mail.gmail.com
* Trigger more frequent autovacuums with relallfrozenMelanie Plageman2025-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | Calculate the insert threshold for triggering an autovacuum of a relation based on the number of unfrozen pages. By only considering the unfrozen portion of the table when calculating how many tuples to add to the insert threshold, we can trigger more frequent vacuums of insert-heavy tables. This increases the chances of vacuuming those pages when they still reside in shared buffers This also increases the number of autovacuums triggered by tuples inserted and not by wraparound risk. We prefer to freeze these pages during insert-triggered autovacuums, as anti-wraparound vacuums are not automatically canceled by conflicting lock requests. We calculate the unfrozen percentage of the table using the recently added (99f8f3fbbc8f) relallfrozen column of pg_class. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com
* Simplify some logic around setting pg_attribute.atthasdef.Tom Lane2025-03-03
| | | | | | | | | | | | | | | | | | | | | | | | DefineRelation was of the opinion that it could usefully pre-fill atthasdef flags to eliminate work for StoreAttrDefault. This is not the case, however: the tupledesc that it's filling is not the one that InsertPgAttributeTuples will work from. The tupledesc used there is made by RelationBuildLocalRelation, which deliberately doesn't copy atthasdef. Moreover, if this did happen as the code thinks, it would be wrong for the case of plain "DEFAULT NULL" clauses, since we detect and ignore simple-null-Const defaults later on. Hence, remove the useless code. It also emerges that it's not really worth a special-case path in StoreAttrDefault() for atthasdef already being set, because as far as we can see that never happens: cases where an existing default gets updated always do RemoveAttrDefault first, so as to clean up possibly-no-longer-correct dependency entries. If it were the case the code would still work, anyway. Also remove a nearby comment made moot by 5eaa0e92e. Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
* Remove now-dead code in StoreAttrDefault().Tom Lane2025-03-03
| | | | | | | | | | | | | | | | | StoreAttrDefault() is no longer responsible for filling attmissingval, so remove the code for that. Get rid of RawColumnDefault.missingMode, too, as we no longer need that to pass information around. While here, clean up some sloppy coding in StoreAttrDefault(), such as failure to use XXXGetDatum macros. These aren't bugs but they're not good code either. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
* Fix broken handling of domains in atthasmissing logic.Tom Lane2025-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If a domain type has a default, adding a column of that type (without any explicit DEFAULT clause) failed to install the domain's default value in existing rows, instead leaving the new column null. This is unexpected, and it used to work correctly before v11. The cause is confusion in the atthasmissing mechanism about which default value to install: we'd only consider installing an explicitly-specified default, and then we'd decide that no table rewrite is needed. To fix, take the responsibility for filling attmissingval out of StoreAttrDefault, and instead put it into ATExecAddColumn's existing logic that derives the correct value to fill the new column with. Also, centralize the logic that determines the need for default-related table rewriting there, instead of spreading it over four or five places. In the back branches, we'll leave the attmissingval-filling code in StoreAttrDefault even though it's now dead, for fear that some extension may be depending on that functionality to exist there. A separate HEAD-only patch will clean up the now-useless code. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com Backpatch-through: 13
* Add relallfrozen to pg_classMelanie Plageman2025-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | | Add relallfrozen, an estimate of the number of pages marked all-frozen in the visibility map. pg_class already has relallvisible, an estimate of the number of pages in the relation marked all-visible in the visibility map. This is used primarily for planning. relallfrozen, together with relallvisible, is useful for estimating the outstanding number of all-visible but not all-frozen pages in the relation for the purposes of scheduling manual VACUUMs and tuning vacuum freeze parameters. A future commit will use relallfrozen to trigger more frequent vacuums on insert-focused workloads with significant volume of frozen data. Bump catalog version Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com
* Allow parallel CREATE INDEX for GIN indexesTomas Vondra2025-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow using parallel workers to build a GIN index, similarly to BTREE and BRIN. For large tables this may result in significant speedup when the build is CPU-bound. The work is divided so that each worker builds index entries on a subset of the table, determined by the regular parallel scan used to read the data. Each worker uses a local tuplesort to sort and merge the entries for the same key. The TID lists do not overlap (for a given key), which means the merge sort simply concatenates the two lists. The merged entries are written into a shared tuplesort for the leader. The leader needs to merge the sorted entries again, before writing them into the index. But this way a significant part of the work happens in the workers, and the leader is left with merging fewer large entries, which is more efficient. Most of the parallelism infrastructure is a simplified copy of the code used by BTREE indexes, omitting the parts irrelevant for GIN indexes (e.g. uniqueness checks). Original patch by me, with reviews and substantial improvements by Matthias van de Meent, certainly enough to make him a co-author. Author: Tomas Vondra, Matthias van de Meent Reviewed-by: Matthias van de Meent, Andy Fan, Kirill Reshke Discussion: https://postgr.es/m/6ab4003f-a8b8-4d75-a67f-f25ad98582dc%40enterprisedb.com
* Handle auxiliary processes in SQL functions of backend statisticsMichael Paquier2025-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit impacts the following SQL functions, authorizing the access to the PGPROC entries of auxiliary processes when attempting to fetch or reset backend-level pgstats entries: - pg_stat_reset_backend_stats() - pg_stat_get_backend_io() This is relevant since a051e71e28a1 for at least the WAL summarizer, WAL receiver and WAL writer processes, that has changed the backend statistics to authorize these three following the addition of WAL I/O statistics in pg_stat_io and backend statistics. The code is more flexible with future changes written this way, adapting automatically to any updates done in pgstat_tracks_backend_bktype(). While on it, pgstat_report_wal() gains a call to pgstat_flush_backend(), making sure that backend I/O statistics are updated when calling this routine. This makes the statistics report correctly for the WAL writer. WAL receiver and WAL summarizer do not call pgstat_report_wal() yet (spoiler: both should). It should be possible to lift some of the existing restrictions for other auxiliary processes, as well, but this is left as future work. Reported-by: Rahila Syed <rahilasyed90@gmail.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CAH2L28v9BwN8_y0k6FQ591=0g2Hj_esHLGj3bP38c9nmVykoiA@mail.gmail.com
* Use PRI*64 instead of "ll*" in format strings (minimal trial)Peter Eisentraut2025-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | Old: errmsg("hello %llu", (unsigned long long) x) New: errmsg("hello %" PRIu64, x) And likewise for everything printf-like. In the past we had to use long long so localized format strings remained architecture independent in message catalogs. Although long long is expected to be 64 bit everywhere, if we hadn't also cast the int64 values, we'd have generated compiler warnings on systems where int64 was long. Now that int64 is int64_t, C99 understand how to format them using <inttypes.h> macros, the casts are not necessary, and the gettext() tools recognize the macros and defer expansion until load time. (And if we ever manage to get -Wformat-signedness to work for us, that'd help with these too, but not the type-system-clobbering casts.) This particular patch converts only pg_checksums.c to the new system, to allow testing of the translation toolchain for everyone. If this works okay, a later patch will convert most of the rest. Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
* Fix pg_strtof() to not crash on NULL endptr.Tom Lane2025-03-01
| | | | | | | | | | | | | | | | We had managed not to notice this simple oversight because none of our calls exercised the case --- until commit 8f427187d. That led to pg_dump crashing on any platform that uses this code (currently Cygwin and Mingw). Even though there's no immediate bug in the back branches, backpatch, because a non-POSIX-compliant strtof() substitute is trouble waiting to happen for extensions or future back-patches. Diagnosed-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/339b3902-4e98-4e31-a744-94e43b7b9292@gmail.com Backpatch-through: 13
* Set amcancrosscompare to true for hashPeter Eisentraut2025-03-01
| | | | | | | | | This was missed in the refactoring in patch ce62f2f2a0a, which thus created a regression. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E1tngY6-0000UL-2n%40gemulon.postgresql.org
* Work around OAuth/EVFILT_TIMER quirk on NetBSD.Thomas Munro2025-03-01
| | | | | | | | | | | | | | | | | | | | | | | NetBSD's EVFILT_TIMER doesn't like zero timeouts, as introduced by commit b3f0be788. Steal the workaround from the same problem on Linux from a few lines up: round zero up to one. Do this only for NetBSD, as the other systems with the kevent() API accept zero and shouldn't have to insert a small bogus wait. Future improvement ideas: * when NetBSD < 10 falls out of support, we could try NODE_ABSTIME for the "fire now" meaning if timeout == 0 * when libcurl tells us to start a 0ms timer and call it back, we could figure out how to handle that more directly without involving the kernel (the current architecture doesn't make that straightforward) Failures with EINVAL errors could be seen on the new optional NetBSD CI task that we're trying to keep green as a candidate for inclusion as default-enabled CI task. The NetBSD build farm animals aren't testing OAuth yet, so no breakage there. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/CA%2BhUKGJ%2BWyJ26QGvO_nkgvbxgw%2B03U4EQ4Hxw%2BQBft6Np%2BXW7w%40mail.gmail.com
* Re-export NextCopyFromRawFields() to copy.h.Masahiko Sawada2025-02-28
| | | | | | | | | | | | | | | Commit 7717f630069 removed NextCopyFromRawFields() from copy.h. While it was hoped that NextCopyFrom() could serve as an alternative, certain use cases still require NextCopyFromRawFields(). For instance, extensions like file_text_array_fdw, which process source data with an unknown number of columns, rely on this function. Per buildfarm member crake. Reported-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Sutou Kouhei <kou@clear-code.com> Discussion: https://postgr.es/m/5c7e1ac8-5083-4c08-af19-cb9ade2f16ce@dunslane.net
* Tweak regex to avoid a bug in Perl 5.16.3.Tom Lane2025-02-28
| | | | | | | | | | For some reason, 5.16.3 (and perhaps slightly earlier/later versions) go into an infinite loop with the version-replacement regex installed by commit fc0d0ce97. We can work around that by using an explicit "\n" instead of the line-start metacharacter "^". Reported-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0u9dV3CdKqkqdusA_RdvBkwWe0c0rxcFWj++VYoutFYSw@mail.gmail.com
* Refactor COPY FROM to use format callback functions.Masahiko Sawada2025-02-28
| | | | | | | | | | | | | | | | | | | | | | | | This commit introduces a new CopyFromRoutine struct, which is a set of callback routines to read tuples in a specific format. It also makes COPY FROM with the existing formats (text, CSV, and binary) utilize these format callbacks. This change is a preliminary step towards making the COPY FROM command extensible in terms of input formats. Similar to 2e4127b6d2d, this refactoring contributes to a performance improvement by reducing the number of "if" branches that need to be checked on a per-row basis when sending field representations in text or CSV mode. The performance benchmark results showed ~5% performance gain in text or CSV mode. Author: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/20231204.153548.2126325458835528809.kou@clear-code.com