aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* List 006_login_trigger.pl test for mesonAlexander Korotkov2023-10-16
| | | | | Reported-by: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKGLuqDUaYYhJnA1H1q5Z-k18kQHoEqZ5fiXtTi4038nspg%40mail.gmail.com
* worker_spi: Fix test failure with BGWORKER_BYPASS_ROLELOGINCHECKMichael Paquier2023-10-16
| | | | | | | | | | | | This is a consequence of 4817da51f69a that has bumped up max_worker_processes, where now the last worker started by the test would be able to start by itself a parallel worker because there are more slots available. This did not show up before as the number of bgworkers reached exactly 8, as known as the previous limit, at the end of the test. Per report from buildfarm member crake, reproducible with debug_parallel_query = regress in the same fashion as fd4d93d269c0.
* Try to handle torn reads of pg_control in frontend.Thomas Munro2023-10-16
| | | | | | | | | | | | | | | | | | | | | | | Some of our src/bin tools read the control file without any kind of interlocking against concurrent writes from the server. At least ext4 and ntfs can expose partially modified contents when you do that. For now, we'll try to tolerate this by retrying up to 10 times if the checksum doesn't match, until we get two reads in a row with the same bad checksum. This is not guaranteed to reach the right conclusion, but it seems very likely to. Thanks to Tom Lane for this suggestion. Various ideas for interlocking or atomicity were considered too complicated, unportable or expensive given the lack of field reports, but remain open for future reconsideration. Back-patch as far as 12. It doesn't seem like a good idea to put a heuristic change for a very rare problem into the final release of 11. Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
* worker_spi: Bump up max_worker_processes in TAP testsMichael Paquier2023-10-16
| | | | | | | | | | | | | | | | mamba has detected a failure in the last test that should start a bgworker while bypassing the role login check. The buildfarm did not provide any information about its failure in the logs, but I suspect that this is caused by an exhaustion of the max_worker_processes slots set at 8 by default. In "normal" test runs, the number of bgworkers running at this stage of the test is already 7, so, if one of them spawns for example a parallel worker all the slots would be taken, preventing the last worker of the test to start. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/ZSyebsiub88pyJJO@paquier.xyz
* Rename 005_login_trigger.pl to 006_login_trigger.plAlexander Korotkov2023-10-16
| | | | In order to avoid numbering collision with 005_sspi.pl.
* Fix role names in src/test/authentication/t/005_login_trigger.plAlexander Korotkov2023-10-16
| | | | Per buildfarm member longfin.
* Fix code indentation violations in e83d1b0c40ccMichael Paquier2023-10-16
| | | | | koel has not reported this one yet, I have just bumped on it while looking at a different patch.
* Fix comment from commit 22655aa231.Thomas Munro2023-10-16
| | | | | Per automated complaint from BF animal koel this needed to be re-indented, but there was also a typo. Back-patch to 16.
* Add support event triggers on authenticated loginAlexander Korotkov2023-10-16
| | | | | | | | | | | | | | | | | | | | | This commit introduces trigger on login event, allowing to fire some actions right on the user connection. This can be useful for logging or connection check purposes as well as for some personalization of environment. Usage details are described in the documentation included, but shortly usage is the same as for other triggers: create function returning event_trigger and then create event trigger on login event. In order to prevent the connection time overhead when there are no triggers the commit introduces pg_database.dathasloginevt flag, which indicates database has active login triggers. This flag is set by CREATE/ALTER EVENT TRIGGER command, and unset at connection time when no active triggers found. Author: Konstantin Knizhnik, Mikhail Gribkov Discussion: https://postgr.es/m/0d46d29f-4558-3af9-9c85-7774e14a7709%40postgrespro.ru Reviewed-by: Pavel Stehule, Takayuki Tsunakawa, Greg Nancarrow, Ivan Panchenko Reviewed-by: Daniel Gustafsson, Teodor Sigaev, Robert Haas, Andres Freund Reviewed-by: Tom Lane, Andrey Sokolov, Zhihong Yu, Sergey Shinderuk Reviewed-by: Gregory Stark, Nikita Malakhov, Ted Yu
* Acquire ControlFileLock in relevant SQL functions.Thomas Munro2023-10-16
| | | | | | | | | | | | | | Commit dc7d70ea added functions that read the control file, but didn't acquire ControlFileLock. With unlucky timing, file systems that have weak interlocking like ext4 and ntfs could expose partially overwritten contents, and the checksum would fail. Back-patch to all supported releases. Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de
* Dissociate btequalimage() from interval_ops, ending its deduplication.Noah Misch2023-10-14
| | | | | | | | | | | | | | | | | | Under interval_ops, some equal values are distinguishable. One such pair is '24:00:00' and '1 day'. With that being so, btequalimage() breaches the documented contract for the "equalimage" btree support function. This can cause incorrect results from index-only scans. Users should REINDEX any btree indexes having interval-type columns. After updating, pg_amcheck will report an error for almost all such indexes. This fix makes interval_ops simply omit the support function, like numeric_ops does. Back-pack to v13, where btequalimage() first appeared. In back branches, for the benefit of old catalog content, btequalimage() code will return false for type "interval". Going forward, back-branch initdb will include the catalog change. Reviewed by Peter Geoghegan. Discussion: https://postgr.es/m/20231011013317.22.nmisch@google.com
* Don't spuriously report FD_SETSIZE exhaustion on Windows.Noah Misch2023-10-14
| | | | | | | | | | | | | | | | | Starting on 2023-08-03, this intermittently terminated a "pgbench -C" test in CI. It could affect a high-client-count "pgbench" without "-C". While parallel reindexdb and vacuumdb reach the same problematic check, sufficient client count and/or connection turnover is less plausible for them. Given the lack of examples from the buildfarm or from manual builds, reproducing this must entail rare operating system configurations. Also correct the associated error message, which was wrong for non-Windows. Back-patch to v12, where the pgbench check first appeared. While v11 vacuumdb has the problematic check, reaching it with typical vacuumdb usage is implausible. Reviewed by Thomas Munro. Discussion: https://postgr.es/m/CA+hUKG+JwvTNdcyJTriy9BbtzF1veSRQ=9M_ZKFn9_LqE7Kp7Q@mail.gmail.com
* Harden xxx_is_visible() functions against concurrent object drops.Tom Lane2023-10-14
| | | | | | | | | | | | | | | | | | | For the same reasons given in commit 403ac226d, adjust these functions to not assume that checking SearchSysCacheExists can guarantee success of a later fetch. This follows the same internal API choices made in the earlier commit: add a function XXXExt(oid, is_missing) and use that to eliminate the need for a separate existence check. The changes are very straightforward, though tedious. For the moment I just made the new functions static in namespace.c, but we could export them if a need emerges. Per bug #18014 from Alexander Lakhin. Given the lack of hard evidence that there's a bug in non-debug builds, I'm content to fix this only in HEAD. Discussion: https://postgr.es/m/18014-28c81cb79d44295d@postgresql.org
* Harden has_xxx_privilege() functions against concurrent object drops.Tom Lane2023-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The versions of these functions that accept object OIDs are supposed to return NULL, rather than failing, if the target object has been dropped. This makes it safe(r) to use them in queries that scan catalogs, since the functions will be applied to objects that are visible in the query's snapshot but might now be gone according to the catalog snapshot. In most cases we implemented this by doing a SearchSysCacheExists test and assuming that if that succeeds, we can safely invoke the appropriate aclchk.c function, which will immediately re-fetch the same syscache entry. It was argued that if the existence test succeeds then the followup fetch must succeed as well, for lack of any intervening AcceptInvalidationMessages call. Alexander Lakhin demonstrated that this is not so when CATCACHE_FORCE_RELEASE is enabled: the syscache entry will be forcibly dropped at the end of SearchSysCacheExists, and then it is possible for the catalog snapshot to get advanced while re-fetching the entry. Alexander's test case requires the operation to happen inside a parallel worker, but that seems incidental to the fundamental problem. What remains obscure is whether there is a way for this to happen in a non-debug build. Nonetheless, CATCACHE_FORCE_RELEASE is a very useful test methodology, so we'd better make the code safe for it. After some discussion we concluded that the most future-proof fix is to give up the assumption that checking SearchSysCacheExists can guarantee success of a later fetch. At best that assumption leads to fragile code --- for example, has_type_privilege appears broken for array types even if you believe the assumption holds. And it's not even particularly efficient. There had already been some work towards extending the aclchk.c APIs to include "is_missing" output flags, so this patch extends that work to cover all the aclchk.c functions that are used by the has_xxx_privilege() functions. (This allows getting rid of some ad-hoc decisions about not throwing errors in certain places in aclchk.c.) In passing, this fixes the has_sequence_privilege() functions to provide the same guarantees as their cousins: for some reason the SearchSysCacheExists tests never got added to those. There is more work to do to remove the unsafe coding pattern with SearchSysCacheExists in other places, but this is a pretty self-contained patch so I'll commit it separately. Per bug #18014 from Alexander Lakhin. Given the lack of hard evidence that there's a bug in non-debug builds, I'm content to fix this only in HEAD. (Perhaps we should clean up the has_sequence_privilege() oversight in the back branches, but in the absence of field complaints I'm not too excited about that either.) Discussion: https://postgr.es/m/18014-28c81cb79d44295d@postgresql.org
* Fix bulk table extension when copying into multiple partitionsAndres Freund2023-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When COPYing into a partitioned table that does now permit the use of table_multi_insert(), we could error out with ERROR: could not read block NN in file "base/...": read only 0 of 8192 bytes because BulkInsertState->next_free was not reset between partitions. This problem occurred only when not able to use table_multi_insert(), as a dedicated BulkInsertState for each partition is used in that case. The bug was introduced in 00d1e02be24, but it was hard to hit at that point, as commonly bulk relation extension is not used when not using table_multi_insert(). It became more likely after 82a4edabd27, which expanded the use of bulk extension. To fix the bug, reset the bulk relation extension state in BulkInsertState in ReleaseBulkInsertStatePin(). That was added (in b1ecb9b3fcf) to tackle a very similar issue. Obviously the name is not quite correct, but there might be external callers, and bulk insert state needs to be reset in precisely in the situations that ReleaseBulkInsertStatePin() already needed to be called. Medium term the better fix likely is to disallow reusing BulkInsertState across relations. Add a test that, without the fix, reproduces #18130 in most configurations. The test also catches the problem fixed in b1ecb9b3fcf when run with small shared_buffers. Reported-by: Ivan Kolombet <enderstd@gmail.com> Analyzed-by: Tom Lane <tgl@sss.pgh.pa.us> Analyzed-by: Andres Freund <andres@anarazel.de> Bug: #18130 Discussion: https://postgr.es/m/18130-7a86a7356a75209d%40postgresql.org Discussion: https://postgr.es/m/257696.1695670946%40sss.pgh.pa.us Backpatch: 16-
* Improve the naming in wal_sync_method code.Nathan Bossart2023-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | * sync_method is renamed to wal_sync_method. * sync_method_options[] is renamed to wal_sync_method_options[]. * assign_xlog_sync_method() is renamed to assign_wal_sync_method(). * The names of the available synchronization methods are now prefixed with "WAL_SYNC_METHOD_" and have been moved into a WalSyncMethod enum. * PLATFORM_DEFAULT_SYNC_METHOD is renamed to PLATFORM_DEFAULT_WAL_SYNC_METHOD, and DEFAULT_SYNC_METHOD is renamed to DEFAULT_WAL_SYNC_METHOD. These more descriptive names help distinguish the code for wal_sync_method from the code for DataDirSyncMethod (e.g., the recovery_init_sync_method configuration parameter and the --sync-method option provided by several frontend utilities). This change also prevents name collisions between the aforementioned sets of code. Since this only improves the naming of internal identifiers, there should be no behavior change. Author: Maxim Orlov Discussion: https://postgr.es/m/CACG%3DezbL1gwE7_K7sr9uqaCGkWhmvRTcTEnm3%2BX1xsRNwbXULQ%40mail.gmail.com
* psql: Add completion support for AT [ LOCAL | TIME ZONE ]Michael Paquier2023-10-13
| | | | | | | | | AT TIME ZONE is completed with a list of supported timezones, something not needed by AT LOCAL. Author: Dagfinn Ilmari Mannsåker Reviewed-by: Jim Jones Discussion: https://postgr.es/m/87jzyzsvgv.fsf@wibble.ilmari.org
* Add support for AT LOCALMichael Paquier2023-10-13
| | | | | | | | | | | | | | | When converting a timestamp to/from with/without time zone, the SQL Standard specifies an AT LOCAL variant of AT TIME ZONE which uses the session's time zone. This includes three system functions able to do the work in the same way as the existing flavors for AT TIME ZONE, except that these need to be marked as stable as they depend on the session's TimeZone GUC. Bump catalog version. Author: Vik Fearing Reviewed-by: Laurenz Albe, Cary Huang, Michael Paquier Discussion: https://postgr.es/m/8e25dec4-5667-c1a5-6581-167d710c2182@postgresfriends.org
* Add wait events for checkpoint delay mechanism.Thomas Munro2023-10-13
| | | | | | | | | | | When MyProc->delayChkptFlags is set to temporarily block phase transitions in a concurrent checkpoint, the checkpointer enters a sleep-poll loop to wait for the flag to be cleared. We should show that as a wait event in the pg_stat_activity view. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGL7Whi8iwKbzkbn_1fixH3Yy8aAPz7mfq6Hpj7FeJrKMg%40mail.gmail.com
* Unify two isLogSwitch tests in XLogInsertRecord.Robert Haas2023-10-12
| | | | | | | | | | | | | | | | | An upcoming patch wants to introduce an additional special case in this function. To keep that as cheap as possible, minimize the amount of branching that we do based on whether this is an XLOG_SWITCH record. Additionally, and also in the interest of keeping the overhead of special-case code paths as low as possible, apply likely() to the non-XLOG_SWITCH case, since only a very tiny fraction of WAL records will be XLOG_SWITCH records. Patch by me, reviewed by Dilip Kumar, Amit Kapila, Andres Freund, and Michael Paquier. Discussion: http://postgr.es/m/CA+TgmoYy-Vc6G9QKcAKNksCa29cv__czr+N9X_QCxEfQVpp_8w@mail.gmail.com
* Fix runtime partition pruning for HASH partitioned tablesDavid Rowley2023-10-13
| | | | | | | | | | | | | | | | | | | | | | | This could only affect HASH partitioned tables with at least 2 partition key columns. If partition pruning was delayed until execution and the query contained an IS NULL qual on one of the partitioned keys, and some subsequent partitioned key was being compared to a non-Const, then this could result in a crash due to the incorrect keyno being used to calculate the stateidx for the expression evaluation code. Here we fix this by properly skipping partitioned keys which have a nullkey set. Effectively, this must be the same as what's going on inside perform_pruning_base_step(). Sergei Glukhov also provided a patch, but that's not what's being used here. Reported-by: Sergei Glukhov Reviewed-by: tender wang, Sergei Glukhov Discussion: https://postgr.es/m/d05b26fa-af54-27e1-f693-6c31590802fa@postgrespro.ru Backpatch-through: 11, where runtime partition pruning was added.
* Fix incorrect step generation in HASH partition pruningDavid Rowley2023-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | get_steps_using_prefix_recurse() incorrectly assumed that it could stop recursive processing of the 'prefix' list when cur_keyno was one before the step_lastkeyno. Since hash partition pruning can prune using IS NULL quals, and these IS NULL quals are not present in the 'prefix' list, then that logic could cause more levels of recursion than what is needed and lead to there being no more items in the 'prefix' list to process. This would manifest itself as a crash in some code that expected the 'start' ListCell not to be NULL. Here we adjust the logic so that instead of stopping recursion at 1 key before the step_lastkeyno, we just look at the llast(prefix) item and ensure we only recursively process up until just before whichever the last key is. This effectively allows keys to be missing in the 'prefix' list. This change does mean that step_lastkeyno is no longer needed, so we remove that from the static functions. I also spent quite some time reading this code and testing it to try to convince myself that there are no other issues. That resulted in the irresistible temptation of rewriting some comments, many of which were just not true or inconcise. Reported-by: Sergei Glukhov Reviewed-by: Sergei Glukhov, tender wang Discussion: https://postgr.es/m/2f09ce72-315e-2a33-589a-8519ada8df61@postgrespro.ru Backpatch-through: 11, where partition pruning was introduced.
* Add option to bgworkers to allow the bypass of role login checkMichael Paquier2023-10-12
| | | | | | | | | | | | | | | | | | | This adds a new option called BGWORKER_BYPASS_ROLELOGINCHECK to the flags available to BackgroundWorkerInitializeConnection() and BackgroundWorkerInitializeConnectionByOid(). This gives the possibility to bgworkers to bypass the role login check, making possible the use of a role that has no login rights while not being a superuser. PostgresInit() gains a new flag called INIT_PG_OVERRIDE_ROLE_LOGIN, taking advantage of the refactoring done in 4800a5dfb4c4. Regression tests are added to worker_spi to check the behavior of this new option with bgworkers. Author: Bertrand Drouvot Reviewed-by: Nathan Bossart, Michael Paquier, Bharath Rupireddy Discussion: https://postgr.es/m/bcc36259-7850-4882-97ef-d6b905d2fc51@gmail.com
* Reindent comment in GenericXLogFinish().Tom Lane2023-10-11
| | | | Restore pgindent cleanliness, per buildfarm member koel.
* Fix missed optimization in relation_excluded_by_constraints().Tom Lane2023-10-11
| | | | | | | | | | | | | | | | | | | | | | In commit 3fc6e2d7f, I (tgl) argued that we only need to check for a constant-FALSE restriction clause when there's exactly one restriction clause, on the grounds that const-folding would have thrown away anything ANDed with a Const FALSE. That's true just after const-folding has been applied, but subsequent processing such as equivalence class expansion could result in cases where a Const FALSE is ANDed with some other stuff. (Compare for instance joinrels.c's restriction_is_constant_false.) Hence, tweak this logic to check all the elements of the baserestrictinfo list, not just one; that's cheap enough to not be worth worrying about. There is one existing test case where this visibly improves the plan. There would not be any savings in runtime, but the planner effort and executor startup effort will be reduced, and anyway it's odd that we can detect related cases but not this one. Richard Guo (independently discovered by David Rowley) Discussion: https://postgr.es/m/CAMbWs4_x3-CnVVrCboS1LkEhB5V+W7sLSCabsRiG+n7+5_kqbg@mail.gmail.com
* Move canAcceptConnections check from ProcessStartupPacket to caller.Heikki Linnakangas2023-10-11
| | | | | | | | | | The check is not about processing the startup packet, so the calling function seems like a more natural place. I'm also working on a patch that moves 'canAcceptConnections' out of the Port struct, and this makes that refactoring more convenient. Reviewed-by: Tristan Partin Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
* Improve some wording in pg_upgrade/IMPLEMENTATIONMichael Paquier2023-10-11
| | | | | Author: Gurjeet Singh Discussion: https://postgr.es/m/CABwTF4VFKtKrb78fBnMXwHvOu4a+-7y86siBSEety2knti2eGA@mail.gmail.com
* Refactor InitPostgres() to use bitwise option flagsMichael Paquier2023-10-11
| | | | | | | | | | | | | | | InitPostgres() has been using a set of boolean arguments to control its behavior, and a patch under discussion was aiming at expanding it with a third one. In preparation for expanding this area, this commit switches all the current boolean arguments of this routine to a single bits32 argument instead. Two values are currently supported for the flags: - INIT_PG_LOAD_SESSION_LIBS to load [session|local]_preload_libraries at startup. - INIT_PG_OVERRIDE_ALLOW_CONNS to allow connection to a database even if it has !datallowconn. This is used by bgworkers. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/ZSTn66_BXRZCeaqS@paquier.xyz
* Fix bug in GenericXLogFinish().Jeff Davis2023-10-10
| | | | | | | | Mark the buffers dirty before writing WAL. Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c094a@iki.fi Reviewed-by: Heikki Linnakangas Backpatch-through: 11
* Replace has_multiple_baserels() with a bitmap test on all_baserels.Tom Lane2023-10-10
| | | | | | | | | | | | Since we added the PlannerInfo.all_baserels set, it's not really necessary to grovel over the rangetable to count baserels in the current query. So let's drop has_multiple_baserels() in favor of a bms_membership() test. This might be microscopically faster, but the main point is to remove some unnecessary code. Richard Guo Discussion: https://postgr.es/m/CAMbWs4_8RcSbbfs1ASZLrMuL0c0EQgXWcoLTQD8swBRY_pQQiA@mail.gmail.com
* pg_resetwal: Corrections around -c optionPeter Eisentraut2023-10-10
| | | | | | | | | | | The present pg_resetwal code hardcodes the minimum value for -c as 2, which is FrozenTransactionId, but it's not clear why that is allowed. After some research, it was probably a mistake in the original patch. Change it to FirstNormalTransactionId, which matches other xid-related options in pg_resetwal. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/d09f0e91-8757-642b-1a92-da9a52f5589a%40eisentraut.org
* Add const to values and nulls argumentsPeter Eisentraut2023-10-10
| | | | | | | This excludes any changes that would change the external AM APIs. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/14c31f4a-0347-0805-dce8-93a9072c05a5%40eisentraut.org
* Fix possible crash in add_paths_to_append_rel()David Rowley2023-10-10
| | | | | | | | | | | | | | While working on a8a968a82, I failed to consider that cheapest_startup_path can be NULL when there is no non-parameterized path in the pathlist. This is well documented in set_cheapest(), I just failed to notice. Here we adjust the code to just check if the RelOptInfo has a cheapest_startup_path set before adding it to the startup_subpaths list. Reported-by: Richard Guo Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs49w3t03V69XhdCuw+GDwivny4uQUxrkVp6Gejaspt0wMQ@mail.gmail.com
* Revert "Optimize various aggregate deserialization functions"David Rowley2023-10-10
| | | | | | | | | | This reverts commit 608fd198def5390c3490bfe903730207dfd8eeb4. On 2nd thoughts, the StringInfo API requires that strings are NUL terminated and pointing directly to the data in a bytea Datum isn't NUL terminated. Discussion: https://postgr.es/m/CAApHDvorfO3iBZ=xpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ@mail.gmail.com
* worker_spi: Fix another stability issue with BGWORKER_BYPASS_ALLOWCONNMichael Paquier2023-10-10
| | | | | | | | | | | | | | | | | | | worker_spi_launch() may report that a worker stopped when it fails to connect on a database that does not allow connections if the worker exits before the SQL function checks for the current status of the worker. The test is switched to use Cluster::psql instead of safe_psql() so as it does not fail hard when this query errors. While on it, this removes a query that looks at pg_stat_activity to simplify the test, as a check on the contents of the server logs achieves the same when the worker cannot connect to the database without datallowconn. Per buildfarm members kestrel, mamba and serinus. Bonus thanks to Tom Lane for providing the logs of the failure from mamba that the buildfarm was not able to show up. Note that I have reproduced the failure with a hardcoded stop point. Discussion: https://postgr.es/m/3365937.1696801735@sss.pgh.pa.us
* Rename StartBackgroundWorker() to BackgroundWorkerMain().Heikki Linnakangas2023-10-09
| | | | | | | | | | | The comment claimed that it is "called from postmaster", but it is actually called in the child process, pretty early in the process initialization. I guess you could interpret "called from postmaster" to mean that, but it seems wrong to me. Rename the function to be consistent with other functions with similar role. Reviewed-by: Thomas Munro Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi
* Allocate Backend structs in PostmasterContext.Heikki Linnakangas2023-10-09
| | | | | | | | | The child processes don't need them. By allocating them in PostmasterContext, the memory gets free'd and is made available for other stuff in the child processes. Reviewed-by: Thomas Munro Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi
* Clarify the checks in RegisterBackgroundWorker.Heikki Linnakangas2023-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In EXEC_BACKEND or single-user mode, we process shared_preload_libraries at postmaster startup as usual, but also at backend startup. When a library calls RegisterBackgroundWorker() when being loaded into a backend process, we go through the motions to add the worker to BackgroundWorkerList, even though that is a postmaster-private data structure. Make it return early when called in a backend process, without changing BackgroundWorkerList. You could argue that it was intentional: In non-EXEC_BACKEND mode, the backend processes inherit BackgroundWorkerList at fork(), so it does make some sense to initialize it to the same state in EXEC_BACKEND mode, too. It's clearly a postmaster-private structure, though, and all the functions that use it are clearly marked as "should only be called in postmaster". You could also argue that libraries should not call RegisterBackgroundWorker() during backend startup. It's too late to correctly register any static background workers at that stage. But it's a common pattern in extensions, and it doesn't seem worth the churn to require all extensions to change it. Another sloppiness was the exception for "internal" background workers. We checked that RegisterBackgroundWorker() was called during shared_preload_libraries processing, or the background worker was an internal one. That exception was made in commit 665d1fad99 to allow postmaster to register the logical apply launcher in ApplyLauncherRegister(). The way the check was written, it would not complain if you registered an internal background worker in a regular backend process. But it would complain if postmaster registered a background worker defined in a shared library, outside shared_preload_libraries processing. I think the correct rule is that you can only register static background workers in the postmaster process, and only before the bgworker shared memory array has been initialized. Check for that more directly. Reviewed-by: Thomas Munro Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi
* Optimize various aggregate deserialization functionsDavid Rowley2023-10-09
| | | | | | | | | | | | | | | The serialized representation of an internal aggregate state is a bytea value. In each deserial function, in order to "receive" the bytea value we appended it onto a short-lived StringInfoData using appendBinaryStringInfo. This was a little wasteful as it meant having to palloc memory, copy a (possibly long) series of bytes then later pfree that memory. Instead of going to this extra trouble, we can just fake up a StringInfoData and point the data directly at the bytea's payload. This should help increase the performance of internal aggregate deserialization. Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAApHDvr=e-YOigriSHHm324a40HPqcUhSp6pWWgjz5WwegR=cQ@mail.gmail.com
* Remove duplicate words in docs and code comments.Amit Kapila2023-10-09
| | | | | | | Additionally, add a missing "the" in a couple of places. Author: Vignesh C, Dagfinn Ilmari Mannsåker Discussion: http://postgr.es/m/CALDaNm28t+wWyPfuyqEaARS810Je=dRFkaPertaLAEJYY2cWYQ@mail.gmail.com
* Strip off ORDER BY/DISTINCT aggregate pathkeys in create_agg_pathDavid Rowley2023-10-09
| | | | | | | | | | | | | | | | | | | | | | | 1349d2790 added code to adjust the PlannerInfo.group_pathkeys so that ORDER BY / DISTINCT aggregate functions could obtain pre-sorted inputs to allow faster execution. That commit forgot to adjust the pathkeys in create_agg_path(). Some code in there assumed that it was always fine to make the AggPath's pathkeys the same as its subpath's. That seems to have been ok up until 1349d2790, but since that commit adds pathkeys for columns which are within the aggregate function, those columns won't be available above the aggregate node. This can result in "could not find pathkey item to sort" during create_plan(). The fix here is to strip off the additional pathkeys added by adjust_group_pathkeys_for_groupagg(). It seems that the pathkeys here will only ever be group_pathkeys, so all we need to do is check if the length of the pathkey list is longer than the num_groupby_pathkeys and get rid of the additional ones only if we see extras. Reported-by: Justin Pryzby Reviewed-by: Richard Guo Discussion: https://postgr.es/m/ZQhYYRhUxpW3PSf9%40telsasoft.com Backpatch-through: 16, where 1349d2790 was introduced
* Remove debug_print_rel and replace usages with pprintDavid Rowley2023-10-09
| | | | | | | | | | | | | | | | | | | | Going by c4a1933b4, b33ef397a and 05893712c (to name just a few), it seems that maintaining debug_print_rel() is often forgotten. In the case of c4a1933b4, it was several years before anyone noticed that a path type was not handled by debug_print_rel(). (debug_print_rel() is only compiled when building with OPTIMIZER_DEBUG). After a quick survey on the pgsql-hackers mailing list, nobody came forward to admit that they use OPTIMIZER_DEBUG. So to prevent any future maintenance neglect, let's just remove debug_print_rel() and have OPTIMIZER_DEBUG make use of pprint() instead (as suggested by Tom Lane). If anyone wants to come forward to claim they make use of OPTIMIZER_DEBUG in a way that they need debug_print_rel() then they have around 10 months remaining in the v17 cycle where we could revert this. If nobody comes forward in that time, then we can likely safely declare debug_print_rel() as not worth keeping. Discussion: https://postgr.es/m/CAApHDvoCdjo8Cu2zEZF4-AxWG-90S+pYXAnoDDa9J3xH-OrczQ@mail.gmail.com
* Fix another typo in e0b1ee17dcAlexander Korotkov2023-10-07
| | | | | Reported-by: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_kHMJDak75y1kBTirv-drS1-knT-7Mpg5LprAjqRJDVA%40mail.gmail.com
* Restore proper linkage of pg_char_to_encoding() and friends.Tom Lane2023-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Back in the 8.3 era we discovered that it was problematic if libpq.so had encoding ID assignments different from the backend, which is possible because on some platforms libpq.so might be of a different major version from the calling programs. psql should use libpq's assignments, but initdb has to use the backend's, else it will put wrong values into pg_database. The solution devised in commit 8468146b0 relied on giving initdb its own copy of encnames.c rather than relying on the functions exported by libpq. Later, that metamorphosed into ensuring that libpgcommon got linked before libpq -- which made things OK for initdb but broke psql. We didn't notice for lack of any changes in enum pg_enc since then. Commit 06843df4a reversed that, fixing the latent bug in psql but adding one in initdb. The meson build infrastructure is also not being sufficiently careful about link order, and trying to make it so would be equally fragile. Hence, let's use a new scheme based on giving the libpq-exported symbols different real names than the same functions exported from libpgcommon.a or libpgcommon_srv.a. (We could distinguish those two cases as well, but there seems no need to.) libpq gets the official names to avoid an ABI break for libpq clients, while the other cases use #define's to make the real names "xxx_private" rather than "xxx". By controlling where the #define's are applied, we can force any particular client program to use one set or the other of the encnames.c functions. We cannot back-patch this, since it'd be an ABI break for backend loadable modules, but there seems little need to. We're just trying to ensure that the world is safe for hypothetical future additions to enum pg_enc. In passing this should fix "duplicate symbol" linker warnings that we've been seeing on AIX buildfarm members since commit 06843df4a. It's not very clear why that linker is complaining now, when there were strictly *more* duplicates visible before, but in any case this should remove the reason for complaint. Patch by me; thanks to Andres Freund for review. Discussion: https://postgr.es/m/2385119.1696354473@sss.pgh.pa.us
* Fix typos in e0b1ee17dcAlexander Korotkov2023-10-07
| | | | Reported-by: Alexander Lakhin
* Add test for checking the line length of --help outputPeter Eisentraut2023-10-06
| | | | | | | | | | | There was some discussion what the line length should be. Most output currently clearly targets around 80 columns, but the maximum in use currently is 95, so we set that as the current maximum. This just ensures that there is some guidance and there are no wild deviations. based on patch by Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/50ca8ff35a8dd8f9ec89963b503571a7@oss.nttdata.com
* Remove environment-variable-based defaults in psql --helpPeter Eisentraut2023-10-06
| | | | | | | | | | This seemed inconsistent with the --help output of other tools. Depending on the values, it can cause ugly formatting. Also, we're not getting the defaults from libpq, we're just emulating the methods libpq uses to derive these values, so they might not be 100% correct. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/50ca8ff35a8dd8f9ec89963b503571a7@oss.nttdata.com
* Remove extra parenthesis from comment.Etsuro Fujita2023-10-06
|
* Skip checking of scan keys required for directional scan in B-treeAlexander Korotkov2023-10-06
| | | | | | | | | | | | | | | | | | | | | Currently, B-tree code matches every scan key to every item on the page. Imagine the ordered B-tree scan for the query like this. SELECT * FROM tbl WHERE col > 'a' AND col < 'b' ORDER BY col; The (col > 'a') scan key will be always matched once we find the location to start the scan. The (col < 'b') scan key will match every item on the page as long as it matches the last item on the page. This patch implements prechecking of the scan keys required for directional scan on beginning of page scan. If precheck is successful we can skip this scan keys check for the items on the page. That could lead to significant acceleration especially if the comparison operator is expensive. Idea from patch by Konstantin Knizhnik. Discussion: https://postgr.es/m/079c3f8e-3371-abe2-e93c-fc8a0ae3f571%40garret.ru Reviewed-by: Peter Geoghegan, Pavel Borisov
* Fix crash on syslogger startupHeikki Linnakangas2023-10-06
| | | | | | | | When syslogger starts up, ListenSockets is still NULL. Don't try to pfree it. Oversight in commit e29c464395. Reported-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/ZR-uNkgL7m60lWUe@paquier.xyz