aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* Use CXXFLAGS instead of CFLAGS for linking C++ codePeter Eisentraut2024-08-04
| | | | | | | | | | Otherwise, this would break if using C and C++ compilers from different families and they understand different options. It already used the right flags for compiling, this is only for linking. Also, the meson setup already did this correctly. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/228700.1722717983@sss.pgh.pa.us
* Fix incorrect format placeholders in pgstat.cMichael Paquier2024-08-04
| | | | | | | | These should have been switched from %d to %u in 3188a4582a8c in the debugging elogs added in ca1ba50fcb6f. PgStat_Kind should never be higher than INT32_MAX, but let's be clean. Issue noticed while hacking more on this area.
* Add -Wmissing-variable-declarations to the standard compilation flagsPeter Eisentraut2024-08-03
| | | | | | | | | | | | | | | | This warning flag detects global variables not declared in header files. This is similar to what -Wmissing-prototypes does for functions. (More correctly, it is similar to what -Wmissing-declarations does for functions, but -Wmissing-prototypes is a superset of that in C.) This flag is new in GCC 14. Clang has supported it for a while. Several recent commits have cleaned up warnings triggered by this, so it should now be clean. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
* Small refactoring around ExecCreateTableAs().Jeff Davis2024-08-02
| | | | | | | | | | | | Since commit 4b74ebf726, the refresh logic is used to populate materialized views, so we can simplify the error message in ExecCreateTableAs(). Also, RefreshMatViewByOid() is moved to just after create_ctas_nodata() call to improve code readability. Author: Yugo Nagata Discussion: https://postgr.es/m/20240802161301.d975daca9ba7a706fa05ecd7@sraoss.co.jp
* Implement pg_wal_replay_wait() stored procedureAlexander Korotkov2024-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | pg_wal_replay_wait() is to be used on standby and specifies waiting for the specific WAL location to be replayed. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes are on standby. The queue of waiters is stored in the shared memory as an LSN-ordered pairing heap, where the waiter with the nearest LSN stays on the top. During the replay of WAL, waiters whose LSNs have already been replayed are deleted from the shared memory pairing heap and woken up by setting their latches. pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records, implying a kind of self-deadlock. This is why it is only possible to implement pg_wal_replay_wait() as a procedure working without an active snapshot, not a function. Catversion is bumped. Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru Author: Kartyshov Ivan, Alexander Korotkov Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi
* Fix NLS file reference in pg_createsubscriberAlvaro Herrera2024-08-02
| | | | | | | | | pg_createsubscriber is referring to a non-existent message translation file, causing NLS to not work correctly. This command should use the same file as pg_basebackup. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20240802.115717.1083441453338151622.horikyota.ntt@gmail.com
* pg_createsubscriber: Fix bogus error messageAlvaro Herrera2024-08-02
| | | | Also some desultory style improvement
* Include bison header files into implementation filesPeter Eisentraut2024-08-02
| | | | | | | | | | | | | | | | | | | | Before Bison 3.4, the generated parser implementation files run afoul of -Wmissing-variable-declarations (in spite of commit ab61c40bfa2) because declarations for yylval and possibly yylloc are missing. The generated header files contain an extern declaration, but the implementation files don't include the header files. Since Bison 3.4, the generated implementation files automatically include the generated header files, so then it works. To make this work with older Bison versions as well, include the generated header file from the .y file. (With older Bison versions, the generated implementation file contains effectively a copy of the header file pasted in, so including the header file is redundant. But we know this works anyway because the core grammar uses this arrangement already.) Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
* Minor refactoring of assign_backendlist_entry()Heikki Linnakangas2024-08-01
| | | | | | | | Make assign_backendlist_entry() responsible just for allocating the Backend struct. Linking it to the RegisteredBgWorker is the caller's responsibility now. Seems more clear that way. Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
* Fix outdated comment; all running bgworkers are in BackendListHeikki Linnakangas2024-08-01
| | | | | | | | Before commit 8a02b3d732, only bgworkers that connected to a database had an entry in the Backendlist. Commit 8a02b3d732 changed that, but forgot to update this comment. Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
* Switch PgStat_Kind from an enum to a uint32 typeMichael Paquier2024-08-02
| | | | | | | | | | | | | | | | A follow-up patch is planned to make cumulative statistics pluggable, and using a type is useful in the internal routines used by pgstats as PgStat_Kind may have a value that was not originally in the enum removed here, once made pluggable. While on it, this commit switches pgstat_is_kind_valid() to use PgStat_Kind rather than an int, to be more consistent with its existing callers. Some loops based on the stats kind IDs are switched to use PgStat_Kind rather than int, for consistency with the new time. Author: Michael Paquier Reviewed-by: Dmitry Dolgov, Bertrand Drouvot Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
* Add redo LSN to pgstats filesMichael Paquier2024-08-02
| | | | | | | | | | | | | | | | This is used in the startup process to check that the pgstats file we are reading includes the redo LSN referring to the shutdown checkpoint where it has been written. The redo LSN in the pgstats file needs to match with what the control file has. This is intended to be used for an upcoming change that will extend the write of the stats file to happen during checkpoints, rather than only shutdown sequences. Bump PGSTAT_FILE_FORMAT_ID. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
* Convert some extern variables to static, Windows codePeter Eisentraut2024-08-01
| | | | Similar to 720b0eaae9b, discovered by MinGW.
* Convert an extern variable to staticPeter Eisentraut2024-08-01
| | | | Similar to 720b0eaae9b, fixes new code from bd15b7db489.
* pg_createsubscriber: Rename option --socket-directory to --socketdirPeter Eisentraut2024-08-01
| | | | | | | | For consistency with the equivalent option in pg_upgrade. Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://www.postgresql.org/message-id/flat/1ed82b9b-8e20-497d-a2f8-aebdd793d595%40eisentraut.org
* Update comment in portal.h.Etsuro Fujita2024-08-01
| | | | | | | | | | | We store tuples into the portal's tuple store for a PORTAL_ONE_MOD_WITH query as well. Back-patch to all supported branches. Reviewed by Andy Fan. Discussion: https://postgr.es/m/CAPmGK14HVYBZYZtHabjeCd-e31VT%3Dwx6rQNq8QfehywLcpZ2Hw%40mail.gmail.com
* Convert node test compile-time settings into run-time parametersPeter Eisentraut2024-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This converts COPY_PARSE_PLAN_TREES WRITE_READ_PARSE_PLAN_TREES RAW_EXPRESSION_COVERAGE_TEST into run-time parameters debug_copy_parse_plan_trees debug_write_read_parse_plan_trees debug_raw_expression_coverage_test They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS. The compile-time symbols are kept for build farm compatibility, but they now just determine the default value of the run-time settings. Furthermore, support for these settings is not compiled in at all unless assertions are enabled, or the new symbol DEBUG_NODE_TESTS_ENABLED is defined at compile time, or any of the legacy compile-time setting symbols are defined. So there is no run-time overhead in production builds. (This is similar to the handling of DISCARD_CACHES_ENABLED.) Discussion: https://www.postgresql.org/message-id/flat/30747bd8-f51e-4e0c-a310-a6e2c37ec8aa%40eisentraut.org
* Avoid duplicate table scans for cross-partition updates during logical ↵Amit Kapila2024-08-01
| | | | | | | | | | | | | | replication. When performing a cross-partition update in the apply worker, it needlessly scans the old partition twice, resulting in noticeable overhead. This commit optimizes it by removing the redundant table scan. Author: Hou Zhijie Reviewed-by: Hayato Kuroda, Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB571623E39984D94CBB5341D994AB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Evaluate arguments of correlated SubPlans in the referencing ExprStateAndres Freund2024-07-31
| | | | | | | | | | | | | | | | | | | | | | | Until now we generated an ExprState for each parameter to a SubPlan and evaluated them one-by-one ExecScanSubPlan. That's sub-optimal as creating lots of small ExprStates a) makes JIT compilation more expensive b) wastes memory c) is a bit slower to execute This commit arranges to evaluate parameters to a SubPlan as part of the ExprState referencing a SubPlan, using the new EEOP_PARAM_SET expression step. We emit one EEOP_PARAM_SET for each argument to a subplan, just before the EEOP_SUBPLAN step. It likely is worth using EEOP_PARAM_SET in other places as well, e.g. for SubPlan outputs, nestloop parameters and - more ambitiously - to get rid of ExprContext->domainValue/caseValue/ecxt_agg*. But that's for later. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Discussion: https://postgr.es/m/20230225214401.346ancgjqc3zmvek@awork3.anarazel.de
* Revert "Allow parallel workers to cope with a newly-created session user ID."Tom Lane2024-07-31
| | | | | | | | | | | | | | | | This reverts commit f5f30c22ed69fb37b896c4d4546b2ab823c3fd61. Some buildfarm animals are failing with "cannot change "client_encoding" during a parallel operation". It looks like assign_client_encoding is unhappy at being asked to roll back a client_encoding setting after a parallel worker encounters a failure. There must be more to it though: why didn't I see this during local testing? In any case, it's clear that moving the RestoreGUCState() call is not as side-effect-free as I thought. Given that the bug f5f30c22e intended to fix has gone unreported for years, it's not something that's urgent to fix; I'm not willing to risk messing with it further with only days to our next release wrap.
* Add is_create parameter to RefreshMatviewByOid().Jeff Davis2024-07-31
| | | | | | | | | RefreshMatviewByOid is used for both REFRESH and CREATE MATERIALIZED VIEW. This flag is currently just used for handling internal error messages, but also aimed to improve code-readability. Author: Yugo Nagata Discussion: https://postgr.es/m/20240726122630.70e889f63a4d7e26f8549de8@sraoss.co.jp
* Remove unused ParamListInfo argument from ExecRefreshMatView.Jeff Davis2024-07-31
| | | | | Author: Yugo Nagata Discussion: https://postgr.es/m/20240726122630.70e889f63a4d7e26f8549de8@sraoss.co.jp
* Allow parallel workers to cope with a newly-created session user ID.Tom Lane2024-07-31
| | | | | | | | | | | | | | | | | | | | | | | | Parallel workers failed after a sequence like BEGIN; CREATE USER foo; SET SESSION AUTHORIZATION foo; because check_session_authorization could not see the uncommitted pg_authid row for "foo". This is because we ran RestoreGUCState() in a separate transaction using an ordinary just-created snapshot. The same disease afflicts any other GUC that requires catalog lookups and isn't forgiving about the lookups failing. To fix, postpone RestoreGUCState() into the worker's main transaction after we've set up a snapshot duplicating the leader's. This affects check_transaction_isolation and check_transaction_deferrable, which think they should only run during transaction start. Make them act like check_transaction_read_only, which already knows it should silently accept the value when InitializingParallelWorker. Per bug #18545 from Andrey Rachitskiy. Back-patch to all supported branches, because this has been wrong for awhile. Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
* Improve performance of dumpSequenceData().Nathan Bossart2024-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | As one might guess, this function dumps the sequence data. It is called once per sequence, and each such call executes a query to retrieve the relevant data for a single sequence. This can cause pg_dump to take significantly longer, especially when there are many sequences. This commit improves the performance of this function by gathering all the sequence data with a single query at the beginning of pg_dump. This information is stored in a sorted array that dumpSequenceData() can bsearch() for what it needs. This follows a similar approach as previous commits that introduced sorted arrays for role information, pg_class information, and sequence metadata. As with those commits, this patch will cause pg_dump to use more memory, but that isn't expected to be too egregious. Note that we use the brand new function pg_sequence_read_tuple() in the query that gathers all sequence data, so we must continue to use the preexisting query-per-sequence approach for versions older than 18. Reviewed-by: Euler Taveira, Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
* Introduce pg_sequence_read_tuple().Nathan Bossart2024-07-31
| | | | | | | | | | | | | | | | | | | | This new function returns the data for the given sequence, i.e., the values within the sequence tuple. Since this function is a substitute for SELECT from the sequence, the SELECT privilege is required on the sequence in question. It returns all NULLs for sequences for which we lack privileges, other sessions' temporary sequences, and unlogged sequences on standbys. This function is primarily intended for use by pg_dump in a follow-up commit that will use it to optimize dumpSequenceData(). Like pg_sequence_last_value(), which is a support function for the pg_sequences system view, pg_sequence_read_tuple() is left undocumented. Bumps catversion. Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
* Improve performance of dumpSequence().Nathan Bossart2024-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | This function dumps the sequence definitions. It is called once per sequence, and each such call executes a query to retrieve the metadata for a single sequence. This can cause pg_dump to take significantly longer, especially when there are many sequences. This commit improves the performance of this function by gathering all the sequence metadata with a single query at the beginning of pg_dump. This information is stored in a sorted array that dumpSequence() can bsearch() for what it needs. This follows a similar approach as commits d5e8930f50 and 2329cad1b9, which introduced sorted arrays for role information and pg_class information, respectively. As with those commits, this patch will cause pg_dump to use more memory, but that isn't expected to be too egregious. Note that before version 10, the sequence metadata was stored in the sequence relation itself, which makes it difficult to gather all the sequence metadata with a single query. For those older versions, we continue to use the preexisting query-per-sequence approach. Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
* Parse sequence type and integer metadata in dumpSequence().Nathan Bossart2024-07-31
| | | | | | | | | | | | | | | | This commit modifies dumpSequence() to parse all the sequence metadata into the appropriate types instead of carting around string pointers to the PGresult data. Besides allowing us to free the PGresult storage earlier in the function, this eliminates the need to compare min_value and max_value to their respective defaults as strings. This is preparatory work for a follow-up commit that will improve the performance of dumpSequence() in a similar manner to how commit 2329cad1b9 optimized binary_upgrade_set_pg_class_oids(). Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
* Fix random failure in 021_twophase.Amit Kapila2024-07-31
| | | | | | | | | | | After disabling the subscription, the failed test was changing the two_phase option for the subscription. We can't change the two_phase option for a subscription till the corresponding apply worker is active. The check to ensure that the replication apply worker has exited was incorrect. Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm3YY+bzj+JWJbY+DsUgJ2mPk8OR1ttjVX2cywKr4BUgxw@mail.gmail.com
* Relax check for return value from second call of pg_strnxfrm().Jeff Davis2024-07-30
| | | | | | | | | strxfrm() is not guaranteed to return the exact number of bytes needed to store the result; it may return a higher value. Discussion: https://postgr.es/m/32f85d88d1f64395abfe5a10dd97a62a4d3474ce.camel@j-davis.com Reviewed-by: Heikki Linnakangas Backpatch-through: 16
* Refactor getWeights to write to caller-supplied bufferHeikki Linnakangas2024-07-30
| | | | | | | This gets rid of the static result buffer. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
* Replace static buf with a stack-allocated one in ReadControlFileHeikki Linnakangas2024-07-30
| | | | | | | It's only used very locally within the function. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
* Replace static buf with palloc in str_time()Heikki Linnakangas2024-07-30
| | | | | | | | | | The function is used only once in the startup process, so the leak into current memory context is harmless. This is a tiny step in making the server thread-safe. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
* Replace static bufs with a StringInfo in cash_words()Heikki Linnakangas2024-07-30
| | | | | | | | | | | For clarity. The code was correct, and the buffer was large enough, but string manipulation with no bounds checking is scary. This incurs an extra palloc+pfree to every call, but in quick performance testing, it doesn't seem to be significant. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
* Remove leftover function declarationHeikki Linnakangas2024-07-30
| | | | | | Commit 9d9b9d46f3 removed the function (or rather, moved it to a different source file and renamed it to SendCancelRequest), but forgot the declaration in the header file.
* Preserve tz when converting to jsonb timestamptzAndrew Dunstan2024-07-30
| | | | | | | | | | | | | | This removes an inconsistency in the treatment of different datatypes by the jsonpath timestamp_tz() function. Conversions from data types that are not timestamp-aware, such as date and timestamp, are now treated consistently with conversion from those that are such as timestamptz. Author: David Wheeler Reviewed-by: Junwang Zhao and Jeevan Chalke Discussion: https://postgr.es/m/7DE080CE-6D8C-4794-9BD1-7D9699172FAB%40justatheory.com Backpatch to release 17.
* Remove useless member of BackendParameters.Thomas Munro2024-07-30
| | | | | | | | Oversight in e2562667, which stopped using SpinlockSemaArray but forgot to remove it from the array. Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/310f4005-91d7-42b2-ac70-92624260dd28%40iki.fi
* Require memory barrier support.Thomas Munro2024-07-30
| | | | | | | | | | | | | | | | | | Previously we had a fallback implementation that made a harmless system call, based on the assumption that system calls must contain a memory barrier. That shouldn't be reached on any current system, and it seems highly likely that we can easily find out how to request explicit memory barriers, if we've already had to find out how to do atomics on a hypothetical new system. Removed comments and a function name referred to a spinlock used for fallback memory barriers, but that changed in 1b468a13, which left some misleading words behind in a few places. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
* Require compiler barrier support.Thomas Munro2024-07-30
| | | | | | | | | | | | | | | Previously we had a fallback implementation of pg_compiler_barrier() that called an empty function across a translation unit boundary so the compiler couldn't see what it did. That shouldn't be needed on any current systems, and might not even work with a link time optimizer. Since we now require compiler-specific knowledge of how to implement atomics, we should also know how to implement compiler barriers on a hypothetical new system. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/721bf39a-ed8a-44b0-8b8e-be3bd81db748%40technowledgy.de Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
* Remove --disable-atomics, require 32 bit atomics.Thomas Munro2024-07-30
| | | | | | | | | | | | | | | | | Modern versions of all relevant architectures and tool chains have atomics support. Since edadeb07, there is no remaining reason to carry code that simulates atomic flags and uint32 imperfectly with spinlocks. 64 bit atomics are still emulated with spinlocks, if needed, for now. Any modern compiler capable of implementing C11 <stdatomic.h> must have the underlying operations we need, though we don't require C11 yet. We detect certain compilers and architectures, so hypothetical new systems might need adjustments here. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (concept, not the patch) Reviewed-by: Andres Freund <andres@anarazel.de> (concept, not the patch) Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
* Remove --disable-spinlocks.Thomas Munro2024-07-30
| | | | | | | | | | A later change will require atomic support, so it wouldn't make sense for a hypothetical new system not to be able to implement spinlocks. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (concept, not the patch) Reviewed-by: Andres Freund <andres@anarazel.de> (concept, not the patch) Discussion: https://postgr.es/m/3351991.1697728588%40sss.pgh.pa.us
* pg_createsubscriber: Remove obsolete commentPeter Eisentraut2024-07-30
| | | | | | | | This comment should have been removed by commit b9639138262. There is no replication slot check on the primary anymore. Author: Euler Taveira <euler@eulerto.com> Discussion: https://www.postgresql.org/message-id/697d692f-f9d3-41f6-9f0e-29a4fb18e544@app.fastmail.com
* Stabilize xid_wraparound testsAndrew Dunstan2024-07-30
| | | | | | | | | | | | The tests had a race condition if autovacuum was set to off. Instead we create all the tables we are interested in with autovacuum disabled, so they are only ever touched when in danger of wraparound. Discussion: https://postgr.es/m/3e2cbd24-f45e-4b2b-ba83-8149214f0a4d@dunslane.net Masahiko Sawada (slightly tweaked by me) Backpatch to release 17 where these tests were introduced.
* pg_createsubscriber: Fix an unpredictable recovery wait time.Amit Kapila2024-07-30
| | | | | | | | | | | | | | | | | The problem is that the tool is using the LSN returned by pg_create_logical_replication_slot() as recovery_target_lsn. This LSN is ahead of the current WAL position and the recovery waits until the publisher writes a WAL record to reach the target and ends the recovery. On idle systems, this wait time is unpredictable and could lead to failure in promoting the subscriber. To avoid that, insert a harmless WAL record. Reported-by: Alexander Lakhin and Tom Lane Diagnosed-by: Hayato Kuroda Author: Euler Taveira Reviewed-by: Hayato Kuroda, Amit Kapila Backpatch-through: 17 Discussion: https://postgr.es/m/2377319.1719766794%40sss.pgh.pa.us Discussion: https://postgr.es/m/CA+TgmoYcY+Wb67NAwaHT7MvxCSeV86oSc+va9hHKaasE42ukyw@mail.gmail.com
* Disallow setting MAX_PARTITION_BUFFERS to less than 2David Rowley2024-07-30
| | | | | | | | | | | | | | Add some comments to mention that this value must be at least 2 and also add a StaticAssertDecl to cause compilation failure if anyone tries to build with an invalid value. The multiInsertBuffers list must have at least two elements due to how the code in CopyMultiInsertInfoFlush() pushes the current ResultRelInfo's CopyMultiInsertBuffer to the end of the list. If the first element is also the last element, bad things will happen. Author: Zhang Mingli <avamingli@gmail.com> Discussion: https://postgr.es/m/CAApHDvpQ6t9ROcqbD-OgqR04Kfq4vQKw79Vo6r5j%2BciHwsSfkA%40mail.gmail.com
* Make collation not depend on setlocale().Jeff Davis2024-07-30
| | | | | | | | | | | | | | | | | Now that the result of pg_newlocale_from_collation() is always non-NULL, then we can move the collate_is_c and ctype_is_c flags into pg_locale_t. That simplifies the logic in lc_collate_is_c() and lc_ctype_is_c(), removing the dependence on setlocale(). This commit also eliminates the multi-stage initialization of the collation cache. As long as we have catalog access, then it's now safe to call pg_newlocale_from_collation() without checking lc_collate_is_c() first. Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org Reviewed-by: Peter Eisentraut, Andreas Karlsson
* Fix partitionwise join with partially-redundant join clausesRichard Guo2024-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To determine if the two relations being joined can use partitionwise join, we need to verify the existence of equi-join conditions involving pairs of matching partition keys for all partition keys. Currently we do that by looking through the join's restriction clauses. However, it has been discovered that this approach is insufficient, because there might be partition keys known equal by a specific EC, but they do not form a join clause because it happens that other members of the EC than the partition keys are constrained to become a join clause. To address this issue, in addition to examining the join's restriction clauses, we also check if any partition keys are known equal by ECs, by leveraging function exprs_known_equal(). To accomplish this, we enhance exprs_known_equal() to check equality per the semantics of the opfamily, if provided. It could be argued that exprs_known_equal() could be called O(N^2) times, where N is the number of partition key expressions, resulting in noticeable performance costs if there are a lot of partition key expressions. But I think this is not a problem. The number of a joinrel's partition key expressions would only be equal to the join degree, since each base relation within the join contributes only one partition key expression. That is to say, it does not scale with the number of partitions. A benchmark with a query involving 5-way joins of partitioned tables, each with 3 partition keys and 1000 partitions, shows that the planning time is not significantly affected by this patch (within the margin of error), particularly when compared to the impact caused by partitionwise join. Thanks to Tom Lane for the idea of leveraging exprs_known_equal() to check if partition keys are known equal by ECs. Author: Richard Guo, Tom Lane Reviewed-by: Tom Lane, Ashutosh Bapat, Robert Haas Discussion: https://postgr.es/m/CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com
* Refactor the checks for parameterized partial pathsRichard Guo2024-07-30
| | | | | | | | | | | | | | | | | | | | | Parameterized partial paths are not supported, and we have several checks in try_partial_xxx_path functions to enforce this. For a partial nestloop join path, we need to ensure that if the inner path is parameterized, the parameterization is fully satisfied by the proposed outer path. For a partial merge/hashjoin join path, we need to ensure that the inner path is not parameterized. In all cases, we need to ensure that the outer path is not parameterized. However, the comment in try_partial_hashjoin_path does not describe this correctly. This patch fixes that. In addtion, this patch simplifies the checks peformed in try_partial_hashjoin_path and try_partial_mergejoin_path with the help of macro PATH_REQ_OUTER, and also adds asserts that the outer path is not parameterized in try_partial_xxx_path functions. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs48mKJ6g_GnYNa7dnw04MHaMK-jnAEBrMVhTp2uUg3Ut4A@mail.gmail.com
* Short-circuit sort_inner_and_outer if there are no mergejoin clausesRichard Guo2024-07-30
| | | | | | | | | | | | | | | | | | | | | | | | In sort_inner_and_outer, we create mergejoin join paths by explicitly sorting both relations on each possible ordering of the available mergejoin clauses. However, if there are no available mergejoin clauses, we can skip this process entirely. This patch introduces a check for mergeclause_list at the beginning of sort_inner_and_outer and exits the function if it is found to be empty. This might help skip all the statements that come before the call to select_outer_pathkeys_for_merge, including the build of UniquePaths in the case of JOIN_UNIQUE_OUTER or JOIN_UNIQUE_INNER. I doubt there's any measurable performance improvement, but throughout the run of the regression tests, sort_inner_and_outer is called a total of 44,424 times. Among these calls, there are 11,064 instances where mergeclause_list is found to be empty, which accounts for approximately one-fourth. I think this suggests that implementing this shortcut is worthwhile. Author: Richard Guo Reviewed-by: Ashutosh Bapat Discussion: https://postgr.es/m/CAMbWs48RKiZGFEd5A0JtztRY5ZdvVvNiHh0AKeuoz21F+0dVjQ@mail.gmail.com
* Add more debugging information when failing to read pgstats filesMichael Paquier2024-07-30
| | | | | | | | | | This is useful to know which part of a stats file is corrupted when reading it, adding to the server logs a WARNING with details about what could not be read before giving up with the remaining data in the file. Author: Michael Paquier Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
* SQL/JSON: Fix casting for integer EXISTS columns in JSON_TABLEAmit Langote2024-07-30
| | | | | | | | | | | | | | | | | | | | | | | | The current method of coercing the boolean result value of JsonPathExists() to the target type specified for an EXISTS column, which is to call the type's input function via json_populate_type(), leads to an error when the target type is integer, because the integer input function doesn't recognize boolean literal values as valid. Instead use the boolean-to-integer cast function for coercion in that case so that using integer or domains thereof as type for EXISTS columns works. Note that coercion for ON ERROR values TRUE and FALSE already works like that because the parser creates a cast expression including the cast function, but the coercion of the actual result value is not handled by the parser. Tests by Jian He. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17