| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reunite RestorePendingSyncs() with RestoreRelationMap(). If
RelationInitPhysicalAddr() ran after RestoreRelationMap() but before
RestorePendingSyncs(), the relcache entry could cause RelationNeedsWAL()
to return true erroneously. Trouble required commands of the current
transaction to include REINDEX or CLUSTER of a system catalog. The
parallel leader correctly derived RelationNeedsWAL()==false from the new
relfilenumber, but the worker saw RelationNeedsWAL()==true. Worker
MarkBufferDirtyHint() then wrote unwanted WAL. Recovery of that
unwanted WAL could lose tuples like the system could before commit
c6b92041d38512a4176ed76ad06f713d2e6c01a8 introduced this tracking.
RestorePendingSyncs() and RestoreRelationMap() were adjacent till commit
126ec0bc76d044d3a9eb86538b61242bf7da6db4, so no back-patch for now.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/20241019232815.c6.nmisch@google.com
|
|
|
|
|
|
|
|
|
|
| |
Stop computing a never-used value. This removes the read; the read had
no functional implications. Back-patch to v12, like commit
a07e03fd8fa7daf4d1356f7cb501ffe784ea6257.
Reported by Alexander Lakhin.
Discussion: https://postgr.es/m/6c92f59b-f5bc-e58c-9bdd-d1f21c17c786@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, GetLockStatusData() checked all slots for every backend
to gather fast-path lock data, which could be inefficient. This commit
refactors it by skipping backends with PID=0 (since they don't hold
fast-path locks) and skipping groups with no registered fast-path locks,
improving efficiency.
This refactoring is particularly beneficial, for example when
max_connections and max_locks_per_transaction are set high,
as it reduces unnecessary checks across numerous slots.
Author: Fujii Masao
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/a0a00c44-31e9-4c67-9846-fb9636213ac9@oss.nttdata.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ssl_ciphers GUC can only set cipher suites for TLSv1.2, and lower,
connections. For TLSv1.3 connections a different OpenSSL API must be
used. This adds a new GUC, ssl_tls13_ciphers, which can be used to
configure a colon separated list of cipher suites to support when
performing a TLSv1.3 handshake.
Original patch by Erica Zhang with additional hacking by me.
Author: Erica Zhang <ericazhangy2021@qq.com>
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ssl_ecdh_curve GUC only accepts a single value, but the TLS
handshake can list multiple curves in the groups extension (the
extension has been renamed to contain more than elliptic curves).
This changes the GUC to accept a colon-separated list of curves.
This commit also renames the GUC to ssl_groups to match the new
nomenclature for the TLS extension.
Original patch by Erica Zhang with additional hacking by me.
Author: Erica Zhang <ericazhangy2021@qq.com>
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
|
|
|
|
|
|
|
|
|
|
|
| |
This argument allow skipping throwing an error. Instead, the result status
can be obtained using pg_wal_replay_wait_status() function.
Catversion is bumped.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz
Reviewed-by: Pavel Borisov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, WaitForLSNReplay() immediately throws an error if waiting for LSN
replay is not successful. This commit teaches WaitForLSNReplay() to return
the result of waiting, while making pg_wal_replay_wait() responsible for
throwing an appropriate error.
This is preparation to adding 'no_error' argument to pg_wal_replay_wait() and
new function pg_wal_replay_wait_status(), which returns the last wait result
status.
Additionally, we stop distinguishing situations when we find our instance to
be not in a recovery state before entering the waiting loop and inside
the waiting loop. Standby promotion may happen at any moment, even between
issuing a procedure call statement and pg_wal_replay_wait() doing a first
check of recovery status. Thus, there is no pointing distinguishing these
situations.
Also, since we may exit the waiting loop and see our instance not in recovery
without throwing an error, we need to deleteLSNWaiter() in that case. We do
this unconditionally for the sake of simplicity, even if standby was already
promoted after reaching the target LSN, the startup process surely already
deleted us.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz
Reviewed-by: Michael Paquier, Pavel Borisov
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZvY2C8N4ZqgCFaLu%40paquier.xyz
Reviewed-by: Pavel Borisov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3c5db1d6b implemented the pg_wal_replay_wait() stored procedure. Due to
the patch development history, the implementation resided in
src/backend/commands/waitlsn.c (src/include/commands/waitlsn.h for headers).
014f9f34d moved pg_wal_replay_wait() itself to
src/backend/access/transam/xlogfuncs.c near to the WAL-manipulation functions.
But most of the implementation stayed in place.
The code in src/backend/commands/waitlsn.c has nothing to do with commands,
but is related to WAL. So, this commit moves this code into
src/backend/access/transam/xlogwait.c (src/include/access/xlogwait.h for
headers).
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
Reviewed-by: Pavel Borisov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur. In order to handle this, we
keep an array of in-progress type cache entries. In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
Reviewed-by: Artur Zakirov
|
|
|
|
|
|
|
| |
Describe the way we handle concurrent invalidation messages.
Discussion: https://postgr.es/m/CAPpHfdsQhwUrnB3of862j9RgHoJM--eRbifvBMvtQxpC57dxCA%40mail.gmail.com
Reviewed-by: Andrei Lepikhov, Artur Zakirov, Pavel Borisov
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, a Query generated through the transform phase would have
unset stmt_location, tracking the starting point of a query string.
Extensions relying on the statement location to extract its relevant
parts in the source text string would fallback to use the whole
statement instead, leading to confusing results like in
pg_stat_statements for queries relying on nested queries, like:
- EXPLAIN, with top-level and nested query using the same query string,
and a query ID coming from the nested query when the non-top-level
entry.
- Multi-statements, with only partial portions of queries being
normalized.
- COPY TO with a query, SELECT or DMLs.
This patch improves things by keeping track of the statement locations
and propagate it to Query during transform, allowing PGSS to only show
the relevant part of the query for nested query. This leads to less
bloat in entries for non-top-level entries, as queries can now be
grouped within the same (toplevel, queryid) duos in pg_stat_statements.
The result gives a stricter one-one mapping between query IDs and its
query strings.
The regression tests introduced in 45e0ba30fc40 produce differences
reflecting the new logic.
Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
|
|
|
|
|
| |
Previously, an invalid attribute name was caught, but the error
message was unhelpful.
|
|
|
|
|
|
|
|
|
| |
Some buildfarm members complained about an always-true test in the
SOFT_ERROR_OCCURRED macro. Fix by reading the field directly rather
than using the macro.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/2144895.1729653514@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
| |
The proposed OAUTHBEARER SASL mechanism will need to allow larger
messages in the exchange, since tokens are sent directly by the
client. Move this limit into the pg_be_sasl_mech struct so that
it can be changed per-mechanism.
Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAOYmi+nqX_5=Se0W0Ynrr55Fha3CMzwv_R9P3rkpHb=1kG7ZTQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Relations opened by the executor are only closed once in
ExecCloseRangeTableRelations(), so the word "again" in the comment
for ExecGetRangeTableRelation() is misleading and unnecessary.
Discussion: https://postgr.es/m/CA+HiwqHnw-zR+u060i3jp4ky5UR0CjByRFQz50oZ05de7wUg=Q@mail.gmail.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable manipulation of attribute statistics. Only superficial
validation is performed, so it's possible to add nonsense, and it's up
to the planner (or other users of statistics) to behave reasonably in
that case.
Bump catalog version.
Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
These functions will either raise an ERROR or run to normal
completion, so no return value is necessary.
Bump catalog version.
Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=cBF8rnphuTyHFi3KYzB9ByDgx57HwK9Rz2yp7S+Om87w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, CREATE/ALTER EXTENSION gave basically no useful
context about errors reported while executing script files.
I think the idea was that you could run the same commands
manually to see the error, but that's often quite inconvenient.
Let's improve that.
If we get an error during raw parsing, we won't have a current
statement identified by a RawStmt node, but we should always get
a syntax error position. Show the portion of the script from
the last semicolon-newline before the error position to the first
one after it. There are cases where this might show only a
fragment of a statement, but that should be uncommon, and it
seems better than showing the whole script file.
Without an error cursor, if we have gotten past raw parsing (which
we probably have), we can report just the current SQL statement as
an item of error context.
In any case also report the script file name as error context,
since it might not be entirely obvious which of a series of
update scripts failed. We can also show an approximate script
line number in case whatever we printed of the query isn't
sufficiently identifiable.
The error-context code path is already exercised by some
test_extensions test cases, but add tests for the syntax-error
path.
Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up to now, the parser's reporting of a statement's stmt_location
included any preceding whitespace or comments. This isn't really
desirable but was done to avoid accounting honestly for nonterminals
that reduce to empty. It causes problems for pg_stat_statements,
which partially compensates by manually stripping whitespace, but
is not bright enough to strip /*-style comments. There will be
more problems with an upcoming patch to improve reporting of errors
in extension scripts, so it's time to do something about this.
The thing we have to do to make it work right is to adjust
YYLLOC_DEFAULT to scan the inputs of each production to find the
first one that has a valid location (i.e., did not reduce to
empty). In theory this adds a little bit of per-reduction overhead,
but in practice it's negligible. I checked by measuring the time
to run raw_parser() on the contents of information_schema.sql, and
there was basically no change.
Having done that, we can rely on any nonterminal that didn't reduce
to completely empty to have a correct starting location, and we don't
need the kluges the stmtmulti production formerly used.
This should have a side benefit of allowing parse error reports to
include an error position in some cases where they formerly failed to
do so, due to trying to report the position of an empty nonterminal.
I did not go looking for an example though. The one previously known
case where that could happen (OptSchemaEltList) no longer needs the
kluge it had; but I rather doubt that that was the only case.
Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... to fix bugs when the referenced table is partitioned.
The catalog representation we chose for foreign keys connecting
partitioned tables (in commit f56f8f8da6af) is inconvenient, in the
sense that a standalone table has a different way to represent the
constraint when referencing a partitioned table, than when the same
table becomes a partition (and vice versa). Because of this, we need to
create additional catalog rows on detach (pg_constraint and pg_trigger),
and remove them on attach. We were doing some of those things, but not
all of them, leading to missing catalog rows in certain cases.
The worst problem seems to be that we are missing action triggers after
detaching a partition, which means that you could update/delete rows
from the referenced partitioned table that still had referencing rows on
that table, the server failing to throw the required errors.
!!!
Note that this means existing databases with FKs that reference
partitioned tables might have rows that break relational integrity, on
tables that were once partitions on the referencing side of the FK.
Another possible problem is that trying to reattach a table
that had been detached would fail indicating that internal triggers
cannot be found, which from the user's point of view is nonsensical.
In branches 15 and above, we fix this by creating a new helper function
addFkConstraint() which is in charge of creating a standalone
pg_constraint row, and repurposing addFkRecurseReferencing() and
addFkRecurseReferenced() so that they're only the recursive routine for
each side of the FK, and they call addFkConstraint() to create
pg_constraint at each partitioning level and add the necessary triggers.
These new routines can be used during partition creation, partition
attach and detach, and foreign key creation. This reduces redundant
code and simplifies the flow.
In branches 14 and 13, we have a much simpler fix that consists on
simply removing the constraint on detach. The reason is that those
branches are missing commit f4566345cf40, which reworked the way this
works in a way that we didn't consider back-patchable at the time.
We opted to leave branch 12 alone, because it's different from branch 13
enough that the fix doesn't apply; and because it is going in EOL mode
very soon, patching it now might be worse since there's no way to undo
the damage if it goes wrong.
Existing databases might need to be repaired.
In the future we might want to rethink the catalog representation to
avoid this problem, but for now the code seems to do what's required to
make the constraints operate correctly.
Co-authored-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Guillaume Lelarge <guillaume@lelarge.info>
Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reported-by: Thomas Baehler (SBB CFF FFS) <thomas.baehler2@sbb.ch>
Discussion: https://postgr.es/m/20230420144344.40744130@karst
Discussion: https://postgr.es/m/20230705233028.2f554f73@karst
Discussion: https://postgr.es/m/GVAP278MB02787E7134FD691861635A8BC9032@GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the query is rewritten into a NOTIFY command by a DO INSTEAD
rule, we'd get an assertion failure, or in non-assert builds
issue a rather confusing error message. Improve that.
Also fix a longstanding grammar mistake in a nearby error message.
Per bug #18664 from Alexander Lakhin. Back-patch to all supported
branches.
Tender Wang and Tom Lane
Discussion: https://postgr.es/m/18664-ffd0ebc2386598df@postgresql.org
|
|
|
|
|
|
| |
We haven't generated those for a long time.
Discussion: https://www.postgresql.org/message-id/b439edfc-c5e5-43a9-802d-4cb51ec20646@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The finished transaction list can contain XIDs that are older than the
serializable global xmin. It's a short-lived state;
ClearOldPredicateLocks() removes any such transactions from the list,
and it's called whenever the global xmin advances. But if another
backend calls SummarizeOldestCommittedSxact() in that window, it will
call SerialAdd() on an XID that's older than the global xmin, or if
there are no more transactions running, when global xmin is
invalid. That trips the assertion in SerialAdd().
Fixes bug #18658 reported by Andrew Bille. Thanks to Alexander Lakhin
for analysis. Backpatch to all versions.
Discussion: https://www.postgresql.org/message-id/18658-7dab125ec688c70b%40postgresql.org
|
|
|
|
|
| |
Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3L5YjxXCjx0LhkwHdDGsNgpFGEqH7SqtXRPNP+dwFMVZQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decision in b6e1157e7 to ignore raw_expr when evaluating a
JsonValueExpr was incorrect. While its value is not ultimately
used (since formatted_expr's value is), failing to initialize it
can lead to problems, for instance, when the expression tree in
raw_expr contains Aggref nodes, which must be initialized to
ensure the parent Agg node works correctly.
Also, optimize eval_const_expressions_mutator()'s handling of
JsonValueExpr a bit. Currently, when formatted_expr cannot be folded
into a constant, we end up processing it twice -- once directly in
eval_const_expressions_mutator() and again recursively via
ece_generic_processing(). This recursive processing is required to
handle raw_expr. To avoid the redundant processing of formatted_expr,
we now process raw_expr directly in eval_const_expressions_mutator().
Finally, update the comment of JsonValueExpr to describe the roles of
raw_expr and formatted_expr more clearly.
Bug: #18657
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Fabio R. Sluzala <fabio3rs@gmail.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18657-1b90ccce2b16bdb8@postgresql.org
Backpatch-through: 16
|
|
|
|
|
|
|
|
|
| |
While the default value for relpages is 0, if a partitioned table with
at least one child has been analyzed, then the partititoned table will
have a relpages value of -1.
Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=fajh1Lpcyr_XsMmq-9Z=SGk-u+_Zeac7Pt0RAN3uiVCg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make nbtree backwards scans optimistically access the next page to be
read to the left by following a prevPage block number that's now stashed
in currPos when the leaf page is first read. This approach matches the
one taken during forward scans, which follow a symmetric nextPage block
number from currPos. We stash both a prevPage and a nextPage, since the
scan direction might change (when fetching from a scrollable cursor).
Backwards scans will no longer need to lock the same page twice, except
in rare cases where the scan detects a concurrent page split (or page
deletion). Testing has shown this optimization to be particularly
effective during parallel index-only backwards scans: ~12% reductions in
query execution time are quite possible.
We're much better off being optimistic; concurrent left sibling page
splits are rare in general. It's possible that we'll need to lock more
pages than the pessimistic approach would have, but only when there are
_multiple_ concurrent splits of the left sibling page we now start at.
If there's just a single concurrent left sibling page split, the new
approach to scanning backwards will at least break even relative to the
old one (we'll acquire the same number of leaf page locks as before).
The optimization from this commit has long been contemplated by comments
added by commit 2ed5b87f96, which changed the rules for locking/pinning
during nbtree index scans. The approach that that commit introduced to
leaf level link traversal when scanning forwards is now more or less
applied all the time, regardless of the direction we're scanning in.
Following uniform conventions around sibling link traversal is simpler.
The only real remaining difference between our forward and backwards
handling is that our backwards handling must still detect and recover
from any concurrent left sibling splits (and concurrent page deletions),
as documented in the nbtree README. That is structured as a single,
isolated extra step that takes place in _bt_readnextpage.
Also use this opportunity to further simplify the functions that deal
with reading pages and traversing sibling links on the leaf level, and
to document their preconditions and postconditions (with respect to
things like buffer locks, buffer pins, and seizing the parallel scan).
This enhancement completely supersedes the one recently added by commit
3f44959f.
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAEze2WgpBGRgTTxTWVPXc9+PB6fc1a7t+VyGXHzfnrFXcQVxnA@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-WzkBTuFv7W2+84jJT8mWZLXVL0GHq2hMUTn6c9Vw=eYrCw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
The previous code (from commit 5d2e1cc117b) did not detect end of
string correctly, so it would fail to error out if fewer than the
expected number of fields were present, which could then later lead to
a crash when NULL string pointers are accessed.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3bf3ab8c56 initially introduced support for unlogged
materialized views, but this was later disallowed by commit 3223b25ff7.
Additionally, commit d25f519107 added more code for handling
unlogged materialized views. This commit cleans up all unused
code related to them.
If unlogged materialized views had been supported in any official
release, psql would need to retain code to handle them for compatibility
with older servers. However, since they were never included in
an official release, this code is no longer necessary.
Author: Pixian Shi
Reviewed-by: Yugo Nagata, Fujii Masao
Discussion: https://postgr.es/m/CAAccyYKRZ=OvAvgowiSH+OELbStLP=p2Ht=R3CgT=OaNSH5DAA@mail.gmail.com
|
|
|
|
|
| |
Reviewed-by: Corey Huinker
Discussion: https://postgr.es/m/901ab7cf01957f92ea8b30b6feeb0eacfb7505fc.camel@j-davis.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After repartitioning the inner side of a hash join that would have
exceeded the allowed size, we check if all the tuples from a parent
partition moved to one child partition. That is evidence that it
contains duplicate keys and later attempts to repartition will also
fail, so we should give up trying to limit memory (for lack of a better
fallback strategy).
A thinko prevented the check from working correctly in partition 0 (the
one that is partially loaded into memory already). After
repartitioning, we should check for extreme skew if the *parent*
partition's space_exhausted flag was set, not the child partition's.
The consequence was repeated futile repartitioning until per-partition
data exceeded various limits including "ERROR: invalid DSA memory alloc
request size 1811939328", OS allocation failure, or temporary disk space
errors. (We could also do something about some of those symptoms, but
that's material for separate patches.)
This problem only became likely when PostgreSQL 16 introduced support
for Parallel Hash Right/Full Join, allowing NULL keys into the hash
table. Repartitioning always leaves NULL in partition 0, no matter how
many times you do it, because the hash value is all zero bits. That's
unlikely for other hashed values, but they might still have caused
wasted extra effort before giving up.
Back-patch to all supported releases.
Reported-by: Craig Milhiser <craig@milhiser.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CA%2BwnhO1OfgXbmXgC4fv_uu%3DOxcDQuHvfoQ4k0DFeB0Qqd-X-rQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
The result is already of the correct type, so these casts don't do
anything.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
| |
Instead of using Node *, we can use an incomplete struct. That way,
everything has the correct type and fewer casts are required. This
technique is already used elsewhere in node type definitions.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
adf97c156 made it so ExprStates could support hashing and changed Hash
Join to use that instead of manually extracting Datums from tuples and
hashing them one column at a time.
When hashing multiple columns or expressions, the code added in that
commit stored the intermediate hash value in the ExprState's resvalue
field. That was a mistake as steps may be injected into the ExprState
between each hashing step that look at or overwrite the stored
intermediate hash value. EEOP_PARAM_SET is an example of such a step.
Here we fix this by adding a new dedicated field for storing
intermediate hash values and adjust the code so that all apart from the
final hashing step store their result in the intermediate field.
In passing, rename a variable so that it's more aligned to the
surrounding code and also so a few lines stay within the 80 char margin.
Reported-by: Andres Freund
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAApHDvqo9eenEFXND5zZ9JxO_k4eTA4jKMGxSyjdTrsmYvnmZw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds missing checks for COPY FORCE_NOT_NULL and FORCE_NULL
when applied to all columns via "*". These options now correctly
require CSV mode and are disallowed in COPY TO, making their behavior
consistent with FORCE_QUOTE.
Some regression tests are added to verify the correct behavior for the
all-columns case, including FORCE_QUOTE, which was not tested.
Backpatch down to 17, where support for the all-column grammar with
FORCE_NOT_NULL and FORCE_NULL has been added.
Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 17
|
|
|
|
| |
Oversight in commit 79fa7b3b.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 2dc1deaea turns out to have been still a brick shy of a load,
because CALL statements executing within a plpgsql exception block
could still pass the wrong snapshot to stable functions within the
CALL's argument list. That happened because standard_ProcessUtility
forces isAtomicContext to true if IsTransactionBlock is true, which
it always will be inside a subtransaction. Then ExecuteCallStmt
would think it does not need to push a new snapshot --- but
_SPI_execute_plan didn't do so either, since it thought it was in
nonatomic mode.
The best fix for this seems to be for _SPI_execute_plan to operate
in atomic execution mode if IsSubTransaction() is true, even when the
SPI context as a whole is non-atomic. This makes _SPI_execute_plan
have the same rules about when non-atomic execution is allowed as
_SPI_commit/_SPI_rollback have about when COMMIT/ROLLBACK are allowed,
which seems appropriately symmetric. (If anyone ever tries to allow
COMMIT/ROLLBACK inside a subtransaction, this would all need to be
rethought ... but I'm unconvinced that such a thing could be logically
consistent at all.)
For further consistency, also check IsSubTransaction() in
SPI_inside_nonatomic_context. That does not matter for its
one present-day caller StartTransaction, which can't be reached
inside a subtransaction. But if any other callers ever arise,
they'd presumably want this definition.
Per bug #18656 from Alexander Alehin. Back-patch to all
supported branches, like previous fixes in this area.
Discussion: https://postgr.es/m/18656-cade1780866ef66c@postgresql.org
|
|
|
|
|
| |
Reported-by: Alexander Korotkov
Discussion: https://postgr.es/m/CAPpHfduAiGSsvUc614Z-JOnyQffcMeJncWMF2HnUL8wFy4fuWA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit a4ccc1cef introduced the Generation Context and modified the
logical decoding process to use a Generation Context with a fixed
block size of 8MB for storing tuple data decoded during logical
decoding (i.e., rb->tup_context). Several reports have indicated that
the logical decoding process can be terminated due to
out-of-memory (OOM) situations caused by excessive memory usage in
rb->tup_context.
This issue can occur when decoding a workload involving several
concurrent transactions, including a long-running transaction that
modifies tuples. By design, the Generation Context does not free a
memory block until all chunks within that block are
released. Consequently, if tuples modified by the long-running
transaction are stored across multiple memory blocks, these blocks
remain allocated until the long-running transaction completes, leading
to substantial memory fragmentation. The memory usage during logical
decoding, tracked by rb->size, does not account for memory
fragmentation, resulting in potentially much higher memory consumption
than the value of the logical_decoding_work_mem parameter.
Various improvement strategies were discussed in the relevant
thread. This change reduces the block size of the Generation Context
used in rb->tup_context from 8MB to 8kB. This modification
significantly decreases the likelihood of substantial memory
fragmentation occurring and is relatively straightforward to
backport. Performance testing across multiple platforms has confirmed
that this change will not introduce any performance degradation that
would impact actual operation.
Backport to all supported branches.
Reported-by: Alex Richman, Michael Guissine, Avi Weinberg
Reviewed-by: Amit Kapila, Fujii Masao, David Rowley
Tested-by: Hayato Kuroda, Shlok Kyal
Discussion: https://postgr.es/m/CAD21AoBTY1LATZUmvSXEssvq07qDZufV4AF-OHh9VD2pC0VY2A%40mail.gmail.com
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5bf748b8 taught nbtree ScalarArrayOp index scans to decide when
and how to start the next primitive index scan based on physical index
characteristics. This included rules for deciding whether to start a
new primitive index scan (or whether to move onto the right sibling leaf
page instead) that specifically consider truncated lower-order columns
(-inf columns) from leaf page high keys.
These omitted columns were treated as satisfying the scan's required
scan keys, though only for scan keys marked required in the current scan
direction (forward). Scan keys that didn't get this behavior (those
marked required in the backwards direction only) usually didn't give the
scan reasonable cause to reposition itself to a later leaf page (via
another descent of the index in _bt_first), but _bt_advance_array_keys
would nevertheless always give up by forcing another call to _bt_first.
_bt_advance_array_keys was unwilling to allow the scan to continue onto
the next leaf page, to reconsider whether we really should start another
primitive scan based on the details of the sibling page's tuples. This
didn't match its behavior with similar cases involving keys required in
the current scan direction (forward), which seems unprincipled. It led
to an excessive number of primitive scans/index descents for queries
with a higher-order = array scan key (with dense, contiguous values)
mixed with a lower-order required > or >= scan key.
Bring > and >= strategy scan keys in line with other required scan key
types: treat truncated -inf scan keys as having satisfied scan keys
required in either scan direction (forwards and backwards alike) during
array advancement. That way affected scans can continue to the right
sibling leaf page. Advancement must now schedule an explicit recheck of
the right sibling page's high key in cases involving > or >= scan keys.
The recheck gives the scan a way to back out and start another primitive
index scan (we can't just rely on _bt_checkkeys with > or >= scan keys).
This work can be considered a stand alone optimization on top of the
work from commit 5bf748b8. But it was written in preparation for an
upcoming patch that will add skip scan to nbtree. In practice scans
that use "skip arrays" will tend to be much more sensitive to any
implementation deficiencies in this area.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
An oversight of 3a8a1f3254b.
Reported-by: Tender Wang <tndrwang@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Backpatch-through: 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two near-identical copies of clause_sides_match_join() existed in
joinpath.c and analyzejoins.c. Deduplicate this by moving the function
into restrictinfo.h.
It isn't quite clear that keeping the inline property of this function
is worthwhile, but this commit is just an exercise in code
deduplication. More effort would be required to determine if the inline
property is worth keeping.
Author: James Hunter <james.hunter.pg@gmail.com>
Discussion: https://postgr.es/m/CAJVSvF7Nm_9kgMLOch4c-5fbh3MYg%3D9BdnDx3Dv7Fcb64zr64Q%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This module provides SQL functions that allow to inspect logical
decoding components.
It currently allows to inspect the contents of serialized logical
snapshots of a running database cluster, which is useful for debugging
or educational purposes.
Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith, Peter Eisentraut
Reviewed-by: David G. Johnston
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit moves the definitions of the SnapBuild and SnapBuildOnDisk
structs, related to logical snapshots, to the snapshot_internal.h
file. This change allows external tools, such as
pg_logicalinspect (with an upcoming patch), to access and utilize the
contents of logical snapshots.
Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal
|
|
|
|
|
|
|
|
| |
Move implementation of pg_locale_t code for libc collations into
pg_locale_libc.c. Other locale-related code, such as
pg_perm_setlocale(), remains in pg_locale.c for now.
Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b.camel@j-davis.com
|
|
|
|
| |
Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b.camel@j-davis.com
|
|
|
|
|
|
|
|
|
| |
Commit d746021de1 introduced construct_array_builtin() for built-in
data types, but forgot some replacements linked to FLOAT8OID.
Author: Bertrand Drouvot
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/CAD21AoCERkwmttY44dqUw%3Dm_9QCctu7W%2Bp6B7w_VqxRJA1Qq_Q%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MergeJoin struct was tracking "mergeStrategies", which were an
array of btree strategy numbers, purely for the purpose of comparing
it later against btree strategies to determine if the scan direction
was forward or reverse. Change that. Instead, track
"mergeReversals", an array of bool, to indicate the same without an
unfortunate assumption that a strategy number refers specifically to a
btree strategy.
Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Functions make_pathkey_from_sortop() and transformWindowDefinitions(),
which receive a SortGroupClause, were determining the sort order
(ascending vs. descending) by comparing that structure's operator
strategy to BTLessStrategyNumber, but could just as easily have gotten
it from the SortGroupClause object, if it had such a field, so add
one. This reduces the number of places that hardcode the assumption
that the strategy refers specifically to a btree strategy, rather than
some other index AM's operators.
Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
|