| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the inner allocation call returns NULL, we should restore the
previous state and return NULL. Previously this code pfree'd
the old chunk anyway, which is surely wrong.
Also, make it call MemoryContextAllocationFailure rather than
summarily returning NULL. The fact that we got control back from the
inner call proves that MCXT_ALLOC_NO_OOM was passed, so this change
is just cosmetic, but someday it might be less so.
This is just a latent bug at present: AFAICT no in-core callers use
this function at all, let alone call it with MCXT_ALLOC_NO_OOM.
Still, it's the kind of bug that might bite back-patched code pretty
hard someday, so let's back-patch to v17 where the bug was introduced
(by commit 743112a2e).
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
Backpatch-through: 17
|
|
|
|
|
|
|
|
|
| |
Due to concerns raised about the approach, and memory leaks found
in sensitive contexts the functionality is reverted. This reverts
commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291
for v18 with an intent to revisit this patch for v19.
Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As pointed out by Tom Lane, the patch introduced fragile and invasive
design around plan invalidation handling when locking of prunable
partitions was deferred from plancache.c to the executor. In
particular, it violated assumptions about CachedPlan immutability and
altered executor APIs in ways that are difficult to justify given the
added complexity and overhead.
This also removes the firstResultRels field added to PlannedStmt in
commit 28317de72, which was intended to support deferred locking of
certain ModifyTable result relations.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/605328.1747710381@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
| |
This must be "return MemoryContextAllocationFailure(context, size, flags)"
instead. The effect of this oversight is that if we got a malloc
failure right here, the code would act as though MCXT_ALLOC_NO_OOM
had been specified, whether it was or not. That would likely lead
to a null-pointer-dereference crash at the unsuspecting call site.
Noted while messing with a patch to improve our Valgrind leak
detection support. Back-patch to v17 where this code came in.
|
|
|
|
|
|
|
|
|
| |
We try to avoid using strncpy() due to the ease of which it can
be misused. Convert this callsite to use strlcpy() instead to
match similar codepaths in this file.
Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/2a796830-de2d-4030-b480-d673f6cc5d94@eisentraut.org
|
|
|
|
|
|
|
|
|
| |
This fixes comment and docs typos as well as a small documentation
change to make it clearer. Found via post-commit review.
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
When copying the string strncpy won't add nul termination since
the string length is equal to the length specified. Explicitly
set a nul terminator after copying to properly terminate. Found
via post-commit review.
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAH2L28vt16C9xTuK+K7QZvtA3kCNWXOEiT=gEekUw3Xxp9LVQw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The case of "node == parent" might seem impossible, since we just
allocated the new node. But it's possible if parent is a dangling
reference to a recently-deleted context. In fact, given aset.c's
habit of recycling contexts, it's actually rather likely if that's so.
If we'd had this assertion before, it would have simplified debugging
a recently-identified walsender issue.
Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAO6_XqoJA7-_G6t7Uqe5nWF3nj+QBGn4F6Ptp=rUGDr0zo+KvA@mail.gmail.com
|
|
|
|
|
|
|
| |
These are all new to v18
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
The large majority of these have been introduced by recent commits done
in the v18 development cycle.
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both pg_get_process_memory_contexts() and pg_backend_memory_contexts
have 1-based levels, whereas pg_log_backend_memory_contexts() was using
0-based levels. Align these.
This results in slightly saner behavior from MemoryContextStatsDetail()
in regards to the max_level. Previously it would stop at 1 level before
the maximum requested level rather than at that level.
Reported-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Author: David Rowley <drowleyml@gmail.com
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/395ea5d4fe190480efa95bf533485c70@oss.nttdata.com
|
|
|
|
|
|
|
|
|
|
| |
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places. These
inconsistencies were all introduced during Postgres 18 development.
This commit was written with help from clang-tidy, by mechanically
applying the same rules as similar clean-up commits (the earliest such
commit was commit 035ce1fe).
|
|
|
|
|
|
|
|
|
|
|
|
| |
The global variable backing the DSA area for Memory Context stats
reporting had a too generic name, rename to be more descriptive.
Independently reported by Peter and Laurenz.
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Reported-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/d51172bd4e7f4b07a18a0288ca1b1c28a71a5f6a.camel@cybertec.at
Discussion: https://postgr.es/m/25095db5-b595-4b85-9100-d358907c25b5@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a function for retrieving memory context statistics
and information from backends as well as auxiliary processes.
The intended usecase is cluster debugging when under memory
pressure or unanticipated memory usage characteristics.
When calling the function it sends a signal to the specified
process to submit statistics regarding its memory contexts
into dynamic shared memory. Each memory context is returned
in detail, followed by a cumulative total in case the number
of contexts exceed the max allocated amount of shared memory.
Each process is limited to use at most 1Mb memory for this.
A summary can also be explicitly requested by the user, this
will return the TopMemoryContext and a cumulative total of
all lower contexts.
In order to not block on busy processes the caller specifies
the number of seconds during which to retry before timing out.
In the case where no statistics are published within the set
timeout, the last known statistics are returned, or NULL if
no previously published statistics exist. This allows dash-
board type queries to continually publish even if the target
process is temporarily congested. Context records contain a
timestamp to indicate when they were submitted.
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com
|
|
|
|
|
|
|
| |
Continuation of work started in commit 15a79c73, after initial trial.
Author: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to support a "noreturn" decoration on more compilers besides
just GCC-compatible ones, but for that we need to move the decoration
in front of the function declaration instead of either behind it or
wherever, which is the current style afforded by GCC-style attributes.
Also rename the macro to "pg_noreturn" to be similar to the C11
standard "noreturn".
pg_noreturn is now supported on all compilers that support C11 (using
_Noreturn), as well as GCC-compatible ones (using __attribute__, as
before), as well as MSVC (using __declspec). (When PostgreSQL
requires C11, the latter two variants can be dropped.)
Now, all supported compilers effectively support pg_noreturn, so the
extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped.
This also fixes a possible problem if third-party code includes
stdnoreturn.h, because then the current definition of
#define pg_attribute_noreturn() __attribute__((noreturn))
would cause an error.
Note that the C standard does not support a noreturn attribute on
function pointer types. So we have to drop these here. There are
only two instances at this time, so it's not a big loss. In one case,
we can make up for it by adding the pg_noreturn to a wrapper function
and adding a pg_unreachable(), in the other case, the latter was
already done before.
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
palloc() includes an assertion checking that an alloc() implementation
never returns NULL for all MemoryContextMethods.
This commit adds a similar assertion in palloc0(). In palloc_extend(),
a different assertion is added, checking that MCXT_ALLOC_NO_OOM is set
when an alloc() routine returns NULL. These additions can be useful to
catch errors when implementing a new set of MemoryContextMethods
routines.
Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/507e8eba-2035-4a12-a777-98199a66beb8@proxel.se
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before executing a cached generic plan, AcquireExecutorLocks() in
plancache.c locks all relations in a plan's range table to ensure the
plan is safe for execution. However, this locks runtime-prunable
relations that will later be pruned during "initial" runtime pruning,
introducing unnecessary overhead.
This commit defers locking for such relations to executor startup and
ensures that if the CachedPlan is invalidated due to concurrent DDL
during this window, replanning is triggered. Deferring these locks
avoids unnecessary locking overhead for pruned partitions, resulting
in significant speedup, particularly when many partitions are pruned
during initial runtime pruning.
* Changes to locking when executing generic plans:
AcquireExecutorLocks() now locks only unprunable relations, that is,
those found in PlannedStmt.unprunableRelids (introduced in commit
cbc127917e), to avoid locking runtime-prunable partitions
unnecessarily. The remaining locks are taken by
ExecDoInitialPruning(), which acquires them only for partitions that
survive pruning.
This deferral does not affect the locks required for permission
checking in InitPlan(), which takes place before initial pruning.
ExecCheckPermissions() now includes an Assert to verify that all
relations undergoing permission checks, none of which can be in the
set of runtime-prunable relations, are properly locked.
* Plan invalidation handling:
Deferring locks introduces a window where prunable relations may be
altered by concurrent DDL, invalidating the plan. A new function,
ExecutorStartCachedPlan(), wraps ExecutorStart() to detect and handle
invalidation caused by deferred locking. If invalidation occurs,
ExecutorStartCachedPlan() updates CachedPlan using the new
UpdateCachedPlan() function and retries execution with the updated
plan. To ensure all code paths that may be affected by this handle
invalidation properly, all callers of ExecutorStart that may execute a
PlannedStmt from a CachedPlan have been updated to use
ExecutorStartCachedPlan() instead.
UpdateCachedPlan() replaces stale plans in CachedPlan.stmt_list. A new
CachedPlan.stmt_context, created as a child of CachedPlan.context,
allows freeing old PlannedStmts while preserving the CachedPlan
structure and its statement list. This ensures that loops over
statements in upstream callers of ExecutorStartCachedPlan() remain
intact.
ExecutorStart() and ExecutorStart_hook implementations now return a
boolean value indicating whether plan initialization succeeded with a
valid PlanState tree in QueryDesc.planstate, or false otherwise, in
which case QueryDesc.planstate is NULL. Hook implementations are
required to call standard_ExecutorStart() at the beginning, and if it
returns false, they should do the same without proceeding.
* Testing:
To verify these changes, the delay_execution module tests scenarios
where cached plans become invalid due to changes in prunable relations
after deferred locks.
* Note to extension authors:
ExecutorStart_hook implementations must verify plan validity after
calling standard_ExecutorStart(), as explained earlier. For example:
if (prev_ExecutorStart)
plan_valid = prev_ExecutorStart(queryDesc, eflags);
else
plan_valid = standard_ExecutorStart(queryDesc, eflags);
if (!plan_valid)
return false;
<extension-code>
return true;
Extensions accessing child relations, especially prunable partitions,
via ExecGetRangeTableRelation() must now ensure their RT indexes are
present in es_unpruned_relids (introduced in commit cbc127917e), or
they will encounter an error. This is a strict requirement after this
change, as only relations in that set are locked.
The idea of deferring some locks to executor startup, allowing locks
for prunable partitions to be skipped, was first proposed by Tom Lane.
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Reviewed-by: David Rowley <dgrowleyml@gmail.com> (earlier versions)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier versions)
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
|
|
|
|
| |
Backpatch-through: 13
|
|
|
|
|
|
|
|
| |
Many of them just seem to have been copied around for no real reason.
Their presence causes (small) risks of hiding actual type mismatches
or silently discarding qualifiers
Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pg_cursor() supposed that any Portal it finds in the hash table must
have sourceText set up, but there's an edge case where that is not so.
A newly-created Portal has sourceText = NULL, and that doesn't change
until PortalDefineQuery is called. In SPI_cursor_open_internal,
we perform GetCachedPlan between CreatePortal and PortalDefineQuery,
and it's possible for user-defined code to execute during that
planning and cause a fetch from the pg_cursors view, resulting in a
null-pointer-dereference crash. (It looks like the same could happen
in exec_bind_message, but I've not tried to provoke a failure there.)
I considered trying to fix this by setting sourceText sooner, but
there may be instances of this same calling pattern in extensions,
and we couldn't be sure they'd get the memo promptly. It seems
better to redefine pg_cursor as not showing Portals that have
not yet had PortalDefineQuery called on them, which we can do by
just skipping them if sourceText is still NULL.
(Before a1c692358, pg_cursor would instead return a row with NULL
in the statement column. We could revert to that behavior but it
doesn't really seem like a better definition, especially since our
documentation doesn't suggest that the column could be NULL.)
Per report from PetSerAl. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAKygsHTBXLXjwV43kpZa+Cs+XTiaeeJiZdL4cPBm9f4MTdw7wg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This only affects MEMORY_CONTEXT_CHECKING builds.
This fixes an off-by-one issue in GenerationRealloc() where the
fast-path code which tries to reuse the existing allocation if the
existing chunk is >= the new requested size. The code there thought it
was always ok to use the existing chunk, but when oldsize == size there
isn't enough space to store the sentinel byte. If both sizes matched
exactly set_sentinel() would overwrite the first byte beyond the chunk
and then subsequent GenerationRealloc() calls could then fail the
Assert(chunk->requested_size < oldsize) check which is trying to ensure
the chunk is large enough to store the sentinel.
The same issue does not exist in aset.c as the sentinel checking code
only adds a sentinel byte if there's enough space in the chunk.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/49275921-7b39-41af-5eb8-97b50ce3312e@gmail.com
Backpatch-through: 16, where the problem was introduced by 0e480385e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up to now, committing a transaction has caused CurrentMemoryContext to
get set to TopMemoryContext. Most callers did not pay any particular
heed to this, which is problematic because TopMemoryContext is a
long-lived context that never gets reset. If the caller assumes it
can leak memory because it's running in a limited-lifespan context,
that behavior translates into a session-lifespan memory leak.
The first-reported instance of this involved ProcessIncomingNotify,
which is called from the main processing loop that normally runs in
MessageContext. That outer-loop code assumes that whatever it
allocates will be cleaned up when we're done processing the current
client message --- but if we service a notify interrupt, then whatever
gets allocated before the next switch to MessageContext will be
permanently leaked in TopMemoryContext. sinval catchup interrupts
have a similar problem, and I strongly suspect that some places in
logical replication do too.
To fix this in a generic way, let's redefine the behavior as
"CommitTransactionCommand restores the memory context that was current
at entry to StartTransactionCommand". This clearly fixes the issue
for the notify and sinval cases, and it seems to match the mental
model that's in use in the logical replication code, to the extent
that anybody thought about it there at all.
For consistency, likewise make subtransaction exit restore the context
that was current at subtransaction start (rather than always selecting
the CurTransactionContext of the parent transaction level). This case
has less risk of resulting in a permanent leak than the outer-level
behavior has, but it would not meet the principle of least surprise
for some CommitTransactionCommand calls to restore the previous
context while others don't.
While we're here, also change xact.c so that we reset
TopTransactionContext at transaction exit and then re-use it in later
transactions, rather than dropping and recreating it in each cycle.
This probably doesn't save a lot given the context recycling mechanism
in aset.c, but it should save a little bit. Per suggestion from David
Rowley. (Parenthetically, the text in src/backend/utils/mmgr/README
implies that this is how I'd planned to implement it as far back as
commit 1aebc3618 --- but the code actually added in that commit just
drops and recreates it each time.)
This commit also cleans up a few places outside xact.c that were
needlessly making TopMemoryContext current, and thus risking more
leaks of the same kind. I don't think any of them represent
significant leak risks today, but let's deal with them while the
issue is top-of-mind.
Per bug #18512 from wizardbrony. Commit to HEAD only, as this change
seems to have some risk of breaking things for some callers. We'll
apply a narrower fix for the known-broken cases in the back branches.
Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes various typos, duplicated words, and tiny bits of whitespace
mainly in code comments but also in docs.
Author: Daniel Gustafsson <daniel@yesql.se>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Alexander Lakhin <exclusion@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se
|
|
|
|
|
|
|
|
|
|
|
|
| |
Oversight in 29f6a959c.
In passing, since we now have 4 memory context types to choose from,
provide a brief overview of the specialities of each memory context
type.
Reported-by: Amul Sul
Author: Amul Sul, David Rowley
Discussion: https://postgr.es/m/CAAJ_b94U2s9nHh--DEK=sPEZUQ+x7vQJ7529fF8UAH97QJ9NXg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BumpContext relies on using the head block from its 'blocks' field to
use as the current block to allocate new chunks to. When we receive an
allocation request larger than allocChunkLimit, we place these chunks on
a new dedicated block and, until now, we pushed the block onto the
*head* of the 'blocks' list.
This behavior caused the previous bump block to no longer be available
for new normal-sized (non-large) allocations and would result in blocks
only being partially filled if a large allocation request arrived before
the block became full.
Here adjust the code to push these dedicated blocks onto the *tail* of
the blocks list so that the head block remains intact and available to
be used by normal allocation request sizes until it becomes full.
In passing, make the elog(ERROR) calls for the unsupported callbacks
consistent. Likewise for the header comments for those functions.
Discussion: https://postgr.es/m/CAApHDvp9___r-ayJj0nZ6GD3MeCGwGZ0_6ZptWpwj+zqHtmwCw@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvqerXpzUnuDQfUEi3DZA+9=Ud9WSt3ruxN5b6PcOosx2g@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
The bump allocator was recently added in 29f6a959c. Our other
allocators have a similar macro to this, but seemingly the version of
the macro for those allocators is only used in places where the chunk
header is decoded. Since the bump allocator has no chunk header, none
of those functions exist for bump therefore macro is unused. Remove it.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/5f724fb2-96e1-4f36-b65b-47b337ad432e@eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The macro was missing a MAXALIGN around the sizeof(BumpContext) which
would cause problems detecting the keeper block on 32-bit systems that
have a MAXALIGN value of 8.
Thank you to Andres Freund, Tomas Vondra and Tom Lane for investigating
and testing.
Reported-by: Melanie Plageman, Tomas Vondra
Discussion: https://postgr.es/m/CAAKRu_Y6dZjiJEZghgNZp0Gjar1JVq-CH7XGDqExDVHnPgDjuw@mail.gmail.com
Discussion: https://postgr.es/m/a4a10b89-6ba8-4abd-b449-019aafff04fc@enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a bump MemoryContext type. The bump context is best
suited for short-lived memory contexts which require only allocations
of memory and never a pfree or repalloc, which are unsupported.
Memory palloc'd into a bump context has no chunk header. This makes
bump a useful context type when lots of small allocations need to be
done without any need to pfree those allocations. Allocation sizes are
rounded up to the next MAXALIGN boundary, so with this and no chunk
header, allocations are very compact indeed.
Allocations are also very fast as bump does not check any freelists to
try and make use of previously free'd chunks. It just checks if there
is enough room on the current block, and if so it bumps the freeptr
beyond this chunk and returns the value that the freeptr was previously
pointing to. Simple and fast. A new block is malloc'd when there's not
enough space in the current block.
Code using the bump allocator must take care never to call any functions
which could try to call realloc() (or variants), pfree(),
GetMemoryChunkContext() or GetMemoryChunkSpace() on a bump allocated
chunk. Due to lack of chunk headers, these operations are unsupported.
To increase the chances of catching such issues, when compiled with
MEMORY_CONTEXT_CHECKING, bump allocated chunks are given a header and
any attempt to perform an unsupported operation will result in an ERROR.
Without MEMORY_CONTEXT_CHECKING, code attempting an unsupported
operation could result in a segfault.
A follow-on commit will implement the first user of bump.
Author: David Rowley
Reviewed-by: Nathan Bossart
Reviewed-by: Matthias van de Meent
Reviewed-by: Tomas Vondra
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvqGSpCU95TmM=Bp=6xjL_nLys4zdZOpfNyWBk97Xrdj2w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reserve 4 bits for MemoryContextMethodID rather than 3. 3 bits did
technically allow a maximum of 8 memory context types, however, we've
opted to reserve some bit patterns which left us with only 4 slots, all
of which were used.
Here we add another bit which frees up 8 slots for future memory context
types.
In passing, adjust the enum names in MemoryContextMethodID to make it
more clear which ones can be used and which ones are reserved.
Author: Matthias van de Meent, David Rowley
Discussion: https://postgr.es/m/CAApHDvqGSpCU95TmM=Bp=6xjL_nLys4zdZOpfNyWBk97Xrdj2w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the DSA segment size always started with 1MB and grew up
to DSA_MAX_SEGMENT_SIZE. It was inconvenient in certain scenarios,
such as when the caller desired a soft constraint on the total DSA
segment size, limiting it to less than 1MB.
This commit introduces the capability to specify the initial and
maximum DSA segment sizes when creating a DSA area, providing more
flexibility and control over memory usage.
Reviewed-by: John Naylor, Tomas Vondra
Discussion: https://postgr.es/m/CAD21AoAYGGC1ePjVX0H%2Bpp9rH%3D9vuPK19nNOiu12NprdV5TVJA%40mail.gmail.com
|
|
|
|
|
|
|
|
| |
Similar to commit 7e735035f20.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
You might run out of stack space with recursion, which is not nice in
functions that might be used e.g. at cleanup after transaction
abort. MemoryContext contains pointer to parent and siblings, so we
can traverse a tree of contexts iteratively, without using
stack. Refactor the functions to do that.
MemoryContextStats() still recurses, but it now has a limit to how
deep it recurses. Once the limit is reached, it prints just a summary
of the rest of the hierarchy, similar to how it summarizes contexts
with lots of children. That seems good anyway, because a context dump
with hundreds of nested contexts isn't very readable.
Report by Egor Chindyaskin and Alexander Lakhin.
Discussion: https://postgr.es/m/1672760457.940462079%40f306.i.mail.ru
Author: Heikki Linnakangas
Reviewed-by: Robert Haas, Andres Freund, Alexander Korotkov, Tom Lane
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main
technique that makes it adaptive is using several different node types,
each with a different capacity of elements, and a different algorithm
for accessing them. The nodes start small and grow/shrink as needed.
The main advantage over hash tables is efficient sorted iteration and
better memory locality when successive keys are lexicographically
close together. The implementation currently assumes 64-bit integer
keys, and traversing the tree is in general slower than a linear
probing hash table, so this is not a general-purpose associative array.
The paper describes two other techniques not implemented here,
namely "path compression" and "lazy expansion". These can further
reduce memory usage and speed up traversal, but the former would add
significant complexity and the latter requires storing the full key
with the value. We do trivially compress the path when leading bytes
of the key are zeros, however.
For value storage, we use "combined pointer/value slots", as
recommended in the paper. Values of size equal or smaller than the the
platform's pointer type are stored in the array of child pointers in
the last level node, while larger values are each stored in a separate
allocation. This is for now fixed at compile time, but it would be
fairly trivial to allow determining at runtime how variable-length
values are stored.
One innovation in our implementation compared to the ART paper is
decoupling the notion of node "size class" from "kind". The size
classes within a given node kind have the same underlying type, but
a variable capacity for children, so we can introduce additional node
sizes with little additional code.
To enable different use cases to specialize for different value types
and for shared/local memory, we use macro-templatized code generation
in the same manner as simplehash.h and sort_template.h.
Future commits will use this infrastructure for storing TIDs.
Patch by Masahiko Sawada and John Naylor, but a substantial amount of
credit is due to Andres Freund, whose proof-of-concept was a valuable
source of coding idioms and awareness of performance pitfalls, and
who reviewed earlier versions.
Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as determined by include-what-you-use (IWYU)
While IWYU also suggests to *add* a bunch of #include's (which is its
main purpose), this patch does not do that. In some cases, a more
specific #include replaces another less specific one.
Some manual adjustments of the automatic result:
- IWYU currently doesn't know about includes that provide global
variable declarations (like -Wmissing-variable-declarations), so
those includes are being kept manually.
- All includes for port(ability) headers are being kept for now, to
play it safe.
- No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the
patch from exploding in size.
Note that this patch touches just *.c files, so nothing declared in
header files changes in hidden ways.
As a small example, in src/backend/access/transam/rmgr.c, some IWYU
pragma annotations are added to handle a special case there.
Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a similar effort to 413c18401, separate out the hot and cold paths in
GenerationAlloc() and SlabAlloc() to avoid having to setup the stack frame
for the hot path.
This additionally adjusts how we use the GenerationContext's freeblock.
Freeblock, when set, is now always empty and we only switch to using it
when the current allocation request finds the current block does not have
enough space and the freeblock is large enough to accomodate the
allocation.
This commit also adjusts GenerationFree() so that if we pfree the final
allocation in the current generation block, we now mark that block as
empty and keep it as the current block. Previously we free'd that block
and set the current block to NULL. Doing that meant we needed a special
case in GenerationAlloc to check if GenerationContext.block was NULL.
So this both reduces free/malloc calls and reduces the work done in
GenerationAlloc().
In passing, improve some comments in aset.c
Discussion: https://postgr.es/m/CAApHDvpHVSJqqb4B4OZLixr=CotKq-eKkbwZqvZVo_biYvUvQA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
dsa_dump would print a large negative number instead of zero for
segment bin 0. Fix by explicitly checking for underflow and add
special case for bin 0. Backpatch to all supported versions.
Author: Ian Ilyasov <ianilyasov@outlook.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/GV1P251MB1004E0D09D117D3CECF9256ECD502@GV1P251MB1004.EURP251.PROD.OUTLOOK.COM
Backpatch-through: v12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocating from a free list or from a block which contains enough space
already, we deem to be common code paths and want to optimize for those.
Having to allocate a new block, either a normal block or a dedicated one
for a large allocation, we deem to be less common, therefore we class
that as "cold". Both cold paths require a malloc so are going to be
slower as a result of that regardless.
The main motivation here is to remove the calls to malloc() in the hot
path and because of this, the compiler is now free to not bother setting
up the stack frame in AllocSetAlloc(), thus making the hot path much
cheaper.
Author: Andres Freund
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20210719195950.gavgs6ujzmjfaiig@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many modern compilers are able to optimize function calls to functions
where the parameters of the called function match a leading subset of
the calling function's parameters. If there are no instructions in the
calling function after the function is called, then the compiler is free
to avoid any stack frame setup and implement the function call as a
"jmp" rather than a "call". This is called sibling call optimization.
Here we adjust the memory allocation functions in mcxt.c to allow this
optimization. This requires moving some responsibility into the memory
context implementations themselves. It's now the responsibility of the
MemoryContext to check for malloc failures. This is good as it both
allows the sibling call optimization, but also because most small and
medium allocations won't call malloc and just allocate memory to an
existing block. That can't fail, so checking for NULLs in that case
isn't required.
Also, traditionally it's been the responsibility of palloc and the other
allocation functions in mcxt.c to check for invalid allocation size
requests. Here we also move the responsibility of checking that into the
MemoryContext. This isn't to allow the sibling call optimization, but
more because most of our allocators handle large allocations separately
and we can just add the size check when doing large allocations. We no
longer check this for non-large allocations at all.
To make checking the allocation request sizes and ERROR handling easier,
add some helper functions to mcxt.c for the allocators to use.
Author: Andres Freund
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20210719195950.gavgs6ujzmjfaiig@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new "Memory:" line under the "Planning:" group (which
currently only has "Buffers:") when the MEMORY option is specified.
In order to make the reporting reasonably accurate, we create a separate
memory context for planner activities, to be used only when this option
is given. The total amount of memory allocated by that context is
reported as "allocated"; we subtract memory in the context's freelists
from that and report that result as "used". We use
MemoryContextStatsInternal() to obtain the quantities.
The code structure to show buffer usage during planning was not in
amazing shape, so I (Álvaro) modified the patch a bit to clean that up
in passing.
Author: Ashutosh Bapat
Reviewed-by: David Rowley, Andrey Lepikhov, Jian He, Andy Fan
Discussion: https://www.postgresql.org/message-id/CAExHW5sZA=5LJ_ZPpRO-w09ck8z9p7eaYAqq3Ks9GDfhrxeWBw@mail.gmail.com
|
|
|
|
|
| |
The keeper block was introduced in commit 1b0d9aa4f7, but it forgot
to update this comment.
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove MemoryContextAllocZeroAligned(). It was supposed to be a
faster version of MemoryContextAllocZero(), but modern compilers turn
the MemSetLoop() into a call to memset() anyway, making it more or
less identical to MemoryContextAllocZero(). That was the only user of
MemSetTest, MemSetLoop, so remove those too, as well as palloc0fast().
- Convert newNode() to a static inline function. When this was
originally originally written, it was written as a macro because
testing showed that gcc didn't inline the size check as we
intended. Modern compiler versions do, and now that it just calls
palloc0() there is no size-check to inline anyway.
One nice effect is that the palloc0() takes one less argument than
MemoryContextAllocZeroAligned(), which saves a few instructions in the
callers of newNode().
Reviewed-by: Peter Eisentraut, Tom Lane, John Naylor, Thomas Munro
Discussion: https://www.postgresql.org/message-id/b51f1fa7-7e6a-4ecc-936d-90a8a1659e7c@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comments in dsa.c suggested that areas were owned by resource
owners, but it was not in fact tracked explicitly. The DSM attachments
held by the dsa were owned by resource owners, but not the area
itself. That led to confusion if you used one resource owner to
attach or create the area, but then switched to a different resource
owner before allocating or even just accessing the allocations in the
area with dsa_get_address(). The additional DSM segments associated
with the area would get owned by a different resource owner than the
initial segment. To fix, add an explicit 'resowner' field to
dsa_area. It replaces the 'mapping_pinned' flag; resowner == NULL now
indicates that the mapping is pinned.
This is arguably a bug fix, but I'm not backpatching because it
doesn't seem to be a live bug in the back branches. In 'master', it is
a bug because commit b8bff07daa made ResourceOwners more strict so
that you are no longer allowed to remember new resources in a
ResourceOwner after you have started to release it. Merely accessing a
dsa pointer might need to attach a new DSM segment, and before this
commit it was temporarily remembered in the current owner for a very
brief period even if the DSA was pinned. And that could happen in
AtEOXact_PgStat(), which is called after the owner is already released.
Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin, Thomas Munro, Andres Freund
Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we reduce the block size fields in AllocSetContext, GenerationContext
and SlabContext from Size down to uint32. Ever since c6e0fe1f2, blocks
for non-dedicated palloc chunks can no longer be larger than 1GB, so
there's no need to store the various block size fields as 64-bit values.
32 bits are enough to store 2^30.
Here we also further reduce the memory context struct sizes by getting rid
of the 'keeper' field which stores a pointer to the context's keeper
block. All the context types which have this field always allocate the
keeper block in the same allocation as the memory context itself, so the
keeper block always comes right at the end of the context struct. Add
some macros to calculate that address rather than storing it in the
context.
Overall, in AllocSetContext and GenerationContext, this saves 20 bytes on
64-bit builds which for ALLOCSET_SMALL_SIZES can sometimes mean the
difference between having to allocate a 2nd block and storing all the
required allocations on the keeper block alone. Such contexts are used
in relcache to store cache entries for indexes, of which there can be
a large number in a single backend.
Author: Melih Mutlu
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAGPVpCSOW3uJ1QJmsMR9_oE3X7fG_z4q0AoU4R_w+2RzvroPFg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's OK to be lazy about re-binning memory segments when allocating,
because that can only leave segments in a bin that's too high. We'll
search higher bins if necessary while allocating next time, and
also eventually re-bin, so no memory can become unreachable that way.
However, when freeing memory, the largest contiguous range of free pages
might go up, so we should re-bin eagerly to make sure we don't leave the
segment in a bin that is too low for get_best_segment() to find.
The re-binning code is moved into a function of its own, so it can be
called whenever free pages are returned to the segment's free page map.
Back-patch to all supported releases.
Author: Dongming Liu <ldming101@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier version)
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAL1p7e8LzB2LSeAXo2pXCW4%2BRya9s0sJ3G_ReKOU%3DAjSUWjHWQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Run pgindent, pgperltidy, and reformat-dat-files.
This set of diffs is a bit larger than typical. We've updated to
pg_bsd_indent 2.1.2, which properly indents variable declarations that
have multi-line initialization expressions (the continuation lines are
now indented one tab stop). We've also updated to perltidy version
20230309 and changed some of its settings, which reduces its desire to
add whitespace to lines to make assignments etc. line up. Going
forward, that should make for fewer random-seeming changes to existing
code.
Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
|
|
|
|
|
| |
Author: Alexander Lakhin
Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes many spelling mistakes in comments, but a few references to
invalid parameter names, function names and option names too in comments
and also some in string constants
Also, fix an #undef that was undefining the incorrect definition
Author: Alexander Lakhin
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/d5f68d19-c0fc-91a9-118d-7c6a5a3f5fad@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit we only ever protected MemoryChunk's requested_size
field with Valgrind NOACCESS. This means that if the hdrmask field is
ever accessed accidentally then we're not going to get any warnings from
Valgrind about it. Valgrind would have warned us about the problem fixed
in 92957ed98 had we already been doing this.
Per suggestion from Tom Lane
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/1650235.1672694719@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAApHDvr=FZNGbj252Z6M9BSFKoq6BMxgkQ2yEAGUYoo7RquqZg@mail.gmail.com
|