| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 61b313e4 made VACUUM pass down a heaprel to index AM bulkdelete
and vacuumcleanup routines. Although this was primarily intended as
preparation for logical decoding on standbys, it also made it easy to
correct an old deficiency in how we determine how to cleanup SP-GiST
redirect and placeholder tuples.
Pass the heaprel to GlobalVisTestFor() during cleanup of redirect and
placeholder tuples, rather than pessimistically passing NULL.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/02392033-f030-a3c8-c7d0-5c27eb529fec@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 61b313e4 made nbtree consistently pass down a heaprel to low
level routines like _bt_getbuf(). Although this was primarily intended
as preparation for logical decoding on standbys, it also made it easy to
correct an old deficiency in how nbtree VACUUM determines whether or not
it's now safe to recycle deleted pages.
Pass the heaprel to GlobalVisTestFor() in nbtree routines that deal with
recycle safety. nbtree now makes less pessimistic assumptions about
recycle safety within non-catalog relations. This enhancement
complements the recycling enhancement added by commit 9dd963ae25.
nbtree remains just as pessimistic as ever when it comes to recycle
safety within indexes on catalog relations. There is no fundamental
reason why we need to treat catalog relations differently, though. The
behavioral inconsistency is a consequence of the way that nbtree uses
nextXID values to implement what Lanin and Shasha call "the drain
technique". Note in particular that it has nothing to do with whether
or not index tuples might still be required for an older MVCC snapshot.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkaiDxCje0yPuH=3Uh2p1V_2pFGY==xfbZoZu7Ax_NB8g@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Commit 61b313e4 added a heaprel struct member to IndexVacuumInfo, but
placed it last. Move the heaprel struct member next to the index struct
member to improve the code's readability.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznG=TV6S9d3VA=y0vBHbXwnLs9_LLdiML=aNJuHeriwxg@mail.gmail.com
|
|
|
|
| |
Discussion: https://postgr.es/m/20230323003003.plgaxjqahjgkuxrk%40awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally, vacuum always makes use of a buffer access strategy 32
buffers in size. This means that running vacuums tend not to cause too
many shared buffers to become dirty, however, this can cause vacuums to
run much more slowly than they otherwise could as WAL flushes will occur
more frequently due to having to flush WAL out to the LSN of the dirty
page before that page can be written to disk.
When we are performing failsafe VACUUMs (as added in 1e55e7d17), we really
want to make the vacuum work go as quickly as possible, so here we disable
the buffer access strategy when entering failsafe mode while vacuuming a
relation.
Per idea and analyis from Andres Freund.
In passing, also include some changes I had intended for 32fbe0239.
Author: Melanie Plageman
Reviewed-by: Justin Pryzby, David Rowley
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb%40awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit only implements one prerequisite part for allowing logical
decoding. The commit message contains an explanation of the overall design,
which later commits will refer back to.
Overall design:
1. We want to enable logical decoding on standbys, but replay of WAL
from the primary might remove data that is needed by logical decoding,
causing error(s) on the standby. To prevent those errors, a new replication
conflict scenario needs to be addressed (as much as hot standby does).
2. Our chosen strategy for dealing with this type of replication slot
is to invalidate logical slots for which needed data has been removed.
3. To do this we need the latestRemovedXid for each change, just as we
do for physical replication conflicts, but we also need to know
whether any particular change was to data that logical replication
might access. That way, during WAL replay, we know when there is a risk of
conflict and, if so, if there is a conflict.
4. We can't rely on the standby's relcache entries for this purpose in
any way, because the startup process can't access catalog contents.
5. Therefore every WAL record that potentially removes data from the
index or heap must carry a flag indicating whether or not it is one
that might be accessed during logical decoding.
Why do we need this for logical decoding on standby?
First, let's forget about logical decoding on standby and recall that
on a primary database, any catalog rows that may be needed by a logical
decoding replication slot are not removed.
This is done thanks to the catalog_xmin associated with the logical
replication slot.
But, with logical decoding on standby, in the following cases:
- hot_standby_feedback is off
- hot_standby_feedback is on but there is no a physical slot between
the primary and the standby. Then, hot_standby_feedback will work,
but only while the connection is alive (for example a node restart
would break it)
Then, the primary may delete system catalog rows that could be needed
by the logical decoding on the standby (as it does not know about the
catalog_xmin on the standby).
So, it’s mandatory to identify those rows and invalidate the slots
that may need them if any. Identifying those rows is the purpose of
this commit.
Implementation:
When a WAL replay on standby indicates that a catalog table tuple is
to be deleted by an xid that is greater than a logical slot's
catalog_xmin, then that means the slot's catalog_xmin conflicts with
the xid, and we need to handle the conflict. While subsequent commits
will do the actual conflict handling, this commit adds a new field
isCatalogRel in such WAL records (and a new bit set in the
xl_heap_visible flags field), that is true for catalog tables, so as to
arrange for conflict handling.
The affected WAL records are the ones that already contain the
snapshotConflictHorizon field, namely:
- gistxlogDelete
- gistxlogPageReuse
- xl_hash_vacuum_one_page
- xl_heap_prune
- xl_heap_freeze_page
- xl_heap_visible
- xl_btree_reuse_page
- xl_btree_delete
- spgxlogVacuumRedirect
Due to this new field being added, xl_hash_vacuum_one_page and
gistxlogDelete do now contain the offsets to be deleted as a
FLEXIBLE_ARRAY_MEMBER. This is needed to ensure correct alignment.
It's not needed on the others struct where isCatalogRel has
been added.
This commit just introduces the WAL format changes mentioned above. Handling
the actual conflicts will follow in future commits.
Bumps XLOG_PAGE_MAGIC as the several WAL records are changed.
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Author: Andres Freund <andres@anarazel.de> (in an older version)
Author: Amit Khandekar <amitdkhan.pg@gmail.com> (in an older version)
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done in preparation for logical decoding on standby, which needs to
include whether visibility affecting WAL records are about a (user) catalog
table. Which is only known for the table, not the indexes.
It's also nice to be able to pass the heap relation to GlobalVisTestFor() in
vacuumRedirectAndPlaceholder().
Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com
|
|
|
|
|
|
|
|
|
| |
If visibilitymap_set() is called with flags containing a higher bit than
VISIBILITYMAP_ALL_FROZEN, the state of neighboring pages is affected. While
there was an assertion that *some* valid bits were set, it did not check
that *only* valid bits were. Change that.
Discussion: https://postgr.es/m/20230331043300.gux3s5wzrapqi4oe@awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PageInit() while holding the extension lock is unnecessary after 0d1fe9f74e3
started to use RBM_ZERO_AND_LOCK - nobody can look at the new page before we
release the page lock. PageInit() zeroes the page, which isn't that cheap, so
deferring it until after the extension lock is released seems like a good idea.
Doing visibilitymap_pin() while holding the extension lock, introduced in
7db0cd2145f2, looks like an accident. Due to the restrictions on
HEAP_INSERT_FROZEN it's unlikely to be a performance issue, but it still seems
better to move it out. We also are doing the visibilitymap_pin() while
holding the buffer lock, which will be fixed in a separate commit.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: http://postgr.es/m/419312fd-9255-078c-c3e3-f0525f911d7f@iki.fi
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In instr_time.h it is stated that:
* When summing multiple measurements, it's recommended to leave the
* running sum in instr_time form (ie, use INSTR_TIME_ADD or
* INSTR_TIME_ACCUM_DIFF) and convert to a result format only at the end.
The reason for that is that converting to microseconds is not cheap, and can
loose precision. Therefore this commit changes 'PendingWalStats' to use
'instr_time' instead of 'PgStat_Counter' while accumulating 'wal_write_time'
and 'wal_sync_time'.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/1feedb83-7aa9-cb4b-5086-598349d3f555@gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gistBuildCallback tried to fetch the size of an index tuple that
might have already been freed by gistProcessEmptyingQueue.
While this seems to usually be harmless in production builds,
in principle it could result in a SIGSEGV, or more likely a bogus
value for indtuplesSize leading to poor page-split decisions later
in the build.
The memory management here is confusing and could stand to be
refactored, but for the moment it seems to be enough to fetch
the tuple size sooner. AFAICT the indtuples[Size] totals aren't
used in between these places; even if they were, the updated
values shouldn't be any worse to use. So just move the
incrementing of the totals up.
It's not very clear why our valgrind-using buildfarm animals
haven't noticed this problem, because the relevant code path
does seem to be exercised according to the code coverage report.
I think the reason that we didn't fix this bug after the first
report is that I'd wanted to try to understand that better.
However, now that it's been re-discovered let's just be pragmatic
and fix it already.
Original report by Alexander Lakhin (bug #16329),
later rediscovered by Egor Chindyaskin (bug #17874).
Patch by Alexander Lakhin (commentary by Pavel Borisov and me).
Back-patch to all supported branches.
Discussion: https://postgr.es/m/16329-7a6aa9b6fa1118a1@postgresql.org
Discussion: https://postgr.es/m/17874-63ca6c7ce42d2103@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function has been a no-op for over a decade. Even if bufmgr
regains a need to be called during commit, it seems unlikely that
the most appropriate call points would be precisely here, so it's not
doing us much good as a placeholder either. Now, removing it probably
doesn't save any noticeable number of cycles --- but the main call is
inside the commit critical section, and the less work done there the
better.
Matthias van de Meent
Discussion: https://postgr.es/m/CAEze2Wi1=tLKbxZnXzcD+8fYKyKqBtivVakLQC_mYBsP4Y8qVA@mail.gmail.com
|
|
|
|
|
|
|
| |
In ancient times, these belonged to arguments or fields that were
actually of type long, but now they are not anymore, so this "L"
decoration is just confusing. (Some other 0L and other "L" constants
remain, where they are actually associated with a long type.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the columns attndims, attstattarget, and attinhcount from int32
to int16, and reorder a bit. This saves some space (currently 4
bytes) in pg_attribute and tuple descriptors, which translates into
small performance benefits and/or room for new columns in pg_attribute
needed by future features.
attndims and attinhcount are never realistically used with values
larger than int16. Just to be sure, add some overflow checks.
attstattarget is currently limited explicitly to 10000.
For consistency, pg_constraint.coninhcount is also changed like
attinhcount.
Discussion: https://www.postgresql.org/message-id/flat/d07ffc2b-e0e8-77f7-38fb-be921dff71af%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When extracting an attr from a cached tuple in the syscache with
SysCacheGetAttr the isnull parameter must be checked in case the
attr cannot be NULL. For cases when this is known beforehand, a
wrapper is introduced which perform the errorhandling internally
on behalf of the caller, invoking an elog in case of a NULL attr.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add pgstat counter to track row updates that result in the successor
version going to a new heap page, leaving behind an original version
whose t_ctid points to the new version. The current count is shown by
the n_tup_newpage_upd column of each of the pg_stat_*_tables views.
The new n_tup_newpage_upd column complements the existing n_tup_hot_upd
and n_tup_upd columns. Tables that have high n_tup_newpage_upd values
(relative to n_tup_upd) are good candidates for tuning heap fillfactor.
Corey Huinker, with small tweaks by me.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, in read committed transaction isolation mode (default), we have the
following sequence of actions when tuple_update()/tuple_delete() finds
the tuple updated by concurrent transaction.
1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which
returns TM_Updated.
2. Lock tuple with tuple_lock().
3. Re-evaluate plan qual (recheck if we still need to update/delete and
calculate the new tuple for update).
4. Second attempt to update/delete tuple with tuple_update()/tuple_delete().
This attempt should be successful, since the tuple was previously locked.
This patch eliminates step 2 by taking the lock during first
tuple_update()/tuple_delete() call. Heap table access method saves some
efforts by checking the updated tuple once instead of twice. Future
undo-based table access methods, which will start from the latest row version,
can immediately place a lock there.
The code in nodeModifyTable.c is simplified by removing the nested switch/case.
Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com
Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp
Reviewed-by: Andres Freund, Chris Travers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When determining whether an index update may be skipped by using HOT, we
can ignore attributes indexed by block summarizing indexes without
references to individual tuples that need to be cleaned up.
A new type TU_UpdateIndexes provides a signal to the executor to
determine which indexes to update - no indexes, all indexes, or only the
summarizing indexes.
This also removes rd_indexattr list, and replaces it with rd_attrsvalid
flag. The list was not used anywhere, and a simple flag is sufficient.
This was originally committed as 5753d4ee32, but then got reverted by
e3fcca0d0d because of correctness issues.
Original patch by Josef Simanek, various fixes and improvements by Tomas
Vondra and me.
Authors: Matthias van de Meent, Josef Simanek, Tomas Vondra
Reviewed-by: Tomas Vondra, Alvaro Herrera
Discussion: https://postgr.es/m/05ebcb44-f383-86e3-4f31-0a97a55634cf@enterprisedb.com
Discussion: https://postgr.es/m/CAFp7QwpMRGcDAQumN7onN9HjrJ3u4X3ZRXdGFT0K5G2JWvnbWg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When calculating distance in brin_minmax_multi_distance_inet(), the
netmask was applied incorrectly. This results in (seemingly) incorrect
ordering of values, triggering an assert.
For builds without asserts this is mostly harmless - we may merge other
ranges, possibly resulting in slightly less efficient index. But it's
still correct and the greedy algorithm doesn't guarantee optimality
anyway.
Backpatch to 14, where minmax-multi indexes were introduced.
Reported by Dmitry Dolgov, investigation and fix by me.
Reported-by: Dmitry Dolgov
Backpatch-through: 14
Discussion: https://postgr.es/m/17774-c6f3e36dd4471e67@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following changes are made to pg_write_zeros(), the API able to
write series of zeros using vectored I/O:
- Add of an "offset" parameter, to write the size from this position
(the 'p' of "pwrite" seems to mean position, though POSIX does not
outline ythat directly), hence the name of the routine is incorrect if
it is not able to handle offsets.
- Avoid memset() of "zbuffer" on every call.
- Avoid initialization of the whole IOV array if not needed.
- Group the trailing write() call with the main write() call,
simplifying the function logic.
Author: Andres Freund
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/20230215005525.mrrlmqrxzjzhaipl@awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Open long-lived data and WAL file descriptors with O_CLOEXEC. This flag
was introduced by SUSv4 (POSIX.1-2008), and by now all of our target
Unix systems have it. Our open() implementation for Windows already had
that behavior, so provide a dummy O_CLOEXEC flag on that platform.
For now, callers of open() and the "thin" wrappers in fd.c that deal in
raw descriptors need to pass in O_CLOEXEC explicitly if desired. This
commit does that for WAL files, and automatically for everything
accessed via VFDs including SMgrRelation and BufFile. (With more
discussion we might decide to turn it on automatically for the thin
open()-wrappers too to avoid risk of missing places that need it, but
these are typically used for short-lived descriptors where we don't
expect to fork/exec, and it's remotely possible that extensions could be
using these APIs and passing descriptors to subprograms deliberately, so
that hasn't been done here.)
Do the same for sockets and the postmaster pipe with FD_CLOEXEC. (Later
commits might use modern interfaces to remove these extra fcntl() calls
and more where possible, but we'll need them as a fallback for a couple
of systems, so do it that way in this initial commit.)
With this change, subprograms executed for archiving, copying etc will
no longer have access to the server's descriptors, other than the ones
that we decide to pass down.
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGKb6FsAdQWcRL35KJsftv%2B9zXqQbzwkfRf1i0J2e57%2BhQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function has been semi-deprecated ever since we invented
bms_next_member(). Its habit of scribbling on the input bitmapset
isn't great, plus for sufficiently large bitmapsets it would take
O(N^2) time to complete a loop. Now we have the additional problem
that reducing the input to empty while leaving it still accessible
would violate a planned invariant. So let's just get rid of it,
after updating the few extant callers to use bms_next_member().
Patch by me; thanks to Nathan Bossart and Richard Guo for review.
Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
| |
When you have some invalid WAL, you often get a message like "wanted
24, got 0". This is a bit incorrect, since it really wanted *at
least* 24, not exactly 24. This updates the messages to that effect,
and also adds that detail to one message where it was available but
not printed.
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Jeevan Ladhe <jeevanladhe.os@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/726d782b-5e45-0c3e-d775-6686afe9aa83%40enterprisedb.com
|
|
|
|
|
|
|
|
| |
Change data buffer to void *, from char *, and add const where
appropriate. This makes it match the File API (see also
2d4f1ba6cfc2f0a977f1c30bda9848041343e248) and stdio.
Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com
|
|
|
|
|
|
|
|
| |
Now that the provider-independent API pg_strnxfrm() is available, we
no longer need the special cases for ICU in hashfunc.c and varchar.c.
Reviewed-by: Peter Eisentraut, Peter Geoghegan
Discussion: https://postgr.es/m/a581136455c940d7bd0ff482d3a2bd51af25a94f.camel%40j-davis.com
|
|
|
|
|
|
|
|
| |
Avoids the need of callers to test for NULL, and also avoids the need
to access the pg_locale_t structure directly.
Reviewed-by: Peter Eisentraut, Peter Geoghegan
Discussion: https://postgr.es/m/a581136455c940d7bd0ff482d3a2bd51af25a94f.camel%40j-davis.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Offers a generally better separation of responsibilities for collation
code. Also, a step towards multi-lib ICU, which should be based on a
clean separation of the routines required for collation providers.
Callers with NUL-terminated strings should call pg_strcoll() or
pg_strxfrm(); callers with strings and their length should call the
variants pg_strncoll() or pg_strnxfrm().
Reviewed-by: Peter Eisentraut, Peter Geoghegan
Discussion: https://postgr.es/m/a581136455c940d7bd0ff482d3a2bd51af25a94f.camel%40j-davis.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When evaluating clauses on multiple scan keys of a multi-column BRIN
index, we can stop processing as soon as we find a scan key eliminating
the range, and the range should not be added to tbe bitmap.
That's how it worked before 14, but since a681e3c107a the code treated
the range as matching if it matched at least the last scan key.
Backpatch to 14, where this code was introduced.
Backpatch-through: 14
Discussion: https://postgr.es/m/ebc18613-125e-60df-7520-fcbe0f9274fc%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
force_parallel_mode is meant to be used to allow us to exercise the
parallel query infrastructure to ensure that it's working as we expect.
It seems some users think this GUC is for forcing the query planner into
picking a parallel plan regardless of the costs. A quick look at the
documentation would have made them realize that they were wrong, but the
GUC is likely too conveniently named which, evidently, seems to often
result in users expecting that it forces the planner into usefully
parallelizing queries.
Here we rename the GUC to something which casual users are less likely to
mistakenly think is what they need to make their query run more quickly.
For now, the old name can still be used. We'll revisit if the old name
mapping can be removed once the buildfarm configs are all updated.
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvrsOi92_uA7PEaHZMH-S4Xv+MGhQWA+GrP8b1kjpS1HjQ@mail.gmail.com
|
|
|
|
|
|
|
|
| |
The affected functions are: bsearch, memcmp, memcpy, memset, memmove,
qsort, repalloc
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we further simplify the code in heapgettup() and
heapgettup_pagemode() to make better use of the helper functions added in
the previous recent refactors in this area.
In passing, remove an unneeded cast added in 8ca6d49f6.
Author: Melanie Plageman
Reviewed-by: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In standby mode, we don't actually report progress of recovery,
but up until now, startup_progress_timeout_handler() nevertheless
got called every log_startup_progress_interval seconds. That's
an unnecessary expense, so avoid it.
Report by Thomas Munro. Patch by Bharath Rupireddy, reviewed by
Simon Riggs, Thomas Munro, and me. Back-patch to v15, where
the problem was introduced.
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKCHSffAj8zZJKJvNX7ygnQFxVD6wm1d-2j3fVw%2BMafPQ%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Some of these appear to be leftovers from when hash_search() took a
char * argument (changed in 5999e78fc45dcb91784b64b6e9ae43f4e4f68ca2).
Since after this there is some more horizontal space available, do
some light reformatting where suitable.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commits 24c35ec and 57169ad. PreRestoreCommand() and
PostRestoreCommand() need to be put closer to the system() call calling
a restore_command, as they enable in_restore_command for the startup
process which would in turn trigger an immediate proc_exit() in the
SIGTERM handler. Perhaps we could get rid of this behavior entirely,
but 24c35ec has made the window where the flag is enabled much larger
than it was, and any Postgres-like actions (palloc, etc.) taken by code
paths while the flag is enabled could lead to more severe issues in the
shutdown processing.
Note that curculio has showed that there are much more problems in this
area, unrelated to this change, actually, hence the issues related to
that had better be addressed first. Keeping the code of HEAD in line
with the stable branches should make that a bit easier.
Per discussion with Andres Freund and Nathan Bossart.
Discussion: https://postgr.es/m/Y979NR3U5VnWrTwB@paquier.xyz
|
|
|
|
|
|
|
|
|
|
| |
The code to get the next block number was exactly the same between these
two functions, so let's just put it into a helper function and call that
from both locations.
Author: Melanie Plageman
Reviewed-by: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
Here we add heapgettup_start_page() and heapgettup_continue_page() to
simplify the code in the heapgettup() function.
Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backward and forward scans share much of the same page acquisition code.
Here we consolidate that code to reduce some duplication.
Additionally, add a new rs_coffset field to HeapScanDescData to track the
offset of the current tuple. The new field fits nicely into the padding
between a bool and BlockNumber field and saves having to look at the last
returned tuple to figure out which offset we should be looking at for the
current tuple.
Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the 90s we needed to deal with computers that still had the
pre-standard signal masking APIs. That hasn't been relevant for a very
long time on Unix systems, and c94ae9d8 got rid of a remaining
dependency in our Windows porting code. PG_SETMASK didn't expose
save/restore functionality, so we'd already started using sigprocmask()
directly in places, creating the visual distraction of having two ways
to spell it. It's not part of the API that extensions are expected to
be using (but if they are, the change will be trivial). It seems like a
good time to drop the old macro and just call the standard POSIX
function.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BKfQgrhHP2DLTohX1WwubaCBHmTzGnAEDPZ-Gug-Xskg%40mail.gmail.com
|
|
|
|
|
|
|
|
|
|
| |
Here we adjust heapgettup() and heapgettup_pagemode() to move the code
that fetches the first block number to scan out into a helper function.
This removes some code duplication.
Author: Melanie Plageman
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here remove some dead code from heapgettup() and heapgettup_pagemode()
which was trying to support NoMovementScanDirection scans. This code can
never be reached as standard_ExecutorRun() never calls ExecutePlan with
NoMovementScanDirection.
Additionally, plans which were scanning an unordered index would use
NoMovementScanDirection rather than ForwardScanDirection. There was no
real need for this, so here we adjust this so we use ForwardScanDirection
for unordered index scans. A comment in pathnodes.h claimed that
NoMovementScanDirection was used for PathKey reasons, but if that was
true, it no longer is, per code in build_index_paths().
This does change the non-text format of the EXPLAIN output so that
unordered index scans now have a "Forward" scan direction rather than
"NoMovement". The text format of EXPLAIN has not changed.
Author: Melanie Plageman
Reviewed-by: Tom Lane, David Rowley
Discussion: https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_T8QEqZqOL02rpi2bw@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4d417992613949af35530b4e8e83670c4e67e1b2. Broad
concerns about regressions caused by eager freezing strategy have been
raised. Whether or not these concerns can be worked through in any time
frame is far from certain.
Discussion: https://postgr.es/m/20230126004347.gepcmyenk2csxrri@awork3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eager freezing strategy avoids large build-ups of all-visible pages. It
makes VACUUM trigger page-level freezing whenever doing so will enable
the page to become all-frozen in the visibility map. This is useful for
tables that experience continual growth, particularly strict append-only
tables such as pgbench's history table. Eager freezing significantly
improves performance stability by spreading out the cost of freezing
over time, rather than doing most freezing during aggressive VACUUMs.
It complements the insert autovacuum mechanism added by commit b07642db.
VACUUM determines its freezing strategy based on the value of the new
vacuum_freeze_strategy_threshold GUC (or reloption) with logged tables.
Tables that exceed the size threshold use the eager freezing strategy.
Unlogged tables and temp tables always use eager freezing strategy,
since the added cost is negligible there. Non-permanent relations won't
incur any extra overhead in WAL written (for the obvious reason), nor in
pages dirtied (since any extra freezing will only take place on pages
whose PD_ALL_VISIBLE bit needed to be set either way).
VACUUM uses lazy freezing strategy for logged tables that fall under the
GUC size threshold. Page-level freezing triggers based on the criteria
established in commit 1de58df4, which added basic page-level freezing.
Eager freezing is strictly more aggressive than lazy freezing. Settings
like vacuum_freeze_min_age still get applied in just the same way in
every VACUUM, independent of the strategy in use. The only mechanical
difference between eager and lazy freezing strategies is that only the
former applies its own additional criteria to trigger freezing pages.
Note that even lazy freezing strategy will trigger freezing whenever a
page happens to have required that an FPI be written during pruning,
provided that the page will thereby become all-frozen in the visibility
map afterwards (due to the FPI optimization from commit 1de58df4).
The vacuum_freeze_strategy_threshold default setting is 4GB. This is a
relatively low setting that prioritizes performance stability. It will
be reviewed at the end of the Postgres 16 beta period.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Jeff Davis <pgsql@j-davis.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzkFok_6EAHuK39GaW4FjEFQsY=3J0AAd6FXk93u-Xq3Fg@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update some remaining comments in multixact.c that still described SLRU
truncation as happening in the checkpointer, rather than during VACUUM.
Follow-up to commit 5212d447.
Shi yu, with tweaks by me.
Author: Shi yu <shiy.fnst@fujitsu.com>
Discussion: https://postgr.es/m/OSZPR01MB631066BF246F8F74E83222FCFDC69@OSZPR01MB6310.jpnprd01.prod.outlook.com
|
|
|
|
|
|
|
|
|
| |
7fcbf6a and 2ff6555 changed the function signature of XLogPageRead()
but did not update the comment.
XLogReaderRoutine contains up to date information about the API, so no
need to repeat all that at XLogPageRead(), but fix the mentions of the
no longer existing function arguments.
|
|
|
|
|
|
|
|
|
|
|
| |
An upcoming patch by Melanie Plageman does some refactoring work in this
area. Run pgindent on that file now before making any changes so that
it's easier to maintain/evolve each of the individual patches doing the
refactor work. Additionally, add a few new required typedefs to the list
to make it easier to do future pgindent runs on this file during the
refactor work.
Discussion: https://postgr.es/m/CAAKRu_YSOnhKsDyFcqJsKtBSrd32DP-jjXmv7hL0BPD-z0TGXQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
| |
These are all not necessary from a correctness POV. However, in the near
future instr_time will be simplified to an int64, at which point gcc would
otherwise start to warn about the changed places.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20230116023639.rn36vf6ajqmfciua@awork3.anarazel.de
|
|
|
|
|
|
| |
Sirisha Chamarthi and Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAKrAKeXt-=bgm=d+EDmcC9kWoikp8kbVb3LH0K3K+AGGsykpHQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ending recovery based on recovery_target_xid matching with
recovery_target_inclusive = off, we printed an incorrect timestamp
(always 2000-01-01) in the "recovery stopping before ... transaction"
log message. This is a consequence of sloppy refactoring in
c945af80c: the code to fetch recordXtime out of the commit/abort
record used to be executed unconditionally, but it was changed
to get called only in the RECOVERY_TARGET_TIME case. We need only
flip the order of operations to restore the intended behavior.
Per report from Torsten Förtsch. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/CAKkG4_kUevPqbmyOfLajx7opAQk6Cvwkvx0HRcFjSPfRPTXanA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Part of a series to remove SHM_QUEUE. ilist.h style lists are more widely used
and have an easier to use interface.
As PROC_QUEUE is now unused, remove it.
Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an older version)
Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de
Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Presently, restore_command uses a different code path than
archive_cleanup_command and recovery_end_command. These code paths
are similar and can be easily combined, as long as it is possible to
identify if a command should:
- Issue a FATAL on signal.
- Exit immediately on SIGTERM.
While on it, this removes src/common/archive.c and its associated
header. Since the introduction of c96de2c, BuildRestoreCommand() has
become a simple wrapper of replace_percent_placeholders() able to call
make_native_path(). This simplifies shell_restore.c as long as
RestoreArchivedFile() includes a call to make_native_path().
Author: Nathan Bossart
Reviewed-by: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/20221227192449.GA3672473@nathanxps13
|