diff options
author | Peter Geoghegan <pg@bowt.ie> | 2019-12-22 19:57:35 -0800 |
---|---|---|
committer | Peter Geoghegan <pg@bowt.ie> | 2019-12-22 19:57:35 -0800 |
commit | fe97c61c8777858cc1a271e657a7d812e100ef00 (patch) | |
tree | 07650ce203947221246cd0343ca708a3b2d88c25 /src | |
parent | b265aa1f39b672d263e37bdb715516d32128d0c4 (diff) | |
download | postgresql-fe97c61c8777858cc1a271e657a7d812e100ef00.tar.gz postgresql-fe97c61c8777858cc1a271e657a7d812e100ef00.zip |
Update nbtree LP_DEAD item deletion comments.
Comments about the consequences of clearing the BTP_HAS_GARBAGE page
flag bit that apply only to VACUUM were added to code that deals with
opportunistic deletion of LP_DEAD items by commit a760893d. The same
comment block was added to both _bt_delitems_vacuum() and
_bt_delitems_delete(). Correct _bt_delitems_delete()'s copy of the
comment block.
_bt_delitems_delete() reliably deletes items that were found by caller
to have their LP_DEAD bit set. There is no question about whether or
not unsetting the BTP_HAS_GARBAGE bit can miss some LP_DEAD items that
were set recently.
Also tweak a related section of the nbtree README.
Diffstat (limited to 'src')
-rw-r--r-- | src/backend/access/nbtree/README | 8 | ||||
-rw-r--r-- | src/backend/access/nbtree/nbtpage.c | 11 |
2 files changed, 6 insertions, 13 deletions
diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README index 334ef76e89f..c60a4d0d9e9 100644 --- a/src/backend/access/nbtree/README +++ b/src/backend/access/nbtree/README @@ -559,15 +559,15 @@ writer cannot observe the incomplete split flag before the first writer finishes the split. If we let concurrent writers on the primary observe an incomplete split flag on the same page, each writer would attempt to complete the unfinished split, corrupting the parent page. (Similarly, -replay of page deletion records does not hold a write lock on the leaf -page throughout; only the primary needs to blocks out concurrent writers -that insert on to the page being deleted.) +replay of page deletion records does not hold a write lock on the target +leaf page throughout; only the primary needs to block out concurrent +writers that insert on to the page being deleted.) During recovery all index scans start with ignore_killed_tuples = false and we never set kill_prior_tuple. We do this because the oldest xmin on the standby server can be older than the oldest xmin on the master server, which means tuples can be marked LP_DEAD even when they are -still visible on the standby. We don't WAL log tuple LP_DEAD bits, but +still visible on the standby. We don't WAL log tuple LP_DEAD bits, but they can still appear in the standby because of full page writes. So we must always ignore them in standby, and that means it's not worth setting them either. (When LP_DEAD-marked tuples are eventually deleted diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c index 404bad7da28..7bfae3c90ff 100644 --- a/src/backend/access/nbtree/nbtpage.c +++ b/src/backend/access/nbtree/nbtpage.c @@ -1074,15 +1074,8 @@ _bt_delitems_delete(Relation rel, Buffer buf, /* * Unlike _bt_delitems_vacuum, we *must not* clear the vacuum cycle ID, - * because this is not called by VACUUM. - */ - - /* - * Mark the page as not containing any LP_DEAD items. This is not - * certainly true (there might be some that have recently been marked, but - * weren't included in our target-item list), but it will almost always be - * true and it doesn't seem worth an additional page scan to check it. - * Remember that BTP_HAS_GARBAGE is only a hint anyway. + * because this is not called by VACUUM. Just clear the BTP_HAS_GARBAGE + * page flag, since we deleted all items with their LP_DEAD bit set. */ opaque = (BTPageOpaque) PageGetSpecialPointer(page); opaque->btpo_flags &= ~BTP_HAS_GARBAGE; |