diff options
author | Alvaro Herrera <alvherre@alvh.no-ip.org> | 2018-06-26 16:38:34 -0400 |
---|---|---|
committer | Alvaro Herrera <alvherre@alvh.no-ip.org> | 2018-06-26 16:48:10 -0400 |
commit | f49a80c481f74fa81407dce8e51dea6956cb64f8 (patch) | |
tree | 1820c0acefa4744043ea48c3876c3029a1e26aaf /contrib | |
parent | 4d54543efa5eb074ead4d0fadb2af4161c943044 (diff) | |
download | postgresql-f49a80c481f74fa81407dce8e51dea6956cb64f8.tar.gz postgresql-f49a80c481f74fa81407dce8e51dea6956cb64f8.zip |
Fix "base" snapshot handling in logical decoding
Two closely related bugs are fixed. First, xmin of logical slots was
advanced too early. During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them. The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.
To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN. This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment). Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.
The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise. To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.
Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.
Liberally sprinkle code comments, and rewrite a few existing ones. This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.
Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
Diffstat (limited to 'contrib')
-rw-r--r-- | contrib/test_decoding/Makefile | 3 | ||||
-rw-r--r-- | contrib/test_decoding/expected/oldest_xmin.out | 27 | ||||
-rw-r--r-- | contrib/test_decoding/expected/snapshot_transfer.out | 49 | ||||
-rw-r--r-- | contrib/test_decoding/specs/oldest_xmin.spec | 37 | ||||
-rw-r--r-- | contrib/test_decoding/specs/snapshot_transfer.spec | 42 |
5 files changed, 157 insertions, 1 deletions
diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index 1d601d8144c..afcab930f7a 100644 --- a/contrib/test_decoding/Makefile +++ b/contrib/test_decoding/Makefile @@ -50,7 +50,8 @@ regresscheck-install-force: | submake-regress submake-test_decoding temp-install $(pg_regress_installcheck) \ $(REGRESSCHECKS) -ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml +ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml \ + oldest_xmin snapshot_transfer isolationcheck: | submake-isolation submake-test_decoding temp-install $(pg_isolation_regress_check) \ diff --git a/contrib/test_decoding/expected/oldest_xmin.out b/contrib/test_decoding/expected/oldest_xmin.out new file mode 100644 index 00000000000..d09342c4bec --- /dev/null +++ b/contrib/test_decoding/expected/oldest_xmin.out @@ -0,0 +1,27 @@ +Parsed test spec with 2 sessions + +starting permutation: s0_begin s0_getxid s1_begin s1_insert s0_alter s0_commit s0_checkpoint s0_get_changes s1_commit s0_vacuum s0_get_changes +step s0_begin: BEGIN; +step s0_getxid: SELECT txid_current() IS NULL; +?column? + +f +step s1_begin: BEGIN; +step s1_insert: INSERT INTO harvest VALUES ((1, 2, 3)); +step s0_alter: ALTER TYPE basket DROP ATTRIBUTE mangos; +step s0_commit: COMMIT; +step s0_checkpoint: CHECKPOINT; +step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); +data + +step s1_commit: COMMIT; +step s0_vacuum: VACUUM FULL; +step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); +data + +BEGIN +table public.harvest: INSERT: fruits[basket]:'(1,2,3)' +COMMIT +?column? + +stop diff --git a/contrib/test_decoding/expected/snapshot_transfer.out b/contrib/test_decoding/expected/snapshot_transfer.out new file mode 100644 index 00000000000..87bed03f766 --- /dev/null +++ b/contrib/test_decoding/expected/snapshot_transfer.out @@ -0,0 +1,49 @@ +Parsed test spec with 2 sessions + +starting permutation: s0_begin s0_begin_sub0 s0_log_assignment s0_sub_get_base_snap s1_produce_new_snap s0_insert s0_end_sub0 s0_commit s0_get_changes +step s0_begin: BEGIN; +step s0_begin_sub0: SAVEPOINT s0; +step s0_log_assignment: SELECT txid_current() IS NULL; +?column? + +f +step s0_sub_get_base_snap: INSERT INTO dummy VALUES (0); +step s1_produce_new_snap: ALTER TABLE harvest ADD COLUMN mangos int; +step s0_insert: INSERT INTO harvest VALUES (1, 2, 3); +step s0_end_sub0: RELEASE SAVEPOINT s0; +step s0_commit: COMMIT; +step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); +data + +BEGIN +table public.dummy: INSERT: i[integer]:0 +table public.harvest: INSERT: apples[integer]:1 pears[integer]:2 mangos[integer]:3 +COMMIT +?column? + +stop + +starting permutation: s0_begin s0_begin_sub0 s0_log_assignment s0_begin_sub1 s0_sub_get_base_snap s1_produce_new_snap s0_insert s0_end_sub1 s0_end_sub0 s0_commit s0_get_changes +step s0_begin: BEGIN; +step s0_begin_sub0: SAVEPOINT s0; +step s0_log_assignment: SELECT txid_current() IS NULL; +?column? + +f +step s0_begin_sub1: SAVEPOINT s1; +step s0_sub_get_base_snap: INSERT INTO dummy VALUES (0); +step s1_produce_new_snap: ALTER TABLE harvest ADD COLUMN mangos int; +step s0_insert: INSERT INTO harvest VALUES (1, 2, 3); +step s0_end_sub1: RELEASE SAVEPOINT s1; +step s0_end_sub0: RELEASE SAVEPOINT s0; +step s0_commit: COMMIT; +step s0_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); +data + +BEGIN +table public.dummy: INSERT: i[integer]:0 +table public.harvest: INSERT: apples[integer]:1 pears[integer]:2 mangos[integer]:3 +COMMIT +?column? + +stop diff --git a/contrib/test_decoding/specs/oldest_xmin.spec b/contrib/test_decoding/specs/oldest_xmin.spec new file mode 100644 index 00000000000..4f8af70aa26 --- /dev/null +++ b/contrib/test_decoding/specs/oldest_xmin.spec @@ -0,0 +1,37 @@ +# Test advancement of the slot's oldest xmin + +setup +{ + SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); -- must be first write in xact + DROP TYPE IF EXISTS basket; + CREATE TYPE basket AS (apples integer, pears integer, mangos integer); + DROP TABLE IF EXISTS harvest; + CREATE TABLE harvest(fruits basket); +} + +teardown +{ + DROP TABLE IF EXISTS harvest; + DROP TYPE IF EXISTS basket; + SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot'); +} + +session "s0" +step "s0_begin" { BEGIN; } +step "s0_getxid" { SELECT txid_current() IS NULL; } +step "s0_alter" { ALTER TYPE basket DROP ATTRIBUTE mangos; } +step "s0_commit" { COMMIT; } +step "s0_checkpoint" { CHECKPOINT; } +step "s0_vacuum" { VACUUM FULL; } +step "s0_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); } + +session "s1" +step "s1_begin" { BEGIN; } +step "s1_insert" { INSERT INTO harvest VALUES ((1, 2, 3)); } +step "s1_commit" { COMMIT; } + +# Checkpoint with following get_changes forces to advance xmin. ALTER of a +# composite type is a rare form of DDL which allows T1 to see the tuple which +# will be removed (xmax set) before T1 commits. That is, interlocking doesn't +# forbid modifying catalog after someone read it (and didn't commit yet). +permutation "s0_begin" "s0_getxid" "s1_begin" "s1_insert" "s0_alter" "s0_commit" "s0_checkpoint" "s0_get_changes" "s1_commit" "s0_vacuum" "s0_get_changes" diff --git a/contrib/test_decoding/specs/snapshot_transfer.spec b/contrib/test_decoding/specs/snapshot_transfer.spec new file mode 100644 index 00000000000..47db7fd90ae --- /dev/null +++ b/contrib/test_decoding/specs/snapshot_transfer.spec @@ -0,0 +1,42 @@ +# Test snapshot transfer from subxact to top-level and receival of later snaps. + +setup +{ + SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); -- must be first write in xact + DROP TABLE IF EXISTS dummy; + CREATE TABLE dummy(i int); + DROP TABLE IF EXISTS harvest; + CREATE TABLE harvest(apples int, pears int); +} + +teardown +{ + DROP TABLE IF EXISTS harvest; + DROP TABLE IF EXISTS dummy; + SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot'); +} + +session "s0" +step "s0_begin" { BEGIN; } +step "s0_begin_sub0" { SAVEPOINT s0; } +step "s0_log_assignment" { SELECT txid_current() IS NULL; } +step "s0_begin_sub1" { SAVEPOINT s1; } +step "s0_sub_get_base_snap" { INSERT INTO dummy VALUES (0); } +step "s0_insert" { INSERT INTO harvest VALUES (1, 2, 3); } +step "s0_end_sub0" { RELEASE SAVEPOINT s0; } +step "s0_end_sub1" { RELEASE SAVEPOINT s1; } +step "s0_insert2" { INSERT INTO harvest VALUES (1, 2, 3, 4); } +step "s0_commit" { COMMIT; } +step "s0_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1'); } + +session "s1" +step "s1_produce_new_snap" { ALTER TABLE harvest ADD COLUMN mangos int; } + +# start top-level without base snap, get base snap in subxact, then create new +# snap and make sure it is queued. +permutation "s0_begin" "s0_begin_sub0" "s0_log_assignment" "s0_sub_get_base_snap" "s1_produce_new_snap" "s0_insert" "s0_end_sub0" "s0_commit" "s0_get_changes" + +# In previous test, we firstly associated subxact with xact and only then got +# base snap; now nest one more subxact to get snap first and only then (at +# commit) associate it with toplevel. +permutation "s0_begin" "s0_begin_sub0" "s0_log_assignment" "s0_begin_sub1" "s0_sub_get_base_snap" "s1_produce_new_snap" "s0_insert" "s0_end_sub1" "s0_end_sub0" "s0_commit" "s0_get_changes" |