diff options
author | Alvaro Herrera <alvherre@alvh.no-ip.org> | 2021-03-25 18:00:28 -0300 |
---|---|---|
committer | Alvaro Herrera <alvherre@alvh.no-ip.org> | 2021-03-25 18:00:28 -0300 |
commit | 71f4c8c6f74ba021e55d35b1128d22fb8c6e1629 (patch) | |
tree | c53d5e70ef2c8ec1723c9fb62fc8174ba6381e29 /src/backend/executor/execPartition.c | |
parent | 650d623530c884c087c565f1d3b8cd76f8fe2b95 (diff) | |
download | postgresql-71f4c8c6f74ba021e55d35b1128d22fb8c6e1629.tar.gz postgresql-71f4c8c6f74ba021e55d35b1128d22fb8c6e1629.zip |
ALTER TABLE ... DETACH PARTITION ... CONCURRENTLY
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
Diffstat (limited to 'src/backend/executor/execPartition.c')
-rw-r--r-- | src/backend/executor/execPartition.c | 29 |
1 files changed, 23 insertions, 6 deletions
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index b8da4c5967d..619aaffae43 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -569,6 +569,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate, int partidx) { ModifyTable *node = (ModifyTable *) mtstate->ps.plan; + Oid partOid = dispatch->partdesc->oids[partidx]; Relation partrel; int firstVarno = mtstate->resultRelInfo[0].ri_RangeTableIndex; Relation firstResultRel = mtstate->resultRelInfo[0].ri_RelationDesc; @@ -579,7 +580,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, EState *estate, oldcxt = MemoryContextSwitchTo(proute->memcxt); - partrel = table_open(dispatch->partdesc->oids[partidx], RowExclusiveLock); + partrel = table_open(partOid, RowExclusiveLock); leaf_part_rri = makeNode(ResultRelInfo); InitResultRelInfo(leaf_part_rri, @@ -1065,9 +1066,21 @@ ExecInitPartitionDispatchInfo(EState *estate, int dispatchidx; MemoryContext oldcxt; + /* + * For data modification, it is better that executor does not include + * partitions being detached, except in snapshot-isolation mode. This + * means that a read-committed transaction immediately gets a "no + * partition for tuple" error when a tuple is inserted into a partition + * that's being detached concurrently, but a transaction in repeatable- + * read mode can still use the partition. Note that because partition + * detach uses ShareLock on the partition (which conflicts with DML), + * we're certain that the detach won't be able to complete until any + * inserting transaction is done. + */ if (estate->es_partition_directory == NULL) estate->es_partition_directory = - CreatePartitionDirectory(estate->es_query_cxt); + CreatePartitionDirectory(estate->es_query_cxt, + IsolationUsesXactSnapshot()); oldcxt = MemoryContextSwitchTo(proute->memcxt); @@ -1645,9 +1658,10 @@ ExecCreatePartitionPruneState(PlanState *planstate, ListCell *lc; int i; + /* Executor must always include detached partitions */ if (estate->es_partition_directory == NULL) estate->es_partition_directory = - CreatePartitionDirectory(estate->es_query_cxt); + CreatePartitionDirectory(estate->es_query_cxt, true); n_part_hierarchies = list_length(partitionpruneinfo->prune_infos); Assert(n_part_hierarchies > 0); @@ -1713,9 +1727,12 @@ ExecCreatePartitionPruneState(PlanState *planstate, partrel); /* - * Initialize the subplan_map and subpart_map. Since detaching a - * partition requires AccessExclusiveLock, no partitions can have - * disappeared, nor can the bounds for any partition have changed. + * Initialize the subplan_map and subpart_map. + * + * Because we request detached partitions to be included, and + * detaching waits for old transactions, it is safe to assume that + * no partitions have disappeared since this query was planned. + * * However, new partitions may have been added. */ Assert(partdesc->nparts >= pinfo->nparts); |