aboutsummaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
authorAlvaro Herrera <alvherre@alvh.no-ip.org>2024-08-12 18:17:56 -0400
committerAlvaro Herrera <alvherre@alvh.no-ip.org>2024-08-12 18:17:56 -0400
commitc899c6839f5de596a316da7fb94e4f917a242b04 (patch)
tree27e1cf8a80e26293962aa822f3149b326f80893f /src
parenta459ac504cc62421c08c9ee1ddc3e6f9be61f384 (diff)
downloadpostgresql-c899c6839f5de596a316da7fb94e4f917a242b04.tar.gz
postgresql-c899c6839f5de596a316da7fb94e4f917a242b04.zip
Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by DROP, this could cause a problem for a concurrent transaction recomputing the partition descriptor when running a prepared statement, because it tries to dereference a pointer to a tuple that's not found in a catalog scan. The existing retry logic added in commit dbca3469ebf8 is sufficient to cope with the overall problem, provided we don't try to dereference a non-existant heap tuple. Arguably, the code in RelationBuildPartitionDesc() has been wrong all along, since no check was added in commit 898e5e3290a7 against receiving a NULL tuple from the catalog scan; that bug has only become user-visible with DETACH CONCURRENTLY which was added in branch 14. Therefore, even though there's no known mechanism to cause a crash because of this, backpatch the addition of such a check to all supported branches. In branches prior to 14, this would cause the code to fail with a "missing relpartbound for relation XYZ" error instead of crashing; that's okay, because there are no reports of such behavior anyway. Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
Diffstat (limited to 'src')
-rw-r--r--src/backend/partitioning/partdesc.c30
1 files changed, 22 insertions, 8 deletions
diff --git a/src/backend/partitioning/partdesc.c b/src/backend/partitioning/partdesc.c
index c661a303bf1..b4e0ed0e710 100644
--- a/src/backend/partitioning/partdesc.c
+++ b/src/backend/partitioning/partdesc.c
@@ -209,6 +209,10 @@ retry:
* shared queue. We solve this problem by reading pg_class directly
* for the desired tuple.
*
+ * If the partition recently detached is also dropped, we get no tuple
+ * from the scan. In that case, we also retry, and next time through
+ * here, we don't see that partition anymore.
+ *
* The other problem is that DETACH CONCURRENTLY is in the process of
* removing a partition, which happens in two steps: first it marks it
* as "detach pending", commits, then unsets relpartbound. If
@@ -223,8 +227,6 @@ retry:
Relation pg_class;
SysScanDesc scan;
ScanKeyData key[1];
- Datum datum;
- bool isnull;
pg_class = table_open(RelationRelationId, AccessShareLock);
ScanKeyInit(&key[0],
@@ -233,17 +235,29 @@ retry:
ObjectIdGetDatum(inhrelid));
scan = systable_beginscan(pg_class, ClassOidIndexId, true,
NULL, 1, key);
+
+ /*
+ * We could get one tuple from the scan (the normal case), or zero
+ * tuples if the table has been dropped meanwhile.
+ */
tuple = systable_getnext(scan);
- datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
- RelationGetDescr(pg_class), &isnull);
- if (!isnull)
- boundspec = stringToNode(TextDatumGetCString(datum));
+ if (HeapTupleIsValid(tuple))
+ {
+ Datum datum;
+ bool isnull;
+
+ datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
+ RelationGetDescr(pg_class), &isnull);
+ if (!isnull)
+ boundspec = stringToNode(TextDatumGetCString(datum));
+ }
systable_endscan(scan);
table_close(pg_class, AccessShareLock);
/*
- * If we still don't get a relpartbound value, then it must be
- * because of DETACH CONCURRENTLY. Restart from the top, as
+ * If we still don't get a relpartbound value (either because
+ * boundspec is null or because there was no tuple), then it must
+ * be because of DETACH CONCURRENTLY. Restart from the top, as
* explained above. We only do this once, for two reasons: first,
* only one DETACH CONCURRENTLY session could affect us at a time,
* since each of them would have to wait for the snapshot under