diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2021-03-31 11:52:34 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2021-03-31 11:52:37 -0400 |
commit | 86dc90056dfdbd9d1b891718d2e5614e3e432f35 (patch) | |
tree | 8d281c58f67e90961688fd311673fbdb2f8c35c7 /doc/src | |
parent | 055fee7eb4dcc78e58672aef146334275e1cc40d (diff) | |
download | postgresql-86dc90056dfdbd9d1b891718d2e5614e3e432f35.tar.gz postgresql-86dc90056dfdbd9d1b891718d2e5614e3e432f35.zip |
Rework planning and execution of UPDATE and DELETE.
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/ddl.sgml | 3 | ||||
-rw-r--r-- | doc/src/sgml/fdwhandler.sgml | 65 | ||||
-rw-r--r-- | doc/src/sgml/perform.sgml | 47 | ||||
-rw-r--r-- | doc/src/sgml/postgres-fdw.sgml | 2 |
4 files changed, 63 insertions, 54 deletions
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index f073fbafd34..4a506bd6f00 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -4823,8 +4823,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01'; well, provided that typical queries allow the query planner to prune all but a small number of partitions. Planning times become longer and memory consumption becomes higher when more partitions remain after the planner - performs partition pruning. This is particularly true for the - <command>UPDATE</command> and <command>DELETE</command> commands. Another + performs partition pruning. Another reason to be concerned about having a large number of partitions is that the server's memory consumption may grow significantly over time, especially if many sessions touch large numbers of partitions. diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml index 635c9ec559a..0f2397df497 100644 --- a/doc/src/sgml/fdwhandler.sgml +++ b/doc/src/sgml/fdwhandler.sgml @@ -424,7 +424,8 @@ GetForeignUpperPaths(PlannerInfo *root, <para> <programlisting> void -AddForeignUpdateTargets(Query *parsetree, +AddForeignUpdateTargets(PlannerInfo *root, + Index rtindex, RangeTblEntry *target_rte, Relation target_relation); </programlisting> @@ -440,27 +441,31 @@ AddForeignUpdateTargets(Query *parsetree, </para> <para> - To do that, add <structname>TargetEntry</structname> items to - <literal>parsetree->targetList</literal>, containing expressions for the - extra values to be fetched. Each such entry must be marked - <structfield>resjunk</structfield> = <literal>true</literal>, and must have a distinct - <structfield>resname</structfield> that will identify it at execution time. - Avoid using names matching <literal>ctid<replaceable>N</replaceable></literal>, - <literal>wholerow</literal>, or - <literal>wholerow<replaceable>N</replaceable></literal>, as the core system can - generate junk columns of these names. - If the extra expressions are more complex than simple Vars, they - must be run through <function>eval_const_expressions</function> - before adding them to the target list. - </para> - - <para> - Although this function is called during planning, the - information provided is a bit different from that available to other - planning routines. - <literal>parsetree</literal> is the parse tree for the <command>UPDATE</command> or - <command>DELETE</command> command, while <literal>target_rte</literal> and - <literal>target_relation</literal> describe the target foreign table. + To do that, construct a <structname>Var</structname> representing + an extra value you need, and pass it + to <function>add_row_identity_var</function>, along with a name for + the junk column. (You can do this more than once if several columns + are needed.) You must choose a distinct junk column name for each + different <structname>Var</structname> you need, except + that <structname>Var</structname>s that are identical except for + the <structfield>varno</structfield> field can and should share a + column name. + The core system uses the junk column names + <literal>tableoid</literal> for a + table's <structfield>tableoid</structfield> column, + <literal>ctid</literal> + or <literal>ctid<replaceable>N</replaceable></literal> + for <structfield>ctid</structfield>, + <literal>wholerow</literal> + for a whole-row <structname>Var</structname> marked with + <structfield>vartype</structfield> = <type>RECORD</type>, + and <literal>wholerow<replaceable>N</replaceable></literal> + for a whole-row <structname>Var</structname> with + <structfield>vartype</structfield> equal to the table's declared rowtype. + Re-use these names when you can (the planner will combine duplicate + requests for identical junk columns). If you need another kind of + junk column besides these, it might be wise to choose a name prefixed + with your extension name, to avoid conflicts against other FDWs. </para> <para> @@ -495,8 +500,8 @@ PlanForeignModify(PlannerInfo *root, <literal>resultRelation</literal> identifies the target foreign table by its range table index. <literal>subplan_index</literal> identifies which target of the <structname>ModifyTable</structname> plan node this is, counting from zero; - use this if you want to index into <literal>plan->plans</literal> or other - substructure of the <literal>plan</literal> node. + use this if you want to index into per-target-relation substructures of the + <literal>plan</literal> node. </para> <para> @@ -703,10 +708,14 @@ ExecForeignUpdate(EState *estate, <literal>slot</literal> contains the new data for the tuple; it will match the row-type definition of the foreign table. <literal>planSlot</literal> contains the tuple that was generated by the - <structname>ModifyTable</structname> plan node's subplan; it differs from - <literal>slot</literal> in possibly containing additional <quote>junk</quote> - columns. In particular, any junk columns that were requested by - <function>AddForeignUpdateTargets</function> will be available from this slot. + <structname>ModifyTable</structname> plan node's subplan. Unlike + <literal>slot</literal>, this tuple contains only the new values for + columns changed by the query, so do not rely on attribute numbers of the + foreign table to index into <literal>planSlot</literal>. + Also, <literal>planSlot</literal> typically contains + additional <quote>junk</quote> columns. In particular, any junk columns + that were requested by <function>AddForeignUpdateTargets</function> will + be available from this slot. </para> <para> diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index aca1fe86a7c..70cb5a62e15 100644 --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -773,13 +773,14 @@ EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100; QUERY PLAN -------------------------------------------------------------------&zwsp;------------------------------------------------------------- - Update on tenk1 (cost=5.07..229.46 rows=0 width=0) (actual time=14.628..14.628 rows=0 loops=1) - -> Bitmap Heap Scan on tenk1 (cost=5.07..229.46 rows=101 width=250) (actual time=0.101..0.439 rows=100 loops=1) + Update on tenk1 (cost=5.08..230.08 rows=0 width=0) (actual time=3.791..3.792 rows=0 loops=1) + -> Bitmap Heap Scan on tenk1 (cost=5.08..230.08 rows=102 width=10) (actual time=0.069..0.513 rows=100 loops=1) Recheck Cond: (unique1 < 100) - -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.043..0.043 rows=100 loops=1) + Heap Blocks: exact=90 + -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.05 rows=102 width=0) (actual time=0.036..0.037 rows=300 loops=1) Index Cond: (unique1 < 100) - Planning time: 0.079 ms - Execution time: 14.727 ms + Planning Time: 0.113 ms + Execution Time: 3.850 ms ROLLBACK; </screen> @@ -807,30 +808,30 @@ ROLLBACK; <screen> EXPLAIN UPDATE parent SET f2 = f2 + 1 WHERE f1 = 101; - QUERY PLAN --------------------------------------------------------------------&zwsp;---------------- - Update on parent (cost=0.00..24.53 rows=0 width=0) - Update on parent - Update on child1 - Update on child2 - Update on child3 - -> Seq Scan on parent (cost=0.00..0.00 rows=1 width=14) - Filter: (f1 = 101) - -> Index Scan using child1_f1_key on child1 (cost=0.15..8.17 rows=1 width=14) - Index Cond: (f1 = 101) - -> Index Scan using child2_f1_key on child2 (cost=0.15..8.17 rows=1 width=14) - Index Cond: (f1 = 101) - -> Index Scan using child3_f1_key on child3 (cost=0.15..8.17 rows=1 width=14) - Index Cond: (f1 = 101) + QUERY PLAN +-------------------------------------------------------------------&zwsp;----------------------------------- + Update on parent (cost=0.00..24.59 rows=0 width=0) + Update on parent parent_1 + Update on child1 parent_2 + Update on child2 parent_3 + Update on child3 parent_4 + -> Result (cost=0.00..24.59 rows=4 width=14) + -> Append (cost=0.00..24.54 rows=4 width=14) + -> Seq Scan on parent parent_1 (cost=0.00..0.00 rows=1 width=14) + Filter: (f1 = 101) + -> Index Scan using child1_pkey on child1 parent_2 (cost=0.15..8.17 rows=1 width=14) + Index Cond: (f1 = 101) + -> Index Scan using child2_pkey on child2 parent_3 (cost=0.15..8.17 rows=1 width=14) + Index Cond: (f1 = 101) + -> Index Scan using child3_pkey on child3 parent_4 (cost=0.15..8.17 rows=1 width=14) + Index Cond: (f1 = 101) </screen> In this example the Update node needs to consider three child tables as well as the originally-mentioned parent table. So there are four input scanning subplans, one per table. For clarity, the Update node is annotated to show the specific target tables that will be updated, in the - same order as the corresponding subplans. (These annotations are new as - of <productname>PostgreSQL</productname> 9.5; in prior versions the reader had to - intuit the target tables by inspecting the subplans.) + same order as the corresponding subplans. </para> <para> diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index a1b426c50ba..c21e9be209b 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -78,7 +78,7 @@ invoked by <command>UPDATE</command> statements executed on partitioned tables, but it currently does not handle the case where a remote partition chosen to insert a moved row into is also an <command>UPDATE</command> - target partition that will be updated later. + target partition that will be updated elsewhere in the same command. </para> <para> |