diff options
author | Tom Lane <tgl@sss.pgh.pa.us> | 2016-06-21 18:38:20 -0400 |
---|---|---|
committer | Tom Lane <tgl@sss.pgh.pa.us> | 2016-06-21 18:38:20 -0400 |
commit | 8b9d323cb9810109e3e5aab1ead427cbbb7aa77e (patch) | |
tree | 522428b6759a078fba63be994ae392fbf79c34e9 /src/backend/optimizer/plan/createplan.c | |
parent | 936b62ddf247c26e8cc4fca34bd8a4c2e65c09fd (diff) | |
download | postgresql-8b9d323cb9810109e3e5aab1ead427cbbb7aa77e.tar.gz postgresql-8b9d323cb9810109e3e5aab1ead427cbbb7aa77e.zip |
Refactor planning of projection steps that don't need a Result plan node.
The original upper-planner-pathification design (commit 3fc6e2d7f5b652b4)
assumed that we could always determine during Path formation whether or not
we would need a Result plan node to perform projection of a targetlist.
That turns out not to work very well, though, because createplan.c still
has some responsibilities for choosing the specific target list associated
with sorting/grouping nodes (in particular it might choose to add resjunk
columns for sorting). We might not ever refactor that --- doing so would
push more work into Path formation, which isn't attractive --- and we
certainly won't do so for 9.6. So, while create_projection_path and
apply_projection_to_path can tell for sure what will happen if the subpath
is projection-capable, they can't tell for sure when it isn't. This is at
least a latent bug in apply_projection_to_path, which might think it can
apply a target to a non-projecting node when the node will end up computing
something different.
Also, I'd tied the creation of a ProjectionPath node to whether or not a
Result is needed, but it turns out that we sometimes need a ProjectionPath
node anyway to avoid modifying a possibly-shared subpath node. Callers had
to use create_projection_path for such cases, and we added code to them
that knew about the potential omission of a Result node and attempted to
adjust the cost estimates for that. That was uncertainly correct and
definitely ugly/unmaintainable.
To fix, have create_projection_path explicitly check whether a Result
is needed and adjust its cost estimate accordingly, though it creates
a ProjectionPath in either case. apply_projection_to_path is now mostly
just an optimized version that can avoid creating an extra Path node when
the input is known to not be shared with any other live path. (There
is one case that create_projection_path doesn't handle, which is pushing
parallel-safe expressions below a Gather node. We could make it do that
by duplicating the GatherPath, but there seems no need as yet.)
create_projection_plan still has to recheck the tlist-match condition,
which means that if the matching situation does get changed by createplan.c
then we'll have made a slightly incorrect cost estimate. But there seems
no help for that in the near term, and I doubt it occurs often enough,
let alone would change planning decisions often enough, to be worth
stressing about.
I added a "dummypp" field to ProjectionPath to track whether
create_projection_path thinks a Result is needed. This is not really
necessary as-committed because create_projection_plan doesn't look at the
flag; but it seems like a good idea to remember what we thought when
forming the cost estimate, if only for debugging purposes.
In passing, get rid of the target_parallel parameter added to
apply_projection_to_path by commit 54f5c5150. I don't think that's a good
idea because it involves callers in what should be an internal decision,
and opens us up to missing optimization opportunities if callers think they
don't need to provide a valid flag, as most don't. For the moment, this
just costs us an extra has_parallel_hazard call when planning a Gather.
If that starts to look expensive, I think a better solution would be to
teach PathTarget to carry/cache knowledge of parallel-safety of its
contents.
Diffstat (limited to 'src/backend/optimizer/plan/createplan.c')
-rw-r--r-- | src/backend/optimizer/plan/createplan.c | 35 |
1 files changed, 21 insertions, 14 deletions
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index ab8df76a6ed..b2db6e8d035 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -1409,8 +1409,9 @@ create_gather_plan(PlannerInfo *root, GatherPath *best_path) /* * create_projection_plan * - * Create a Result node to do a projection step and (recursively) plans - * for its subpaths. + * Create a plan tree to do a projection step and (recursively) plans + * for its subpaths. We may need a Result node for the projection, + * but sometimes we can just let the subplan do the work. */ static Plan * create_projection_plan(PlannerInfo *root, ProjectionPath *best_path) @@ -1425,31 +1426,37 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path) tlist = build_path_tlist(root, &best_path->path); /* - * We might not really need a Result node here. There are several ways - * that this can happen. For example, MergeAppend doesn't project, so we - * would have thought that we needed a projection to attach resjunk sort - * columns to its output ... but create_merge_append_plan might have added - * those same resjunk sort columns to both MergeAppend and its children. - * Alternatively, apply_projection_to_path might have created a projection - * path as the subpath of a Gather node even though the subpath was - * projection-capable. So, if the subpath is capable of projection or the - * desired tlist is the same expression-wise as the subplan's, just jam it - * in there. We'll have charged for a Result that doesn't actually appear - * in the plan, but that's better than having a Result we don't need. + * We might not really need a Result node here, either because the subplan + * can project or because it's returning the right list of expressions + * anyway. Usually create_projection_path will have detected that and set + * dummypp if we don't need a Result; but its decision can't be final, + * because some createplan.c routines change the tlists of their nodes. + * (An example is that create_merge_append_plan might add resjunk sort + * columns to a MergeAppend.) So we have to recheck here. If we do + * arrive at a different answer than create_projection_path did, we'll + * have made slightly wrong cost estimates; but label the plan with the + * cost estimates we actually used, not "corrected" ones. (XXX this could + * be cleaned up if we moved more of the sortcolumn setup logic into Path + * creation, but that would add expense to creating Paths we might end up + * not using.) */ if (is_projection_capable_path(best_path->subpath) || tlist_same_exprs(tlist, subplan->targetlist)) { + /* Don't need a separate Result, just assign tlist to subplan */ plan = subplan; plan->targetlist = tlist; - /* Adjust cost to match what we thought during planning */ + /* Label plan with the estimated costs we actually used */ plan->startup_cost = best_path->path.startup_cost; plan->total_cost = best_path->path.total_cost; + plan->plan_rows = best_path->path.rows; + plan->plan_width = best_path->path.pathtarget->width; /* ... but be careful not to munge subplan's parallel-aware flag */ } else { + /* We need a Result node */ plan = (Plan *) make_result(tlist, NULL, subplan); copy_generic_path_info(plan, (Path *) best_path); |