aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/plan/planner.c
Commit message (Collapse)AuthorAge
...
* JIT tuple deforming in LLVM JIT provider.Andres Freund2018-03-26
| | | | | | | | | | | | | | | | | | | | | Performing JIT compilation for deforming gains performance benefits over unJITed deforming from compile-time knowledge of the tuple descriptor. Fixed column widths, NOT NULLness, etc can be taken advantage of. Right now the JITed deforming is only used when deforming tuples as part of expression evaluation (and obviously only if the descriptor is known). It's likely to be beneficial in other cases, too. By default tuple deforming is JITed whenever an expression is JIT compiled. There's a separate boolean GUC controlling it, but that's expected to be primarily useful for development and benchmarking. Docs will follow in a later commit containing docs for the whole JIT feature. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
* Add expression compilation support to LLVM JIT provider.Andres Freund2018-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In addition to the interpretation of expressions (which back evaluation of WHERE clauses, target list projection, aggregates transition values etc) support compiling expressions to native code, using the infrastructure added in earlier commits. To avoid duplicating a lot of code, only support emitting code for cases that are likely to be performance critical. For expression steps that aren't deemed that, use the existing interpreter. The generated code isn't great - some architectural changes are required to address that. But this already yields a significant speedup for some analytics queries, particularly with WHERE clauses filtering a lot, or computing multiple aggregates. Author: Andres Freund Tested-By: Thomas Munro Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de Disable JITing for VALUES() nodes. VALUES() nodes are only ever executed once. This is primarily helpful for debugging, when forcing JITing even for cheap queries. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
* Basic planner and executor integration for JIT.Andres Freund2018-03-22
| | | | | | | | | | | | | | | | | | | | | This adds simple cost based plan time decision about whether JIT should be performed. jit_above_cost, jit_optimize_above_cost are compared with the total cost of a plan, and if the cost is above them JIT is performed / optimization is performed respectively. For that PlannedStmt and EState have a jitFlags (es_jit_flags) field that stores information about what JIT operations should be performed. EState now also has a new es_jit field, which can store a JitContext. When there are no errors the context is released in standard_ExecutorEnd(). It is likely that the default values for jit_[optimize_]above_cost will need to be adapted further, but in my test these values seem to work reasonably. Author: Andres Freund, with feedback by Peter Eisentraut Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
* Implement partition-wise grouping/aggregation.Robert Haas2018-03-22
| | | | | | | | | | | | | | | | | | | | | If the partition keys of input relation are part of the GROUP BY clause, all the rows belonging to a given group come from a single partition. This allows aggregation/grouping over a partitioned relation to be broken down * into aggregation/grouping on each partition. This should be no worse, and often better, than the normal approach. If the GROUP BY clause does not contain all the partition keys, we can still perform partial aggregation for each partition and then finalize aggregation after appending the partial results. This is less certain to be a win, but it's still useful. Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, and Rafia Sabih. Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
* Repair crash with unsortable grouping sets.Andrew Gierth2018-03-21
| | | | | | | | | | | If there were multiple grouping sets, none of them empty, all of which were unsortable, then an oversight in consider_groupingsets_paths led to a null pointer dereference. Fix, and add a regression test for this case. Per report from Dang Minh Huong, though I didn't use their patch. Backpatch to 10.x where hashed grouping sets were added.
* Don't pass the grouping target around unnecessarily.Robert Haas2018-03-20
| | | | | | | | | | | Since commit 4f15e5d09de276fb77326be5567dd9796008ca2e made grouped_rel set reltarget, a variety of other functions can just get it from grouped_rel instead of having to pass it around explicitly. Simplify accordingly. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com
* Determine grouping strategies in create_grouping_paths.Robert Haas2018-03-20
| | | | | | | | | | Partition-wise aggregate will call create_ordinary_grouping_paths multiple times and we don't want to redo this work every time; have the caller do it instead and pass the details down. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoY7VYYn9a7YHj1nJL6zj6BkHmt4K-un9LRmXkyqRZyynA@mail.gmail.com
* Defer creation of partially-grouped relation until it's needed.Robert Haas2018-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | This avoids unnecessarily creating a RelOptInfo for which we have no actual need. This idea is from Ashutosh Bapat, who wrote a very different patch to accomplish a similar goal. It will be more important if and when we get partition-wise aggregate, since then there could be many partially grouped relations all of which could potentially be unnecessary. In passing, this sets the grouping relation's reltarget, which wasn't done previously but makes things simpler for this refactoring. Along the way, adjust things so that add_paths_to_partial_grouping_rel, now renamed create_partial_grouping_paths, does not perform the Gather or Gather Merge steps to generate non-partial paths from partial paths; have the caller do it instead. This is again for the convenience of partition-wise aggregate, which wants to inject additional partial paths are created and before we decide which ones to Gather/Gather Merge. This might seem like a separate change, but it's actually pretty closely entangled; I couldn't really see much value in separating it and having to change some things twice. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com
* Split create_grouping_paths into degenerate and non-degenerate cases.Robert Haas2018-03-15
| | | | | | | | | | | | | | | There's no functional change here, or at least I hope there isn't, just code rearrangement. The rearrangement is motivated by partition-wise aggregate, which doesn't need to consider the degenerate case but wants to reuse the logic for the ordinary case. Based loosely on a patch from Ashutosh Bapat and Jeevan Chalke, but I whacked it around pretty heavily. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CAFjFpRewpqCmVkwvq6qrRjmbMDpN0CZvRRzjd8UvncczA3Oz1Q@mail.gmail.com
* Pass additional arguments to a couple of grouping-related functions.Robert Haas2018-03-15
| | | | | | | | | | | | | | | | | | | get_number_of_groups() and make_partial_grouping_target() currently fish information directly out of the PlannerInfo; in the former case, the target list, and in the latter case, the HAVING qual. This works fine if there's only one grouping relation, but if the pending patch for partition-wise aggregate gets committed, we'll have multiple grouping relations and must therefore use appropriately translated versions of these values for each one. To make that simpler, pass the values to be used as arguments. Jeevan Chalke. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CAM2+6=UqFnFUypOvLdm5TgC+2M=-E0Q7_LOh0VDFFzmk2BBPzQ@mail.gmail.com Discussion: http://postgr.es/m/CAM2+6=W+L=C4yBqMrgrfTfNtbtmr4T53-hZhwbA2kvbZ9VMrrw@mail.gmail.com
* Fix function-header comments in planner.cStephen Frost2018-03-14
| | | | | | | | | | | | | | In b5635948ab1, a couple of function header comments weren't changed, or weren't changed correctly, to reflect the arguments being passed into the functions. Specifically, get_number_of_groups() had the wrong argument name in the commit and create_grouping_paths() wasn't updated even though the arguments had been changed. The issue with create_grouping_paths() was noticed by Ashutosh Bapat, while I discovered the issue with get_number_of_groups() by looking to see if there were any similar issues from that commit. Discussion: https://postgr.es/m/CAFjFpRcbp4702jcp387PExt3fNCt62QJN8++DQGwBhsW6wRHWA@mail.gmail.com
* Let Parallel Append over simple UNION ALL have partial subpaths.Robert Haas2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | A simple UNION ALL gets flattened into an appendrel of subquery RTEs, but up until now it's been impossible for the appendrel to use the partial paths for the subqueries, so we can implement the appendrel as a Parallel Append but only one with non-partial paths as children. There are three separate obstacles to removing that limitation. First, when planning a subquery, propagate any partial paths to the final_rel so that they are potentially visible to outer query levels (but not if they have initPlans attached, because that wouldn't be safe). Second, after planning a subquery, propagate any partial paths for the final_rel to the subquery RTE in the outer query level in the same way we do for non-partial paths. Third, teach finalize_plan() to account for the possibility that the fake parameter we use for rescan signalling when the plan contains a Gather (Merge) node may be propagated from an outer query level. Patch by me, reviewed and tested by Amit Khandekar, Rajkumar Raghuwanshi, and Ashutosh Bapat. Test cases based on examples by Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com
* Fix improper uses of canonicalize_qual().Tom Lane2018-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the things canonicalize_qual() does is to remove constant-NULL subexpressions of top-level AND/OR clauses. It does that on the assumption that what it's given is a top-level WHERE clause, so that NULL can be treated like FALSE. Although this is documented down inside a subroutine of canonicalize_qual(), it wasn't mentioned in the documentation of that function itself, and some callers hadn't gotten that memo. Notably, commit d007a9505 caused get_relation_constraints() to apply canonicalize_qual() to CHECK constraints. That allowed constraint exclusion to misoptimize situations in which a CHECK constraint had a provably-NULL subclause, as seen in the regression test case added here, in which a child table that should be scanned is not. (Although this thinko is ancient, the test case doesn't fail before 9.2, for reasons I've not bothered to track down in detail. There may be related cases that do fail before that.) More recently, commit f0e44751d added an independent bug by applying canonicalize_qual() to index expressions, which is even sillier since those might not even be boolean. If they are, though, I think this could lead to making incorrect index entries for affected index expressions in v10. I haven't attempted to prove that though. To fix, add an "is_check" parameter to canonicalize_qual() to specify whether it should assume WHERE or CHECK semantics, and make it perform NULL-elimination accordingly. Adjust the callers to apply the right semantics, or remove the call entirely in cases where it's not known that the expression has one or the other semantics. I also removed the call in some cases involving partition expressions, where it should be a no-op because such expressions should be canonical already ... and was a no-op, independently of whether it could in principle have done something, because it was being handed the qual in implicit-AND format which isn't what it expects. In HEAD, add an Assert to catch that type of mistake in future. This represents an API break for external callers of canonicalize_qual(). While that's intentional in HEAD to make such callers think about which case applies to them, it seems like something we probably wouldn't be thanked for in released branches. Hence, in released branches, the extra parameter is added to a new function canonicalize_qual_ext(), and canonicalize_qual() is a wrapper that retains its old behavior. Patch by me with suggestions from Dean Rasheed. Back-patch to all supported branches. Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us
* Correctly assess parallel-safety of tlists when SRFs are used.Robert Haas2018-03-08
| | | | | | | | | | | | | | | | Since commit 69f4b9c85f168ae006929eec44fc44d569e846b9, the existing code was no longer assessing the parallel-safety of the real tlist for each upper rel, but rather the first of possibly several tlists created by split_pathtarget_at_srfs(). Repair. Even though this is clearly wrong, it's not clear that it has any user-visible consequences at the moment, so no back-patch for now. If we discover later that it does have user-visible consequences, we might need to back-patch this to v10. Patch by me, per a report from Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+Tgmoaob_Strkg4Dcx=VyxnyXtrmkV=ofj=pX7gH9hSre-g0Q@mail.gmail.com
* Minor cleanup of code related to partially_grouped_rel.Robert Haas2018-02-27
| | | | | | Jeevan Chalke Discussion: http://postgr.es/m/CAM2+6=X9kxQoL2ZqZ00E6asBt9z+rfyWbOmhXJ0+8fPAyMZ9Jg@mail.gmail.com
* Fix logic error in add_paths_to_partial_grouping_rel.Robert Haas2018-02-27
| | | | | | | | | | | | | Commit 3bf05e096b9f8375e640c5d7996aa57efd7f240c sometimes uses the cheapest_partial_path variable in this function to mean the cheapest one from the input rel and at other times the cheapest one from the partially grouped rel, but it never resets it, so we can end up with bad plans, leading to "ERROR: Aggref found in non-Agg plan node". Jeevan Chalke, per a report from Andreas Joseph Krogh and a separate off-list report from Rajkumar Raghuwanshi Discussion: http://postgr.es/m/CAM2+6=X9kxQoL2ZqZ00E6asBt9z+rfyWbOmhXJ0+8fPAyMZ9Jg@mail.gmail.com
* Add a new upper planner relation for partially-aggregated results.Robert Haas2018-02-26
| | | | | | | | | | | | | | | | | | | | | | | Up until now, we've abused grouped_rel->partial_pathlist as a place to store partial paths that have been partially aggregate, but that's really not correct, because a partial path for a relation is supposed to be one which produces the correct results with the addition of only a Gather or Gather Merge node, and these paths also require a Finalize Aggregate step. Instead, add a new partially_group_rel which can hold either partial paths (which need to be gathered and then have aggregation finalized) or non-partial paths (which only need to have aggregation finalized). This allows us to reuse generate_gather_paths for partially_grouped_rel instead of writing new code, so that this patch actually basically no net new code while making things cleaner, simplifying things for pending patches for partition-wise aggregate. Robert Haas and Jeevan Chalke. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CA+TgmobrzFYS3+U8a_BCy3-hOvh5UyJbC18rEcYehxhpw5=ETA@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZyQEjdBNuoG9-wC5GQ5GrO4544Myo13dVptvx+uLg9uQ@mail.gmail.com
* Fix parallel index builds for dynamic_shared_memory_type=none.Robert Haas2018-02-12
| | | | | | | | | | | The previous code failed to realize that this setting effectively disables parallelism, and would crash if it decided to attempt parallelism anyway. Instead, treat it as a disabling condition. Kyotaro Horiguchi, who also reported the issue. Reviewed by Michael Paquier and Peter Geoghegan. Discussion: http://postgr.es/m/20180209.170635.256350357.horiguchi.kyotaro@lab.ntt.co.jp
* Support parallel btree index builds.Robert Haas2018-02-02
| | | | | | | | | | | | | | | | | | | | | | To make this work, tuplesort.c and logtape.c must also support parallelism, so this patch adds that infrastructure and then applies it to the particular case of parallel btree index builds. Testing to date shows that this can often be 2-3x faster than a serial index build. The model for deciding how many workers to use is fairly primitive at present, but it's better than not having the feature. We can refine it as we get more experience. Peter Geoghegan with some help from Rushabh Lathia. While Heikki Linnakangas is not an author of this patch, he wrote other patches without which this feature would not have been possible, and therefore the release notes should possibly credit him as an author of this feature. Reviewed by Claudio Freire, Heikki Linnakangas, Thomas Munro, Tels, Amit Kapila, me. Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
* Factor some code out of create_grouping_paths.Robert Haas2018-01-26
| | | | | | | | | | | | | | This is preparatory refactoring to prepare the way for partition-wise aggregate, which will reuse the new subroutines for child grouping rels. It also does not seem like a bad idea on general principle, as the function was getting pretty long. Jeevan Chalke. The larger patch series of which this patch is a part was reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, Ashutosh Bapat, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, and me. Some cosmetic changes by me. Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
* Allow UPDATE to move rows between partitions.Robert Haas2018-01-19
| | | | | | | | | | | | | | | | | When an UPDATE causes a row to no longer match the partition constraint, try to move it to a different partition where it does match the partition constraint. In essence, the UPDATE is split into a DELETE from the old partition and an INSERT into the new one. This can lead to surprising behavior in concurrency scenarios because EvalPlanQual rechecks won't work as they normally did; the known problems are documented. (There is a pending patch to improve the situation further, but it needs more review.) Amit Khandekar, reviewed and tested by Amit Langote, David Rowley, Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro Herrera, Amit Kapila, and me. A few final revisions by me. Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* Re-fix wrong costing of Sort under Gather Merge.Robert Haas2017-12-19
| | | | | | | | | | | | | | | | Commit dc02c7bca4dccf7de278cdc6b3325a829e75b252 changed this call to create_sort_path() to take -1 rather than limit_tuples because, at that time, there was no way for a Sort beneath a Gather Merge to become a top-N sort. Later, commit 3452dc5240da43e833118484e1e9b4894d04431c provided a way for a Sort beneath a Gather Merge to become a top-N sort, but failed to revert the previous commit in the process. Do that. Report and analysis by Jeff Janes; patch by Thomas Munro; review by Amit Kapila and by me. Discussion: http://postgr.es/m/CAEepm=1BWtC34vUroA0Uqjw02MaqdUrW+d6WD85_k8SLyPiKHQ@mail.gmail.com
* Support Parallel Append plan nodes.Robert Haas2017-12-05
| | | | | | | | | | | | | | | | | | | | When we create an Append node, we can spread out the workers over the subplans instead of piling on to each subplan one at a time, which should typically be a bit more efficient, both because the startup cost of any plan executed entirely by one worker is paid only once and also because of reduced contention. We can also construct Append plans using a mix of partial and non-partial subplans, which may allow for parallelism in places that otherwise couldn't support it. Unfortunately, this patch doesn't handle the important case of parallelizing UNION ALL by running each branch in a separate worker; the executor infrastructure is added here, but more planner work is needed. Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
* Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.Tom Lane2017-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rewriteTargetListUD's processing is dependent on the relkind of the query's target table. That was fine at the time it was made to act that way, even for queries on inheritance trees, because all tables in an inheritance tree would necessarily be plain tables. However, the 9.5 feature addition allowing some members of an inheritance tree to be foreign tables broke the assumption that rewriteTargetListUD's output tlist could be applied to all child tables with nothing more than column-number mapping. This led to visible failures if foreign child tables had row-level triggers, and would also break in cases where child tables belonged to FDWs that used methods other than CTID for row identification. To fix, delay running rewriteTargetListUD until after the planner has expanded inheritance, so that it is applied separately to the (already mapped) tlist for each child table. We can conveniently call it from preprocess_targetlist. Refactor associated code slightly to avoid the need to heap_open the target relation multiple times during preprocess_targetlist. (The APIs remain a bit ugly, particularly around the point of which steps scribble on parse->targetList and which don't. But avoiding such scribbling would require a change in FDW callback APIs, which is more pain than it's worth.) Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when we transition from rows providing a CTID to rows that don't. (That's really an independent bug, but it manifests in much the same cases.) Add a regression test checking one manifestation of this problem, which was that row-level triggers on a foreign child table did not work right. Back-patch to 9.5 where the problem was introduced. Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
* Pass InitPlan values to workers via Gather (Merge).Robert Haas2017-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | If a PARAM_EXEC parameter is used below a Gather (Merge) but the InitPlan that computes it is attached to or above the Gather (Merge), force the value to be computed before starting parallelism and pass it down to all workers. This allows us to use parallelism in cases where it previously would have had to be rejected as unsafe. We do - in this case - lose the optimization that the value is only computed if it's actually used. An alternative strategy would be to have the first worker that needs the value compute it, but one downside of that approach is that we'd then need to select a parallel-safe path to compute the parameter value; it couldn't for example contain a Gather (Merge) node. At some point in the future, we might want to consider both approaches. Independent of that consideration, there is a great deal more work that could be done to make more kinds of PARAM_EXEC parameters parallel-safe. This infrastructure could be used to allow a Gather (Merge) on the inner side of a nested loop (although that's not a very appealing plan) and cases where the InitPlan is attached below the Gather (Merge) could be addressed as well using various techniques. But this is a good start. Amit Kapila, reviewed and revised by me. Reviewing and testing from Kuntal Ghosh, Haribabu Kommi, and Tushar Ahuja. Discussion: http://postgr.es/m/CAA4eK1LV0Y1AUV4cUCdC+sYOx0Z0-8NAJ2Pd9=UKsbQ5Sr7+JQ@mail.gmail.com
* Add parallel_leader_participation GUC.Robert Haas2017-11-15
| | | | | | | | | | | Sometimes, for testing, it's useful to have the leader do nothing but read tuples from workers; and it's possible that could work out better even in production. Thomas Munro, reviewed by Amit Kapila and by me. A few final tweaks by me. Discussion: http://postgr.es/m/CAEepm=2U++Lp3bNTv2Bv_kkr5NE2pOyHhxU=G0YTa4ZhSYhHiw@mail.gmail.com
* Push target list evaluation through Gather Merge.Robert Haas2017-11-13
| | | | | | | | | We already do this for Gather, but it got overlooked for Gather Merge. Amit Kapila, with review and minor revisions by Rushabh Lathia and by me. Discussion: http://postgr.es/m/CAA4eK1KUC5Uyu7qaifxrjpHxbSeoQh3yzwN3bThnJsmJcZ-qtA@mail.gmail.com
* Track in the plan the types associated with PARAM_EXEC parameters.Robert Haas2017-11-13
| | | | | | | | | | | | Up until now, we only tracked the number of parameters, which was sufficient to allocate an array of Datums of the appropriate size, but not sufficient to, for example, know how to serialize a Datum stored in one of those slots. An upcoming patch wants to do that, so add this tracking to make it possible. Patch by me, reviewed by Tom Lane and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoYqpxDKn8koHdW8BEKk8FMUL0=e8m2Qe=M+r0UBjr3tuQ@mail.gmail.com
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-08
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* In the planner, delete joinaliasvars lists after we're done with them.Tom Lane2017-10-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although joinaliasvars lists coming out of the parser are quite simple, those lists can contain arbitrarily complex expressions after subquery pullup. We do not perform expression preprocessing on them, meaning that expressions in those lists will not meet the expectations of later phases of the planner (for example, that they do not contain SubLinks). This had been thought pretty harmless, since we don't intentionally touch those lists in later phases --- but Andreas Seltenreich found a case in which adjust_appendrel_attrs() could recurse into a joinaliasvars list and then die on its assertion that it never sees a SubLink. We considered a couple of localized fixes to prevent that specific case from looking at the joinaliasvars lists, but really this seems like a generic hazard for all expression processing in the planner. Therefore, probably the best answer is to delete the joinaliasvars lists from the parsetree at the end of expression preprocessing, so that there are no reachable expressions that haven't been through preprocessing. The case Andreas found seems to be harmless in non-Assert builds, and so far there are no field reports suggesting that there are user-visible effects in other cases. I considered back-patching this anyway, but it turns out that Andreas' test doesn't fail at all in 9.4-9.6, because in those versions adjust_appendrel_attrs contains code (added in commit 842faa714 and removed again in commit 215b43cdc) to process SubLinks rather than complain about them. Barring discovery of another path by which unprocessed joinaliasvars lists can cause trouble, the most prudent compromise seems to be to patch this into v10 but not further. Patch by me, with thanks to Amit Langote for initial investigation and review. Discussion: https://postgr.es/m/87r2tvt9f1.fsf@ansel.ydns.eu
* Basic partition-wise join functionality.Robert Haas2017-10-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of joining two partitioned tables in their entirety we can, if it is an equi-join on the partition keys, join the matching partitions individually. This involves teaching the planner about "other join" rels, which are related to regular join rels in the same way that other member rels are related to baserels. This can use significantly more CPU time and memory than regular join planning, because there may now be a set of "other" rels not only for every base relation but also for every join relation. In most practical cases, this probably shouldn't be a problem, because (1) it's probably unusual to join many tables each with many partitions using the partition keys for all joins and (2) if you do that scenario then you probably have a big enough machine to handle the increased memory cost of planning and (3) the resulting plan is highly likely to be better, so what you spend in planning you'll make up on the execution side. All the same, for now, turn this feature off by default. Currently, we can only perform joins between two tables whose partitioning schemes are absolutely identical. It would be nice to cope with other scenarios, such as extra partitions on one side or the other with no match on the other side, but that will have to wait for a future patch. Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit Khandekar, and by me. A few final adjustments by me. Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
* Allow DML commands that create tables to use parallel query.Robert Haas2017-10-05
| | | | | | | | | | | Haribabu Kommi, reviewed by Dilip Kumar and Rafia Sabih. Various cosmetic changes by me to explain why this appears to be safe but allowing inserts in parallel mode in general wouldn't be. Also, I removed the REFRESH MATERIALIZED VIEW case from Haribabu's patch, since I'm not convinced that case is OK, and hacked on the documentation somewhat. Discussion: http://postgr.es/m/CAJrrPGdo5bak6qnPWe8Kpi8g_jfQEs-G4SYmG9y+OFaw2-dPvA@mail.gmail.com
* Expand partitioned table RTEs level by level, without flattening.Robert Haas2017-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | Flattening the partitioning hierarchy at this stage makes various desirable optimizations difficult. The original use case for this patch was partition-wise join, which wants to match up the partitions in one partitioning hierarchy with those in another such hierarchy. However, it now seems that it will also be useful in making partition pruning work using the PartitionDesc rather than constraint exclusion, because with a flattened expansion, we have no easy way to figure out which PartitionDescs apply to which leaf tables in a multi-level partition hierarchy. As it turns out, we end up creating both rte->inh and !rte->inh RTEs for each intermediate partitioned table, just as we previously did for the root table. This seems unnecessary since the partitioned tables have no storage and are not scanned. We might want to go back and rejigger things so that no partitioned tables (including the parent) need !rte->inh RTEs, but that seems to require some adjustments not related to the core purpose of this patch. Ashutosh Bapat, reviewed by me and by Amit Langote. Some final adjustments by me. Discussion: http://postgr.es/m/CAFjFpRd=1venqLL7oGU=C1dEkuvk2DJgvF+7uKbnPHaum1mvHQ@mail.gmail.com
* Set partitioned_rels appropriately when UNION ALL is used.Robert Haas2017-09-14
| | | | | | | | | | In most cases, this omission won't matter, because the appropriate locks will have been acquired during parse/plan or by AcquireExecutorLocks. But it's a bug all the same. Report by Ashutosh Bapat. Patch by me, reviewed by Amit Langote. Discussion: http://postgr.es/m/CAFjFpRdHb_ZnoDTuBXqrudWXh3H1ibLkr6nHsCFT96fSK4DXtA@mail.gmail.com
* Use lfirst_node() and linitial_node() where appropriate in planner.c.Tom Lane2017-09-05
| | | | | | | | | There's no particular reason to target this module for the first wholesale application of these macros; but we gotta start somewhere. Ashutosh Bapat and Jeevan Chalke Discussion: https://postgr.es/m/CAFjFpRcNr3r=u0ni=7A4GD9NnHQVq+dkFafzqo2rS6zy=dt1eg@mail.gmail.com
* Force rescanning of parallel-aware scan nodes below a Gather[Merge].Tom Lane2017-08-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ExecReScan machinery contains various optimizations for postponing or skipping rescans of plan subtrees; for example a HashAgg node may conclude that it can re-use the table it built before, instead of re-reading its input subtree. But that is wrong if the input contains a parallel-aware table scan node, since the portion of the table scanned by the leader process is likely to vary from one rescan to the next. This explains the timing-dependent buildfarm failures we saw after commit a2b70c89c. The established mechanism for showing that a plan node's output is potentially variable is to mark it as depending on some runtime Param. Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC parameter number, but carries no actual value) associated with each Gather or GatherMerge node, mark parallel-aware nodes below that node as dependent on that Param, and arrange for ExecReScanGather[Merge] to flag that Param as changed whenever the Gather[Merge] node is rescanned. This solution breaks an undocumented assumption made by the parallel executor logic, namely that all rescans of nodes below a Gather[Merge] will happen synchronously during the ReScan of the top node itself. But that's fundamentally contrary to the design of the ExecReScan code, and so was doomed to fail someday anyway (even if you want to argue that the bug being fixed here wasn't a failure of that assumption). A follow-on patch will address that issue. In the meantime, the worst that's expected to happen is that given very bad timing luck, the leader might have to do all the work during a rescan, because workers think they have nothing to do, if they are able to start up before the eventual ReScan of the leader's parallel-aware table scan node has reset the shared scan state. Although this problem exists in 9.6, there does not seem to be any way for it to manifest there. Without GatherMerge, it seems that a plan tree that has a rescan-short-circuiting node below Gather will always also have one above it that will short-circuit in the same cases, preventing the Gather from being rescanned. Hence we won't take the risk of back-patching this change into 9.6. But v10 needs it. Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
* Attempt to clarify comments related to force_parallel_mode.Robert Haas2017-08-17
| | | | | | Per discussion with Tom Lane. Discussion: http://postgr.es/m/28589.1502902172@sss.pgh.pa.us
* Teach adjust_appendrel_attrs(_multilevel) to do multiple translations.Robert Haas2017-08-15
| | | | | | | | | | | | | | | | Currently, child relations are always base relations, so when we translate parent relids to child relids, we only need to translate a singler relid. However, the proposed partition-wise join feature will create child joins, which will mean we need to translate a set of parent relids to the corresponding child relids. This is preliminary refactoring to make that possible. Ashutosh Bapat. Review and testing of the larger patch set of which this is a part by Amit Langote, Rajkumar Raghuwanshi, Rafia Sabih, Thomas Munro, Dilip Kumar, and me. Some adjustments, mostly cosmetic, by me. Discussion: http://postgr.es/m/CA+TgmobQK80vtXjAsPZWWXd7c8u13G86gmuLupN+uUJjA+i4nA@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Fire per-statement triggers on partitioned tables.Robert Haas2017-05-01
| | | | | | | | | | | Even though no actual tuples are ever inserted into a partitioned table (the actual tuples are in the partitions, not the partitioned table itself), we still need to have a ResultRelInfo for the partitioned table, or per-statement triggers won't get fired. Amit Langote, per a report from Rajkumar Raghuwanshi. Reviewed by me. Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
* Mark finished Plan nodes with parallel_safe flags.Tom Lane2017-04-12
| | | | | | | | | | | | | | | | | | | | | | | We'd managed to avoid doing this so far, but it seems pretty obvious that it would be forced on us some day, and this is much the cleanest way of approaching the open problem that parallel-unsafe subplans are being transmitted to parallel workers. Anyway there's no space cost due to alignment considerations, and the time cost is pretty minimal since we're just copying the flag from the corresponding Path node. (At least in most cases ... some of the klugier spots in createplan.c have to work a bit harder.) In principle we could perhaps get rid of SubPlan.parallel_safe, but I thought it better to keep that in case there are reasons to consider a SubPlan unsafe even when its child plan is parallel-safe. This patch doesn't actually do anything with the new flags, but I thought I'd commit it separately anyway. Note: although this touches outfuncs/readfuncs, there's no need for a catversion bump because Plan trees aren't stored on disk. Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
* Improve castNode notation by introducing list-extraction-specific variants.Tom Lane2017-04-10
| | | | | | | | | | | | | | | | | This extends the castNode() notation introduced by commit 5bcab1114 to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
* Abstract logic to allow for multiple kinds of child rels.Robert Haas2017-04-03
| | | | | | | | | | | | | | | | | | | | | | Currently, the only type of child relation is an "other member rel", which is the child of a baserel, but in the future joins and even upper relations may have child rels. To facilitate that, introduce macros that test to test for particular RelOptKind values, and use them in various places where they help to clarify the sense of a test. (For example, a test may allow RELOPT_OTHER_MEMBER_REL either because it intends to allow child rels, or because it intends to allow simple rels.) Also, remove find_childrel_top_parent, which will not work for a child rel that is not a baserel. Instead, add a new RelOptInfo member top_parent_relids to track the same kind of information in a more generic manner. Ashutosh Bapat, slightly tweaked by me. Review and testing of the patch set from which this was taken by Rajkumar Raghuwanshi and Rafia Sabih. Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com
* Try and silence spurious Coverity warning.Andrew Gierth2017-04-03
| | | | | | gset_data (aka gd) in planner.c is always non-null if and only if parse->groupingSets is non-null, but Coverity doesn't know that and complains. Feed it an assertion to see if that keeps it happy.
* Cast result of copyObject() to correct typePeter Eisentraut2017-03-28
| | | | | | | | | | | | | | copyObject() is declared to return void *, which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node * or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
* Support hashed aggregation with grouping sets.Andrew Gierth2017-03-27
| | | | | | | | | | | | | | | | | | This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
* Faster expression evaluation and targetlist projection.Andres Freund2017-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de