aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/plan/planner.c
Commit message (Collapse)AuthorAge
...
* Speed up planning when partitions can be pruned at plan time.Tom Lane2019-03-30
| | | | | | | | | | | | | | | | | | | | | | Previously, the planner created RangeTblEntry and RelOptInfo structs for every partition of a partitioned table, even though many of them might later be deemed uninteresting thanks to partition pruning logic. This incurred significant overhead when there are many partitions. Arrange to postpone creation of these data structures until after we've processed the query enough to identify restriction quals for the partitioned table, and then apply partition pruning before not after creation of each partition's data structures. In this way we need not open the partition relations at all for partitions that the planner has no real interest in. For queries that can be proven at plan time to access only a small number of partitions, this patch improves the practical maximum number of partitions from under 100 to perhaps a few thousand. Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen, Yoshikazu Imai, and David Rowley Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
* Avoid crash in partitionwise join planning under GEQO.Tom Lane2019-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While trying to plan a partitionwise join, we may be faced with cases where one or both input partitions for a particular segment of the join have been pruned away. In HEAD and v11, this is problematic because earlier processing didn't bother to make a pruned RelOptInfo fully valid. With an upcoming patch to make partition pruning more efficient, this'll be even more problematic because said RelOptInfo won't exist at all. The existing code attempts to deal with this by retroactively making the RelOptInfo fully valid, but that causes crashes under GEQO because join planning is done in a short-lived memory context. In v11 we could probably have fixed this by switching to the planner's main context while fixing up the RelOptInfo, but that idea doesn't scale well to the upcoming patch. It would be better not to mess with the base-relation data structures during join planning, anyway --- that's just a recipe for order-of-operations bugs. In many cases, though, we don't actually need the child RelOptInfo, because if the input is certainly empty then the join segment's result is certainly empty, so we can skip making a join plan altogether. (The existing code ultimately arrives at the same conclusion, but only after doing a lot more work.) This approach works except when the pruned-away partition is on the nullable side of a LEFT, ANTI, or FULL join, and the other side isn't pruned. But in those cases the existing code leaves a lot to be desired anyway --- the correct output is just the result of the unpruned side of the join, but we were emitting a useless outer join against a dummy Result. Pending somebody writing code to handle that more nicely, let's just abandon the partitionwise-join optimization in such cases. When the modified code skips making a join plan, it doesn't make a join RelOptInfo either; this requires some upper-level code to cope with nulls in part_rels[] arrays. We would have had to have that anyway after the upcoming patch. Back-patch to v11 since the crash is demonstrable there. Discussion: https://postgr.es/m/8305.1553884377@sss.pgh.pa.us
* Avoid passing query tlist around separately from root->processed_tlist.Tom Lane2019-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the dim past, the planner kept the fully-processed version of the query targetlist (the result of preprocess_targetlist) in grouping_planner's local variable "tlist", and only grudgingly passed it to individual other routines as needed. Later we discovered a need to still have it available after grouping_planner finishes, and invented the root->processed_tlist field for that purpose, but it wasn't used internally to grouping_planner; the tlist was still being passed around separately in the same places as before. Now comes a proposed patch to allow appendrel expansion to add entries to the processed tlist, well after preprocess_targetlist has finished its work. To avoid having to pass around the tlist explicitly, it's proposed to allow appendrel expansion to modify root->processed_tlist. That makes aliasing the tlist with assorted parameters and local variables really scary. It would accidentally work as long as the tlist is initially nonempty, because then the List header won't move around, but it's not exactly hard to think of ways for that to break. Aliased values are poor programming practice anyway. Hence, get rid of local variables and parameters that can be identified with root->processed_tlist, in favor of just using that field directly. And adjust comments to match. (Some of the new comments speak as though it's already possible for appendrel expansion to modify the tlist; that's not true yet, but will happen in a later patch.) Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp
* Enable parallel query with SERIALIZABLE isolation.Thomas Munro2019-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the SERIALIZABLE isolation level prevented parallel query from being used. Allow the two features to be used together by sharing the leader's SERIALIZABLEXACT with parallel workers. An extra per-SERIALIZABLEXACT LWLock is introduced to make it safe to share, and new logic is introduced to coordinate the early release of the SERIALIZABLEXACT required for the SXACT_FLAG_RO_SAFE optimization, as follows: The first backend to observe the SXACT_FLAG_RO_SAFE flag (set by some other transaction) will 'partially release' the SERIALIZABLEXACT, meaning that the conflicts and locks it holds are released, but the SERIALIZABLEXACT itself will remain active because other backends might still have a pointer to it. Whenever any backend notices the SXACT_FLAG_RO_SAFE flag, it clears its own MySerializableXact variable and frees local resources so that it can skip SSI checks for the rest of the transaction. In the special case of the leader process, it transfers the SERIALIZABLEXACT to a new variable SavedSerializableXact, so that it can be completely released at the end of the transaction after all workers have exited. Remove the serializable_okay flag added to CreateParallelContext() by commit 9da0cc35, because it's now redundant. Author: Thomas Munro Reviewed-by: Haribabu Kommi, Robert Haas, Masahiko Sawada, Kevin Grittner Discussion: https://postgr.es/m/CAEepm=0gXGYhtrVDWOTHS8SQQy_=S9xo+8oCxGLWZAOoeJ=yzQ@mail.gmail.com
* Fix testing of parallel-safety of scan/join target.Etsuro Fujita2019-03-12
| | | | | | | | | | | In commit 960df2a971 ("Correctly assess parallel-safety of tlists when SRFs are used."), the testing of scan/join target was done incorrectly, which caused a plan-quality problem. Backpatch through to v11 where the aforementioned commit went in, since this is a regression from v10. Author: Etsuro Fujita Reviewed-by: Robert Haas and Tom Lane Discussion: https://postgr.es/m/5C75303E.8020303@lab.ntt.co.jp
* Fix handling of targetlist SRFs when scan/join relation is known empty.Tom Lane2019-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we introduced separate ProjectSetPath nodes for application of set-returning functions in v10, we inadvertently broke some cases where we're supposed to recognize that the result of a subquery is known to be empty (contain zero rows). That's because IS_DUMMY_REL was just looking for a childless AppendPath without allowing for a ProjectSetPath being possibly stuck on top. In itself, this didn't do anything much worse than produce slightly worse plans for some corner cases. Then in v11, commit 11cf92f6e rearranged things to allow the scan/join targetlist to be applied directly to partial paths before they get gathered. But it inserted a short-circuit path for dummy relations that was a little too short: it failed to insert a ProjectSetPath node at all for a targetlist containing set-returning functions, resulting in bogus "set-valued function called in context that cannot accept a set" errors, as reported in bug #15669 from Madelaine Thibaut. The best way to fix this mess seems to be to reimplement IS_DUMMY_REL so that it drills down through any ProjectSetPath nodes that might be there (and it seems like we'd better allow for ProjectionPath as well). While we're at it, make it look at rel->pathlist not cheapest_total_path, so that it gives the right answer independently of whether set_cheapest has been done lately. That dependency looks pretty shaky in the context of code like apply_scanjoin_target_to_paths, and even if it's not broken today it'd certainly bite us at some point. (Nastily, unsafe use of the old coding would almost always work; the hazard comes down to possibly looking through a dangling pointer, and only once in a blue moon would you find something there that resulted in the wrong answer.) It now looks like it was a mistake for IS_DUMMY_REL to be a macro: if there are any extensions using it, they'll continue to use the old inadequate logic until they're recompiled, after which they'll fail to load into server versions predating this fix. Hopefully there are few such extensions. Having fixed IS_DUMMY_REL, the special path for dummy rels in apply_scanjoin_target_to_paths is unnecessary as well as being wrong, so we can just drop it. Also change a few places that were testing for partitioned-ness of a planner relation but not using IS_PARTITIONED_REL for the purpose; that seems unsafe as well as inconsistent, plus it required an ugly hack in apply_scanjoin_target_to_paths. In passing, save a few cycles in apply_scanjoin_target_to_paths by skipping processing of pre-existing paths for partitioned rels, and do some cosmetic cleanup and comment adjustment in that function. I renamed IS_DUMMY_PATH to IS_DUMMY_APPEND with the intention of breaking any code that might be using it, since in almost every case that would be wrong; IS_DUMMY_REL is what to be using instead. In HEAD, also make set_dummy_rel_pathlist static (since it's no longer used from outside allpaths.c), and delete is_dummy_plan, since it's no longer used anywhere. Back-patch as appropriate into v11 and v10. Tom Lane and Julien Rouhaud Discussion: https://postgr.es/m/15669-02fb3296cca26203@postgresql.org
* Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.Robert Haas2019-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We still require AccessExclusiveLock on the partition itself, because otherwise an insert that violates the newly-imposed partition constraint could be in progress at the same time that we're changing that constraint; only the lock level on the parent relation is weakened. To make this safe, we have to cope with (at least) three separate problems. First, relevant DDL might commit while we're in the process of building a PartitionDesc. If so, find_inheritance_children() might see a new partition while the RELOID system cache still has the old partition bound cached, and even before invalidation messages have been queued. To fix that, if we see that the pg_class tuple seems to be missing or to have a null relpartbound, refetch the value directly from the table. We can't get the wrong value, because DETACH PARTITION still requires AccessExclusiveLock throughout; if we ever want to change that, this will need more thought. In testing, I found it quite difficult to hit even the null-relpartbound case; the race condition is extremely tight, but the theoretical risk is there. Second, successive calls to RelationGetPartitionDesc might not return the same answer. The query planner will get confused if lookup up the PartitionDesc for a particular relation does not return a consistent answer for the entire duration of query planning. Likewise, query execution will get confused if the same relation seems to have a different PartitionDesc at different times. Invent a new PartitionDirectory concept and use it to ensure consistency. This ensures that a single invocation of either the planner or the executor sees the same view of the PartitionDesc from beginning to end, but it does not guarantee that the planner and the executor see the same view. Since this allows pointers to old PartitionDesc entries to survive even after a relcache rebuild, also postpone removing the old PartitionDesc entry until we're certain no one is using it. For the most part, it seems to be OK for the planner and executor to have different views of the PartitionDesc, because the executor will just ignore any concurrently added partitions which were unknown at plan time; those partitions won't be part of the inheritance expansion, but invalidation messages will trigger replanning at some point. Normally, this happens by the time the very next command is executed, but if the next command acquires no locks and executes a prepared query, it can manage not to notice until a new transaction is started. We might want to tighten that up, but it's material for a separate patch. There would still be a small window where a query that started just after an ATTACH PARTITION command committed might fail to notice its results -- but only if the command starts before the commit has been acknowledged to the user. All in all, the warts here around serializability seem small enough to be worth accepting for the considerable advantage of being able to add partitions without a full table lock. Although in general the consequences of new partitions showing up between planning and execution are limited to the query not noticing the new partitions, run-time partition pruning will get confused in that case, so that's the third problem that this patch fixes. Run-time partition pruning assumes that indexes into the PartitionDesc are stable between planning and execution. So, add code so that if new partitions are added between plan time and execution time, the indexes stored in the subplan_map[] and subpart_map[] arrays within the plan's PartitionedRelPruneInfo get adjusted accordingly. There does not seem to be a simple way to generalize this scheme to cope with partitions that are removed, mostly because they could then get added back again with different bounds, but it works OK for added partitions. This code does not try to ensure that every backend participating in a parallel query sees the same view of the PartitionDesc. That currently doesn't matter, because we never pass PartitionDesc indexes between backends. Each backend will ignore the concurrently added partitions which it notices, and it doesn't matter if different backends are ignoring different sets of concurrently added partitions. If in the future that matters, for example because we allow writes in parallel query and want all participants to do tuple routing to the same set of partitions, the PartitionDirectory concept could be improved to share PartitionDescs across backends. There is a draft patch to serialize and restore PartitionDescs on the thread where this patch was discussed, which may be a useful place to start. Patch by me. Thanks to Alvaro Herrera, David Rowley, Simon Riggs, Amit Langote, and Michael Paquier for discussion, and to Alvaro Herrera for some review. Discussion: http://postgr.es/m/CA+Tgmobt2upbSocvvDej3yzokd7AkiT+PvgFH+a9-5VV1oJNSQ@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZE0r9-cyA-aY6f8WFEROaDLLL7Vf81kZ8MtFCkxpeQSw@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoY13KQZF-=HNTrt9UYWYx3_oYOQpu9ioNT49jGgiDpUEA@mail.gmail.com
* Fix plan created for inherited UPDATE/DELETE with all tables excluded.Tom Lane2019-02-22
| | | | | | | | | | | | | | | | | | | | | | In the case where inheritance_planner() finds that every table has been excluded by constraints, it thought it could get away with making a plan consisting of just a dummy Result node. While certainly there's no updating or deleting to be done, this had two user-visible problems: the plan did not report the correct set of output columns when a RETURNING clause was present, and if there were any statement-level triggers that should be fired, it didn't fire them. Hence, rather than only generating the dummy Result, we need to stick a valid ModifyTable node on top, which requires a tad more effort here. It's been broken this way for as long as inheritance_planner() has known about deleting excluded subplans at all (cf commit 635d42e9c), so back-patch to all supported branches. Amit Langote and Tom Lane, per a report from Petr Fedorov. Discussion: https://postgr.es/m/5da6f0f0-1364-1876-6978-907678f89a3e@phystech.edu
* Move estimate_hashagg_tablesize to selfuncs.c, and widen result to double.Tom Lane2019-02-21
| | | | | | | | | | | | | | | | | It seems to make more sense for this to be in selfuncs.c, since it's largely a statistical-estimation thing, and it's related to other functions like estimate_hash_bucket_stats that are there. While at it, change the result type from Size to double. Perhaps at one point it was impossible for the result to overflow an integer, but I've got no confidence in that proposition anymore. Nothing's actually done with the result except to compare it to a work_mem-based limit, so as long as we don't get an overflow on the way to that comparison, things should be fine even with very large dNumGroups. Code movement proposed by Antonin Houska, type change by me Discussion: https://postgr.es/m/25767.1549359615@localhost
* Save PathTargets for distinct/ordered relations in root->upper_targets[].Etsuro Fujita2019-02-18
| | | | | | | | | | For the convenience of extensions, we previously only saved PathTargets for grouped, window, and final relations in root->upper_targets[] in grouping_planner(). To improve the convenience, save PathTargets for distinct and ordered relations as well. Author: Antonin Houska, with an additional change by me Discussion: https://postgr.es/m/10994.1549559088@localhost
* Allow user control of CTE materialization, and change the default behavior.Tom Lane2019-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically we've always materialized the full output of a CTE query, treating WITH as an optimization fence (so that, for example, restrictions from the outer query cannot be pushed into it). This is appropriate when the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE query is non-recursive and side-effect-free, there's no hazard of changing the query results by pushing restrictions down. Another argument for materialization is that it can avoid duplicate computation of an expensive WITH query --- but that only applies if the WITH query is called more than once in the outer query. Even then it could still be a net loss, if each call has restrictions that would allow just a small part of the WITH query to be computed. Hence, let's change the behavior for WITH queries that are non-recursive and side-effect-free. By default, we will inline them into the outer query (removing the optimization fence) if they are called just once. If they are called more than once, we will keep the old behavior by default, but the user can override this and force inlining by specifying NOT MATERIALIZED. Lastly, the user can force the old behavior by specifying MATERIALIZED; this would mainly be useful when the query had deliberately been employing WITH as an optimization fence to prevent a poor choice of plan. Andreas Karlsson, Andrew Gierth, David Fetter Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
* Refactor the representation of indexable clauses in IndexPaths.Tom Lane2019-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In place of three separate but interrelated lists (indexclauses, indexquals, and indexqualcols), an IndexPath now has one list "indexclauses" of IndexClause nodes. This holds basically the same information as before, but in a more useful format: in particular, there is now a clear connection between an indexclause (an original restriction clause from WHERE or JOIN/ON) and the indexquals (directly usable index conditions) derived from it. We also change the ground rules a bit by mandating that clause commutation, if needed, be done up-front so that what is stored in the indexquals list is always directly usable as an index condition. This gets rid of repeated re-determination of which side of the clause is the indexkey during costing and plan generation, as well as repeated lookups of the commutator operator. To minimize the added up-front cost, the typical case of commuting a plain OpExpr is handled by a new special-purpose function commute_restrictinfo(). For RowCompareExprs, generating the new clause properly commuted to begin with is not really any more complex than before, it's just different --- and we can save doing that work twice, as the pretty-klugy original implementation did. Tracking the connection between original and derived clauses lets us also track explicitly whether the derived clauses are an exact or lossy translation of the original. This provides a cheap solution to getting rid of unnecessary rechecks of boolean index clauses, which previously seemed like it'd be more expensive than it was worth. Another pleasant (IMO) side-effect is that EXPLAIN now always shows index clauses with the indexkey on the left; this seems less confusing. This commit leaves expand_indexqual_conditions() and some related functions in a slightly messy state. I didn't bother to change them any more than minimally necessary to work with the new data structure, because all that code is going to be refactored out of existence in a follow-on patch. Discussion: https://postgr.es/m/22182.1549124950@sss.pgh.pa.us
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* Make some small planner API cleanups.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | Move a few very simple node-creation and node-type-testing functions from the planner's clauses.c to nodes/makefuncs and nodes/nodeFuncs. There's nothing planner-specific about them, as evidenced by the number of other places that were using them. While at it, rename and_clause() etc to is_andclause() etc, to clarify that they are node-type-testing functions not node-creation functions. And use "static inline" implementations for the shortest ones. Also, modify flatten_join_alias_vars() and some subsidiary functions to take a Query not a PlannerInfo to define the join structure that Vars should be translated according to. They were only using the "parse" field of the PlannerInfo anyway, so this just requires removing one level of indirection. The advantage is that now parse_agg.c can use flatten_join_alias_vars() without the horrid kluge of creating an incomplete PlannerInfo, which will allow that file to be decoupled from relation.h in a subsequent patch. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* In the planner, replace an empty FROM clause with a dummy RTE.Tom Lane2019-01-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fact that "SELECT expression" has no base relations has long been a thorn in the side of the planner. It makes it hard to flatten a sub-query that looks like that, or is a trivial VALUES() item, because the planner generally uses relid sets to identify sub-relations, and such a sub-query would have an empty relid set if we flattened it. prepjointree.c contains some baroque logic that works around this in certain special cases --- but there is a much better answer. We can replace an empty FROM clause with a dummy RTE that acts like a table of one row and no columns, and then there are no such corner cases to worry about. Instead we need some logic to get rid of useless dummy RTEs, but that's simpler and covers more cases than what was there before. For really trivial cases, where the query is just "SELECT expression" and nothing else, there's a hazard that adding the extra RTE makes for a noticeable slowdown; even though it's not much processing, there's not that much for the planner to do overall. However testing says that the penalty is very small, close to the noise level. In more complex queries, this is able to find optimizations that we could not find before. The new RTE type is called RTE_RESULT, since the "scan" plan type it gives rise to is a Result node (the same plan we produced for a "SELECT expression" query before). To avoid confusion, rename the old ResultPath path type to GroupResultPath, reflecting that it's only used in degenerate grouping cases where we know the query produces just one grouped row. (It wouldn't work to unify the two cases, because there are different rules about where the associated quals live during query_planner.) Note: although this touches readfuncs.c, I don't think a catversion bump is required, because the added case can't occur in stored rules, only plans. Patch by me, reviewed by David Rowley and Mark Dilger Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
* Replace uses of heap_open et al with the corresponding table_* function.Andres Freund2019-01-21
| | | | | Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
* Replace heapam.h includes with {table, relation}.h where applicable.Andres Freund2019-01-21
| | | | | | | | | A lot of files only included heapam.h for relation_open, heap_open etc - replace the heapam.h include in those files with the narrower header. Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
* Don't include genam.h from execnodes.h and relscan.h anymore.Andres Freund2019-01-14
| | | | | | | | | | | | | | | | | | | This is the genam.h equivalent of 4c850ecec649c (which removed heapam.h from a lot of other headers). There's still a few header includes of genam.h, but not from central headers anymore. As a few headers are not indirectly included anymore, execnodes.h and relscan.h need a few additional includes. Some of the depended on types were replacable by using the underlying structs, but e.g. for Snapshot in execnodes.h that'd have gotten more invasive than reasonable in this commit. Like the aforementioned commit 4c850ecec649c, this requires adding new genam.h includes to a number of backend files, which likely is also required in a few external projects. Author: Andres Freund Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
* Don't include heapam.h from others headers.Andres Freund2019-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | heapam.h previously was included in a number of widely used headers (e.g. execnodes.h, indirectly in executor.h, ...). That's problematic on its own, as heapam.h contains a lot of low-level details that don't need to be exposed that widely, but becomes more problematic with the upcoming introduction of pluggable table storage - it seems inappropriate for heapam.h to be included that widely afterwards. heapam.h was largely only included in other headers to get the HeapScanDesc typedef (which was defined in heapam.h, even though HeapScanDescData is defined in relscan.h). The better solution here seems to be to just use the underlying struct (forward declared where necessary). Similar for BulkInsertState. Another problem was that LockTupleMode was used in executor.h - parts of the file tried to cope without heapam.h, but due to the fact that it indirectly included it, several subsequent violations of that goal were not not noticed. We could just reuse the approach of declaring parameters as int, but it seems nicer to move LockTupleMode to lockoptions.h - that's not a perfect location, but also doesn't seem bad. As a number of files relied on implicitly included heapam.h, a significant number of files grew an explicit include. It's quite probably that a few external projects will need to do the same. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
* Avoid sharing PARAM_EXEC slots between different levels of NestLoop.Tom Lane2019-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now, createplan.c attempted to share PARAM_EXEC slots for NestLoopParams across different plan levels, if the same underlying Var was being fed down to different righthand-side subplan trees by different NestLoops. This was, I think, more of an artifact of using subselect.c's PlannerParamItem infrastructure than an explicit design goal, but anyway that was the end result. This works well enough as long as the plan tree is executing synchronously, but the feature whereby Gather can execute the parallelized subplan locally breaks it. An upper NestLoop node might execute for a row retrieved from a parallel worker, and assign a value for a PARAM_EXEC slot from that row, while the leader's copy of the parallelized subplan is suspended with a different active value of the row the Var comes from. When control eventually returns to the leader's subplan, it gets the wrong answers if the same PARAM_EXEC slot is being used within the subplan, as reported in bug #15577 from Bartosz Polnik. This is pretty reminiscent of the problem fixed in commit 46c508fbc, and the proper fix seems to be the same: don't try to share PARAM_EXEC slots across different levels of controlling NestLoop nodes. This requires decoupling NestLoopParam handling from PlannerParamItem handling, although the logic remains somewhat similar. To avoid bizarre division of labor between subselect.c and createplan.c, I decided to move all the param-slot-assignment logic for both cases out of those files and put it into a new file paramassign.c. Hopefully it's a bit better documented now, too. A regression test case for this might be nice, but we don't know a test case that triggers the problem with a suitably small amount of data. Back-patch to 9.6 where we added Gather nodes. It's conceivable that related problems exist in older branches; but without some evidence for that, I'll leave the older branches alone. Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org
* Move inheritance expansion code into its own fileAlvaro Herrera2019-01-10
| | | | | | | | | | | | | | | | | This commit moves expand_inherited_tables and underlings from optimizer/prep/prepunionc.c to optimizer/utils/inherit.c. Also, all of the AppendRelInfo-based expression manipulation routines are moved to optimizer/utils/appendinfo.c. No functional code changes. One exception is the introduction of make_append_rel_info, but that's still just moving around code. Also, stop including <limits.h> in prepunion.c, which no longer needs it since 3fc6e2d7f5b6. I (Álvaro) noticed this because Amit was copying that to inherit.c, which likewise doesn't need it. Author: Amit Langote Discussion: https://postgr.es/m/3be67028-a00a-502c-199a-da00eec8fb6e@lab.ntt.co.jp
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Drop no-op CoerceToDomain nodes from expressions at planning time.Tom Lane2018-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
* Remove some unnecessary fields from Plan trees.Tom Lane2018-10-07
| | | | | | | | | | | | | | | | | | | | In the wake of commit f2343653f, we no longer need some fields that were used before to control executor lock acquisitions: * PlannedStmt.nonleafResultRelations can go away entirely. * partitioned_rels can go away from Append, MergeAppend, and ModifyTable. However, ModifyTable still needs to know the RT index of the partition root table if any, which was formerly kept in the first entry of that list. Add a new field "rootRelation" to remember that. rootRelation is partly redundant with nominalRelation, in that if it's set it will have the same value as nominalRelation. However, the latter field has a different purpose so it seems best to keep them distinct. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
* Create an RTE field to record the query's lock mode for each relation.Tom Lane2018-09-30
| | | | | | | | | | | | | | | | | | | | | | | | Add RangeTblEntry.rellockmode, which records the appropriate lock mode for each RTE_RELATION rangetable entry (either AccessShareLock, RowShareLock, or RowExclusiveLock depending on the RTE's role in the query). This patch creates the field and makes all creators of RTE nodes fill it in reasonably, but for the moment nothing much is done with it. The plan is to replace assorted post-parser logic that re-determines the right lockmode to use with simple uses of rte->rellockmode. For now, just add Asserts in each of those places that the rellockmode matches what they are computing today. (In some cases the match isn't perfect, so the Asserts are weaker than you might expect; but this seems OK, as per discussion.) This passes check-world for me, but it seems worth pushing in this state to see if the buildfarm finds any problems in cases I failed to test. catversion bump due to change of stored rules. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp
* Order active window clauses for greater reuse of Sort nodes.Andrew Gierth2018-09-14
| | | | | | | | | | By sorting the active window list lexicographically by the sort clause list but putting longer clauses before shorter prefixes, we generate more chances to elide Sort nodes when building the path. Author: Daniel Gustafsson (with some editorialization by me) Reviewed-by: Alexander Kuzmenkov, Masahiko Sawada, Tom Lane Discussion: https://postgr.es/m/124A7F69-84CD-435B-BA0E-2695BE21E5C2%40yesql.se
* Don't allow LIMIT/OFFSET clause within sub-selects to be pushed to workers.Amit Kapila2018-09-14
| | | | | | | | | | | | | | Allowing sub-select containing LIMIT/OFFSET in workers can lead to inconsistent results at the top-level as there is no guarantee that the row order will be fully deterministic. The fix is to prohibit pushing LIMIT/OFFSET within sub-selects to workers. Reported-by: Andrew Fletcher Bug: 15324 Author: Amit Kapila Reviewed-by: Dilip Kumar Backpatch-through: 9.6 Discussion: https://postgr.es/m/153417684333.10284.11356259990921828616@wrigleys.postgresql.org
* Fix wrong order of operations in inheritance_planner.Tom Lane2018-08-11
| | | | | | | | | | | | | | | | When considering a partitioning parent rel, we should stop processing that subroot as soon as we've done adjust_appendrel_attrs and any securityQuals updates. The rest of this is unnecessary, and indeed adding duplicate subquery RTEs to the subroot is *wrong*. As the code stood, the children of that partition ended up with two sets of copied subquery RTEs, confusing matters greatly. Even more hilarity ensued if all of the children got excluded by constraint exclusion, so that the extra RTEs didn't make it back into the parent rtable. Per fuzz testing by Andreas Seltenreich. Back-patch to v11 where this got broken (by commit 0a480502b, it looks like). Discussion: https://postgr.es/m/87va8g7vq0.fsf@ansel.ydns.eu
* Fix run-time partition pruning for appends with multiple source rels.Tom Lane2018-08-01
| | | | | | | | | | | | | | | | | | | | | | The previous coding here supposed that if run-time partitioning applied to a particular Append/MergeAppend plan, then all child plans of that node must be members of a single partitioning hierarchy. This is totally wrong, since an Append could be formed from a UNION ALL: we could have multiple hierarchies sharing the same Append, or child plans that aren't part of any hierarchy. To fix, restructure the related plan-time and execution-time data structures so that we can have a separate list or array for each partitioning hierarchy. Also track subplans that are not part of any hierarchy, and make sure they don't get pruned. Per reports from Phil Florent and others. Back-patch to v11, since the bug originated there. David Rowley, with a lot of cosmetic adjustments by me; thanks also to Amit Langote for review. Discussion: https://postgr.es/m/HE1PR03MB17068BB27404C90B5B788BCABA7B0@HE1PR03MB1706.eurprd03.prod.outlook.com
* Fix misc typos, mostly in comments.Heikki Linnakangas2018-07-18
| | | | | | | | A collection of typos I happened to spot while reading code, as well as grepping for common mistakes. Backpatch to all supported versions, as applicable, to avoid conflicts when backporting other commits in the future.
* Fix bugs with degenerate window ORDER BY clauses in GROUPS/RANGE mode.Tom Lane2018-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nodeWindowAgg.c failed to cope with the possibility that no ordering columns are defined in the window frame for GROUPS mode or RANGE OFFSET mode, leading to assertion failures or odd errors, as reported by Masahiko Sawada and Lukas Eder. In RANGE OFFSET mode, an ordering column is really required, so add an Assert about that. In GROUPS mode, the code would work, except that the node initialization code wasn't in sync with the execution code about when to set up tuplestore read pointers and spare slots. Fix the latter for consistency's sake (even though I think the changes described below make the out-of-sync cases unreachable for now). Per SQL spec, a single ordering column is required for RANGE OFFSET mode, and at least one ordering column is required for GROUPS mode. The parser enforced the former but not the latter; add a check for that. We were able to reach the no-ordering-column cases even with fully spec compliant queries, though, because the planner would drop partitioning and ordering columns from the generated plan if they were redundant with earlier columns according to the redundant-pathkey logic, for instance "PARTITION BY x ORDER BY y" in the presence of a "WHERE x=y" qual. While in principle that's an optimization that could save some pointless comparisons at runtime, it seems unlikely to be meaningful in the real world. I think this behavior was not so much an intentional optimization as a side-effect of an ancient decision to construct the plan node's ordering-column info by reverse-engineering the PathKeys of the input path. If we give up redundant-column removal then it takes very little code to generate the plan node info directly from the WindowClause, ensuring that we have the expected number of ordering columns in all cases. (If anyone does complain about this, the planner could perhaps be taught to remove redundant columns only when it's safe to do so, ie *not* in RANGE OFFSET mode. But I doubt anyone ever will.) With these changes, the WindowAggPath.winpathkeys field is not used for anything anymore, so remove it. The test cases added here are not actually very interesting given the removal of the redundant-column-removal logic, but they would represent important corner cases if anyone ever tries to put that back. Tom Lane and Masahiko Sawada. Back-patch to v11 where RANGE OFFSET and GROUPS modes were added. Discussion: https://postgr.es/m/CAD21AoDrWqycq-w_+Bx1cjc+YUhZ11XTj9rfxNiNDojjBx8Fjw@mail.gmail.com Discussion: https://postgr.es/m/153086788677.17476.8002640580496698831@wrigleys.postgresql.org
* Remove dynamic_shared_memory_type=nonePeter Eisentraut2018-07-10
| | | | | | | | | PostgreSQL nowadays offers some kind of dynamic shared memory feature on all supported platforms. Having the choice of "none" prevents us from relying on DSM in core features. So this patch removes the choice of "none". Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
* Allow direct lookups of AppendRelInfo by child relidAlvaro Herrera2018-06-26
| | | | | | | | | | | | | | | | | | | | | find_appinfos_by_relids had quite a large overhead when the number of items in the append_rel_list was high, as it had to trawl through the append_rel_list looking for AppendRelInfos belonging to the given childrelids. Since there can only be a single AppendRelInfo for each child rel, it seems much better to store an array in PlannerInfo which indexes these by child relid, making the function O(1) rather than O(N). This function was only called once inside the planner, so just replace that call with a lookup to the new array. find_childrel_appendrelinfo is now unused and thus removed. This fixes a planner performance regression new to v11 reported by Thomas Reiss. Author: David Rowley Reported-by: Thomas Reiss Reviewed-by: Ashutosh Bapat Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/94dd7a4b-5e50-0712-911d-2278e055c622@dalibo.com
* Avoid generating bogus paths with partitionwise aggregate.Robert Haas2018-06-22
| | | | | | | | | | | Previously, if some or all partitions had no partially aggregated path, we would still try to generate a partially aggregated path for the parent, leading to assertion failures or wrong answers. Report by Rajkumar Raghuwanshi. Patch by Jeevan Chalke, reviewed by Ashutosh Bapat. A few changes by me. Discussion: http://postgr.es/m/CAKcux6=q4+Mw8gOOX16ef6ZMFp9Cve7KWFstUsrDa4GiFaXGUQ@mail.gmail.com
* Post-feature-freeze pgindent run.Tom Lane2018-04-26
| | | | Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
* Add GUC enable_partition_pruningAlvaro Herrera2018-04-23
| | | | | | | | | | | | | | | | | | | | | | This controls both plan-time and execution-time new-style partition pruning. While finer-grain control is possible (maybe using an enum GUC instead of boolean), there doesn't seem to be much need for that. This new parameter controls partition pruning for all queries: trivially, SELECT queries that affect partitioned tables are naturally under its control since they are using the new technology. However, while UPDATE/DELETE queries do not use the new code, we make the new GUC control their behavior also (stealing control from constraint_exclusion), because it is more natural, and it leads to a more natural transition to the future in which those queries will also use the new pruning code. Constraint exclusion still controls pruning for regular inheritance situations (those not involving partitioned tables). Author: David Rowley Review: Amit Langote, Ashutosh Bapat, Justin Pryzby, David G. Johnston Discussion: https://postgr.es/m/CAKJS1f_0HwsxJG9m+nzU+CizxSdGtfe6iF_ykPYBiYft302DCw@mail.gmail.com
* Revert MERGE patchSimon Riggs2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commits d204ef63776b8a00ca220adec23979091564e465, 83454e3c2b28141c0db01c7d2027e01040df5249 and a few more commits thereafter (complete list at the end) related to MERGE feature. While the feature was fully functional, with sufficient test coverage and necessary documentation, it was felt that some parts of the executor and parse-analyzer can use a different design and it wasn't possible to do that in the available time. So it was decided to revert the patch for PG11 and retry again in the future. Thanks again to all reviewers and bug reporters. List of commits reverted, in reverse chronological order: f1464c5380 Improve parse representation for MERGE ddb4158579 MERGE syntax diagram correction 530e69e59b Allow cpluspluscheck to pass by renaming variable 01b88b4df5 MERGE minor errata 3af7b2b0d4 MERGE fix variable warning in non-assert builds a5d86181ec MERGE INSERT allows only one VALUES clause 4b2d44031f MERGE post-commit review 4923550c20 Tab completion for MERGE aa3faa3c7a WITH support in MERGE 83454e3c2b New files for MERGE d204ef6377 MERGE SQL Command following SQL:2016 Author: Pavan Deolasee Reviewed-by: Michael Paquier
* Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.Tom Lane2018-04-08
| | | | | | | | | | | | | Traditionally, include/catalog/pg_foo.h contains extern declarations for functions in backend/catalog/pg_foo.c, in addition to its function as the authoritative definition of the pg_foo catalog's rowtype. In some cases, we'd been forced to split out those extern declarations into separate pg_foo_fn.h headers so that the catalog definitions could be #include'd by frontend code. That problem is gone as of commit 9c0a0de4c, so let's undo the splits to make things less confusing. Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
* Support partition pruning at execution timeAlvaro Herrera2018-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
* Faster partition pruningAlvaro Herrera2018-04-06
| | | | | | | | | | | | | | | | | | | | | Add a new module backend/partitioning/partprune.c, implementing a more sophisticated algorithm for partition pruning. The new module uses each partition's "boundinfo" for pruning instead of constraint exclusion, based on an idea proposed by Robert Haas of a "pruning program": a list of steps generated from the query quals which are run iteratively to obtain a list of partitions that must be scanned in order to satisfy those quals. At present, this targets planner-time partition pruning, but there exist further patches to apply partition pruning at execution time as well. This commit also moves some definitions from include/catalog/partition.h to a new file include/partitioning/partbounds.h, in an attempt to rationalize partitioning related code. Authors: Amit Langote, David Rowley, Dilip Kumar Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen. Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
* MERGE SQL Command following SQL:2016Simon Riggs2018-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
* Revert "Modified files for MERGE"Simon Riggs2018-04-02
| | | | This reverts commit 354f13855e6381d288dfaa52bcd4f2cb0fd4a5eb.
* Modified files for MERGESimon Riggs2018-04-02
|
* postgres_fdw: Push down partition-wise aggregation.Robert Haas2018-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 7012b132d07c2b4ea15b0b3cb1ea9f3278801d98, postgres_fdw has been able to push down the toplevel aggregation operation to the remote server. Commit e2f1eb0ee30d144628ab523432320f174a2c8966 made it possible to break down the toplevel aggregation into one aggregate per partition. This commit lets postgres_fdw push down aggregation in that case just as it does at the top level. In order to make this work, this commit adds an additional argument to the GetForeignUpperPaths FDW API. A matching argument is added to the signature for create_upper_paths_hook. Third-party code using either of these will need to be updated. Also adjust create_foreignscan_plan() so that it picks up the correct set of relids in this case. Jeevan Chalke, reviewed by Ashutosh Bapat and by me and with some adjustments by me. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, and Rafia Sabih. Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
* Fix a boatload of typos in C comments.Tom Lane2018-04-01
| | | | | | Justin Pryzby Discussion: https://postgr.es/m/20180331105640.GK28454@telsasoft.com
* Don't call IS_DUMMY_REL() when cheapest_total_path might be junk.Robert Haas2018-03-30
| | | | | | | | | | | Unlike the previous coding, this might result in a Gather per Append subplan when the target list is parallel-restricted, but such a plan is probably worth considering in that case, since a single Gather on top of the entire Append is impossible. Per Andres Freund and the buildfarm. Discussion: http://postgr.es/m/20180330050351.bmxx4cdtz67czjda@alap3.anarazel.de
* Remove 'target' from GroupPathExtraData.Robert Haas2018-03-29
| | | | | | | | It's not needed. Jeevan Chalke Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
* Rewrite the code that applies scan/join targets to paths.Robert Haas2018-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the toplevel scan/join target list is parallel-safe, postpone generating Gather (or Gather Merge) paths until after the toplevel has been adjusted to return it. This (correctly) makes queries with expensive functions in the target list more likely to choose a parallel plan, since the cost of the plan now reflects the fact that the evaluation will happen in the workers rather than the leader. The original complaint about this problem was from Jeff Janes. If the toplevel scan/join relation is partitioned, recursively apply the changes to all partitions. This sometimes allows us to get rid of Result nodes, because Append is not projection-capable but its children may be. It also cleans up what appears to be incorrect SRF handling from commit e2f1eb0ee30d144628ab523432320f174a2c8966: the old code had no knowledge of SRFs for child scan/join rels. Because we now use create_projection_path() in some cases where we formerly used apply_projection_to_path(), this changes the ordering of columns in some queries generated by postgres_fdw. Update regression outputs accordingly. Patch by me, reviewed by Amit Kapila and by Ashutosh Bapat. Other fixes for this problem (substantially different from this version) were reviewed by Dilip Kumar, Amit Khandekar, and Marina Polyakova. Discussion: http://postgr.es/m/CAMkU=1ycXNipvhWuweUVpKuyu6SpNjF=yHWu4c4US5JgVGxtZQ@mail.gmail.com
* Postpone generate_gather_paths for topmost scan/join rel.Robert Haas2018-03-29
| | | | | | | | | | | | | | | Don't call generate_gather_paths for the topmost scan/join relation when it is initially populated with paths. Instead, do the work in grouping_planner. By itself, this gains nothing; in fact it loses slightly because we end up calling set_cheapest() for the topmost scan/join rel twice rather than once. However, it paves the way for a future commit which will postpone generate_gather_paths for the topmost scan/join relation even further, allowing more accurate costing of parallel paths. Amit Kapila and Robert Haas. Earlier versions of this patch (which different substantially) were reviewed by Dilip Kumar, Amit Khandekar, Marina Polyakova, and Ashutosh Bapat.
* Add inlining support to LLVM JIT provider.Andres Freund2018-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides infrastructure to allow JITed code to inline code implemented in C. This e.g. can be postgres internal functions or extension code. This already speeds up long running queries, by allowing the LLVM optimizer to optimize across function boundaries. The optimization potential currently doesn't reach its full potential because LLVM cannot optimize the FunctionCallInfoData argument fully away, because it's allocated on the heap rather than the stack. Fixing that is beyond what's realistic for v11. To be able to do that, use CLANG to convert C code to LLVM bitcode, and have LLVM build a summary for it. That bitcode can then be used to to inline functions at runtime. For that the bitcode needs to be installed. Postgres bitcode goes into $pkglibdir/bitcode/postgres, extensions go into equivalent directories. PGXS has been modified so that happens automatically if postgres has been compiled with LLVM support. Currently this isn't the fastest inline implementation, modules are reloaded from disk during inlining. That's to work around an apparent LLVM bug, triggering an apparently spurious error in LLVM assertion enabled builds. Once that is resolved we can remove the superfluous read from disk. Docs will follow in a later commit containing docs for the whole JIT feature. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de