postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
*	Transaction control in PL procedures	Peter Eisentraut	2018-01-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In each of the supplied procedural languages (PL/pgSQL, PL/Perl, PL/Python, PL/Tcl), add language-specific commit and rollback functions/commands to control transactions in procedures in that language. Add similar underlying functions to SPI. Some additional cleanup so that transaction commit or abort doesn't blow away data structures still used by the procedure call. Add execution context tracking to CALL and DO statements so that transaction control commands can only be issued in top-level procedure and block calls, not function calls or other procedure or block calls. - SPI Add a new function SPI_connect_ext() that is like SPI_connect() but allows passing option flags. The only option flag right now is SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction control commands, otherwise it's not allowed. This is meant to be passed down from CALL and DO statements which themselves know in which context they are called. A nonatomic SPI connection uses different memory management. A normal SPI connection allocates its memory in TopTransactionContext. For nonatomic connections we use PortalContext instead. As the comment in SPI_connect_ext() (previously SPI_connect()) indicates, one could potentially use PortalContext in all cases, but it seems safest to leave the existing uses alone, because this stuff is complicated enough already. SPI also gets new functions SPI_start_transaction(), SPI_commit(), and SPI_rollback(), which can be used by PLs to implement their transaction control logic. - portalmem.c Some adjustments were made in the code that cleans up portals at transaction abort. The portal code could already handle a command committing a transaction and continuing (e.g., VACUUM), but it was not quite prepared for a command aborting a transaction and continuing. In AtAbort_Portals(), remove the code that marks an active portal as failed. As the comment there already predicted, this doesn't work if the running command wants to keep running after transaction abort. And it's actually not necessary, because pquery.c is careful to run all portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if there is an exception. So the code in AtAbort_Portals() is never used anyway. In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not to clean up active portals too much. This mirrors similar code in PreCommit_Portals(). - PL/Perl Gets new functions spi_commit() and spi_rollback() - PL/pgSQL Gets new commands COMMIT and ROLLBACK. Update the PL/SQL porting example in the documentation to reflect that transactions are now possible in procedures. - PL/Python Gets new functions plpy.commit and plpy.rollback. - PL/Tcl Gets new commands commit and rollback. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
*	Allow UPDATE to move rows between partitions.	Robert Haas	2018-01-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an UPDATE causes a row to no longer match the partition constraint, try to move it to a different partition where it does match the partition constraint. In essence, the UPDATE is split into a DELETE from the old partition and an INSERT into the new one. This can lead to surprising behavior in concurrency scenarios because EvalPlanQual rechecks won't work as they normally did; the known problems are documented. (There is a pending patch to improve the situation further, but it needs more review.) Amit Khandekar, reviewed and tested by Amit Langote, David Rowley, Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro Herrera, Amit Kapila, and me. A few final revisions by me. Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
*	Replace AclObjectKind with ObjectType	Peter Eisentraut	2018-01-19
\| \| \| \| \| \| \| \| \|	AclObjectKind was basically just another enumeration for object types, and we already have a preferred one for that. It's only used in aclcheck_error. By using ObjectType instead, we can also give some more precise error messages, for example "index" instead of "relation". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Remove useless lookup of root partitioned rel in ExecInitModifyTable().	Tom Lane	2018-01-17
\| \| \| \| \| \| \| \| \| \| \|	node->partitioned_rels is only set in UPDATE/DELETE cases, but ExecInitModifyTable only uses its "rel" variable in INSERT cases, so the extra logic to find the root rel is just a waste of complexity and cycles. Etsuro Fujita, reviewed by Amit Langote Discussion: https://postgr.es/m/93cf9816-2f7d-0f67-8ed2-4a4e497a6ab8@lab.ntt.co.jp
*	Revert "Move portal pinning from PL/pgSQL to SPI"	Peter Eisentraut	2018-01-10
\| \| \| \| \| \| \|	This reverts commit b3617cdfbba1b5381e9d1c6bc0839500e8eb7273. This broke returning unnamed cursors from PL/pgSQL functions. Apparently, there are no test cases for this.
*	Move portal pinning from PL/pgSQL to SPI	Peter Eisentraut	2018-01-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PL/pgSQL "pins" internally generated (unnamed) portals so that user code cannot close them by guessing their names. This logic is also useful in other languages and really for any code. So move that logic into SPI. An unnamed portal obtained through SPI_cursor_open() and related functions is now automatically pinned, and SPI_cursor_close() automatically unpins a portal that is pinned. In the core distribution, this affects PL/Perl and PL/Python, preventing users from manually closing cursors created by spi_query and plpy.cursor, respectively. (PL/Tcl does not currently offer any cursor functionality.) Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
*	Expression evaluation based aggregate transition invocation.	Andres Freund	2018-01-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously aggregate transition and combination functions were invoked by special case code in nodeAgg.c, evaluating input and filters separately using the expression evaluation machinery. That turns out to not be great for performance for several reasons: - repeated expression evaluations have some cost - the transition functions invocations are poorly predicted, as commonly there are multiple aggregates in a query, resulting in the same call-stack invoking different functions. - filter and input computation had to be done separately - the special case code made it hard to implement JITing of the whole transition function invocation Address this by building one large expression that computes input, evaluates filters, and invokes transition functions. This leads to moderate speedups in queries bottlenecked by aggregate computations, and enables large speedups for similar cases once JITing is done. There's potential for further improvement: - It'd be nice if we could simplify the somewhat expensive aggstate->all_pergroups lookups. - right now there's still an advance_transition_function invocation in nodeAgg.c, leading to some code duplication. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
*	Remove PortalGetQueryDesc()	Peter Eisentraut	2018-01-09
\| \| \| \| \| \| \| \| \|	After having gotten rid of PortalGetHeapMemory(), there seems little reason to keep one Portal access macro around that offers no actual abstraction and isn't consistently used anyway. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
*	Update portal-related memory context names and API	Peter Eisentraut	2018-01-09
\| \| \| \| \| \| \| \| \| \| \| \| \|	Rename PortalMemory to TopPortalContext, to avoid confusion with PortalContext and align naming with similar top-level memory contexts. Rename PortalData's "heap" field to portalContext. The "heap" naming seems quite antiquated and confusing. Also get rid of the PortalGetHeapMemory() macro and access the field directly, which we do for other portal fields, so this abstraction doesn't buy anything. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
*	Factor error generation out of ExecPartitionCheck.	Robert Haas	2018-01-05
\| \| \| \| \| \| \| \| \| \| \|	At present, we always raise an ERROR if the partition constraint is violated, but a pending patch for UPDATE tuple routing will consider instead moving the tuple to the correct partition. Refactor to make that simpler. Amit Khandekar, reviewed by Amit Langote, David Rowley, and me. Discussion: http://postgr.es/m/CAJ3gD9cue54GbEzfV-61nyGpijvjZgCcghvLsB0_nL8Nm8HzCA@mail.gmail.com
*	Simplify and encapsulate tuple routing support code.	Robert Haas	2018-01-04
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of having ExecSetupPartitionTupleRouting return multiple out parameters, have it return a pointer to a structure containing all of those different things. Also, provide and use a cleanup function, ExecCleanupTupleRouting, instead of cleaning up all of the resources allocated by ExecSetupPartitionTupleRouting individually. Amit Khandekar, reviewed by Amit Langote, David Rowley, and me Discussion: http://postgr.es/m/CAJ3gD9fWfxgKC+PfJZF3hkgAcNOy-LpfPxVYitDEXKHjeieWQQ@mail.gmail.com
*	Code review for Parallel Append.	Robert Haas	2018-01-04
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Remove unnecessary #include mistakenly added in execnodes.h. - Fix mistake in comment in choose_next_subplan_for_leader. - Adjust row estimates in cost_append for a possibly-different parallel divisor. - Clamp row estimates in cost_append after operations that may not produce integers. Amit Kapila, with cosmetic adjustments by me. Discussion: http://postgr.es/m/CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com
*	Fix some minor errors in new PHJ code.	Tom Lane	2018-01-03
\| \| \| \| \| \| \| \| \| \| \| \|	Correct ExecParallelHashTuplePrealloc's estimate of whether the space_allowed limit is exceeded. Be more consistent about tuples that are exactly HASH_CHUNK_THRESHOLD in size (they're "small", not "large"). Neither of these things explain the current buildfarm unhappiness, but they're still bugs. Thomas Munro, per gripe by me Discussion: https://postgr.es/m/CAEepm=34PDuR69kfYVhmZPgMdy8pSA-MYbpesEN1SR+2oj3Y+w@mail.gmail.com
*	Update copyright for 2018	Bruce Momjian	2018-01-02
\| \| \| \|	Backpatch-through: certain files through 9.3
*	Simplify representation of aggregate transition values a bit.	Andres Freund	2018-01-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously aggregate transition values for hash and other forms of aggregation (i.e. sort and no group by) were represented differently. Hash based aggregation used a grouping set indexed array pointing to an array of transition values, whereas other forms of aggregation used one flattened array with the index being computed out of grouping set and transition offsets. That made upcoming changes hard, so represent both as grouping set indexed array of per-group data. As a nice side-effect this also makes aggregation slightly faster, because computing offsets with `transno + (setno * numTrans)` turns out not to be that cheap (too big for x86 lea for example). Author: Andres Freund Discussion: https://postgr.es/m/20171128003121.nmxbm2ounxzb6n2t@alap3.anarazel.de
*	Ensure proper alignment of tuples in HashMemoryChunkData buffers.	Tom Lane	2018-01-02
\| \| \| \| \| \| \| \| \| \| \| \| \|	The previous coding relied (without any documentation) on the data[] member of HashMemoryChunkData being at a MAXALIGN'ed offset. If it was not, the tuples would not be maxaligned either, leading to failures on alignment-picky machines. While there seems to be no live bug on any platform we support, this is clearly pretty fragile: any addition to or rearrangement of the fields in HashMemoryChunkData could break it. Let's remove the hazard by getting rid of the data[] member and instead using pointer arithmetic with an explicitly maxalign'ed offset. Discussion: https://postgr.es/m/14483.1514938129@sss.pgh.pa.us
*	Fix EXPLAIN ANALYZE output for Parallel Hash.	Andres Freund	2018-01-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a race case, EXPLAIN ANALYZE could fail to display correct nbatch and size information. Refactor so that participants report only on batches they worked on rather than trying to report on all of them, and teach explain.c to consider the HashInstrumentation object from all participants instead of picking the first one it can find. This should fix an occasional build farm failure in the "join" regression test. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/30219.1514428346%40sss.pgh.pa.us
*	Perform slot validity checks in a separate pass over expression.	Andres Freund	2017-12-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reduces code duplication a bit, but the primary benefit that it makes JITing expression evaluation easier. When doing so we can't, as previously done in the interpreted case, really change opcode without recompiling. Nor dow we just carry around unnecessary branches to avoid re-checking over and over. As a minor side-effect this makes ExecEvalStepOp() O(log(N)) rather than O(N). Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
*	Rely on executor utils to build targetlist for DML RETURNING.	Andres Freund	2017-12-29
\| \| \| \| \| \| \| \| \| \|	This is useful because it gets rid of the sole direct user of ExecAssignResultType(). A future commit will likely make use of that and combine creating the targetlist with the initialization of the result slot. But it seems like good code hygiene anyway. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
*	Fix rare assertion failure in parallel hash join.	Andres Freund	2017-12-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a backend runs out of inner tuples to hash, it should detach from grow_batch_barrier only after it has flushed all batches to disk and merged counters, not before. Otherwise a concurrent backend in ExecParallelHashIncreaseNumBatches() could stop waiting for this backend and try to read tuples before they have been written. This commit reorders those operations and should fix the assertion failures seen occasionally on the build farm since commit 1804284042e659e7d16904e7bbb0ad546394b6a3. Author: Thomas Munro Discussion: https://postgr.es/m/E1eRwXy-0004IK-TO%40gemulon.postgresql.org
*	Fix assert with side effects in the new PHJ code.	Andres Freund	2017-12-24
\| \| \| \| \| \| \|	Instead of asserting the assert just set the value to what it was supposed to test... Per coverity.
*	Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit.	Tom Lane	2017-12-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does three interrelated things: * Create a new expression execution step type EEOP_PARAM_CALLBACK and add the infrastructure needed for add-on modules to generate that. As discussed, the best control mechanism for that seems to be to add another hook function to ParamListInfo, which will be called by ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found. For stand-alone expressions, we add a new entry point to allow the ParamListInfo to be specified directly, since it can't be retrieved from the parent plan node's EState. * Redesign the API for the ParamListInfo paramFetch hook so that the ParamExternData array can be entirely virtual. This also lets us get rid of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to decide which param IDs should be accessible or not. plpgsql_param_fetch was already doing the identical masking check, so having callers do it too seemed redundant. While I was at it, I added a "speculative" flag to paramFetch that the planner can specify as TRUE to avoid unwanted failures. This solves an ancient problem for plpgsql that it couldn't provide values of non-DTYPE_VAR variables to the planner for fear of triggering premature "record not assigned yet" or "field not found" errors during planning. * Rework plpgsql to get rid of the need for "unshared" parameter lists, by dint of turning the single ParamListInfo per estate into a nearly read-only data structure that doesn't instantiate any per-variable data. Instead, the paramFetch hook controls access to per-variable data and can make the right decisions on the fly, replacing the cases that we used to need multiple ParamListInfos for. This might perhaps have been a performance loss on its own, but by using a paramCompile hook we can bypass plpgsql_param_fetch entirely during normal query execution. (It's now only called when, eg, we copy the ParamListInfo into a cursor portal. copyParamList() or SerializeParamList() effectively instantiate the virtual parameter array as a simple physical array without a paramFetch hook, which is what we want in those cases.) This allows reverting most of commit 6c82d8d1f, though I kept the cosmetic code-consolidation aspects of that (eg the assign_simple_var function). Performance testing shows this to be at worst a break-even change, and it can provide wins ranging up to 20% in test cases involving accesses to fields of "record" variables. The fact that values of such variables can now be exposed to the planner might produce wins in some situations, too, but I've not pursued that angle. In passing, remove the "parent" pointer from the arguments to ExecInitExprRec and related functions, instead storing that pointer in a transient field in ExprState. The ParamListInfo pointer for a stand-alone expression is handled the same way; we'd otherwise have had to add yet another recursively-passed-down argument in expression compilation. Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
*	Add parallel-aware hash joins.	Andres Freund	2017-12-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, meaning that the work of the inner subplan was duplicated in every worker. After this commit, the planner will consider using a partial subplan on the inner side too, using the Parallel Hash node to divide the work over the available CPU cores and combine its results in shared memory. If the join needs to be split into multiple batches in order to respect work_mem, then workers process different batches as much as possible and then work together on the remaining batches. The advantages of a parallel-aware hash join over a parallel-oblivious hash join used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUs One disadvantage is that there is some communication between the participating CPUs which might outweigh the benefits of parallelism in the case of small hash tables. This is avoided by the planner's existing reluctance to supply partial plans for small scans, but it may be necessary to estimate synchronization costs in future if that situation changes. Another is that outer batch 0 must be written to disk if multiple batches are required. A potential future advantage of parallel-aware hash joins is that right and full outer joins could be supported, since there is a single set of matched bits for each hashtable, but that is not yet implemented. A new GUC enable_parallel_hash is defined to control the feature, defaulting to on. Author: Thomas Munro Reviewed-By: Andres Freund, Robert Haas Tested-By: Rafia Sabih, Prabhat Sahu Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
*	When passing query strings to workers, pass the terminating \0.	Robert Haas	2017-12-20
\| \| \| \| \| \| \| \| \|	Otherwise, when the query string is read, we might trailing garbage beyond the end, unless there happens to be a \0 there by good luck. Report and patch by Thomas Munro. Reviewed by Rafia Sabih. Discussion: http://postgr.es/m/CAEepm=2SJs7X+_vx8QoDu8d1SMEOxtLhxxLNzZun_BvNkuNhrw@mail.gmail.com
*	Try again to fix accumulation of parallel worker instrumentation.	Robert Haas	2017-12-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a Gather or Gather Merge node is started and stopped multiple times, accumulate instrumentation data only once, at the end, instead of after each execution, to avoid recording inflated totals. Commit 778e78ae9fa51e58f41cbdc72b293291d02d8984, the previous attempt at a fix, instead reset the state after every execution, which worked for the general instrumentation data but had problems for the additional instrumentation specific to Sort and Hash nodes. Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila, following a design proposal from Thomas Munro, with a comment tweak by me. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
*	Fix crashes on plans with multiple Gather (Merge) nodes.	Robert Haas	2017-12-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	es_query_dsa turns out to be broken by design, because it supposes that there is only one DSA for the whole query, whereas there is actually one per Gather (Merge) node. For now, work around that problem by setting and clearing the pointer around the sections of code that might need it. It's probably a better idea to get rid of es_query_dsa altogether in favor of having each node keep track individually of which DSA is relevant, but that seems like more than we would want to back-patch. Thomas Munro, reviewed and tested by Andreas Seltenreich, Amit Kapila, and by me. Discussion: http://postgr.es/m/CAEepm=1U6as=brnVvMNixEV2tpi8NuyQoTmO8Qef0-VV+=7MDA@mail.gmail.com
*	Allow executor nodes to change their ExecProcNode function.	Andres Freund	2017-12-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order for executor nodes to be able to change their ExecProcNode function after ExecInitNode() has finished, provide ExecSetExecProcNode(). This allows any wrappers functions that only execProcnode.c knows about to be reinstalled. The motivation for wanting to change ExecProcNode after ExecInitNode() has finished is that it is not known until later whether parallel query is available, so if a parallel variant is to be installed then ExecInitNode() is too soon to decide. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=09rr65VN+cAV5FgyM_z=D77Xy8Fuc9CDDDYbq3pQUezg@mail.gmail.com
*	Revert "Fix accumulation of parallel worker instrumentation."	Robert Haas	2017-12-13
\| \| \| \| \| \| \|	This reverts commit 2c09a5c12a66087218c7f8cba269cd3de51b9b82. Per further discussion, that doesn't seem to be the best possible fix. Discussion: http://postgr.es/m/CAA4eK1LW2aFKzY3=vwvc=t-juzPPVWP2uT1bpx_MeyEqnM+p8g@mail.gmail.com
*	Fix comment	Peter Eisentraut	2017-12-11
\| \| \| \|	Reported-by: Noah Misch <noah@leadboat.com>
*	Fix corner-case coredump in _SPI_error_callback().	Tom Lane	2017-12-11
\| \| \| \| \| \| \| \| \| \| \|	I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL, and only fills it in some time later. If an error were to happen in between, _SPI_error_callback would try to dereference the null pointer. This is unlikely --- there's not much between those points except push-snapshot calls --- but it's clearly not impossible. Tweak the callback to do nothing if the pointer isn't set yet. It's been like this for awhile, so back-patch to all supported branches.
*	Fix Parallel Append crash.	Robert Haas	2017-12-06
\| \| \| \| \| \| \| \| \|	Reported by Tom Lane and the buildfarm. Amul Sul and Amit Khandekar Discussion: http://postgr.es/m/17868.1512519318@sss.pgh.pa.us Discussion: http://postgr.es/m/CAJ3gD9cJQ4d-XhmZ6BqM9rMM2KDBfpkdgOAb4+psz56uBuMQ_A@mail.gmail.com
*	Support Parallel Append plan nodes.	Robert Haas	2017-12-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we create an Append node, we can spread out the workers over the subplans instead of piling on to each subplan one at a time, which should typically be a bit more efficient, both because the startup cost of any plan executed entirely by one worker is paid only once and also because of reduced contention. We can also construct Append plans using a mix of partial and non-partial subplans, which may allow for parallelism in places that otherwise couldn't support it. Unfortunately, this patch doesn't handle the important case of parallelizing UNION ALL by running each branch in a separate worker; the executor infrastructure is added here, but more planner work is needed. Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
*	Fix accumulation of parallel worker instrumentation.	Robert Haas	2017-12-05
\| \| \| \| \| \| \| \| \| \| \| \| \|	When a Gather or Gather Merge node is started and stopped multiple times, the old code wouldn't reset the shared state between executions, potentially resulting in dramatically inflated instrumentation data for nodes beneath it. (The per-worker instrumentation ended up OK, I think, but the overall totals were inflated.) Report by hubert depesz lubaczewski. Analysis and fix by Amit Kapila, reviewed and tweaked a bit by me. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
*	Fix EXPLAIN ANALYZE of hash join when the leader doesn't participate.	Andres Freund	2017-12-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a hash join appears in a parallel query, there may be no hash table available for explain.c to inspect even though a hash table may have been built in other processes. This could happen either because parallel_leader_participation was set to off or because the leader happened to hit the end of the outer relation immediately (even though the complete relation is not empty) and decided not to build the hash table. Commit bf11e7ee introduced a way for workers to exchange instrumentation via the DSM segment for Sort nodes even though they are not parallel-aware. This commit does the same for Hash nodes, so that explain.c has a way to find instrumentation data from an arbitrary participant that actually built the hash table. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm%3D3DUQC2-z252N55eOcZBer6DPdM%3DFzrxH9dZc5vYLsjaA%40mail.gmail.com
*	Remove memory leak protection from Gather and Gather Merge nodes.	Robert Haas	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before commit 6b65a7fe62e129d5c2b85cd74d6a91d8f7564608, tqueue.c could perform tuple remapping and thus leak memory, which is why commit af33039317ddc4a0e38a02e2255c2bf453115fd2 made TupleQueueReaderNext run in a short-lived context. Now, however, tqueue.c has been reduced to a shadow of its former self, and there shouldn't be any chance of leaks any more. Accordingly, remove some tuple copying and memory context manipulation to speed up processing. Patch by me, reviewed by Amit Kapila. Some testing by Rafia Sabih. Discussion: http://postgr.es/m/CAA4eK1LSDydwrNjmYSNkfJ3ZivGSWH9SVswh6QpNzsMdj_oOQA@mail.gmail.com
*	Re-allow INSERT .. ON CONFLICT DO NOTHING on partitioned tables.	Robert Haas	2017-12-01
\| \| \| \| \| \| \| \| \| \| \| \|	Commit 8355a011a0124bdf7ccbada206a967d427039553 was reverted in f05230752d53c4aa74cffa9b699983bbb6bcb118, but this attempt is hopefully better-considered: we now pass the correct value to ExecOpenIndices, which should avoid the crash that we hit before. Amit Langote, reviewed by Simon Riggs and by me. Some final editing by me. Discussion: http://postgr.es/m/7ff1e8ec-dc39-96b1-7f47-ff5965dceeac@lab.ntt.co.jp
*	Fix uninitialized memory reference.	Robert Haas	2017-12-01
\| \| \| \| \| \| \| \| \| \| \|	Without this, when partdesc->nparts == 0, we end up calling ExecBuildSlotPartitionKeyDescription without initializing values and isnull. Reported by Coverity via Michael Paquier. Patch by Michael Paquier, reviewed and revised by Amit Langote. Discussion: http://postgr.es/m/CAB7nPqQ3mwkdMoPY-ocgTpPnjd8TKOadMxdTtMLvEzF8480Zfg@mail.gmail.com
*	SQL procedures	Peter Eisentraut	2017-11-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
*	Update typedefs.list and re-run pgindent	Robert Haas	2017-11-29
\| \| \| \|	Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com
*	Fix wrong function name in comment.	Robert Haas	2017-11-28
\| \| \| \| \| \|	Rushabh Lathia Discussion: http://postgr.es/m/CAGPqQf2z5g+7YmGZSZgKoiFsaUB+63Rzmz8-5PQHuS6hd14FEg@mail.gmail.com
*	Teach bitmap heap scan to cope with absence of a DSA.	Robert Haas	2017-11-28
\| \| \| \| \| \| \| \| \| \| \| \|	If we have a plan that uses parallelism but are unable to execute it using parallelism, for example due to a lack of available DSM segments, then the EState's es_query_dsa will be NULL. Parallel bitmap heap scan needs to fall back to a non-parallel scan in such cases. Patch by me, reviewed by Dilip Kumar Discussion: http://postgr.es/m/CAEepm=0kADK5inNf_KuemjX=HQ=PuTP0DykM--fO5jS5ePVFEA@mail.gmail.com
*	Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.	Tom Lane	2017-11-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rewriteTargetListUD's processing is dependent on the relkind of the query's target table. That was fine at the time it was made to act that way, even for queries on inheritance trees, because all tables in an inheritance tree would necessarily be plain tables. However, the 9.5 feature addition allowing some members of an inheritance tree to be foreign tables broke the assumption that rewriteTargetListUD's output tlist could be applied to all child tables with nothing more than column-number mapping. This led to visible failures if foreign child tables had row-level triggers, and would also break in cases where child tables belonged to FDWs that used methods other than CTID for row identification. To fix, delay running rewriteTargetListUD until after the planner has expanded inheritance, so that it is applied separately to the (already mapped) tlist for each child table. We can conveniently call it from preprocess_targetlist. Refactor associated code slightly to avoid the need to heap_open the target relation multiple times during preprocess_targetlist. (The APIs remain a bit ugly, particularly around the point of which steps scribble on parse->targetList and which don't. But avoiding such scribbling would require a change in FDW callback APIs, which is more pain than it's worth.) Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when we transition from rows providing a CTID to rows that don't. (That's really an independent bug, but it manifests in much the same cases.) Add a regression test checking one manifestation of this problem, which was that row-level triggers on a foreign child table did not work right. Back-patch to 9.5 where the problem was introduced. Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
*	Repair failure with SubPlans in multi-row VALUES lists.	Tom Lane	2017-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When nodeValuesscan.c was written, it was impossible to have a SubPlan in VALUES --- any sub-SELECT there would have to be uncorrelated and thereby would produce an InitPlan instead. We therefore took a shortcut in the logic that throws away a ValuesScan's per-row expression evaluation data structures. This was broken by the introduction of LATERAL however; a sub-SELECT containing a lateral reference produces a correlated SubPlan. The cleanest fix for this would be to give up the optimization of discarding the expression eval state. But that still seems pretty unappetizing for long VALUES lists. It seems to work to just prevent the subexpressions from hooking into the ValuesScan node's subPlan list, so let's do that and see how well it works. (If this breaks, due to additional connections between the subexpressions and the outer query structures, we might consider compromises like throwing away data only for VALUES rows not containing SubPlans.) Per bug #14924 from Christian Duta. Back-patch to 9.3 where LATERAL was introduced. Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org
*	Avoid projecting tuples unnecessarily in Gather and Gather Merge.	Robert Haas	2017-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \|	It's most often the case that the target list for the Gather (Merge) node matches the target list supplied by the underlying plan node; when this is so, we can avoid the overhead of projecting. This depends on commit f455e1125e2588d4cd4fc663c6a10da4e003a3b5 for proper functioning. Idea by Andres Freund. Patch by me. Review by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoZ0ZL=cesZFq8c9NnfK6bqy-wwUd3_74iYGodYrSoQ7Fw@mail.gmail.com
*	Fix handling of NULLs returned by aggregate combine functions.	Andres Freund	2017-11-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When strict aggregate combine functions, used in multi-stage/parallel aggregation, returned NULL, we didn't check for that, invoking the combine function with NULL the next round, despite it being strict. The equivalent code invoking normal transition functions has a check for that situation, which did not get copied in a7de3dc5c346. Fix the bug by adding the equivalent check. Based on a quick look I could not find any strict combine functions in core actually returning NULL, and it doesn't seem very likely external users have done so. So this isn't likely to have caused issues in practice. Add tests verifying transition / combine functions returning NULL is tested. Reported-By: Andres Freund Author: Andres Freund Discussion: https://postgr.es/m/20171121033642.7xvmjqrl4jdaaat3@alap3.anarazel.de Backpatch: 9.6, where parallel aggregation was introduced
*	Tweak use of ExecContextForcesOids by Gather (Merge).	Robert Haas	2017-11-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Specifically, pass the outer plan's PlanState instead of our own PlanState. At present, ExecContextForcesOids doesn't actually care which PlanState we pass; it just looks through to the underlying EState to find the result relation or top-level eflags. However, in the future it might care. If that happens, and if our goal is to get a tuple descriptor that matches that of the outer plan, then I think what we care about is whether the outer plan's context forces OIDs, rather than whether our own context forces OIDs, just as we use the outer node's target list rather than our own. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoZ0ZL=cesZFq8c9NnfK6bqy-wwUd3_74iYGodYrSoQ7Fw@mail.gmail.com
*	Pass eflags down to parallel workers.	Robert Haas	2017-11-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, there are no known consequences of this oversight, so no back-patch. Several of the EXEC_FLAG_* constants aren't usable in parallel mode anyway, and potential problems related to the presence or absence of OIDs (see EXEC_FLAG_WITH_OIDS, EXEC_FLAG_WITHOUT_OIDS) seem at present to be masked by the unconditional projection step performed by Gather and Gather Merge. In general, however, it seems important that all participants agree on the values of these flags, which modify executor behavior globally, and a pending patch to skip projection in Gather (Merge) would be outright broken in certain cases without this fix. Patch by me, based on investigation of a test case provided by Amit Kapila. This patch was also reviewed by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoZ0ZL=cesZFq8c9NnfK6bqy-wwUd3_74iYGodYrSoQ7Fw@mail.gmail.com
*	Provide DSM segment to ExecXXXInitializeWorker functions.	Andres Freund	2017-11-16
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, executor nodes running in parallel worker processes didn't have access to the dsm_segment object used for parallel execution. In order to support resource management based on DSM segment lifetime, they need that. So create a ParallelWorkerContext object to hold it and pass it to all InitializeWorker functions. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
*	Pass InitPlan values to workers via Gather (Merge).	Robert Haas	2017-11-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a PARAM_EXEC parameter is used below a Gather (Merge) but the InitPlan that computes it is attached to or above the Gather (Merge), force the value to be computed before starting parallelism and pass it down to all workers. This allows us to use parallelism in cases where it previously would have had to be rejected as unsafe. We do - in this case - lose the optimization that the value is only computed if it's actually used. An alternative strategy would be to have the first worker that needs the value compute it, but one downside of that approach is that we'd then need to select a parallel-safe path to compute the parameter value; it couldn't for example contain a Gather (Merge) node. At some point in the future, we might want to consider both approaches. Independent of that consideration, there is a great deal more work that could be done to make more kinds of PARAM_EXEC parameters parallel-safe. This infrastructure could be used to allow a Gather (Merge) on the inner side of a nested loop (although that's not a very appealing plan) and cases where the InitPlan is attached below the Gather (Merge) could be addressed as well using various techniques. But this is a good start. Amit Kapila, reviewed and revised by me. Reviewing and testing from Kuntal Ghosh, Haribabu Kommi, and Tushar Ahuja. Discussion: http://postgr.es/m/CAA4eK1LV0Y1AUV4cUCdC+sYOx0Z0-8NAJ2Pd9=UKsbQ5Sr7+JQ@mail.gmail.com
*	Centralize executor-related partitioning code.	Robert Haas	2017-11-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	Some code is moved from partition.c, which has grown very quickly lately; splitting the executor parts out might help to keep it from getting totally out of control. Other code is moved from execMain.c. All is moved to a new file execPartition.c. get_partition_for_tuple now has a new interface that more clearly separates executor concerns from generic concerns. Amit Langote. A slight comment tweak by me. Discussion: http://postgr.es/m/1f0985f8-3b61-8bc4-4350-baa6d804cb6d@lab.ntt.co.jp