aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor/execMain.c
Commit message (Collapse)AuthorAge
...
* Fix partitioning crashes during error reporting.Robert Haas2017-07-24
| | | | | | | | | | In various places where we reverse-map a tuple before calling ExecBuildSlotValueDescription, we neglected to ensure that the slot descriptor matched the tuple stored in it. Amit Langote and Amit Khandekar, reviewed by Etsuro Fujita Discussion: http://postgr.es/m/CAJ3gD9cqpP=WvJj=dv1ONkPWjy8ZuUaOM4_x86i3uQPas=0_jg@mail.gmail.com
* Reverse-convert row types in ExecWithCheckOptions.Robert Haas2017-07-17
| | | | | | | | | Just as we already do in ExecConstraints, and for the same reason: to improve the quality of error messages. Etsuro Fujita, reviewed by Amit Langote Discussion: http://postgr.es/m/56e0baa8-e458-2bbb-7936-367f7d832e43@lab.ntt.co.jp
* Use a real RT index when setting up partition tuple routing.Robert Haas2017-07-17
| | | | | | | | | | | Before, we always used a dummy value of 1, but that's not right when the partitioned table being modified is inside of a WITH clause rather than part of the main query. Amit Langote, reported and reviewd by Etsuro Fujita, with a comment change by me. Discussion: http://postgr.es/m/ee12f648-8907-77b5-afc0-2980bcb0aa37@lab.ntt.co.jp
* Fix transition tables for partition/inheritance.Andrew Gierth2017-06-28
| | | | | | | | | | | We disallow row-level triggers with transition tables on child tables. Transition tables for triggers on the parent table contain only those columns present in the parent. (We can't mix tuple formats in a single transition table.) Patch by Thomas Munro Discussion: https://postgr.es/m/CA%2BTgmoZzTBBAsEUh4MazAN7ga%3D8SsMC-Knp-6cetts9yNZUCcg%40mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Fix typo in comment.Heikki Linnakangas2017-06-21
| | | | Etsuro Fujita
* Re-run pgindent.Tom Lane2017-06-13
| | | | | | | | This is just to have a clean base state for testing of Piotr Stefaniak's latest version of FreeBSD indent. I fixed up a couple of places where pgindent would have changed format not-nicely. perltidy not included. Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com
* Prevent BEFORE triggers from violating partitioning constraints.Robert Haas2017-06-07
| | | | | | | | | | | | | | | | | | | | | Since tuple-routing implicitly checks the partitioning constraints at least for the levels of the partitioning hierarchy it traverses, there's normally no need to revalidate the partitioning constraint after performing tuple routing. However, if there's a BEFORE trigger on the target partition, it could modify the tuple, causing the partitioning constraint to be violated. Catch that case. Also, instead of checking the root table's partition constraint after tuple-routing, check it beforehand. Otherwise, the rules for when the partitioning constraint gets checked get too complicated, because you sometimes have to check part of the constraint but not all of it. This effectively reverts commit 39162b2030fb0a35a6bb28dc636b5a71b8df8d1c in favor of a different approach altogether. Report by me. Initial debugging by Jeevan Ladhe. Patch by Amit Langote, reviewed by me. Discussion: http://postgr.es/m/CA+Tgmoa9DTgeVOqopieV8d1QRpddmP65aCdxyjdYDoEO5pS5KA@mail.gmail.com
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Fix relcache leak when row triggers on partitions are fired by COPY.Robert Haas2017-05-16
| | | | | | Thomas Munro, reviewed by Amit Langote Discussion: http://postgr.es/m/CAEepm=15Jss-yhFApuKzxcoCuFnb8TR8iQiWMjG=CLYPx48QLw@mail.gmail.com
* Fix typos in comments.Robert Haas2017-05-09
| | | | | | Etsuro Fujita Discussion: http://postgr.es/m/968d99bf-0fa8-085b-f0a1-a379f8d661ff@lab.ntt.co.jp
* Fire per-statement triggers on partitioned tables.Robert Haas2017-05-01
| | | | | | | | | | | Even though no actual tuples are ever inserted into a partitioned table (the actual tuples are in the partitions, not the partitioned table itself), we still need to have a ResultRelInfo for the partitioned table, or per-statement triggers won't get fired. Amit Langote, per a report from Rajkumar Raghuwanshi. Reviewed by me. Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
* Fix reporting of violations in ExecConstraints, again.Robert Haas2017-04-10
| | | | | | | | | | | | We decided in f1b4c771ea74f42447dccaed42ffcdcccf3aa694 to pass the original slot to ExecConstraints(), but that breaks when there are BEFORE ROW triggers involved. So we need to do reverse-map the tuples back to the original descriptor instead, as Amit originally proposed. Amit Langote, reviewed by Ashutosh Bapat. One overlooked comment fixed by me. Discussion: http://postgr.es/m/b3a17254-6849-e542-2353-bde4e880b6a4@lab.ntt.co.jp
* Add infrastructure to support EphemeralNamedRelation references.Kevin Grittner2017-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others
* Faster expression evaluation and targetlist projection.Andres Freund2017-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the old, recursive tree-walk based evaluation, with non-recursive, opcode dispatch based, expression evaluation. Projection is now implemented as part of expression evaluation. This both leads to significant performance improvements, and makes future just-in-time compilation of expressions easier. The speed gains primarily come from: - non-recursive implementation reduces stack usage / overhead - simple sub-expressions are implemented with a single jump, without function calls - sharing some state between different sub-expressions - reduced amount of indirect/hard to predict memory accesses by laying out operation metadata sequentially; including the avoidance of nearly all of the previously used linked lists - more code has been moved to expression initialization, avoiding constant re-checks at evaluation time Future just-in-time compilation (JIT) has become easier, as demonstrated by released patches intended to be merged in a later release, for primarily two reasons: Firstly, due to a stricter split between expression initialization and evaluation, less code has to be handled by the JIT. Secondly, due to the non-recursive nature of the generated "instructions", less performance-critical code-paths can easily be shared between interpreted and compiled evaluation. The new framework allows for significant future optimizations. E.g.: - basic infrastructure for to later reduce the per executor-startup overhead of expression evaluation, by caching state in prepared statements. That'd be helpful in OLTPish scenarios where initialization overhead is measurable. - optimizing the generated "code". A number of proposals for potential work has already been made. - optimizing the interpreter. Similarly a number of proposals have been made here too. The move of logic into the expression initialization step leads to some backward-incompatible changes: - Function permission checks are now done during expression initialization, whereas previously they were done during execution. In edge cases this can lead to errors being raised that previously wouldn't have been, e.g. a NULL array being coerced to a different array type previously didn't perform checks. - The set of domain constraints to be checked, is now evaluated once during expression initialization, previously it was re-built every time a domain check was evaluated. For normal queries this doesn't change much, but e.g. for plpgsql functions, which caches ExprStates, the old set could stick around longer. The behavior around might still change. Author: Andres Freund, with significant changes by Tom Lane, changes by Heikki Linnakangas Reviewed-By: Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
* Allow for parallel execution whenever ExecutorRun() is done only once.Robert Haas2017-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, it was unsafe to execute a plan in parallel if ExecutorRun() might be called with a non-zero row count. However, it's quite easy to fix things up so that we can support that case, provided that it is known that we will never call ExecutorRun() a second time for the same QueryDesc. Add infrastructure to signal this, and cross-checks to make sure that a caller who claims this is true doesn't later reneg. While that pattern never happens with queries received directly from a client -- there's no way to know whether multiple Execute messages will be sent unless the first one requests all the rows -- it's pretty common for queries originating from procedural languages, which often limit the result to a single tuple or to a user-specified number of tuples. This commit doesn't actually enable parallelism in any additional cases, because currently none of the places that would be able to benefit from this infrastructure pass CURSOR_OPT_PARALLEL_OK in the first place, but it makes it much more palatable to pass CURSOR_OPT_PARALLEL_OK in places where we currently don't, because it eliminates some cases where we'd end up having to run the parallel plan serially. Patch by me, based on some ideas from Rafia Sabih and corrected by Rafia Sabih based on feedback from Dilip Kumar and myself. Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
* Don't scan partitioned tables.Robert Haas2017-03-21
| | | | | | | | | | | | | | | | | | | Partitioned tables do not contain any data; only their unpartitioned descendents need to be scanned. However, the partitioned tables still need to be locked, even though they're not scanned. To make that work, Append and MergeAppend relations now need to carry a list of (unscanned) partitioned relations that must be locked, and InitPlan must lock all partitioned result relations. Aside from the obvious advantage of avoiding some work at execution time, this has two other advantages. First, it may improve the planner's decision-making in some cases since the empty relation might throw things off. Second, it paves the way to getting rid of the storage for partitioned tables altogether. Amit Langote, reviewed by me. Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp
* Improve error reporting for tuple-routing failures.Robert Haas2017-03-03
| | | | | | | | | | | Currently, the whole row is shown without column names. Instead, adopt a style similar to _bt_check_unique() in ExecFindPartition() and show the failing key: (key1, ...) = (val1, ...). Amit Langote, per a complaint from Simon Riggs. Reviewed by me; I also adjusted the grammar in one of the comments. Discussion: http://postgr.es/m/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp
* Pass the source text for a parallel query to the workers.Robert Haas2017-02-22
| | | | | | | | | With this change, you can see the query that a parallel worker is executing in pg_stat_activity, and if the worker crashes you can see what query it was executing when it crashed. Rafia Sabih, reviewed by Kuntal Ghosh and Amit Kapila and slightly revised by me.
* Fix typos in comments.Heikki Linnakangas2017-02-06
| | | | | | | | | Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
* Remove redundant comment.Robert Haas2017-02-03
| | | | Rafia Sabih
* Set ecxt_scantuple correctly for tuple routing.Robert Haas2017-01-24
| | | | | | | | | | | | In 2ac3ef7a01df859c62d0a02333b646d65eaec5ff, we changed things so that it's possible for a different TupleTableSlot to be used for partitioned tables at successively lower levels. If we do end up changing the slot from the original, we must update ecxt_scantuple to point to the new one for partition key of the tuple to be computed correctly. Reported by Rajkumar Raghuwanshi. Patch by Amit Langote. Discussion: http://postgr.es/m/CAKcux6%3Dm1qyqB2k6cjniuMMrYXb75O-MB4qGQMu8zg-iGGLjDw%40mail.gmail.com
* Reindent table partitioning code.Robert Haas2017-01-24
| | | | | | We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.
* Logical replicationPeter Eisentraut2017-01-20
| | | | | | | | | | | | | - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
* Fix RETURNING to work correctly with partition tuple routing.Robert Haas2017-01-19
| | | | | | | | | | In ExecInsert(), do not switch back to the root partitioned table ResultRelInfo until after we finish ExecProcessReturning(), so that RETURNING projection is done using the partition's descriptor. For the projection to work correctly, we must initialize the same for each leaf partition during ModifyTableState initialization. Amit Langote
* Fix failure to enforce partitioning contraint for internal partitions.Robert Haas2017-01-19
| | | | | | | | | | | | | When a tuple is inherited into a partitioning root, no partition constraints need to be enforced; when it is inserted into a leaf, the parent's partitioning quals needed to be enforced. The previous coding got both of those cases right. When a tuple is inserted into an intermediate level of the partitioning hierarchy (i.e. a table which is both a partition itself and in turn partitioned), it must enforce the partitioning qual inherited from its parent. That case got overlooked; repair. Amit Langote
* Remove unnecessary arguments from partitioning functions.Robert Haas2017-01-04
| | | | | | | RelationGetPartitionQual() and generate_partition_qual() are always called with recurse = true, so we don't need an argument for that. Extracted by me from a larger patch by Amit Langote.
* Fix reporting of constraint violations for table partitioning.Robert Haas2017-01-04
| | | | | | | | | After a tuple is routed to a partition, it has been converted from the root table's row type to the partition's row type. ExecConstraints needs to report the failure using the original tuple and the parent's tuple descriptor rather than the ones for the selected partition. Amit Langote
* Move partition_tuple_slot out of EState.Robert Haas2017-01-04
| | | | | | | | | | | Commit 2ac3ef7a01df859c62d0a02333b646d65eaec5ff added a TupleTapleSlot for partition tuple slot to EState (es_partition_tuple_slot) but it's more logical to have it as part of ModifyTableState (mt_partition_tuple_slot) and CopyState (partition_tuple_slot). Discussion: http://postgr.es/m/1bd459d9-4c0c-197a-346e-e5e59e217d97@lab.ntt.co.jp Amit Langote, per a gripe from me
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Refactor partition tuple routing code to reduce duplication.Robert Haas2016-12-21
| | | | Amit Langote
* Fix bogus comment.Robert Haas2016-12-08
| | | | | | | Commit 4212cb73262bbdd164727beffa4c4744b4ead92d rendered a comment in execMain.c incorrect. Per complaint from Tom Lane, repair. Patch from Amit Kapila, per wording suggested by Tom Lane and me.
* Implement table partitioning.Robert Haas2016-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
* Fix interaction of parallel query with prepared statements.Robert Haas2016-12-06
| | | | | | | | | | | | | | | | | Previously, a prepared statement created via a Parse message could get a parallel plan, but one created with a PREPARE statement could not. This state of affairs was due to confusion on my (rhaas) part: I erroneously believed that a CREATE TABLE .. AS EXECUTE statement could only be performed with a prepared statement by PREPARE, but in fact one created by a Prepare message works just as well. Therefore, it makes no sense to allow parallel query in one case but not the other. To fix, allow parallel query with all prepared statements, but run the parallel plan serially (i.e. without workers) in the case of CREATE TABLE .. AS EXECUTE. Also, document this. Amit Kapila and Tobias Bussman, plus an extra sentence of documentation by me.
* pgindent run for 9.6Robert Haas2016-06-09
|
* Stop the executor if no more tuples can be sent from worker to leader.Robert Haas2016-06-06
| | | | | | | | | | | | | | | | | | | | | If a Gather node has read as many tuples as it needs (for example, due to Limit) it may detach the queue connecting it to the worker before reading all of the worker's tuples. Rather than let the worker continue to generate and send all of the results, have it stop after sending the next tuple. More could be done here to stop the worker even quicker, but this is about as well as we can hope to do for 9.6. This is in response to a problem report from Andreas Seltenreich. Commit 44339b892a04e94bbb472235882dc6f7023bdc65 should be actually be sufficient to fix that example even without this change, but it seems better to do this, too, since we might otherwise waste quite a large amount of effort in one or more workers. Discussion: CAA4eK1KOKGqmz9bGu+Z42qhRwMbm4R5rfnqsLCNqFs9j14jzEA@mail.gmail.com Amit Kapila
* Directly modify foreign tables.Robert Haas2016-03-18
| | | | | | | | | postgres_fdw can now sent an UPDATE or DELETE statement directly to the foreign server in simple cases, rather than sending a SELECT FOR UPDATE statement and then updating or deleting rows one-by-one. Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro Horiguchi, Albe Laurenz, Thom Brown, and me.
* Widen query numbers-of-tuples-processed counters to uint64.Tom Lane2016-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch widens SPI_processed, EState's es_processed field, PortalData's portalPos field, FuncCallContext's call_cntr and max_calls fields, ExecutorRun's count argument, PortalRunFetch's result, and the max number of rows in a SPITupleTable to uint64, and deals with (I hope) all the ensuing fallout. Some of these values were declared uint32 before, and others "long". I also removed PortalData's posOverflow field, since that logic seems pretty useless given that portalPos is now always 64 bits. The user-visible results are that command tags for SELECT etc will correctly report tuple counts larger than 4G, as will plpgsql's GET GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples than that are still not exactly the norm, but they're becoming more common. Most values associated with FETCH/MOVE distances, such as PortalRun's count argument and the count argument of most SPI functions that have one, remain declared as "long". It's not clear whether it would be worth promoting those to int64; but it would definitely be a large dollop of additional API churn on top of this, and it would only help 32-bit platforms which seem relatively less likely to see any benefit. Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Message style improvementsPeter Eisentraut2015-10-28
| | | | | Message style, plurals, quoting, spelling, consistency with similar messages
* Rewrite interaction of parallel mode with parallel executor support.Robert Haas2015-10-16
| | | | | | | | | | | | | | In the previous coding, before returning from ExecutorRun, we'd shut down all parallel workers. This was dead wrong if ExecutorRun was called with a non-zero tuple count; it had the effect of truncating the query output. To fix, give ExecutePlan control over whether to enter parallel mode, and have it refuse to do so if the tuple count is non-zero. Rewrite the Gather logic so that it can cope with being called outside parallel mode. Commit 7aea8e4f2daa4b39ca9d1309a0c4aadb0f7ed81b is largely to blame for this problem, though this patch modifies some subsequently-committed code which relied on the guarantees it purported to make.
* Add a Gather executor node.Robert Haas2015-09-30
| | | | | | | | | | | | | | | | | | | | | | | | A Gather executor node runs any number of copies of a plan in an equal number of workers and merges all of the results into a single tuple stream. It can also run the plan itself, if the workers are unavailable or haven't started up yet. It is intended to work with the Partial Seq Scan node which will be added in future commits. It could also be used to implement parallel query of a different sort by itself, without help from Partial Seq Scan, if the single_copy mode is used. In that mode, a worker executes the plan, and the parallel leader does not, merely collecting the worker's results. So, a Gather node could be inserted into a plan to split the execution of that plan across two processes. Nested Gather nodes aren't currently supported, but we might want to add support for that in the future. There's nothing in the planner to actually generate Gather nodes yet, so it's not quite time to break out the champagne. But we're getting close. Amit Kapila. Some designs suggestions were provided by me, and I also reviewed the patch. Single-copy mode, documentation, and other minor changes also by me.
* Determine whether it's safe to attempt a parallel plan for a query.Robert Haas2015-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 924bcf4f16d54c55310b28f77686608684734f42 introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.
* RLS refactoringStephen Frost2015-09-15
| | | | | | | | | | | | | | | | This refactors rewrite/rowsecurity.c to simplify the handling of the default deny case (reducing the number of places where we check for and add the default deny policy from three to one) by splitting up the retrival of the policies from the application of them. This also allowed us to do away with the policy_id field. A policy_name field was added for WithCheckOption policies and is used in error reporting, when available. Patch by Dean Rasheed, with various mostly cosmetic changes by me. Back-patch to 9.5 where RLS was introduced to avoid unnecessary differences, since we're still in alpha, per discussion with Robert.
* Plug RLS related information leak in pg_stats view.Joe Conway2015-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The pg_stats view is supposed to be restricted to only show rows about tables the user can read. However, it sometimes can leak information which could not otherwise be seen when row level security is enabled. Fix that by not showing pg_stats rows to users that would be subject to RLS on the table the row is related to. This is done by creating/using the newly introduced SQL visible function, row_security_active(). Along the way, clean up three call sites of check_enable_rls(). The second argument of that function should only be specified as other than InvalidOid when we are checking as a different user than the current one, as in when querying through a view. These sites were passing GetUserId() instead of InvalidOid, which can cause the function to return incorrect results if the current user has the BYPASSRLS privilege and row_security has been set to OFF. Additionally fix a bug causing RI Trigger error messages to unintentionally leak information when RLS is enabled, and other minor cleanup and improvements. Also add WITH (security_barrier) to the definition of pg_stats. Bumped CATVERSION due to new SQL functions and pg_stats view definition. Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav. Patch by Joe Conway and Dean Rasheed with review and input by Michael Paquier and Stephen Frost.
* pgindent run for 9.5Bruce Momjian2015-05-23
|
* Fix postgres_fdw to return the right ctid value in EvalPlanQual cases.Tom Lane2015-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a postgres_fdw foreign table is a non-locked source relation in an UPDATE, DELETE, or SELECT FOR UPDATE/SHARE, and the query selects its ctid column, the wrong value would be returned if an EvalPlanQual recheck occurred. This happened because the foreign table's result row was copied via the ROW_MARK_COPY code path, and EvalPlanQualFetchRowMarks just unconditionally set the reconstructed tuple's t_self to "invalid". To fix that, we can have EvalPlanQualFetchRowMarks copy the composite datum's t_ctid field, and be sure to initialize that along with t_self when postgres_fdw constructs a tuple to return. If we just did that much then EvalPlanQualFetchRowMarks would start returning "(0,0)" as ctid for all other ROW_MARK_COPY cases, which perhaps does not matter much, but then again maybe it might. The cause of that is that heap_form_tuple, which is the ultimate source of all composite datums, simply leaves t_ctid as zeroes in newly constructed tuples. That seems like a bad idea on general principles: a field that's really not been initialized shouldn't appear to have a valid value. So let's eat the trivial additional overhead of doing "ItemPointerSetInvalid(&(td->t_ctid))" in heap_form_tuple. This closes out our handling of Etsuro Fujita's report that tableoid and ctid weren't correctly set in postgres_fdw EvalPlanQual cases. Along the way we did a great deal of work to improve FDWs' ability to control row locking behavior; which was not wasted effort by any means, but it didn't end up being a fix for this problem because that feature would be too expensive for postgres_fdw to use all the time. Although the fix for the tableoid misbehavior was back-patched, I'm hesitant to do so here; it seems far less likely that people would care about remote ctid than tableoid, and even such a minor behavioral change as this in heap_form_tuple is perhaps best not back-patched. So commit to HEAD only, at least for the moment. Etsuro Fujita, with some adjustments by me
* Add support for doing late row locking in FDWs.Tom Lane2015-05-12
| | | | | | | | | | | | | | | | | | | | | Previously, FDWs could only do "early row locking", that is lock a row as soon as it's fetched, even though local restriction/join conditions might discard the row later. This patch adds callbacks that allow FDWs to do late locking in the same way that it's done for regular tables. To make use of this feature, an FDW must support the "ctid" column as a unique row identifier. Currently, since ctid has to be of type TID, the feature is of limited use, though in principle it could be used by postgres_fdw. We may eventually allow FDWs to specify another data type for ctid, which would make it possible for more FDWs to use this feature. This commit does not modify postgres_fdw to use late locking. We've tested some prototype code for that, but it's not in committable shape, and besides it's quite unclear whether it actually makes sense to do late locking against a remote server. The extra round trips required are likely to outweigh any benefit from improved concurrency. Etsuro Fujita, reviewed by Ashutosh Bapat, and hacked up a lot by me
* Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.Andres Freund2015-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.