aboutsummaryrefslogtreecommitdiff
path: root/src/backend/parser/parse_expr.c
Commit message (Collapse)AuthorAge
* Allow an alias to be attached to a JOIN ... USINGPeter Eisentraut2021-03-31
| | | | | | | | | | | | | | | This allows something like SELECT ... FROM t1 JOIN t2 USING (a, b, c) AS x where x has the columns a, b, c and unlike a regular alias it does not hide the range variables of the tables being joined t1 and t2. Per SQL:2016 feature F404 "Range variable for common column names". Reviewed-by: Vik Fearing <vik.fearing@2ndquadrant.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/454638cf-d563-ab76-a585-2564428062af@2ndquadrant.com
* Extended statistics on expressionsTomas Vondra2021-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to 79f6a942bd. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
* Remove no-longer-used RTE argument of markVarForSelectPriv().Tom Lane2021-02-11
| | | | | | In the wake of c028faf2a, this is no longer needed. I left it out of that patch since the API change would be undesirable in a released branch; but there's no reason not to do it in HEAD.
* SEARCH and CYCLE clausesPeter Eisentraut2021-02-01
| | | | | | | | | | | | This adds the SQL standard feature that adds the SEARCH and CYCLE clauses to recursive queries to be able to do produce breadth- or depth-first search orders and detect cycles. These clauses can be rewritten into queries using existing syntax, and that is what this patch does in the rewriter. Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Support subscripting of arbitrary types, not only arrays.Tom Lane2020-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
* Remove operator_precedence_warning.Tom Lane2020-12-08
| | | | | | | | | | | This GUC was always intended as a temporary solution to help with finding 9.4-to-9.5 migration issues. Now that all pre-9.5 branches are out of support, and 9.5 will be too before v14 is released, it seems like it's okay to drop it. Doing so allows removal of several hundred lines of poorly-tested code in parse_expr.c, which have been a fertile source of bugs when people did use this. Discussion: https://postgr.es/m/2234320.1607117945@sss.pgh.pa.us
* Remove undocumented IS [NOT] OF syntax.Tom Lane2020-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This feature was added a long time ago, in 7c1e67bd5 and eb121ba2c, but never documented in any user-facing way. (Documentation added in 6126d3e70 was commented out almost immediately, in 8272fc3f7.) That's because, while this syntax is defined by SQL:99, our implementation is only vaguely related to the standard's semantics. The standard appears to intend a run-time not parse-time test, and it definitely intends that the test should understand subtype relationships. No one has stepped up to fix that in the intervening years, but people keep coming across the code and asking why it's not documented. Let's just get rid of it: if anyone ever wants to make it work per spec, they can easily recover whatever parts of this code are still of value from our git history. If there's anyone out there who's actually using this despite its undocumented status, they can switch to using pg_typeof() instead, eg. "pg_typeof(something) = 'mytype'::regtype". That gives essentially the same semantics as what our IS OF code did. (We didn't have that function last time this was discussed, or we would have ripped out IS OF then.) Discussion: https://postgr.es/m/CAKFQuwZ2pTc-DSkOiTfjauqLYkNREeNZvWmeg12Q-_69D+sYZA@mail.gmail.com Discussion: https://postgr.es/m/BAY20-F23E9F2B4DAB3E4E88D3623F99B0@phx.gbl Discussion: https://postgr.es/m/3E7CF81D.1000203@joeconway.com
* Optimize a few list_delete_ptr callsDavid Rowley2020-10-22
| | | | | | | | | | | | | | | | | | | | | There is a handful of places where we called list_delete_ptr() to remove some element from a List. In many of these places we know, or with very little additional effort know the index of the ListCell that we need to remove. Here we change all of those places to instead either use one of; list_delete_nth_cell(), foreach_delete_current() or list_delete_last(). Each of these saves from having to iterate over the list to search for the element to remove by its pointer value. There are some small performance gains to be had by doing this, but in the general case, none of these lists are likely to be very large, so the lookup was probably never that expensive anyway. However, some of the calls are in fairly hot code paths, e.g process_equivalence(). So any small gains there are useful. Author: Zhijie Hou and David Rowley Discussion: https://postgr.es/m/b3517353ec7c4f87aa560678fbb1034b@G08CNEXMBPEKD05.g08.fujitsu.local
* Remove support for postfix (right-unary) operators.Tom Lane2020-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | This feature has been a thorn in our sides for a long time, causing many grammatical ambiguity problems. It doesn't seem worth the pain to continue to support it, so remove it. There are some follow-on improvements we can make in the grammar, but this commit only removes the bare minimum number of productions, plus assorted backend support code. Note that pg_dump and psql continue to have full support, since they may be used against older servers. However, pg_dump warns about postfix operators. There is also a check in pg_upgrade. Documentation-wise, I (tgl) largely removed the "left unary" terminology in favor of saying "prefix operator", which is a more standard and IMO less confusing term. I included a catversion bump, although no initial catalog data changes here, to mark the boundary at which oprkind = 'r' stopped being valid in pg_operator. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
* Run pgindent with new pg_bsd_indent version 2.1.1.Tom Lane2020-05-16
| | | | | | | | | | | Thomas Munro fixed a longstanding annoyance in pg_bsd_indent, that it would misformat lines containing IsA() macros on the assumption that the IsA() call should be treated like a cast. This improves some other cases involving field/variable names that match typedefs, too. The only places that get worse are a couple of uses of the OpenSSL macro STACK_OF(); we'll gladly take that trade-off. Discussion: https://postgr.es/m/20200114221814.GA19630@alvherre.pgsql
* Make parser rely more heavily on the ParseNamespaceItem data structure.Tom Lane2020-01-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I added the ParseNamespaceItem data structure (in commit 5ebaaa494), it wasn't very tightly integrated into the parser's APIs. In the wake of adding p_rtindex to that struct (commit b541e9acc), there is a good reason to make more use of it: by passing around ParseNamespaceItem pointers instead of bare RTE pointers, we can get rid of various messy methods for passing back or deducing the rangetable index of an RTE during parsing. Hence, refactor the addRangeTableEntryXXX functions to build and return a ParseNamespaceItem struct, not just the RTE proper; and replace addRTEtoQuery with addNSItemToQuery, which is passed a ParseNamespaceItem rather than building one internally. Also, add per-column data (a ParseNamespaceColumn array) to each ParseNamespaceItem. These arrays are built during addRangeTableEntryXXX, where we have column type data at hand so that it's nearly free to fill the data structure. Later, when we need to build Vars referencing RTEs, we can use the ParseNamespaceColumn info to avoid the rather expensive operations done in get_rte_attribute_type() or expandRTE(). get_rte_attribute_type() is indeed dead code now, so I've removed it. This makes for a useful improvement in parse analysis speed, around 20% in one moderately-complex test query. The ParseNamespaceColumn structs also include Var identity information (varno/varattno). That info isn't actually being used in this patch, except that p_varno == 0 is a handy test for a dropped column. A follow-on patch will make more use of it. Discussion: https://postgr.es/m/2461.1577764221@sss.pgh.pa.us
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Refactor parser's generation of Var nodes.Tom Lane2019-12-26
| | | | | | | | | | | | | | | | | | | | | Instead of passing around a pointer to the RangeTblEntry that provides the desired column, pass a pointer to the associated ParseNamespaceItem. The RTE is trivially reachable from the nsitem, and having the ParseNamespaceItem allows access to additional information. As proof of concept for that, add the rangetable index to ParseNamespaceItem, and use that to get rid of RTERangeTablePosn searches. (I have in mind to teach the parser to generate some different representation for Vars that are nullable by outer joins, and keeping the necessary information in ParseNamespaceItems seems like a reasonable approach to that. But whether that ever happens or not, this seems like good cleanup.) Also refactor the code around scanRTEForColumn so that the "fuzzy match" stuff does not leak out of parse_relation.c. Discussion: https://postgr.es/m/26144.1576858373@sss.pgh.pa.us
* Make the order of the header file includes consistent in backend modules.Amit Kapila2019-11-12
| | | | | | | | | | | Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Represent Lists as expansible arrays, not chains of cons-cells.Tom Lane2019-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Fix backwards test in operator_precedence_warning logic.Tom Lane2019-04-10
| | | | | | | | | | Warnings about unary minus might have been wrong. It's a bit surprising that nobody noticed yet ... probably the precedence-warning feature hasn't really been used much in the field. Rikard Falkeborn Discussion: https://postgr.es/m/CADRDgG6fzA8A2oeygUw4=o7ywo4kvz26NxCSgpq22nMD73Bx4Q@mail.gmail.com
* Generated columnsPeter Eisentraut2019-03-30
| | | | | | | | | | | | | | This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com
* Improve error handling of column references in expression transformationMichael Paquier2019-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Column references are not allowed in default expressions and partition bound expressions, and are restricted as such once the transformation of their expressions is done. However, trying to use more complex column references can lead to confusing error messages. For example, trying to use a two-field column reference name for default expressions and partition bounds leads to "missing FROM-clause entry for table", which makes no sense in their respective context. In order to make the errors generated more useful, this commit adds more verbose messages when transforming column references depending on the context. This has a little consequence though: for example an expression using an aggregate with a column reference as argument would cause an error to be generated for the column reference, while the aggregate was the problem reported before this commit because column references get transformed first. The confusion exists for default expressions for a long time, and the problem is new as of v12 for partition bounds. Still per the lack of complaints on the matter no backpatch is done. The patch has been written by Amit Langote and me, and Tom Lane has provided the improvement of the documentation for default expressions on the CREATE TABLE page. Author: Amit Langote, Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20190326020853.GM2558@paquier.xyz
* Renaming for new subscripting mechanismAlvaro Herrera2019-02-01
| | | | | | | | | | | | Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to introduce a more generic subscription mechanism, which allows subscripting not only arrays but also other object types such as JSONB. That functionality is introduced in a largish invasive patch, out of which this internal renaming patch was extracted. Author: Dmitry Dolgov Reviewed-by: Tom Lane, Arthur Zakirov Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* Allow generalized expression syntax for partition boundsPeter Eisentraut2019-01-25
| | | | | | | | | | | | | | Previously, only literals were allowed. This change allows general expressions, including functions calls, which are evaluated at the time the DDL command is executed. Besides offering some more functionality, it simplifies the parser structures and removes some inconsistencies in how the literals were handled. Author: Kyotaro Horiguchi, Tom Lane, Amit Langote Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/9f88b5e0-6da2-5227-20d0-0d7012beaa1c@lab.ntt.co.jp/
* Allow COPY FROM to filter data using WHERE conditionsTomas Vondra2019-01-20
| | | | | | | | | | | | | Extends the COPY FROM command with a WHERE condition, which allows doing various types of filtering while importing the data (random sampling, condition on a data column, etc.). Until now such filtering required either preprocessing of the input data, or importing all data and then filtering in the database. COPY FROM ... WHERE is an easy-to-use and low-overhead alternative for most simple cases. Author: Surafel Temesgen Reviewed-by: Tomas Vondra, Masahiko Sawada, Lim Myungkyu Discussion: https://www.postgresql.org/message-id/flat/CALAY4q_DdpWDuB5-Zyi-oTtO2uSk8pmy+dupiRe3AvAc++1imA@mail.gmail.com
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Revert MERGE patchSimon Riggs2018-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commits d204ef63776b8a00ca220adec23979091564e465, 83454e3c2b28141c0db01c7d2027e01040df5249 and a few more commits thereafter (complete list at the end) related to MERGE feature. While the feature was fully functional, with sufficient test coverage and necessary documentation, it was felt that some parts of the executor and parse-analyzer can use a different design and it wasn't possible to do that in the available time. So it was decided to revert the patch for PG11 and retry again in the future. Thanks again to all reviewers and bug reporters. List of commits reverted, in reverse chronological order: f1464c5380 Improve parse representation for MERGE ddb4158579 MERGE syntax diagram correction 530e69e59b Allow cpluspluscheck to pass by renaming variable 01b88b4df5 MERGE minor errata 3af7b2b0d4 MERGE fix variable warning in non-assert builds a5d86181ec MERGE INSERT allows only one VALUES clause 4b2d44031f MERGE post-commit review 4923550c20 Tab completion for MERGE aa3faa3c7a WITH support in MERGE 83454e3c2b New files for MERGE d204ef6377 MERGE SQL Command following SQL:2016 Author: Pavan Deolasee Reviewed-by: Michael Paquier
* MERGE SQL Command following SQL:2016Simon Riggs2018-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
* Revert "Modified files for MERGE"Simon Riggs2018-04-02
| | | | This reverts commit 354f13855e6381d288dfaa52bcd4f2cb0fd4a5eb.
* Modified files for MERGESimon Riggs2018-04-02
|
* Fix oversight in CALL argument handling, and do some minor cleanup.Tom Lane2018-02-10
| | | | | | | | | | | | | | | | | | | CALL statements cannot support sub-SELECTs in the arguments of the called procedure, since they just use ExecEvalExpr to evaluate such arguments. Teach transformSubLink() to reject the case, as it already does for other contexts in which subqueries are not supported. In passing, s/EXPR_KIND_CALL/EXPR_KIND_CALL_ARGUMENT/ to make that enum symbol line up more closely with the phrasing of the error messages it is associated with. And fix someone's weak grasp of English grammar in the preceding EXPR_KIND_PARTITION_EXPRESSION addition. Also update an incorrect comment in resolve_unique_index_expr (possibly it was correct when written, but nowadays transformExpr definitely does reject SRFs here). Per report from Pavel Stehule --- but this resolves only one of the bugs he mentions. Discussion: https://postgr.es/m/CAFj8pRDxOwPPzpA8i+AQeDQFj7bhVw-dR2==rfWZ3zMGkm568Q@mail.gmail.com
* Support all SQL:2011 options for window frame clauses.Tom Lane2018-02-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
* Update copyright for 2018Bruce Momjian2018-01-02
| | | | Backpatch-through: certain files through 9.3
* SQL proceduresPeter Eisentraut2017-11-30
| | | | | | | | | | | | | | | | | | | | | This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
* Add some const decorations to prototypesPeter Eisentraut2017-11-10
| | | | Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
* Reduce excessive dereferencing of function pointersPeter Eisentraut2017-09-07
| | | | | | | | | | | | It is equivalent in ANSI C to write (*funcptr) () and funcptr(). These two styles have been applied inconsistently. After discussion, we'll use the more verbose style for plain function pointer variables, to make it clear that it's a variable, and the shorter style when the function pointer is in a struct (s.func() or s->func()), because then it's clear that it's not a plain function name, and otherwise the excessive punctuation makes some of those invocations hard to read. Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Disallow set-returning functions inside CASE or COALESCE.Tom Lane2017-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we reimplemented SRFs in commit 69f4b9c85, our initial choice was to allow the behavior to vary from historical practice in cases where a SRF call appeared within a conditional-execution construct (currently, only CASE or COALESCE). But that was controversial to begin with, and subsequent discussion has resulted in a consensus that it's better to throw an error instead of executing the query differently from before, so long as we can provide a reasonably clear error message and a way to rewrite the query. Hence, add a parser mechanism to allow detection of such cases during parse analysis. The mechanism just requires storing, in the ParseState, a pointer to the set-returning FuncExpr or OpExpr most recently emitted by parse analysis. Then the parsing functions for CASE and COALESCE can detect the presence of a SRF in their arguments by noting whether this pointer changes while analyzing their arguments. Furthermore, if it does, it provides a suitable error cursor location for the complaint. (This means that if there's more than one SRF in the arguments, the error will point at the last one to be analyzed not the first. While connoisseurs of parsing behavior might find that odd, it's unlikely the average user would ever notice.) While at it, we can also provide more specific error messages than before about some pre-existing restrictions, such as no-SRFs-within-aggregates. Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM construct would need to return a set. We've never supported that, but the restriction is depended on in more subtle ways now, so it seems wise to detect it at the start. Also, provide some documentation about how to rewrite a SRF-within-CASE query using a custom wrapper SRF. It turns out that the information_schema.user_mapping_options view contained an instance of exactly the behavior we're now forbidding; but rewriting it makes it more clear and safer too. initdb forced because of user_mapping_options change. Patch by me, with error message suggestions from Alvaro Herrera and Andres Freund, pursuant to a complaint from Regina Obe. Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Improve castNode notation by introducing list-extraction-specific variants.Tom Lane2017-04-10
| | | | | | | | | | | | | | | | | This extends the castNode() notation introduced by commit 5bcab1114 to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
* Cast result of copyObject() to correct typePeter Eisentraut2017-03-28
| | | | | | | | | | | | | | copyObject() is declared to return void *, which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node * or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
* Make more use of castNode()Peter Eisentraut2017-02-21
|
* Use castNode() in a bunch of statement-list-related code.Tom Lane2017-01-26
| | | | | | | | | | | | | When I wrote commit ab1f0c822, I really missed the castNode() macro that Peter E. had proposed shortly before. This back-fills the uses I would have put it to. It's probably not all that significant, but there are more assertions here than there were before, and conceivably they will help catch any bugs associated with those representation changes. I left behind a number of usages like "(Query *) copyObject(query_var)". Those could have been converted as well, but Peter has proposed another notational improvement that would handle copyObject cases automatically, so I let that be for now.
* Change unknown-type literals to type text in SELECT and RETURNING lists.Tom Lane2017-01-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we left such literals alone if the query or subquery had no properties forcing a type decision to be made (such as an ORDER BY or DISTINCT clause using that output column). This meant that "unknown" could be an exposed output column type, which has never been a great idea because it could result in strange failures later on. For example, an outer query that tried to do any operations on an unknown-type subquery output would generally fail with some weird error like "failed to find conversion function from unknown to text" or "could not determine which collation to use for string comparison". Also, if the case occurred in a CREATE VIEW's query then the view would have an unknown-type column, causing similar failures in queries trying to use the view. To fix, at the tail end of parse analysis of a query, forcibly convert any remaining "unknown" literals in its SELECT or RETURNING list to type text. However, provide a switch to suppress that, and use it in the cases of SELECT inside a set operation or INSERT command. In those cases we already had type resolution rules that make use of context information from outside the subquery proper, and we don't want to change that behavior. Also, change creation of an unknown-type column in a relation from a warning to a hard error. The error should be unreachable now in CREATE VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown" in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because it's nothing but a foot-gun. This change creates a pg_upgrade failure case: a matview that contains an unknown-type column can't be pg_upgraded, because reparsing the matview's defining query will now decide that the column is of type text, which doesn't match the cstring-like storage that the old materialized column would actually have. Add a checking pass to detect that. While at it, we can detect tables or composite types that would fail, essentially for free. Those would fail safely anyway later on, but we might as well fail earlier. This patch is by me, but it owes something to previous investigations by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for review. Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
* Fix check_srf_call_placement() to handle VALUES cases correctly.Tom Lane2017-01-16
| | | | | | | | | | | | | | | | | | | | | INSERT ... VALUES with a single VALUES row is implemented quite differently from the general VALUES case. A user-visible implication of that is that we accept SRFs in the single-row case, but not in the multi-row case. That's a historical artifact no doubt, but in view of the lack of field complaints, I'm not excited about fixing it right now. However, check_srf_call_placement() needs to know about this, first because it should throw an error in the unsupported case, and second because it should set p_hasTargetSRFs in the single-row case (because we treat that like a SELECT tlist). That's an oversight in commit a4c35ea1c. To fix, split EXPR_KIND_VALUES into two values. So far as I can see, this is the only place where we need to distinguish the two cases at present; but there might be more later. Patch by me, per report from Andres Freund. Discussion: https://postgr.es/m/20170116081548.zg63zltblwimpfgp@alap3.anarazel.de
* Change representation of statement lists, and add statement location info.Tom Lane2017-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
* Get rid of ParseState.p_value_substitute; use a columnref hook instead.Tom Lane2017-01-07
| | | | | | | | I noticed that p_value_substitute, which is a single-purpose kluge I added in 2002 (commit b0422b215), could be replaced by having domainAddConstraint install a parser hook that looks for the name "value". The parser hook code only dates back to 2009, so it's not surprising that we had to kluge this in 2002, but we can do it more cleanly now.
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Implement table partitioning.Robert Haas2016-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.