aboutsummaryrefslogtreecommitdiff
path: root/src/backend/nodes
Commit message (Collapse)AuthorAge
* gen_node_support.pl: improve error message for unclosed struct.Tom Lane2025-04-22
| | | | | | | | | | | | This error message was 'runaway "struct_name"', which isn't all that clear; I think 'could not find closing brace for "struct_name"' is better. Also, provide the location of the struct start using the script's usual '$file:$lineno' style. Bug: #18901 Reported-by: Clemens Ruck <clemens.ruck@t-online.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18901-424272abe01357e6@postgresql.org
* Fix typos and grammar in the codeMichael Paquier2025-04-19
| | | | | | | | The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com
* Harmonize function parameter names for Postgres 18.Peter Geoghegan2025-04-12
| | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 18 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit 035ce1fe).
* Speedup child EquivalenceMember lookup in plannerDavid Rowley2025-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When planning queries to partitioned tables, we clone all EquivalenceMembers belonging to the partitioned table into em_is_child EquivalenceMembers for each non-pruned partition. For partitioned tables with large numbers of partitions, this meant the ec_members list could become large and code searching that list would become slow. Effectively, the more partitions which were present, the more searches needed to be performed for operations such as find_ec_member_matching_expr() during create_plan() and the more partitions present, the longer these searches would take, i.e., a quadratic slowdown. To fix this, here we adjust how we store EquivalenceMembers for em_is_child members. Instead of storing these directly in ec_members, these are now stored in a new array of Lists in the EquivalenceClass, which is indexed by the relid. When we want to find EquivalenceMembers belonging to a certain child relation, we can narrow the search to the array element for that relation. To make EquivalenceMember lookup easier and to reduce the amount of code change, this commit provides a pair of functions to allow iteration over the EquivalenceMembers of an EC which also handles finding the child members, if required. Callers that never need to look at child members can remain using the foreach loop over ec_members, which will now often be faster due to only parent-level members being stored there. The actual performance increases here are highly dependent on the number of partitions and the query being planned. Performance increases can be visible with as few as 8 partitions, but the speedup is marginal for such low numbers of partitions. The speedups become much more visible with a few dozen to hundreds of partitions. With some tested queries using 56 partitions, the planner was around 3x faster than before. For use cases with thousands of partitions, these are likely to become significantly faster. Some testing has shown planner speedups of 60x or more with 8192 partitions. Author: Yuya Watari <watari.yuya@gmail.com> Co-authored-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Tested-by: Thom Brown <thom@linux.com> Tested-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com> Discussion: https://postgr.es/m/CAJ2pMkZNCgoUKSE%2B_5LthD%2BKbXKvq6h2hQN8Esxpxd%2Bcxmgomg%40mail.gmail.com
* Make derived clause lookup in EquivalenceClass more efficientAmit Langote2025-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Derived clauses are stored in ec_derives, a List of RestrictInfos. These clauses are later looked up by matching the left and right EquivalenceMembers along with the clause's parent EC. This linear search becomes expensive in queries with many joins or partitions, where ec_derives may contain thousands of entries. In particular, create_join_clause() can spend significant time scanning this list. To improve performance, introduce a hash table (ec_derives_hash) that is built when the list reaches 32 entries -- the same threshold used for join_rel_hash. The original list is retained alongside the hash table to support EC merging and serialization (_outEquivalenceClass()). Each clause is stored in the hash table using a canonicalized key: the EquivalenceMember with the lower memory address is placed in the key before the one with the higher memory address. This avoids storing or searching for both permutations of the same clause. For clauses involving a constant EM, the key places NULL in the first slot and the non-constant EM in the second. The hash table is initialized using list_length(ec_derives_list) as the size hint. simplehash internally adjusts this to the next power of two after dividing by the fillfactor, so this typically results in at least 64 buckets near the threshold -- avoiding immediate resizing while adapting to the actual number of entries. The lookup logic for derived clauses is now centralized in ec_search_derived_clause_for_ems(), which consults the hash table when available and falls back to the list otherwise. The new ec_clear_derived_clauses() always frees ec_derives_list, even though some of the original code paths that cleared the old ec_derives field did not. This ensures consistent cleanup and avoids leaking memory when large lists are discarded. An assertion originally placed in find_derived_clause_for_ec_member() is moved into ec_search_derived_clause_for_ems() so that it is enforced consistently, regardless of whether the hash table or list is used for lookup. This design incorporates suggestions by David Rowley, who proposed both the key canonicalization and the initial sizing approach to balance memory usage and CPU efficiency. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Tested-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Tested-by: Amit Langote <amitlangote09@gmail.com> Tested-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAExHW5vZiQtWU6moszLP5iZ8gLX_ZAUbgEX0DxGLx9PGWCtqUg@mail.gmail.com
* Remove the query_id_squash_values GUCÁlvaro Herrera2025-03-27
| | | | | | | | | | | | | Commit 62d712ecfd94 introduced the capability to calculate the same queryId for queries with different lengths of constants in a list for an IN clause. This behavior was originally enabled with a GUC query_id_squash_values. After a discussion about the value of such a GUC, it was decided to back out of the use of a GUC and make the squashing behavior the only available option. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z-LZyygkkNyA8-kR@msg.df7cb.de Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com
* Optimize Query jumbleDavid Rowley2025-03-27
| | | | | | | | | | | f31aad9b0 adjusted query jumbling so it no longer ignores NULL nodes during the jumble. This added some overhead. Here we tune a few things to make jumbling faster again. This makes jumbling perform similar or even slightly faster than prior to that change. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAApHDvreP04nhTKuYsPw0F-YN+4nr4f=L72SPeFb81jfv+2c7w@mail.gmail.com
* Fix query jumbling to account for NULL nodesDavid Rowley2025-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously NULL nodes were ignored. This could cause issues where the computed query ID could match for queries where fields that are next to each other in their Node struct where one field was NULL and the other non-NULL. For example, the Query struct had distinctClause and sortClause next to each other. If someone wrote; SELECT DISTINCT c1 FROM t; and then; SELECT c1 FROM t ORDER BY c1; these would produce the same query ID since, in the first query, we ignored the NULL sortClause and appended the jumble bytes for the distictClause. In the latter query, since we did nothing for the NULL distinctClause then jumble the non-NULL sortClause, and since the node representation stored is the same in both cases, the query IDs were identical. Here we fix this by always accounting for NULL nodes by recording that we saw a NULL in the jumble buffer. This fixes the issue as the order that the NULL is recorded isn't the same in the above two queries. Author: Bykov Ivan <i.bykov@modernsys.ru> Author: Michael Paquier <michael@paquier.xyz> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/aafce7966e234372b2ba876c0193f1e9%40localhost.localdomain
* Use relation name instead of OID in query jumbling for RangeTblEntryMichael Paquier2025-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
* Fix two issues with custom_query_jumble in gen_node_support.plMichael Paquier2025-03-26
| | | | | | | | | | | | | | A node field marked with custom_query_jumble and query_jumble_ignore would generate some code of a custom routine. The script is changed so as custom_query_jumble behaves like the other options in this case, query_jumble_ignore taking priority, with no code generated. A comment related to the code generated for node types was misplaced. Thinkos introduced in 5ac462e2b7ac. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1324036.1742945060@sss.pgh.pa.us
* Add support for custom_query_jumble as a node field attributeMichael Paquier2025-03-25
| | | | | | | | | | | | | | | | | | | | | | | This option gives the possibility for query jumble to define a custom routine for the field of a Node, extending support for custom_query_jumble as a node field attribute. When dealing with complex node structures, this can be simpler than having to enforce a custom function across a full node. Custom functions need to be defined in queryjumblefuncs.c, named as _jumble${node}_${field}(), and use in input the JumbleState, the node and its field. The field is not really required if we have the Node, but it makes custom implementations somewhat easier to think about. The code generated by gen_node_support.pl uses a macro called JUMBLE_CUSTOM(), hiding the internals of the logic inside queryjumblefuncs.c. This will be used by an upcoming patch manipulating adding a custom routine into a field of RangeTblEntry, but this facility can become useful in more cases. Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/Z9y43-dRvb4EtxQ0@paquier.xyz
* Introduce squashing of constant lists in query jumblingÁlvaro Herrera2025-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pg_stat_statements produces multiple entries for queries like SELECT something FROM table WHERE col IN (1, 2, 3, ...) depending on the number of parameters, because every element of ArrayExpr is individually jumbled. Most of the time that's undesirable, especially if the list becomes too large. Fix this by introducing a new GUC query_id_squash_values which modifies the node jumbling code to only consider the first and last element of a list of constants, rather than each list element individually. This affects both the query_id generated by query jumbling, as well as pg_stat_statements query normalization so that it suppresses printing of the individual elements of such a list. The default value is off, meaning the previous behavior is maintained. Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Sergey Dudoladov (mysterious, off-list) Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Marcos Pegoraro <marcos@f10.com.br> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Tested-by: Yasuo Honda <yasuo.honda@gmail.com> Tested-by: Sergei Kornilov <sk@zsrv.org> Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com> Tested-by: Chengxi Sun <sunchengxi@highgo.com> Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com
* Separate TBM[Shared|Private]Iterator and TBMIterateResultMelanie Plageman2025-03-15
| | | | | | | | | | | | | | Remove the TBMIterateResult member from the TBMPrivateIterator and TBMSharedIterator and make tbm_[shared|private_]iterate() take a TBMIterateResult as a parameter. This allows tidbitmap API users to manage multiple TBMIterateResults per scan. This is required for bitmap heap scan to use the read stream API, with which there may be multiple I/Os in flight at once, each one with a TBMIterateResult. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/d4bb26c9-fe07-439e-ac53-c0e244387e01%40vondra.me
* Build whole-row Vars the same way during parsing and planning.Tom Lane2025-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | makeWholeRowVar() has different rules for constructing a whole-row Var depending on the kind of RTE it's representing. This turns out to be problematic because the rewriter and planner can convert view RTEs and set-returning-function RTEs into subquery RTEs; so a whole-row Var made during planning might look different from one made by the parser. In isolation this doesn't cause any problem, but if a query contains Vars made both ways for the same varno, there are cross-checks in the executor that will complain. This manifests for UPDATE, DELETE, and MERGE queries that use whole-row table references. To fix, we need makeWholeRowVar() to produce the same result from an inlined RTE as it would have for the original. For an inlined view, we can use RangeTblEntry.relid to detect that this had been a view RTE. For inlined SRFs, make a data structure definition change akin to commit 47bb9db75, and say that we won't clear RangeTblEntry.functions until the end of planning. That allows makeWholeRowVar() to repeat what it would have done with the unmodified RTE. Reported-by: Duncan Sands <duncan.sands@deepbluecap.com> Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/3518c50a-ab18-482f-b916-a37263622501@deepbluecap.com Backpatch-through: 13
* Delay extraction of TIDBitmap per page offsetsMelanie Plageman2025-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pages from the bitmap created by the TIDBitmap API can be exact or lossy. The TIDBitmap API extracts the tuple offsets from exact pages into an array for the convenience of the caller. This was done in tbm_private|shared_iterate() right after advancing the iterator. However, as long as tbm_private|shared_iterate() set a reference to the PagetableEntry in the TBMIterateResult, the offset extraction can be done later. Waiting to extract the tuple offsets has a few benefits. For the shared iterator case, it allows us to extract the offsets after dropping the shared iterator state lock, reducing time spent holding a contended lock. Separating the iteration step and extracting the offsets later also allows us to avoid extracting the offsets for prefetched blocks. Those offsets were never used, so the overhead of extracting and storing them was wasted. The real motivation for this change, however, is that future commits will make bitmap heap scan use the read stream API. This requires a TBMIterateResult per issued block. By removing the array of tuple offsets from the TBMIterateResult and only extracting the offsets when they are used, we reduce the memory required for per buffer data substantially. Suggested-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
* Add lossy indicator to TBMIterateResultMelanie Plageman2025-02-24
| | | | | | | | TBMIterateResult->ntuples is -1 when the page in the bitmap is lossy. Add an explicit lossy indicator so that we can move ntuples out of the TBMIterateResult in a future commit. Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
* Remove various unnecessary (char *) castsPeter Eisentraut2025-02-20
| | | | | | | | Remove a number of (char *) casts that are unnecessary. Or in some cases, rewrite the code to make the purpose of the cast clearer. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
* Move CompareType to separate header filePeter Eisentraut2025-02-02
| | | | | | | | | We'll want to make use of it in more places, and we'd prefer to not have to include all of primnodes.h everywhere. Author: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
* Get rid of our dependency on type "long" for memory size calculations.Tom Lane2025-01-31
| | | | | | | | | | | | | | | | | | | | | | | | | | Consistently use "Size" (or size_t, or in some places int64 or double) as the type for variables holding memory allocation sizes. In most places variables' data types were fine already, but we had an ancient habit of computing bytes from kilobytes-units GUCs with code like "work_mem * 1024L". That risks overflow on Win64 where they did not make "long" as wide as "size_t". We worked around that by restricting such GUCs' ranges, so you couldn't set work_mem et al higher than 2GB on Win64. This patch removes that restriction, after replacing such calculations with "work_mem * (Size) 1024" or variants of that. It should be noted that this patch was constructed by searching outwards from the GUCs that have MAX_KILOBYTES as upper limit. So I can't positively guarantee there are no other places doing memory-size arithmetic in int or long variables. I do however feel pretty confident that increasing MAX_KILOBYTES on Win64 is safe now. Also, nothing in our code should be dealing in multiple-gigabyte allocations without authorization from a relevant GUC, so it seems pretty likely that this search caught everything that could be at risk of overflow. Author: Vladlen Popolitov <v.popolitov@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
* Add OLD/NEW support to RETURNING in DML queries.Dean Rasheed2025-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries to explicitly return old and new values by using the special aliases "old" and "new", which are automatically added to the query (if not already defined) while parsing its RETURNING list, allowing things like: RETURNING old.colname, new.colname, ... RETURNING old.*, new.* Additionally, a new syntax is supported, allowing the names "old" and "new" to be changed to user-supplied alias names, e.g.: RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ... This is useful when the names "old" and "new" are already defined, such as inside trigger functions, allowing backwards compatibility to be maintained -- the interpretation of any existing queries that happen to already refer to relations called "old" or "new", or use those as aliases for other relations, is not changed. For an INSERT, old values will generally be NULL, and for a DELETE, new values will generally be NULL, but that may change for an INSERT with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule changes the command type. Therefore, we put no restrictions on the use of old and new in any DML queries. Dean Rasheed, reviewed by Jian He and Jeff Davis. Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
* Add support for NOT ENFORCED in CHECK constraintsPeter Eisentraut2025-01-11
| | | | | | | | | | | | | | | | | | | This adds support for the NOT ENFORCED/ENFORCED flag for constraints, with support for check constraints. The plan is to eventually support this for foreign key constraints, where it is typically more useful. Note that CHECK constraints do not currently support ALTER operations, so changing the enforceability of an existing constraint isn't possible without dropping and recreating it. This could be added later. Author: Amul Sul <amul.sul@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Tested-by: Triveni N <triveni.n@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Fix bitmap table scan crash on iterator releaseMelanie Plageman2024-12-19
| | | | | | | | | | | 1a0da347a7ac98db replaced Bitmap Table Scan's individual private and shared iterators with a unified iterator. It neglected, however, to check if the iterator had already been cleaned up before doing so on rescan. Add this check both on rescan and end scan to be safe. Reported-by: Richard Guo Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs48nrhcLY1kcd-u9oD%2B6yiS631F_8Fx8ZGsO-BYDwH%2Bbyw%40mail.gmail.com
* Add common interface for TBMIteratorsMelanie Plageman2024-12-18
| | | | | | | | | | | | | Add and use TBMPrivateIterator, which replaces the current TBMIterator for serial use cases, and repurpose TBMIterator to be a unified interface for both the serial ("private") and parallel ("shared") TID Bitmap iterator interfaces. This encapsulation simplifies call sites for callers supporting both parallel and serial TID Bitmap access. TBMIterator is not yet used in this commit. Author: Melanie Plageman Reviewed-by: Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/063e4eb4-32d9-439e-a0b1-75565a9835a8%40iki.fi
* Remove useless casts to (void *)Peter Eisentraut2024-11-28
| | | | | | | | Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
* Add pg_constraint rows for not-null constraintsÁlvaro Herrera2024-11-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now create contype='n' pg_constraint rows for not-null constraints on user tables. Only one such constraint is allowed for a column. We propagate these constraints to other tables during operations such as adding inheritance relationships, creating and attaching partitions and creating tables LIKE other tables. These related constraints mostly follow the well-known rules of conislocal and coninhcount that we have for CHECK constraints, with some adaptations: for example, as opposed to CHECK constraints, we don't match not-null ones by name when descending a hierarchy to alter or remove it, instead matching by the name of the column that they apply to. This means we don't require the constraint names to be identical across a hierarchy. The inheritance status of these constraints can be controlled: now we can be sure that if a parent table has one, then all children will have it as well. They can optionally be marked NO INHERIT, and then children are free not to have one. (There's currently no support for altering a NO INHERIT constraint into inheriting down the hierarchy, but that's a desirable future feature.) This also opens the door for having these constraints be marked NOT VALID, as well as allowing UNIQUE+NOT NULL to be used for functional dependency determination, as envisioned by commit e49ae8d3bc58. It's likely possible to allow DEFERRABLE constraints as followup work, as well. psql shows these constraints in \d+, though we may want to reconsider if this turns out to be too noisy. Earlier versions of this patch hid constraints that were on the same columns of the primary key, but I'm not sure that that's very useful. If clutter is a problem, we might be better off inventing a new \d++ command and not showing the constraints in \d+. For now, we omit these constraints on system catalog columns, because they're unlikely to achieve anything. The main difference to the previous attempt at this (b0e96f311985) is that we now require that such a constraint always exists when a primary key is in the column; we didn't require this previously which had a number of unpalatable consequences. With this requirement, the code is easier to reason about. For example: - We no longer have "throwaway constraints" during pg_dump. We needed those for the case where a table had a PK without a not-null underneath, to prevent a slow scan of the data during restore of the PK creation, which was particularly problematic for pg_upgrade. - We no longer have to cope with attnotnull being set spuriously in case a primary key is dropped indirectly (e.g., via DROP COLUMN). Some bits of code in this patch were authored by Jian He. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Bernd Helmle <mailings@oopsware.de> Reviewed-by: 何建 (jian he) <jian.universality@gmail.com> Reviewed-by: 王刚 (Tender Wang) <tndrwang@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/202408310358.sdhumtyuy2ht@alvherre.pgsql
* Add a parse location field to struct FunctionParameter.Tom Lane2024-10-31
| | | | | | | | | | | | | | | | This allows an error cursor to be supplied for a bunch of bad-function-definition errors that previously lacked one, or that cheated a bit by pointing at the contained type name when the error isn't really about that. Bump catversion from an abundance of caution --- I don't think this node type can actually appear in stored views/rules, but better safe than sorry. Jian He and Tom Lane (extracted from a larger patch by Jian, with some additional work by me) Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
* Improve parser's reporting of statement start locations.Tom Lane2024-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now, the parser's reporting of a statement's stmt_location included any preceding whitespace or comments. This isn't really desirable but was done to avoid accounting honestly for nonterminals that reduce to empty. It causes problems for pg_stat_statements, which partially compensates by manually stripping whitespace, but is not bright enough to strip /*-style comments. There will be more problems with an upcoming patch to improve reporting of errors in extension scripts, so it's time to do something about this. The thing we have to do to make it work right is to adjust YYLLOC_DEFAULT to scan the inputs of each production to find the first one that has a valid location (i.e., did not reduce to empty). In theory this adds a little bit of per-reduction overhead, but in practice it's negligible. I checked by measuring the time to run raw_parser() on the contents of information_schema.sql, and there was basically no change. Having done that, we can rely on any nonterminal that didn't reduce to completely empty to have a correct starting location, and we don't need the kluges the stmtmulti production formerly used. This should have a side benefit of allowing parse error reports to include an error position in some cases where they formerly failed to do so, due to trying to report the position of an empty nonterminal. I did not go looking for an example though. The one previously known case where that could happen (OptSchemaEltList) no longer needs the kluge it had; but I rather doubt that that was the only case. Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
* Fix unnecessary casts of copyObject() resultPeter Eisentraut2024-10-17
| | | | | | | | | The result is already of the correct type, so these casts don't do anything. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org
* Show values of SET statements as constants in pg_stat_statementsMichael Paquier2024-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a continuation of work like 11c34b342bd7, done to reduce the bloat of pg_stat_statements by applying more normalization to query entries. This commit is able to detect and normalize values in VariableSetStmt, resulting in: SET conf_param = $1 Compared to other parse nodes, VariableSetStmt is embedded in much more places in the parser, impacting many query patterns in pg_stat_statements. A custom jumble function is used, with an extra field in the node to decide if arguments should be included in the jumbling or not, a location field being not enough for this purpose. This approach allows for a finer tuning. Clauses relying on one or more keywords are not normalized, for example: * DEFAULT * FROM CURRENT * List of keywords. SET SESSION CHARACTERISTICS AS TRANSACTION, where it is critical to differentiate different sets of options, is a good example of why normalization should not happen. Some queries use VariableSetStmt for some subclauses with SET, that also have their values normalized: - ALTER DATABASE - ALTER ROLE - ALTER SYSTEM - CREATE/ALTER FUNCTION ba90eac7a995 has added test coverage for most of the existing SET patterns. The expected output of these tests shows the difference this commit creates. Normalization could be perhaps applied to more portions of the grammar but what is done here is conservative, and good enough as a starting point. Author: Greg Sabino Mullane, Michael Paquier Discussion: https://postgr.es/m/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
* Add temporal PRIMARY KEY and UNIQUE constraintsPeter Eisentraut2024-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints. These are backed by GiST indexes instead of B-tree indexes, since they are essentially exclusion constraints with = for the scalar parts of the key and && for the temporal part. (previously committed as 46a0cd4cefb, reverted by 46a0cd4cefb; the new part is this:) Because 'empty' && 'empty' is false, the temporal PK/UQ constraint allowed duplicates, which is confusing to users and breaks internal expectations. For instance, when GROUP BY checks functional dependencies on the PK, it allows selecting other columns from the table, but in the presence of duplicate keys you could get the value from any of their rows. So we need to forbid empties. This all means that at the moment we can only support ranges and multiranges for temporal PK/UQs, unlike the original patch (above). Documentation and tests for this are added. But this could conceivably be extended by introducing some more general support for the notion of "empty" for other types. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
* Introduce an RTE for the grouping stepRichard Guo2024-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are subqueries in the grouping expressions, each of these subqueries in the targetlist and HAVING clause is expanded into distinct SubPlan nodes. As a result, only one of these SubPlan nodes would be converted to reference to the grouping key column output by the Agg node; others would have to get evaluated afresh. This is not efficient, and with grouping sets this can cause wrong results issues in cases where they should go to NULL because they are from the wrong grouping set. Furthermore, during re-evaluation, these SubPlan nodes might use nulled column values from grouping sets, which is not correct. This issue is not limited to subqueries. For other types of expressions that are part of grouping items, if they are transformed into another form during preprocessing, they may fail to match lower target items. This can also lead to wrong results with grouping sets. To fix this issue, we introduce a new kind of RTE representing the output of the grouping step, with columns that are the Vars or expressions being grouped on. In the parser, we replace the grouping expressions in the targetlist and HAVING clause with Vars referencing this new RTE, so that the output of the parser directly expresses the semantic requirement that the grouping expressions be gotten from the grouping output rather than computed some other way. In the planner, we first preprocess all the columns of this new RTE and then replace any Vars in the targetlist and HAVING clause that reference this new RTE with the underlying grouping expressions, so that we will have only one instance of a SubPlan node for each subquery contained in the grouping expressions. Bump catversion because this changes the querytree produced by the parser. Thanks to Tom Lane for the idea to invent a new kind of RTE. Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from various threads. Author: Richard Guo Reviewed-by: Ashutosh Bapat, Sutou Kouhei Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
* Convert node test compile-time settings into run-time parametersPeter Eisentraut2024-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This converts COPY_PARSE_PLAN_TREES WRITE_READ_PARSE_PLAN_TREES RAW_EXPRESSION_COVERAGE_TEST into run-time parameters debug_copy_parse_plan_trees debug_write_read_parse_plan_trees debug_raw_expression_coverage_test They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS. The compile-time symbols are kept for build farm compatibility, but they now just determine the default value of the run-time settings. Furthermore, support for these settings is not compiled in at all unless assertions are enabled, or the new symbol DEBUG_NODE_TESTS_ENABLED is defined at compile time, or any of the legacy compile-time setting symbols are defined. So there is no run-time overhead in production builds. (This is similar to the handling of DISCARD_CACHES_ENABLED.) Discussion: https://www.postgresql.org/message-id/flat/30747bd8-f51e-4e0c-a310-a6e2c37ec8aa%40eisentraut.org
* SQL/JSON: Always coerce JsonExpr result at runtimeAmit Langote2024-06-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of looking up casts at parse time for converting the result of JsonPath* query functions to the specified or the default RETURNING type, always perform the conversion at runtime using either the target type's input function or the function json_populate_type(). There are two motivations for this change: 1. json_populate_type() coerces to types with typmod such that any string values that exceed length limit cause an error instead of silent truncation, which is necessary to be standard-conforming. 2. It was possible to end up with a cast expression that doesn't support soft handling of errors causing bugs in the of handling ON ERROR clause. JsonExpr.coercion_expr which would store the cast expression is no longer necessary, so remove. Bump catversion because stored rules change because of the above removal. Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: Discussion: https://postgr.es/m/202405271326.5a5rprki64aw%40alvherre.pgsql
* Fix query pullup issue with WindowClause runConditionDavid Rowley2024-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 94985c210 added code to detect when WindowFuncs were monotonic and allowed additional quals to be "pushed down" into the subquery to be used as WindowClause runConditions in order to short-circuit execution in nodeWindowAgg.c. The Node representation of runConditions wasn't well selected and because we do qual pushdown before planning the subquery, the planning of the subquery could perform subquery pull-up of nested subqueries. For WindowFuncs with args, the arguments could be changed after pushing the qual down to the subquery. This was made more difficult by the fact that the code duplicated the WindowFunc inside an OpExpr to include in the WindowClauses runCondition field. This could result in duplication of subqueries and a pull-up of such a subquery could result in another initplan parameter being issued for the 2nd version of the subplan. This could result in errors such as: ERROR: WindowFunc not found in subplan target lists To fix this, we change the node representation of these run conditions and instead of storing an OpExpr containing the WindowFunc in a list inside WindowClause, we now store a new node type named WindowFuncRunCondition within a new field in the WindowFunc. These get transformed into OpExprs later in planning once subquery pull-up has been performed. This problem did exist in v15 and v16, but that was fixed by 9d36b883b and e5d20bbd. Cat version bump due to new node type and modifying WindowFunc struct. Bug: #18305 Reported-by: Zuming Jiang Discussion: https://postgr.es/m/18305-33c49b4c830b37b3%40postgresql.org
* Fix an assortment of typosDavid Rowley2024-05-04
| | | | | Author: Alexander Lakhin Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
* revert: Transform OR clauses to ANY expressionAlexander Korotkov2024-04-10
| | | | | | | This commit reverts 72bd38cc99 due to implementation and design issues. Reported-by: Tom Lane Discussion: https://postgr.es/m/3604469.1712628736%40sss.pgh.pa.us
* JSON_TABLE: Add support for NESTED paths and columnsAmit Langote2024-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A NESTED path allows to extract data from nested levels of JSON objects given by the parent path expression, which are projected as columns specified using a nested COLUMNS clause, just like the parent COLUMNS clause. Rows comprised from a NESTED columns are "joined" to the row comprised from the parent columns. If a particular NESTED path evaluates to 0 rows, then the nested COLUMNS will emit NULLs, making it an OUTER join. NESTED columns themselves may include NESTED paths to allow extracting data from arbitrary nesting levels, which are likewise joined against the rows at the parent level. Multiple NESTED paths at a given level are called "sibling" paths and their rows are combined by UNIONing them, that is, after being joined against the parent row as described above. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
* Transform OR clauses to ANY expressionAlexander Korotkov2024-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | Replace (expr op C1) OR (expr op C2) ... with expr op ANY(ARRAY[C1, C2, ...]) on the preliminary stage of optimization when we are still working with the expression tree. Here Cn is a n-th constant expression, 'expr' is non-constant expression, 'op' is an operator which returns boolean result and has a commuter (for the case of reverse order of constant and non-constant parts of the expression, like 'Cn op expr'). Sometimes it can lead to not optimal plan. This is why there is a or_to_any_transform_limit GUC. It specifies a threshold value of length of arguments in an OR expression that triggers the OR-to-ANY transformation. Generally, more groupable OR arguments mean that transformation will be more likely to win than to lose. Discussion: https://postgr.es/m/567ED6CA.2040504%40sigaev.ru Author: Alena Rybakina <lena.ribackina@yandex.ru> Author: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com>
* Introduce a bump memory allocatorDavid Rowley2024-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a bump MemoryContext type. The bump context is best suited for short-lived memory contexts which require only allocations of memory and never a pfree or repalloc, which are unsupported. Memory palloc'd into a bump context has no chunk header. This makes bump a useful context type when lots of small allocations need to be done without any need to pfree those allocations. Allocation sizes are rounded up to the next MAXALIGN boundary, so with this and no chunk header, allocations are very compact indeed. Allocations are also very fast as bump does not check any freelists to try and make use of previously free'd chunks. It just checks if there is enough room on the current block, and if so it bumps the freeptr beyond this chunk and returns the value that the freeptr was previously pointing to. Simple and fast. A new block is malloc'd when there's not enough space in the current block. Code using the bump allocator must take care never to call any functions which could try to call realloc() (or variants), pfree(), GetMemoryChunkContext() or GetMemoryChunkSpace() on a bump allocated chunk. Due to lack of chunk headers, these operations are unsupported. To increase the chances of catching such issues, when compiled with MEMORY_CONTEXT_CHECKING, bump allocated chunks are given a header and any attempt to perform an unsupported operation will result in an ERROR. Without MEMORY_CONTEXT_CHECKING, code attempting an unsupported operation could result in a segfault. A follow-on commit will implement the first user of bump. Author: David Rowley Reviewed-by: Nathan Bossart Reviewed-by: Matthias van de Meent Reviewed-by: Tomas Vondra Reviewed-by: John Naylor Discussion: https://postgr.es/m/CAApHDvqGSpCU95TmM=Bp=6xjL_nLys4zdZOpfNyWBk97Xrdj2w@mail.gmail.com
* Add basic JSON_TABLE() functionalityAmit Langote2024-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
* Add support for MERGE ... WHEN NOT MATCHED BY SOURCE.Dean Rasheed2024-03-30
| | | | | | | | | | | | | | | | | | | This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE actions, which operate on rows that exist in the target relation, but not in the data source. These actions can execute UPDATE, DELETE, or DO NOTHING sub-commands. This is in contrast to already-supported WHEN NOT MATCHED actions, which operate on rows that exist in the data source, but not in the target relation. To make this distinction clearer, such actions may now be written as WHEN NOT MATCHED BY TARGET. Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is equivalent to writing WHEN NOT MATCHED BY TARGET. Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing. Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com
* Do not output actual value of location fields in node serialization by defaultPeter Eisentraut2024-03-22
| | | | | | | | | | | | | | | | | | | This changes nodeToString() to not output the actual value of location fields in nodes, but instead it writes -1. This mirrors the fact that stringToNode() also does not read location field values but always stores -1. For most uses of nodeToString(), which is to store nodes in catalog fields, this is more useful. We don't store original query texts in catalogs, so any lingering query location values are not meaningful. For debugging purposes, there is a new nodeToStringWithLocations(), which mirrors the existing stringToNodeWithLocations(). This is used for WRITE_READ_PARSE_PLAN_TREES and nodes/print.c functions, which covers all the debugging uses. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEze2WgrCiR3JZmWyB0YTc8HV7ewRdx13j0CqD6mVkYAW+SFGQ@mail.gmail.com
* Make RangeTblEntry dump order consistentPeter Eisentraut2024-03-22
| | | | | | | | | | | Put the fields alias and eref earlier in the struct, so that it matches the order in _outRangeTblEntry()/_readRangeTblEntry(). This helps if we ever want to fully automate out/read of RangeTblEntry. Also, it makes dumps in the debugger easier to read in the same way. Internally, this makes no difference. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org
* Remove custom _jumbleRangeTblEntry()Peter Eisentraut2024-03-22
| | | | | | | | | | | | | | | | | | This is part of an effort to reduce the number of special cases in the automatically generated node support functions. This patch removes _jumbleRangeTblEntry() and instead adds per-field query_jumble_ignore annotations to match the behavior of the previous custom code. The pg_stat_statements test suite has some coverage of this. It gets rid of the switch on rtekind; this should be technically correct, since we do the equal and copy functions like this also. The list of fields to jumble has been checked and is considered correct as of 8b29a119fd. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org
* Add SQL/JSON query functionsAmit Langote2024-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
* gen_node_support.pl: Mark location fields as type alias ParseLocPeter Eisentraut2024-03-19
| | | | | | | | | | Instead of the rather ugly type=int + name ~= location$, we now have a marker type for offset pointers or sizes that are only relevant when a query text is included, which decreases the complexity required in gen_node_support.pl for handling these values. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEze2WgrCiR3JZmWyB0YTc8HV7ewRdx13j0CqD6mVkYAW+SFGQ@mail.gmail.com
* Add RETURNING support to MERGE.Dean Rasheed2024-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | This allows a RETURNING clause to be appended to a MERGE query, to return values based on each row inserted, updated, or deleted. As with plain INSERT, UPDATE, and DELETE commands, the returned values are based on the new contents of the target table for INSERT and UPDATE actions, and on its old contents for DELETE actions. Values from the source relation may also be returned. As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be used as the source relation for other operations such as WITH queries and COPY commands. Additionally, a special function merge_action() is provided, which returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action executed for each row. The merge_action() function can be used anywhere in the RETURNING list, including in arbitrary expressions and subqueries, but it is an error to use it anywhere outside of a MERGE query's RETURNING list. Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera, Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut, and Wolfgang Walther. Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com
* Fix description and grouping of RangeTblEntry.inhPeter Eisentraut2024-03-07
| | | | | | | | | | | | | | | | The inh field of RangeTblEntry was doubly confusingly documented. Some parts of the code insisted that it was only valid for RTE_RELATION entries, other parts said the field was valid for all entries. Neither was quite correct. More correctly, the field is valid for RTE_RELATION entries but is also used in the planner for RTE_SUBQUERY entries. So it makes more sense to group it with other fields that are primarily for RTE_RELATION but borrowed by RTE_SUBQUERY. (The exact position was chosen so that it is next to relkind for better struct packing, and next to relid, since relid and inh are sort of the input fields and the others are filled in later.) Also add documentation for the planner's use at the struct definition. Discussion: https://www.postgresql.org/message-id/6c1fbccc-85c8-40d3-b08b-4f47f2093711@eisentraut.org
* Move some bitmap logic out of bitmapset.cJohn Naylor2024-03-06
| | | | | | | | | Move the logic for selecting appropriate pg_bitutils.h functions based on word size to bitmapset.h for wider visibility. Reviewed (in a previous version) by Tom Lane Discussion: https://postgr.es/m/CAFBsxsFW2JjTo58jtDB%2B3sZhxMx3t-3evew8%3DAcr%2BGGhC%2BkFaA%40mail.gmail.com