aboutsummaryrefslogtreecommitdiff
path: root/contrib/postgres_fdw/deparse.c
Commit message (Collapse)AuthorAge
* Convert PathKey to use CompareTypePeter Eisentraut2025-04-04
| | | | | | | | | | | | | | | Change the PathKey struct to use CompareType to record the sort direction instead of hardcoding btree strategy numbers. The CompareType is then converted to the index-type-specific strategy when the plan is created. This reduces the number of places btree strategy numbers are hardcoded, and it's a self-contained subset of a larger effort to allow non-btree indexes to behave like btrees. Author: Mark Dilger <mark.dilger@enterprisedb.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Remove unused #include's from contrib, pl, test .c filesPeter Eisentraut2024-10-28
| | | | | | | | | as determined by IWYU Similar to commit dbbca2cf299, but for contrib, pl, and src/test/. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
* Fixup various StringInfo function usagesDavid Rowley2024-04-10
| | | | | | | | | | | | | | | This adjusts various appendStringInfo* function calls to use a more appropriate and efficient function with the same behavior. For example, use appendStringInfoChar() when appending a single character rather than appendStringInfo() and appendStringInfoString() when no formatting is required rather than using appendStringInfo(). All adjustments made here are in code that's new to v17, so it makes sense to fix these now rather than wait a few years and make backpatching harder. Discussion: https://postgr.es/m/CAApHDvojY2UvMiO+9_55ArTj10P1LBNJyyoGB+C65BLDNT0GsQ@mail.gmail.com Reviewed-by: Nathan Bossart, Tom Lane
* Make the order of the header file includes consistentPeter Eisentraut2024-03-13
| | | | | | | | Similar to commit 7e735035f20. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
* Fix deparsing of Consts in postgres_fdw ORDER BYDavid Rowley2024-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For UNION ALL queries where a union child query contained a foreign table, if the targetlist of that query contained a constant, and the top-level query performed an ORDER BY which contained the column for the constant value, then postgres_fdw would find the EquivalenceMember with the Const and then try to produce an ORDER BY containing that Const. This caused problems with INT typed Consts as these could appear to be requests to order by an ordinal column position rather than the constant value. This could lead to either an error such as: ERROR: ORDER BY position <int const> is not in select list or worse, if the constant value is a valid column, then we could just sort by the wrong column altogether. Here we fix this issue by just not including these Consts in the ORDER BY clause. In passing, add a new section for testing ORDER BY in the postgres_fdw tests and move two existing tests which were misplaced in the WHERE clause testing section into it. Reported-by: Michał Kłeczek Reviewed-by: Ashutosh Bapat, Richard Guo Bug: #18381 Discussion: https://postgr.es/m/0714C8B8-8D82-4ABB-9F8D-A0C3657E7B6E%40kleczek.org Discussion: https://postgr.es/m/18381-137456acd168bf93%40postgresql.org Backpatch-through: 12, oldest supported version
* Make dblink interruptible, via new libpqsrv APIs.Noah Misch2024-01-08
| | | | | | | | | | | | | | | This replaces dblink's blocking libpq calls, allowing cancellation and allowing DROP DATABASE (of a database not involved in the query). Apart from explicit dblink_cancel_query() calls, dblink still doesn't cancel the remote side. The replacement for the blocking calls consists of new, general-purpose query execution wrappers in the libpqsrv facility. Out-of-tree extensions should adopt these. Use them in postgres_fdw, replacing a local implementation from which the libpqsrv implementation derives. This is a bug fix for dblink. Code inspection identified the bug at least thirteen years ago, but user complaints have not appeared. Hence, no back-patch for now. Discussion: https://postgr.es/m/20231122012945.74@rfd.leadboat.com
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Add support for deparsing semi-joins to contrib/postgres_fdwAlexander Korotkov2023-12-05
| | | | | | | | | | | | | | | | | | | SEMI-JOIN is deparsed as the EXISTS subquery. It references outer and inner relations, so it should be evaluated as the condition in the upper-level WHERE clause. The signatures of deparseFromExprForRel() and deparseRangeTblRef() are revised so that they can add conditions to the upper level. PgFdwRelationInfo now has a hidden_subquery_rels field, referencing the relids used in the inner parts of semi-join. They can't be referred to from upper relations and should be used internally for equivalence member searches. The planner can create semi-join, which refers to inner rel vars in its target list. However, we deparse semi-join as an exists() subquery. So we skip the case when the target list references to inner rel of semi-join. Author: Alexander Pyhalov Reviewed-by: Ashutosh Bapat, Ian Lawrence Barwick, Yuuki Fujii, Tomas Vondra Discussion: https://postgr.es/m/c9e2a757cf3ac2333714eaf83a9cc184@postgrespro.ru
* Add trailing commas to enum definitionsPeter Eisentraut2023-10-26
| | | | | | | | | | | | | | | | | | | | Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
* Fix various typos in code and testsMichael Paquier2023-02-09
| | | | | | | | Most of these are recent, and the documentation portions are new as of v16 so there is no need for a backpatch. Author: Justin Pryzby Discussion: https://postgr.es/m/20230208155644.GM1653@telsasoft.com
* Make Vars be outer-join-aware.Tom Lane2023-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They *must* follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us
* Remove redundant grouping and DISTINCT columns.Tom Lane2023-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid explicitly grouping by columns that we know are redundant for sorting, for example we need group by only one of x and y in SELECT ... WHERE x = y GROUP BY x, y This comes up more often than you might think, as shown by the changes in the regression tests. It's nearly free to detect too, since we are just piggybacking on the existing logic that detects redundant pathkeys. (In some of the existing plans that change, it's visible that a sort step preceding the grouping step already didn't bother to sort by the redundant column, making the old plan a bit silly-looking.) To do this, build processed_groupClause and processed_distinctClause lists that omit any provably-redundant sort items, and consult those not the originals where relevant. This means that within the planner, one should usually consult root->processed_groupClause or root->processed_distinctClause if one wants to know which columns are to be grouped on; but to check whether grouping or distinct-ing is happening at all, check non-NIL-ness of parse->groupClause or parse->distinctClause. This is comparable to longstanding rules about handling the HAVING clause, so I don't think it'll be a huge maintenance problem. nodeAgg.c also needs minor mods, because it's now possible to generate AGG_PLAIN and AGG_SORTED Agg nodes with zero grouping columns. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/185315.1672179489@sss.pgh.pa.us
* Check relkind before using TABLESAMPLE in postgres_fdwTomas Vondra2023-01-07
| | | | | | | | | | | | | | | | | | | Check the remote relkind before trying to use TABLESAMPLE to acquire sample from the remote relation. Even if the remote server version has TABLESAMPLE support, the foreign table may point to incompatible relkind (e.g. a view or a sequence). If the relkind does not support TABLESAMPLE, error out if TABLESAMPLE was requested specifically (as system/bernoulli), or fallback to random just like we do for old server versions. We currently end up disabling sampling for such relkind values anyway, due to reltuples being -1 or 1, but that seems rather accidental, and might get broken by improving reltuples estimates, etc. So better to make the check explicit. Reported-by: Tom Lane Discussion: https://postgr.es/m/951485.1672461744%40sss.pgh.pa.us
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Sample postgres_fdw tables remotely during ANALYZETomas Vondra2022-12-30
| | | | | | | | | | | | | | | | | When collecting ANALYZE sample on foreign tables, postgres_fdw fetched all rows and performed the sampling locally. For large tables this means transferring and immediately discarding large amounts of data. This commit allows the sampling to be performed on the remote server, transferring only the much smaller sample. The sampling is performed using the built-in TABLESAMPLE methods (system, bernoulli) or random() function, depending on the remote server version. Remote sampling can be enabled by analyze_sampling on the foreign server and/or foreign table, with supported values 'off', 'auto', 'system', 'bernoulli' and 'random'. The default value is 'auto' which uses either 'bernoulli' (TABLESAMPLE method) or 'random' (for remote servers without TABLESAMPLE support).
* Fix final warnings produced by -Wshadow=compatible-localDavid Rowley2022-10-07
| | | | | | | | I thought I had these in d8df67bb1, but per report from Andres Freund, I missed some. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20221005214052.c4tkudawyp5wxt3c@awork3.anarazel.de
* Harmonize parameter names in contrib code.Peter Geoghegan2022-09-22
| | | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in contrib code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
* postgres_fdw: be more wary about shippability of reg* constants.Tom Lane2022-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | Don't consider a constant of regconfig or other reg* types to be shippable unless it refers to a built-in object, or an object in an extension that's been marked shippable. Without this restriction, we're too likely to send a constant that will fail to parse on the remote server. For the regconfig type only, consider OIDs up to 16383 to be "built in", rather than the normal cutoff of 9999. Otherwise the initdb-created text search configurations will be considered unshippable, which is unlikely to make anyone happy. It's possible that this new restriction will de-optimize queries that were working satisfactorily before. Users can restore any lost performance by making sure that objects that can be expected to exist on the remote side are in shippable extensions. However, that's not a change that people are likely to be happy about having to make after a minor-release update. Between that consideration and the lack of field complaints, let's just change this in HEAD. Noted while fixing bug #17483, although this is not precisely the problem that that report complained about. Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
* Fix postgres_fdw to check shippability of sort clauses properly.Tom Lane2022-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw would push ORDER BY clauses to the remote side without verifying that the sort operator is safe to ship. Moreover, it failed to print a suitable USING clause if the sort operator isn't default for the sort expression's type. The net result of this is that the remote sort might not have anywhere near the semantics we expect, which'd be disastrous for locally-performed merge joins in particular. We addressed similar issues in the context of ORDER BY within an aggregate function call in commit 7012b132d, but failed to notice that query-level ORDER BY was broken. Thus, much of the necessary logic already existed, but it requires refactoring to be usable in both cases. Back-patch to all supported branches. In HEAD only, remove the core code's copy of find_em_expr_for_rel, which is no longer used and really should never have been pushed into equivclass.c in the first place. Ronan Dunklau, per report from David Rowley; reviews by David Rowley, Ranier Vilela, and myself Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* postgres_fdw: suppress casts on constants in limited cases.Tom Lane2021-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion: https://postgr.es/m/C9LU294V7K4F.34LRRDU449O45@lamia
* postgres_fdw: Fix issues with generated columns in foreign tables.Etsuro Fujita2021-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw imported generated columns from the remote tables as plain columns, and caused failures like "ERROR: cannot insert a non-DEFAULT value into column "foo"" when inserting into the foreign tables, as it tried to insert values into the generated columns. To fix, we do the following under the assumption that generated columns in a postgres_fdw foreign table are defined so that they represent generated columns in the underlying remote table: * Send DEFAULT for the generated columns to the foreign server on insert or update, not generated column values computed on the local server. * Add to postgresImportForeignSchema() an option "import_generated" to include column generated expressions in the definitions of foreign tables imported from a foreign server. The option is true by default. The assumption seems reasonable, because that would make a query of the postgres_fdw foreign table return values for the generated columns that are consistent with the generated expression. While here, fix another issue in postgresImportForeignSchema(): it tried to include column generated expressions as column default expressions in the foreign table definitions when the import_default option was enabled. Per bug #16631 from Daniel Cherniy. Back-patch to v12 where generated columns were added. Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org
* In postgres_fdw, allow CASE expressions to be pushed to the remote server.Tom Lane2021-07-30
| | | | | | | | | | | This is simple enough except for the need to check whether CaseTestExpr nodes have a collation that is not derived from a remote Var. For that, examine the CASE's "arg" expression and then pass that info down into the recursive examination of the WHEN expressions. Alexander Pyhalov, reviewed by Gilles Darold and myself Discussion: https://postgr.es/m/fda09032e90d85d9b726a41e03f9097f@postgrespro.ru
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Don't pass "ONLY" options specified in TRUNCATE to foreign data wrapper.Fujii Masao2021-04-27
| | | | | | | | | | | | | | | | | | | | | | Commit 8ff1c94649 allowed TRUNCATE command to truncate foreign tables. Previously the information about "ONLY" options specified in TRUNCATE command were passed to the foreign data wrapper. Then postgres_fdw constructed the TRUNCATE command to issue the remote server and included "ONLY" options in it based on the passed information. On the other hand, "ONLY" options specified in SELECT, UPDATE or DELETE have no effect when accessing or modifying the remote table, i.e., are not passed to the foreign data wrapper. So it's inconsistent to make only TRUNCATE command pass the "ONLY" options to the foreign data wrapper. Therefore this commit changes the TRUNCATE command so that it doesn't pass the "ONLY" options to the foreign data wrapper, for the consistency with other statements. Also this commit changes postgres_fdw so that it always doesn't include "ONLY" options in the TRUNCATE command that it constructs. Author: Fujii Masao Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi, Justin Pryzby, Zhihong Yu Discussion: https://postgr.es/m/551ed8c1-f531-818b-664a-2cecdab99cd8@oss.nttdata.com
* Allow TRUNCATE command to truncate foreign tables.Fujii Masao2021-04-08
| | | | | | | | | | | | | | | | | | | | | | | This commit introduces new foreign data wrapper API for TRUNCATE. It extends TRUNCATE command so that it accepts foreign tables as the targets to truncate and invokes that API. Also it extends postgres_fdw so that it can issue TRUNCATE command to foreign servers, by adding new routine for that TRUNCATE API. The information about options specified in TRUNCATE command, e.g., ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to truncate is also passed to FDW. FDW truncates the foreign data sources that the passed foreign tables specify, based on those information. For example, postgres_fdw constructs TRUNCATE command using them and issues it to the foreign server. For performance, TRUNCATE command invokes the FDW routine for TRUNCATE once per foreign server that foreign tables to truncate belong to. Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com
* Rework planning and execution of UPDATE and DELETE.Tom Lane2021-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two closely related sets of changes: 1. For UPDATE, the subplan of the ModifyTable node now only delivers the new values of the changed columns (i.e., the expressions computed in the query's SET clause) plus row identity information such as CTID. ModifyTable must re-fetch the original tuple to merge in the old values of any unchanged columns. The core advantage of this is that the changed columns are uniform across all tables of an inherited or partitioned target relation, whereas the other columns might not be. A secondary advantage, when the UPDATE involves joins, is that less data needs to pass through the plan tree. The disadvantage of course is an extra fetch of each tuple to be updated. However, that seems to be very nearly free in context; even worst-case tests don't show it to add more than a couple percent to the total query cost. At some point it might be interesting to combine the re-fetch with the tuple access that ModifyTable must do anyway to mark the old tuple dead; but that would require a good deal of refactoring and it seems it wouldn't buy all that much, so this patch doesn't attempt it. 2. For inherited UPDATE/DELETE, instead of generating a separate subplan for each target relation, we now generate a single subplan that is just exactly like a SELECT's plan, then stick ModifyTable on top of that. To let ModifyTable know which target relation a given incoming row refers to, a tableoid junk column is added to the row identity information. This gets rid of the horrid hack that was inheritance_planner(), eliminating O(N^2) planning cost and memory consumption in cases where there were many unprunable target relations. Point 2 of course requires point 1, so that there is a uniform definition of the non-junk columns to be returned by the subplan. We can't insist on uniform definition of the row identity junk columns however, if we want to keep the ability to have both plain and foreign tables in a partitioning hierarchy. Since it wouldn't scale very far to have every child table have its own row identity column, this patch includes provisions to merge similar row identity columns into one column of the subplan result. In particular, we can merge the whole-row Vars typically used as row identity by FDWs into one column by pretending they are type RECORD. (It's still okay for the actual composite Datums to be labeled with the table's rowtype OID, though.) There is more that can be done to file down residual inefficiencies in this patch, but it seems to be committable now. FDW authors should note several API changes: * The argument list for AddForeignUpdateTargets() has changed, and so has the method it must use for adding junk columns to the query. Call add_row_identity_var() instead of manipulating the parse tree directly. You might want to reconsider exactly what you're adding, too. * PlanDirectModify() must now work a little harder to find the ForeignScan plan node; if the foreign table is part of a partitioning hierarchy then the ForeignScan might not be the direct child of ModifyTable. See postgres_fdw for sample code. * To check whether a relation is a target relation, it's no longer sufficient to compare its relid to root->parse->resultRelation. Instead, check it against all_result_relids or leaf_result_relids, as appropriate. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
* Implement support for bulk inserts in postgres_fdwTomas Vondra2021-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Extends the FDW API to allow batching inserts into foreign tables. That is usually much more efficient than inserting individual rows, due to high latency for each round-trip to the foreign server. It was possible to implement something similar in the regular FDW API, but it was inconvenient and there were issues with reporting the number of actually inserted rows etc. This extends the FDW API with two new functions: * GetForeignModifyBatchSize - allows the FDW picking optimal batch size * ExecForeignBatchInsert - inserts a batch of rows at once Currently, only INSERT queries support batching. Support for DELETE and UPDATE may be added in the future. This also implements batching for postgres_fdw. The batch size may be specified using "batch_size" option both at the server and table level. The initial patch version was written by me, but it was rewritten and improved in many ways by Takayuki Tsunakawa. Author: Takayuki Tsunakawa Reviewed-by: Tomas Vondra, Amit Langote Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Support subscripting of arbitrary types, not only arrays.Tom Lane2020-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
* Remove support for postfix (right-unary) operators.Tom Lane2020-09-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | This feature has been a thorn in our sides for a long time, causing many grammatical ambiguity problems. It doesn't seem worth the pain to continue to support it, so remove it. There are some follow-on improvements we can make in the grammar, but this commit only removes the bare minimum number of productions, plus assorted backend support code. Note that pg_dump and psql continue to have full support, since they may be used against older servers. However, pg_dump warns about postfix operators. There is also a check in pg_upgrade. Documentation-wise, I (tgl) largely removed the "left unary" terminology in favor of saying "prefix operator", which is a more standard and IMO less confusing term. I included a catversion bump, although no initial catalog data changes here, to mark the boundary at which oprkind = 'r' stopped being valid in pg_operator. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
* In postgres_fdw, don't try to ship MULTIEXPR updates to remote server.Tom Lane2020-01-26
| | | | | | | | | | | | | | | | | | | | | | | | In a statement like "UPDATE remote_tab SET (x,y) = (SELECT ...)", we'd conclude that the statement could be directly executed remotely, because the sub-SELECT is in a resjunk tlist item that's not examined for shippability. Currently that ends up crashing if the sub-SELECT contains any remote Vars. Prevent the crash by deeming MULTIEXEC Params to be unshippable. This is a bit of a brute-force solution, since if the sub-SELECT *doesn't* contain any remote Vars, the current execution technology would work; but that's not a terribly common use-case for this syntax, I think. In any case, we generally don't try to ship sub-SELECTs, so it won't surprise anybody that this doesn't end up as a remote direct update. I'd be inclined to see if that general limitation can be fixed before worrying about this case further. Per report from Lukáš Sobotka. Back-patch to 9.6. 9.5 had MULTIEXPR, but we didn't try to perform remote direct updates then, so the case didn't arise anyway. Discussion: https://postgr.es/m/CAJif3k+iA_ekBB5Zw2hDBaE1wtiQa4LH4_JUXrrMGwTrH0J01Q@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Make the order of the header file includes consistent in contrib modules.Amit Kapila2019-10-24
| | | | | | | | | | | | | | The basic rule we follow here is to always first include 'postgres.h' or 'postgres_fe.h' whichever is applicable, then system header includes and then Postgres header includes.  In this, we also follow that all the Postgres header includes are in order based on their ASCII value.  We generally follow these rules, but the code has deviated in many places. This commit makes it consistent just for contrib modules. The later commits will enforce similar rules in other parts of code. Author: Vignesh C Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Rationalize use of list_concat + list_copy combinations.Tom Lane2019-08-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In the wake of commit 1cff1b95a, the result of list_concat no longer shares the ListCells of the second input. Therefore, we can replace "list_concat(x, list_copy(y))" with just "list_concat(x, y)". To improve call sites that were list_copy'ing the first argument, or both arguments, invent "list_concat_copy()" which produces a new list sharing no ListCells with either input. (This is a bit faster than "list_concat(list_copy(x), y)" because it makes the result list the right size to start with.) In call sites that were not list_copy'ing the second argument, the new semantics mean that we are usually leaking the second List's storage, since typically there is no remaining pointer to it. We considered inventing another list_copy variant that would list_free the second input, but concluded that for most call sites it isn't worth worrying about, given the relative compactness of the new List representation. (Note that in cases where such leakage would happen, the old code already leaked the second List's header; so we're only discussing the size of the leak not whether there is one. I did adjust two or three places that had been troubling to free that header so that they manually free the whole second List.) Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
* Use appendBinaryStringInfo in more places where the length is knownDavid Rowley2019-07-23
| | | | | | | | | | When we already know the length that we're going to append, then it makes sense to use appendBinaryStringInfo instead of appendStringInfoString so that the append can be performed with a simple memcpy() using a known length rather than having to first perform a strlen() call to obtain the length. Discussion: https://postgr.es/m/CAKJS1f8+FRAM1s5+mAa3isajeEoAaicJ=4e0WzrH3tAusbbiMQ@mail.gmail.com
* Represent Lists as expansible arrays, not chains of cons-cells.Tom Lane2019-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
* Use appendStringInfoString and appendPQExpBufferStr where possibleDavid Rowley2019-07-04
| | | | | | | | | | This changes various places where appendPQExpBuffer was used in places where it was possible to use appendPQExpBufferStr, and likewise for appendStringInfo and appendStringInfoString. This is really just a stylistic improvement, but there are also small performance gains to be had from doing this. Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Avoid postgres_fdw crash for a targetlist entry that's just a Param.Tom Lane2019-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | foreign_grouping_ok() is willing to put fairly arbitrary expressions into the targetlist of a remote SELECT that's doing grouping or aggregation on the remote side, including expressions that have no foreign component to them at all. This is possibly a bit dubious from an efficiency standpoint; but it rises to the level of a crash-causing bug if the expression is just a Param or non-foreign Var. In that case, the expression will necessarily also appear in the fdw_exprs list of values we need to send to the remote server, and then setrefs.c's set_foreignscan_references will mistakenly replace the fdw_exprs entry with a Var referencing the targetlist result. The root cause of this problem is bad design in commit e7cb7ee14: it put logic into set_foreignscan_references that IMV is postgres_fdw-specific, and yet this bug shows that it isn't postgres_fdw-specific enough. The transformation being done on fdw_exprs assumes that fdw_exprs is to be evaluated with the fdw_scan_tlist as input, which is not how postgres_fdw uses it; yet it could be the right thing for some other FDW. (In the bigger picture, setrefs.c has no business assuming this for the other expression fields of a ForeignScan either.) The right fix therefore would be to expand the FDW API so that the FDW could inform setrefs.c how it intends to evaluate these various expressions. We can't change that in the back branches though, and we also can't just summarily change setrefs.c's behavior there, or we're likely to break external FDWs. As a stopgap, therefore, hack up postgres_fdw so that it won't attempt to send targetlist entries that look exactly like the fdw_exprs entries they'd produce. In most cases this actually produces a superior plan, IMO, with less data needing to be transmitted and returned; so we probably ought to think harder about whether we should ship tlist expressions at all when they don't contain any foreign Vars or Aggs. But that's an optimization not a bug fix so I left it for later. One case where this produces an inferior plan is where the expression in question is actually a GROUP BY expression: then the restriction prevents us from using remote grouping. It might be possible to work around that (since that would reduce to group-by-a-constant on the remote side); but it seems like a pretty unlikely corner case, so I'm not sure it's worth expending effort solely to improve that. In any case the right long-term answer is to fix the API as sketched above, and then revert this hack. Per bug #15781 from Sean Johnston. Back-patch to v10 where the problem was introduced. Discussion: https://postgr.es/m/15781-2601b1002bad087c@postgresql.org
* postgres_fdw: Perform the (FINAL, NULL) upperrel operations remotely.Etsuro Fujita2019-04-02
| | | | | | | | | | | | | | | | The upper-planner pathification allows FDWs to arrange to push down different types of upper-stage operations to the remote side. This commit teaches postgres_fdw to do it for the (FINAL, NULL) upperrel, which is responsible for doing LockRows, LIMIT, and/or ModifyTable. This provides the ability for postgres_fdw to handle SELECT commands so that it 1) skips the LockRows step (if any) (note that this is safe since it performs early locking) and 2) pushes down the LIMIT and/or OFFSET restrictions (if any) to the remote side. This doesn't handle the INSERT/UPDATE/DELETE cases. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
* postgres_fdw: Perform the (ORDERED, NULL) upperrel operations remotely.Etsuro Fujita2019-04-02
| | | | | | | | | | | | | | The upper-planner pathification allows FDWs to arrange to push down different types of upper-stage operations to the remote side. This commit teaches postgres_fdw to do it for the (ORDERED, NULL) upperrel, which is responsible for evaluating the query's ORDER BY ordering. Since postgres_fdw is already able to evaluate that ordering remotely for foreign baserels and foreign joinrels (see commit aa09cd242f et al.), this adds support for that for foreign grouping relations. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk
* Renaming for new subscripting mechanismAlvaro Herrera2019-02-01
| | | | | | | | | | | | Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to introduce a more generic subscription mechanism, which allows subscripting not only arrays but also other object types such as JSONB. That functionality is introduced in a largish invasive patch, out of which this internal renaming patch was extracted. Author: Dmitry Dolgov Reviewed-by: Tom Lane, Arthur Zakirov Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* Replace uses of heap_open et al with the corresponding table_* function.Andres Freund2019-01-21
| | | | | Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
* Replace heapam.h includes with {table, relation}.h where applicable.Andres Freund2019-01-21
| | | | | | | | | A lot of files only included heapam.h for relation_open, heap_open etc - replace the heapam.h include in those files with the narrower header. Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Remove WITH OIDS support, change oid catalog column visibility.Andres Freund2018-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_*->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by * (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
* Fix WITH CHECK OPTION on views referencing postgres_fdw tables.Jeff Davis2018-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a view references a foreign table, and the foreign table has a BEFORE INSERT trigger, then it's possible for a tuple inserted or updated through the view to be changed such that it violates the view's WITH CHECK OPTION constraint. Before this commit, postgres_fdw handled this case inconsistently. A RETURNING clause on the INSERT or UPDATE statement targeting the view would cause the finally-inserted tuple to be read back, and the WITH CHECK OPTION violation would throw an error. But without a RETURNING clause, postgres_fdw would not read the final tuple back, and WITH CHECK OPTION would not throw an error for the violation (or may throw an error when there is no real violation). AFTER ROW triggers on the foreign table had a similar effect as a RETURNING clause on the INSERT or UPDATE statement. To fix, this commit retrieves the attributes needed to enforce the WITH CHECK OPTION constraint along with the attributes needed for the RETURNING clause (if any) from the remote side. Thus, the WITH CHECK OPTION constraint is always evaluated against the final tuple after any triggers on the remote side. This fix may be considered inconsistent with CHECK constraints declared on foreign tables, which are not enforced locally at all (because the constraint is on a remote object). The discussion concluded that this difference is reasonable, because the WITH CHECK OPTION is a constraint on the local view (not any remote object); therefore it only makes sense to enforce its WITH CHECK OPTION constraint locally. Author: Etsuro Fujita Reviewed-by: Arthur Zakirov, Stephen Frost Discussion: https://www.postgresql.org/message-id/7eb58fab-fd3b-781b-ac33-f7cfec96021f%40lab.ntt.co.jp