aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor/functions.c
Commit message (Collapse)AuthorAge
* Don't use a tuplestore if we don't have to for SQL-language functions.Tom Lane2025-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We only need a tuplestore if we're actually going to accumulate multiple result tuples. Obviously then we don't need one for non-set- returning functions; but even a SRF doesn't need one if we decide to use "lazyEval" (one row at a time) mode. In these cases, it's sufficient to use the junkfilter's result slot to hold the single row that's due to be returned. We just need to "materialize" that slot to ensure it holds onto the data past shutdown of the sub-executor. The original intent of this patch was partially to save a few cycles (by not putting tuples into a tuplestore only to pull them back out immediately), but mostly to ensure that we don't use a tuplestore in non-set-returning functions. That's because I had concerns about whether a tuplestore is safe to keep across queries, which was possible for functions invoked via long-lived FmgrInfos such as those kept in the typcache. There are no cases where SRFs are called that way, so getting rid of the tuplestore in non-SRFs should make things safer. However, it emerges that running fmgr_sql in a short-lived context (as 595d1efed made it do) makes the existing coding unsafe anyway: we can end up with a long-lived TupleTableSlot holding a freeable reference to a short-lived tuple, resulting in a double-free crash. Not trying to pull tuples out of the tuplestore using that slot dodges the problem, so I'm going to commit this now rather than invent a band-aid solution for v18. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2443532.1744919968@sss.pgh.pa.us Discussion: https://postgr.es/m/9f975803-1a1c-4f21-b987-f572e110e860@gmail.com
* Cache typlens of a SQL function's input arguments.Tom Lane2025-04-17
| | | | | | | | | | | This gets rid of repetitive get_typlen calls in postquel_sub_params, which show up as costing a few percent of the runtime in simple test cases (more with more parameters). In combination with the preceding patches, this gets us most of the way back down to the amount of per-call overhead that functions.c had before commit 0dca5d68d. There are some more things that could be done, but this seems like an okay place to stop for v18.
* Make SQLFunctionCache long-lived again.Tom Lane2025-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | At this point, the only data structures we allocate directly in fcontext are the SQLFunctionCache struct itself, the ParamListInfo struct, and the execution_state array, all of which are small and perfectly capable of being re-used across executions of the same FmgrInfo. Hence, let's give them the same lifespan as the FmgrInfo. This step gets rid of the separate SQLFunctionLink struct and makes fn_extra point to SQLFunctionCache again. We also get rid of the separate fcontext memory context and allocate these items directly in fn_mcxt. For notational simplicity, SQLFunctionCache still has an fcontext field, but it's just a copy of fn_mcxt. The motivation for this is to allow these structures to live as long as the FmgrInfo and be re-used across calls, restoring the original design without its propensity for memory leaks. This gets rid of some per-call overhead that we added in 0dca5d68d. We also make an effort to re-use the JunkFilter and result slot. Those might need to change if the function definition changes, so we compromise by rebuilding them if the cached plan changes. This also moves the tuplestore into fn_mcxt so that it can be re-used across calls, again undoing a change made in 0dca5d68d.
* Split some storage out to separate subcontexts of fcontext.Tom Lane2025-04-17
| | | | | | | | | | | | | | | | Put the JunkFilter and its result slot (and thence also some subsidiary data such as the result tupledesc) into a separate subcontext "jfcontext". This doesn't accomplish a lot at this point, because we make a new JunkFilter each time through the SQL function. However, the plan is to make the fcontext long-lived, and that raises the possibility that we'll need a new JunkFilter because the plan for the result-generating query changes. A separate context makes it easy to free the obsoleted data when that happens. Also, instead of always running the sub-executor in fcontext, make a separate context for it if we're doing lazy eval of a SRF, and otherwise just run it inside CurrentMemoryContext.
* Make functions.c mostly run in a short-lived memory context.Tom Lane2025-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, much of this code ran with CurrentMemoryContext set to be the function's fcontext, so that we tended to leak a lot of stuff there. Commit 0dca5d68d dealt with that by releasing the fcontext at the completion of each SQL function call, but we'd like to go back to the previous approach of allowing the fcontext to be query-lifespan. To control the leakage problem, rearrange the code so that we mostly run in the memory context that fmgr_sql is called in (which we expect to be short-lived). Notably, this means that parsing/planning is all done in the short-lived context and doesn't leak cruft into fcontext. This patch also fixes the allocation of execution_state records so that we don't leak them across executions. I set that up with a re-usable array that contains at least as many execution_state structs as we need for the current querytree. The chain structure is still there, but it's not really doing much for us, and maybe somebody will be motivated to get rid of it. I'm not though. This incidentally also moves the call of BlessTupleDesc to be with the code that creates the JunkFilter. That doesn't make much difference now, but a later patch will reduce the number of times the JunkFilter gets made, and we needn't bless the results any more often than that. We still leak a fair amount in fcontext, particularly when executing utility statements, but that's material for a separate patch step; the point here is only to get rid of unintentional allocations in fcontext.
* Minor performance improvement for SQL-language functions.Tom Lane2025-04-17
| | | | | | | | | | | | | | | | Late in the development of commit 0dca5d68d, I added a step to copy the result tlist we extract from the cached final query, because I was afraid that that might not last as long as the JunkFilter that we're passing it off to. However, that turns out to cost a noticeable number of cycles, and it's really quite unnecessary because the JunkFilter will not examine that tlist after it's been created. (ExecFindJunkAttribute would use it, but we don't use that function on this JunkFilter.) Hence, remove the copy step. For safety, reset the might-become-dangling jf_targetList pointer to NIL. In passing, remove DR_sqlfunction.cxt, which we don't use anymore; it's confusing because it's not entirely clear which context it ought to point at.
* functions.c: copy trees from source_list before parse analysis etc.Tom Lane2025-04-04
| | | | | | | | | | | | | | | | | | | | | | This is yet another bit of fallout from the fact that backend/parser (like other code) feels free to scribble on the parse tree it's handed. In this case that resulted in modifying the relatively-short-lived copy in the cached function's source_list. That would be fine since we only need each source_list tree once ... except that if the parser fails after making some changes, the function cache entry remains as-is and will still be there if the user tries to execute the function again. Then we have problems because we're feeding a non-pristine tree to the parser. The most expedient fix is a quick copyObject(). I considered other answers like somehow marking the cache entry invalid temporarily, but that would add complexity and I'm not sure it's worth it. In typical scenarios we'd only do this once per function query per session. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6d442183-102c-498a-81d1-eeeb086cdc5a@gmail.com
* Fix oversight in commit 0dca5d68d.Tom Lane2025-04-03
| | | | | | | | | | | | | As coded, fmgr_sql() would get an assertion failure for a SQL function that has an empty body and is declared to return some type other than VOID. Typically you'd never get that far because fmgr_sql_validator() would reject such a definition (I suspect that's how come I managed to miss the bug). But if check_function_bodies is off or the function is polymorphic, the validation check wouldn't get made. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0fde377a-3870-4d18-946a-ce008ee5bb88@gmail.com
* Change SQL-language functions to use the plan cache.Tom Lane2025-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the historical implementation of SQL functions (if they don't get inlined), we built plans for all the contained queries at first call within an outer query, and then re-used those plans for the duration of the outer query, and then forgot everything. This was not ideal, not least because the plans could not be customized to specific values of the function's parameters. Our plancache infrastructure seems mature enough to be used here. That will solve both the problem with not being able to build custom plans and the problem with not being able to share work across successive outer queries. Aside from those performance concerns, this change fixes a longstanding bugaboo with SQL functions: you could not write DDL that would affect later statements in the same function. That's mostly still true with new-style SQL functions, since the results of parse analysis are baked into the stored query trees (and protected by dependency records). But for old-style SQL functions, it will now work much as it does with PL/pgSQL functions, because we delay parse analysis and planning of each query until we're ready to run it. Some edge cases that require replanning are now handled better too; see for example the new rowsecurity test, where we now detect an RLS context change that was previously missed. One other edge-case change that might be worthy of a release note is that we now insist that a SQL function's result be generated by the physically-last query within it. Previously, if the last original query was deleted by a DO INSTEAD NOTHING rule, we'd be willing to take the result from the preceding query instead. This behavior was undocumented except in source-code comments, and it seems hard to believe that anyone's relying on it. Along the way to this feature, we needed a few infrastructure changes: * The plancache can now take either a raw parse tree or an analyzed-but-not-rewritten Query as the starting point for a CachedPlanSource. If given a Query, it is caller's responsibility that nothing will happen to invalidate that form of the query. We use this for new-style SQL functions, where what's in pg_proc is serialized Query(s) and we trust the dependency mechanism to disallow DDL that would break those. * The plancache now offers a way to invoke a post-rewrite callback to examine/modify the rewritten parse tree when it is rebuilding the parse trees after a cache invalidation. We need this because SQL functions sometimes adjust the parse tree to make its output exactly match the declared result type; if the plan gets rebuilt, that has to be re-done. * There is a new backend module utils/cache/funccache.c that abstracts the idea of caching data about a specific function usage (a particular function and set of input data types). The code in it is moved almost verbatim from PL/pgSQL, which has done that for a long time. We use that logic now for SQL-language functions too, and maybe other PLs will have use for it in the future. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/8216639.NyiUUSuA9g@aivenlaptop
* Don't lock partitions pruned by initial pruningAmit Langote2025-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before executing a cached generic plan, AcquireExecutorLocks() in plancache.c locks all relations in a plan's range table to ensure the plan is safe for execution. However, this locks runtime-prunable relations that will later be pruned during "initial" runtime pruning, introducing unnecessary overhead. This commit defers locking for such relations to executor startup and ensures that if the CachedPlan is invalidated due to concurrent DDL during this window, replanning is triggered. Deferring these locks avoids unnecessary locking overhead for pruned partitions, resulting in significant speedup, particularly when many partitions are pruned during initial runtime pruning. * Changes to locking when executing generic plans: AcquireExecutorLocks() now locks only unprunable relations, that is, those found in PlannedStmt.unprunableRelids (introduced in commit cbc127917e), to avoid locking runtime-prunable partitions unnecessarily. The remaining locks are taken by ExecDoInitialPruning(), which acquires them only for partitions that survive pruning. This deferral does not affect the locks required for permission checking in InitPlan(), which takes place before initial pruning. ExecCheckPermissions() now includes an Assert to verify that all relations undergoing permission checks, none of which can be in the set of runtime-prunable relations, are properly locked. * Plan invalidation handling: Deferring locks introduces a window where prunable relations may be altered by concurrent DDL, invalidating the plan. A new function, ExecutorStartCachedPlan(), wraps ExecutorStart() to detect and handle invalidation caused by deferred locking. If invalidation occurs, ExecutorStartCachedPlan() updates CachedPlan using the new UpdateCachedPlan() function and retries execution with the updated plan. To ensure all code paths that may be affected by this handle invalidation properly, all callers of ExecutorStart that may execute a PlannedStmt from a CachedPlan have been updated to use ExecutorStartCachedPlan() instead. UpdateCachedPlan() replaces stale plans in CachedPlan.stmt_list. A new CachedPlan.stmt_context, created as a child of CachedPlan.context, allows freeing old PlannedStmts while preserving the CachedPlan structure and its statement list. This ensures that loops over statements in upstream callers of ExecutorStartCachedPlan() remain intact. ExecutorStart() and ExecutorStart_hook implementations now return a boolean value indicating whether plan initialization succeeded with a valid PlanState tree in QueryDesc.planstate, or false otherwise, in which case QueryDesc.planstate is NULL. Hook implementations are required to call standard_ExecutorStart() at the beginning, and if it returns false, they should do the same without proceeding. * Testing: To verify these changes, the delay_execution module tests scenarios where cached plans become invalid due to changes in prunable relations after deferred locks. * Note to extension authors: ExecutorStart_hook implementations must verify plan validity after calling standard_ExecutorStart(), as explained earlier. For example: if (prev_ExecutorStart) plan_valid = prev_ExecutorStart(queryDesc, eflags); else plan_valid = standard_ExecutorStart(queryDesc, eflags); if (!plan_valid) return false; <extension-code> return true; Extensions accessing child relations, especially prunable partitions, via ExecGetRangeTableRelation() must now ensure their RT indexes are present in es_unpruned_relids (introduced in commit cbc127917e), or they will encounter an error. This is a strict requirement after this change, as only relations in that set are locked. The idea of deferring some locks to executor startup, allowing locks for prunable partitions to be skipped, was first proposed by Tom Lane. Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: David Rowley <dgrowleyml@gmail.com> (earlier versions) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier versions) Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
* Update copyright for 2025Bruce Momjian2025-01-01
| | | | Backpatch-through: 13
* Introduce CompactAttribute array in TupleDesc, take 2David Rowley2024-12-20
| | | | | | | | | | | | | | | | | | | | | | | | The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com
* Simplify executor's determination of whether to use parallelism.Tom Lane2024-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our parallel-mode code only works when we are executing a query in full, so ExecutePlan must disable parallel mode when it is asked to do partial execution. The previous logic for this involved passing down a flag (variously named execute_once or run_once) from callers of ExecutorRun or PortalRun. This is overcomplicated, and unsurprisingly some of the callers didn't get it right, since it requires keeping state that not all of them have handy; not to mention that the requirements for it were undocumented. That led to assertion failures in some corner cases. The only state we really need for this is the existing QueryDesc.already_executed flag, so let's just put all the responsibility in ExecutePlan. (It could have been done in ExecutorRun too, leading to a slightly shorter patch -- but if there's ever more than one caller of ExecutePlan, it seems better to have this logic in the subroutine than the callers.) This makes those ExecutorRun/PortalRun parameters unnecessary. In master it seems okay to just remove them, returning the API for those functions to what it was before parallelism. Such an API break is clearly not okay in stable branches, but for them we can just leave the parameters in place after documenting that they do nothing. Per report from Yugo Nagata, who also reviewed and tested this patch. Back-patch to all supported branches. Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
* Remove useless casts to (void *)Peter Eisentraut2024-11-28
| | | | | | | | Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
* Ensure cached plans are correctly marked as dependent on role.Nathan Bossart2024-11-11
| | | | | | | | | | | | | | If a CTE, subquery, sublink, security invoker view, or coercion projection references a table with row-level security policies, we neglected to mark the plan as potentially dependent on which role is executing it. This could lead to later executions in the same session returning or hiding rows that should have been hidden or returned instead. Reported-by: Wolfgang Walther Reviewed-by: Noah Misch Security: CVE-2024-10976 Backpatch-through: 12
* Propagate query IDs of utility statements in functionsMichael Paquier2024-07-19
| | | | | | | | | | | | | | | | | | | For utility statements defined within a function, the query tree is copied to a PlannedStmt as utility commands do not require planning. However, the query ID was missing from the information passed down. This leads to plugins relying on the query ID like pg_stat_statements to not be able to track utility statements within function calls. Tests are added to check this behavior, depending on pg_stat_statements.track. This is an old bug. Now, query IDs for utilities are compiled using their parsed trees rather than the query string since v16 (3db72ebcbe20), leading to less bloat with utilities, so backpatch down only to this version. Author: Anthonin Bonnefoy Discussion: https://postgr.es/m/CAO6_XqrGp-uwBqi3vBPLuRULKkddjC7R5QZCgsFren=8E+m2Sg@mail.gmail.com Backpatch-through: 16
* Add RETURNING support to MERGE.Dean Rasheed2024-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | This allows a RETURNING clause to be appended to a MERGE query, to return values based on each row inserted, updated, or deleted. As with plain INSERT, UPDATE, and DELETE commands, the returned values are based on the new contents of the target table for INSERT and UPDATE actions, and on its old contents for DELETE actions. Values from the source relation may also be returned. As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be used as the source relation for other operations such as WITH queries and COPY commands. Additionally, a special function merge_action() is provided, which returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action executed for each row. The merge_action() function can be used anywhere in the RETURNING list, including in arbitrary expressions and subqueries, but it is an error to use it anywhere outside of a MERGE query's RETURNING list. Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera, Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut, and Wolfgang Walther. Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com
* Fix confusion about the return rowtype of SQL-language procedures.Tom Lane2024-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a very ancient hack in check_sql_fn_retval that allows a single SELECT targetlist entry of composite type to be taken as supplying all the output columns of a function returning composite. (This is grotty and fundamentally ambiguous, but it's really hard to do nested composite-returning functions without it.) As far as I know, that doesn't cause any problems in ordinary functions. It's disastrous for procedures however. All procedures that have any output parameters are labeled with prorettype RECORD, and the CALL code expects it will get back a record with one column per output parameter, regardless of whether any of those parameters is composite. Doing something else leads to an assertion failure or core dump. This is simple enough to fix: we just need to not apply that rule when considering procedures. However, that requires adding another argument to check_sql_fn_retval, which at least in principle might be getting called by external callers. Therefore, in the back branches convert check_sql_fn_retval into an ABI-preserving wrapper around a new function check_sql_fn_retval_ext. Per report from Yahor Yuzefovich. This has been broken since we implemented procedures, so back-patch to all supported branches. Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com
* Redefine backend ID to be an index into the proc arrayHeikki Linnakangas2024-03-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, backend ID was an index into the ProcState array, in the shared cache invalidation manager (sinvaladt.c). The entry in the ProcState array was reserved at backend startup by scanning the array for a free entry, and that was also when the backend got its backend ID. Things become slightly simpler if we redefine backend ID to be the index into the PGPROC array, and directly use it also as an index to the ProcState array. This uses a little more memory, as we reserve a few extra slots in the ProcState array for aux processes that don't need them, but the simplicity is worth it. Aux processes now also have a backend ID. This simplifies the reservation of BackendStatusArray and ProcSignal slots. You can now convert a backend ID into an index into the PGPROC array simply by subtracting 1. We still use 0-based "pgprocnos" in various places, for indexes into the PGPROC array, but the only difference now is that backend IDs start at 1 while pgprocnos start at 0. (The next commmit will get rid of the term "backend ID" altogether and make everything 0-based.) There is still a 'backendId' field in PGPROC, now part of 'vxid' which encapsulates the backend ID and local transaction ID together. It's needed for prepared xacts. For regular backends, the backendId is always equal to pgprocno + 1, but for prepared xact PGPROC entries, it's the ID of the original backend that processed the transaction. Reviewed-by: Andres Freund, Reid Thompson Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407@iki.fi
* Update copyright for 2024Bruce Momjian2024-01-03
| | | | | | | | Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
* Add trailing commas to enum definitionsPeter Eisentraut2023-10-26
| | | | | | | | | | | | | | | | | | | | Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
* Add SysCacheGetAttrNotNull for guaranteed not-null attrsDaniel Gustafsson2023-03-25
| | | | | | | | | | | | | When extracting an attr from a cached tuple in the syscache with SysCacheGetAttr the isnull parameter must be checked in case the attr cannot be NULL. For cases when this is known beforehand, a wrapper is introduced which perform the errorhandling internally on behalf of the caller, invoking an elog in case of a NULL attr. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/AD76405E-DB45-46B6-941F-17B1EB3A9076@yesql.se
* Update copyright for 2023Bruce Momjian2023-01-02
| | | | Backpatch-through: 11
* Prevent clobbering of utility statements in SQL function caches.Tom Lane2022-11-29
| | | | | | | | | | | | | | | | | | | | | | This is an oversight in commit 7c337b6b5: I apparently didn't think about the possibility of a SQL function being executed multiple times within a query. In that case, functions.c's primitive caching mechanism allows the same utility parse tree to be presented for execution more than once. We have to tell ProcessUtility to make a working copy of the parse tree, or bad things happen. Normally I'd add a regression test, but I think the reported crasher is dependent on some rather random implementation choices that are nowhere near functions.c, so its usefulness as a long-lived test feels questionable. In any case, this fix is clearly correct given the design choices of 7c337b6b5. Per bug #17702 from Xin Wen. Thanks to Daniel Gustafsson for analysis. Back-patch to v14 where the faulty commit came in (before that, the responsibility for copying scribble-able utility parse trees lay elsewhere). Discussion: https://postgr.es/m/17702-ad24fdcdd1e9047a@postgresql.org
* Fix handling of R/W expanded datums that are passed to SQL functions.Tom Lane2022-08-10
| | | | | | | | | | | | | | | | | | | fmgr_sql must make expanded-datum arguments read-only, because it's possible that the function body will pass the argument to more than one callee function. If one of those functions takes the datum's R/W property as license to scribble on it, then later callees will see an unexpected value, leading to wrong answers. From a performance standpoint, it'd be nice to skip this in the common case that the argument value is passed to only one callee. However, detecting that seems fairly hard, and certainly not something that I care to attempt in a back-patched bug fix. Per report from Adam Mackler. This has been broken since we invented expanded datums, so back-patch to all supported branches. Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us
* Remove useless assertionsPeter Eisentraut2022-07-13
| | | | | | We don't need Assert(IsA(foo, String)) right before running strVal(foo), since strVal() already does the assertion internally (via castNode()).
* Parse/analyze function renamingPeter Eisentraut2022-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three parallel ways to call parse/analyze: with fixed parameters, with variable parameters, and by supplying your own parser callback. Some of the involved functions were confusingly named and made this API structure more confusing. This patch renames some functions to make this clearer: parse_analyze() -> parse_analyze_fixedparams() pg_analyze_and_rewrite() -> pg_analyze_and_rewrite_fixedparams() (Otherwise one might think this variant doesn't accept parameters, but in fact all three ways accept parameters.) pg_analyze_and_rewrite_params() -> pg_analyze_and_rewrite_withcb() (Before, and also when considering pg_analyze_and_rewrite(), one might think this is the only way to pass parameters. Moreover, the parser callback doesn't necessarily need to parse only parameters, it's just one of the things it could do.) parse_fixed_parameters() -> setup_parse_fixed_parameters() parse_variable_parameters() -> setup_parse_variable_parameters() (These functions don't actually do any parsing, they just set up callbacks to use during parsing later.) This patch also adds some const decorations to the fixed-parameters API, so the distinction from the variable-parameters API is more clear. Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/c67ce276-52b4-0239-dc0e-39875bf81840@enterprisedb.com
* Update copyright for 2022Bruce Momjian2022-01-07
| | | | Backpatch-through: 10
* Centralize the logic for protective copying of utility statements.Tom Lane2021-06-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the "simple Query" code path, it's fine for parse analysis or execution of a utility statement to scribble on the statement's node tree, since that'll just be thrown away afterwards. However it's not fine if the node tree is in the plan cache, as then it'd be corrupted for subsequent executions. Up to now we've dealt with that by having individual utility-statement functions apply copyObject() if they were going to modify the tree. But that's prone to errors of omission. Bug #17053 from Charles Samborski shows that CREATE/ALTER DOMAIN didn't get this memo, and can crash if executed repeatedly from plan cache. In the back branches, we'll just apply a narrow band-aid for that, but in HEAD it seems prudent to have a more principled fix that will close off the possibility of other similar bugs in future. Hence, let's hoist the responsibility for doing copyObject up into ProcessUtility from its children, thus ensuring that it happens for all utility statement types. Also, modify ProcessUtility's API so that its callers can tell it whether a copy step is necessary. It turns out that in all cases, the immediate caller knows whether the node tree is transient, so this doesn't involve a huge amount of code thrashing. In this way, while we lose a little bit in the execute-from-cache code path due to sometimes copying node trees that wouldn't be mutated anyway, we gain something in the simple-Query code path by not copying throwaway node trees. Statements that are complex enough to be expensive to copy are almost certainly ones that would have to be copied anyway, so the loss in the cache code path shouldn't be much. (Note that this whole problem applies only to utility statements. Optimizable statements don't have the issue because we long ago made the executor treat Plan trees as read-only. Perhaps someday we will make utility statement execution act likewise, but I'm not holding my breath.) Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
* Use the correct article for abbreviationsDavid Rowley2021-06-11
| | | | | | | | | | | | | | | | | | | | | | | We've accumulated quite a mix of instances of "an SQL" and "a SQL" in the documents. It would be good to be a bit more consistent with these. The most recent version of the SQL standard I looked at seems to prefer "an SQL". That seems like a good lead to follow, so here we change all instances of "a SQL" to become "an SQL". Most instances correctly use "an SQL" already, so it also makes sense to use the dominant variation in order to minimise churn. Additionally, there were some other abbreviations that needed to be adjusted. FSM, SSPI, SRF and a few others. Also fix some pronounceable, abbreviations to use "a" instead of "an". For example, "a SASL" instead of "an SASL". Here I've only adjusted the documents and error messages. Many others still exist in source code comments. Translator hint comments seem to be the biggest culprit. It currently does not seem worth the churn to change these. Discussion: https://postgr.es/m/CAApHDvpML27UqFXnrYO1MJddsKVMQoiZisPvsAGhKE_tsKXquw%40mail.gmail.com
* Reconsider the handling of procedure OUT parameters.Tom Lane2021-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2453ea142 redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us
* Undo decision to allow pg_proc.prosrc to be NULL.Tom Lane2021-04-15
| | | | | | | | | | | | | | | | | | | | | | | | Commit e717a9a18 changed the longstanding rule that prosrc is NOT NULL because when a SQL-language function is written in SQL-standard style, we don't currently have anything useful to put there. This seems a poor decision though, as it could easily have negative impacts on external PLs (opening them to crashes they didn't use to have, for instance). SQL-function-related code can just as easily test "is prosqlbody not null" as "is prosrc null", so there's no real gain there either. Hence, revert the NOT NULL marking removal and adjust related logic. For now, we just put an empty string into prosrc for SQL-standard functions. Maybe we'll have a better idea later, although the history of things like pg_attrdef.adsrc suggests that it's not easy to maintain a string equivalent of a node tree. This also adds an assertion that queryDesc->sourceText != NULL to standard_ExecutorStart. We'd been silently relying on that for awhile, so let's make it less silent. Also fix some overlooked documentation and test cases. Discussion: https://postgr.es/m/2197698.1617984583@sss.pgh.pa.us
* SQL-standard function bodyPeter Eisentraut2021-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Multirange datatypesAlexander Korotkov2020-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiranges are basically sorted arrays of non-overlapping ranges with set-theoretic operations defined over them. Since v14, each range type automatically gets a corresponding multirange datatype. There are both manual and automatic mechanisms for naming multirange types. Once can specify multirange type name using multirange_type_name attribute in CREATE TYPE.  Otherwise, a multirange type name is generated automatically. If the range type name contains "range" then we change that to "multirange". Otherwise, we add "_multirange" to the end. Implementation of multiranges comes with a space-efficient internal representation format, which evades extra paddings and duplicated storage of oids.  Altogether this format allows fetching a particular range by its index in O(n). Statistic gathering and selectivity estimation are implemented for multiranges. For this purpose, stored multirange is approximated as union range without gaps. This field will likely need improvements in the future. Catversion is bumped. Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0 Author: Paul Jungwirth, revised by me Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston Reviewed-by: Zhihong Yu, Alexander Korotkov
* Fix list-munging bug that broke SQL function result coercions.Tom Lane2020-10-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 913bbd88d, check_sql_fn_retval() can either insert type coercion steps in-line in the Query that produces the SQL function's results, or generate a new top-level Query to perform the coercions, if modifying the Query's output in-place wouldn't be safe. However, it appears that the latter case has never actually worked, because the code tried to inject the new Query back into the query list it was passed ... which is not the list that will be used for later processing when we execute the SQL function "normally" (without inlining it). So we ended up with no coercion happening at run-time, leading to wrong results or crashes depending on the datatypes involved. While the regression tests look like they cover this area well enough, through a huge bit of bad luck all the test cases that exercise the separate-Query path were checking either inline-able cases (which accidentally didn't have the bug) or cases that are no-ops at runtime (e.g., varchar to text), so that the failure to perform the coercion wasn't obvious. The fact that the cases that don't work weren't allowed at all before v13 probably contributed to not noticing the problem sooner, too. To fix, get rid of the separate "flat" list of Query nodes and instead pass the real two-level list that is going to be used later. I chose to make the same change in check_sql_fn_statements(), although that has no actual bug, just so that we don't need that data structure at all. This is an API change, as evidenced by the adjustments needed to callers outside functions.c. That's a bit scary to be doing in a released branch, but so far as I can tell from a quick search, there are no outside callers of these functions (and they are sufficiently specific to our semantics for SQL-language functions that it's not apparent why any extension would need to call them). In any case, v13 already changed the API of check_sql_fn_retval() compared to prior branches. Per report from pinker. Back-patch to v13 where this code came in. Discussion: https://postgr.es/m/1603050466566-0.post@n3.nabble.com
* Support for OUT parameters in proceduresPeter Eisentraut2020-10-05
| | | | | | | | | | Unlike for functions, OUT parameters for procedures are part of the signature. Therefore, they have to be listed in pg_proc.proargtypes as well as mentioned in ALTER PROCEDURE and DROP PROCEDURE. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2b8490fe-51af-e671-c504-47359dc453c5@2ndquadrant.com
* Account for collation when coercing the output of a SQL function.Tom Lane2020-04-14
| | | | | | | Commit 913bbd88d overlooked that the result of coerce_to_target_type might need collation fixups. Per report from Andreas Joseph Krogh. Discussion: https://postgr.es/m/VisenaEmail.72.37d08ec2b8cb8fb5.17179940cd3@tc7-visena
* Allow the planner-related functions and hook to accept the query string.Fujii Masao2020-03-30
| | | | | | | | | | | | | | | | | | This commit adds query_string argument into the planner-related functions and hook and allows us to pass the query string to them. Currently there is no user of the query string passed. But the upcoming patch for the planning counters will add the planning hook function into pg_stat_statements and the function will need the query string. So this change will be necessary for that patch. Also this change is useful for some extensions that want to use the query string in their planner hook function. Author: Pascal Legrand, Julien Rouhaud Reviewed-by: Yoshikazu Imai, Tom Lane, Fujii Masao Discussion: https://postgr.es/m/CAOBaU_bU1m3_XF5qKYtSj1ua4dxd=FWDyh2SH4rSJAUUfsGmAQ@mail.gmail.com Discussion: https://postgr.es/m/1583789487074-0.post@n3.nabble.com
* Remove bogus assertion about polymorphic SQL function result.Tom Lane2020-03-17
| | | | | | | | | | | | | | | | | | It is possible to reach check_sql_fn_retval() with an unresolved polymorphic rettype, resulting in an assertion failure as demonstrated by one of the added test cases. However, the code following that throws what seems an acceptable error message, so just remove the Assert and adjust commentary. While here, I thought it'd be a good idea to provide some parallel tests of SQL-function and PL/pgSQL-function polymorphism behavior. Some of these cases are perhaps duplicative of tests elsewhere, but we hadn't any organized coverage of the topic AFAICS. Although that assertion's been wrong all along, it won't have any effect in production builds, so I'm not bothering to back-patch. Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
* Represent command completion tags as structsAlvaro Herrera2020-03-02
| | | | | | | | | | | | | | | | | | | | | | The backend was using strings to represent command tags and doing string comparisons in multiple places, but that's slow and unhelpful. Create a new command list with a supporting structure to use instead; this is stored in a tag-list-file that can be tailored to specific purposes with a caller-definable C macro, similar to what we do for WAL resource managers. The first first such uses are a new CommandTag enum and a CommandTagBehavior struct. Replace numerous occurrences of char *completionTag with a QueryCompletion struct so that the code no longer stores information about completed queries in a cstring. Only at the last moment, in EndCommand(), does this get converted to a string. EventTriggerCacheItem no longer holds an array of palloc’d tag strings in sorted order, but rather just a Bitmapset over the CommandTags. Author: Mark Dilger, with unsolicited help from Álvaro Herrera Reviewed-by: John Naylor, Tom Lane Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
* Fix problems with "read only query" checks, and refactor the code.Robert Haas2020-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, check_xact_readonly() was responsible for determining which types of queries could not be run in a read-only transaction, standard_ProcessUtility() was responsibility for prohibiting things which were allowed in read only transactions but not in recovery, and utility commands were basically prohibited in bulk in parallel mode by calls to CommandIsReadOnly() in functions.c and spi.c. This situation was confusing and error-prone. Accordingly, move all the checks to a new function ClassifyUtilityCommandAsReadOnly(), which determines the degree to which a given statement is read only. In the old code, check_xact_readonly() inadvertently failed to handle several statement types that actually should have been prohibited, specifically T_CreatePolicyStmt, T_AlterPolicyStmt, T_CreateAmStmt, T_CreateStatsStmt, T_AlterStatsStmt, and T_AlterCollationStmt. As a result, thes statements were erroneously allowed in read only transactions, parallel queries, and standby operation. Generally, they would fail anyway due to some lower-level error check, but we shouldn't rely on that. In the new code structure, future omissions of this type should cause ClassifyUtilityCommandAsReadOnly() to complain about an unrecognized node type. As a fringe benefit, this means we can allow certain types of utility commands in parallel mode, where it's safe to do so. This allows ALTER SYSTEM, CALL, DO, CHECKPOINT, COPY FROM, EXPLAIN, and SHOW. It might be possible to allow additional commands with more work and thought. Along the way, document the thinking process behind the current set of checks, as per discussion especially with Peter Eisentraut. There is some interest in revising some of these rules, but that seems like a job for another patch. Patch by me, reviewed by Tom Lane, Stephen Frost, and Peter Eisentraut. Discussion: http://postgr.es/m/CA+TgmoZ_rLqJt5sYkvh+JpQnfX0Y+B2R+qfi820xNih6x-FQOQ@mail.gmail.com
* Improve the handling of result type coercions in SQL functions.Tom Lane2020-01-08
| | | | | | | | | | | | | | | | | | | | | | Use the parser's standard type coercion machinery to convert the output column(s) of a SQL function's final SELECT or RETURNING to the type(s) they should have according to the function's declared result type. We'll allow any case where an assignment-level coercion is available. Previously, we failed unless the required coercion was a binary-compatible one (and the documentation ignored this, falsely claiming that the types must match exactly). Notably, the coercion now accounts for typmods, so that cases where a SQL function is declared to return a composite type whose columns are typmod-constrained now behave as one would expect. Arguably this aspect is a bug fix, but the overall behavioral change here seems too large to consider back-patching. A nice side-effect is that functions can now be inlined in a few cases where we previously failed to do so because of type mismatches. Discussion: https://postgr.es/m/18929.1574895430@sss.pgh.pa.us
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Rationalize use of list_concat + list_copy combinations.Tom Lane2019-08-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In the wake of commit 1cff1b95a, the result of list_concat no longer shares the ListCells of the second input. Therefore, we can replace "list_concat(x, list_copy(y))" with just "list_concat(x, y)". To improve call sites that were list_copy'ing the first argument, or both arguments, invent "list_concat_copy()" which produces a new list sharing no ListCells with either input. (This is a bit faster than "list_concat(list_copy(x), y)" because it makes the result list the right size to start with.) In call sites that were not list_copy'ing the second argument, the new semantics mean that we are usually leaking the second List's storage, since typically there is no remaining pointer to it. We considered inventing another list_copy variant that would list_free the second input, but concluded that for most call sites it isn't worth worrying about, given the relative compactness of the new List representation. (Note that in cases where such leakage would happen, the old code already leaked the second List's header; so we're only discussing the size of the leak not whether there is one. I did adjust two or three places that had been troubling to free that header so that they manually free the whole second List.) Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
* Represent Lists as expansible arrays, not chains of cons-cells.Tom Lane2019-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit d0b4399d8) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell *" pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell * pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically *doesn't* require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
* Phase 2 pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | | Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Refactor ParamListInfo initializationPeter Eisentraut2019-03-14
| | | | | There were six copies of identical nontrivial code. Put it into a function.
* Change function call information to be variable length.Andres Freund2019-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change FunctionCallInfoData, the struct arguments etc for V1 function calls are stored in, always had space for FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two arrays. For nearly every function call 100 arguments is far more than needed, therefore wasting memory. Arg and argnull being two separate arrays also guarantees that to access a single argument, two cachelines have to be touched. Change the layout so there's a single variable-length array with pairs of value / isnull. That drastically reduces memory consumption for most function calls (on x86-64 a two argument function now uses 64bytes, previously 936 bytes), and makes it very likely that argument value and its nullness are on the same cacheline. Arguments are stored in a new NullableDatum struct, which, due to padding, needs more memory per argument than before. But as usually far fewer arguments are stored, and individual arguments are cheaper to access, that's still a clear win. It's likely that there's other places where conversion to NullableDatum arrays would make sense, e.g. TupleTableSlots, but that's for another commit. Because the function call information is now variable-length allocations have to take the number of arguments into account. For heap allocations that can be done with SizeForFunctionCallInfoData(), for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro that helps to allocate an appropriately sized and aligned variable. Some places with stack allocation function call information don't know the number of arguments at compile time, and currently variably sized stack allocations aren't allowed in postgres. Therefore allow for FUNC_MAX_ARGS space in these cases. They're not that common, so for now that seems acceptable. Because of the need to allocate FunctionCallInfo of the appropriate size, older extensions may need to update their code. To avoid subtle breakages, the FunctionCallInfoData struct has been renamed to FunctionCallInfoBaseData. Most code only references FunctionCallInfo, so that shouldn't cause much collateral damage. This change is also a prerequisite for more efficient expression JIT compilation (by allocating the function call information on the stack, allowing LLVM to optimize it away); previously the size of the call information caused problems inside LLVM's optimizer. Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de