aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Improve some error wording with multirange type parsingMichael Paquier2021-05-31
| | | | | | | | | | | | Braces were referred in some error messages as only brackets (not curly brackets or curly braces), which can be confusing as other types of brackets could be used. While on it, add one test to check after the case of junk characters detected after a right brace. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20210514.153153.1814935914483287479.horikyota.ntt@gmail.com
* Fix race condition when sharing tuple descriptors.Thomas Munro2021-05-29
| | | | | | | | | | | | Parallel query processes that called BlessTupleDesc() for identical tuple descriptors at the same moment could crash. There was code to handle that rare case, but it dereferenced a bogus DSA pointer. Repair. Back-patch to 11, where commit cc5f8136 added support for sharing tuple descriptors in parallel queries. Reported-by: Eric Thinnes <e.thinnes@gmx.de> Discussion: https://postgr.es/m/99aaa2eb-e194-bf07-c29a-1a76b4f2bcf9%40gmx.de
* Fix VACUUM VERBOSE's LP_DEAD item pages output.Peter Geoghegan2021-05-27
| | | | Oversight in commit 5100010e.
* Reduce the range of OIDs reserved for genbki.pl.Tom Lane2021-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ab596105b increased FirstBootstrapObjectId from 12000 to 13000, but we've had some push-back about that. It's worrisome to reduce the daylight between there and FirstNormalObjectId, because the number of OIDs consumed during initdb for collation objects is hard to predict. We can improve the situation by abandoning the assumption that these OIDs must be globally unique. It should be sufficient for them to be unique per-catalog. (Any code that's unhappy about that is broken anyway, since no more than per-catalog uniqueness can be guaranteed once the OID counter wraps around.) With that change, the largest OID assigned during genbki.pl (starting from a base of 10000) is a bit under 11000. This allows reverting FirstBootstrapObjectId to 12000 with reasonable confidence that that will be sufficient for many years to come. We are not, at this time, abandoning the expectation that hand-assigned OIDs (below 10000) are globally unique. Someday that'll likely be necessary, but the need seems years away still. This is late for v14, but it seems worth doing it now so that downstream software doesn't have to deal with the consequences of a change in FirstBootstrapObjectId. In any case, we already bought into forcing an initdb for beta2, so another catversion bump won't hurt. Discussion: https://postgr.es/m/1665197.1622065382@sss.pgh.pa.us
* Rethink definition of pg_attribute.attcompression.Tom Lane2021-05-27
| | | | | | | | | | | | | | | | | | | | | | | Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to compress, use the current setting of default_toast_compression". This allows '\0' to be a suitable default choice regardless of datatype, greatly simplifying code paths that initialize tupledescs and the like. It seems like a more user-friendly approach as well, because now the default compression choice doesn't migrate into table definitions, meaning that changing default_toast_compression is usually sufficient to flip an installation's behavior; one needn't tediously issue per-column ALTER SET COMPRESSION commands. Along the way, fix a few minor bugs and documentation issues with the per-column-compression feature. Adopt more robust APIs for SetIndexStorageProperties and GetAttributeCompression. Bump catversion because typical contents of attcompression will now be different. We could get away without doing that, but it seems better to ensure v14 installations all agree on this. (We already forced initdb for beta2, anyway.) Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us
* Replace run-time error check with assertionPeter Eisentraut2021-05-27
| | | | | | | | | | The error message was checking that the structures returned from the parser matched expectations. That's something we usually use assertions for, not a full user-facing error message. So replace that with an assertion (hidden inside lfirst_node()). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/452e9df8-ec89-e01b-b64a-8cc6ce830458%40enterprisedb.com
* doc: Fix description of some GUCs in docs and postgresql.conf.sampleMichael Paquier2021-05-27
| | | | | | | | | | | | | | | | | | | | | | The following parameters have been imprecise, or incorrect, about their description (PGC_POSTMASTER or PGC_SIGHUP): - autovacuum_work_mem (docs, as of 9.6~) - huge_page_size (docs, as of 14~) - max_logical_replication_workers (docs, as of 10~) - max_sync_workers_per_subscription (docs, as of 10~) - min_dynamic_shared_memory (docs, as of 14~) - recovery_init_sync_method (postgresql.conf.sample, as of 14~) - remove_temp_files_after_crash (docs, as of 14~) - restart_after_crash (docs, as of 9.6~) - ssl_min_protocol_version (docs, as of 12~) - ssl_max_protocol_version (docs, as of 12~) This commit adjusts the description of all these parameters to be more consistent with the practice used for the others. Revewed-by: Justin Pryzby Discussion: https://postgr.es/m/YK2ltuLpe+FbRXzA@paquier.xyz Backpatch-through: 9.6
* Fix assertion during streaming of multi-insert toast changes.Amit Kapila2021-05-27
| | | | | | | | | | | | | | | | | | | | While decoding the multi-insert WAL we can't clean the toast untill we get the last insert of that WAL record. Now if we stream the changes before we get the last change, the memory for toast chunks won't be released and we expect the txn to have streamed all changes after streaming. This restriction is mainly to ensure the correctness of streamed transactions and it doesn't seem worth uplifting such a restriction just to allow this case because anyway we will stream the transaction once such an insert is complete. Previously we were using two different flags (one for toast tuples and another for speculative inserts) to indicate partial changes. Now instead we replaced both of them with a single flag to indicate partial changes. Reported-by: Pavan Deolasee Author: Dilip Kumar Reviewed-by: Pavan Deolasee, Amit Kapila Discussion: https://postgr.es/m/CABOikdN-_858zojYN-2tNcHiVTw-nhxPwoQS4quExeweQfG1Ug@mail.gmail.com
* Fix typo in heapam.cMichael Paquier2021-05-26
| | | | | Author: Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB571612191738540B27A8DE5894249@OS0PR01MB5716.jpnprd01.prod.outlook.com
* Fix use of uninitialized variable in inline_function().Tom Lane2021-05-25
| | | | | | | | | | | | | | | | | | | Commit e717a9a18 introduced a code path that bypassed the call of get_expr_result_type, which is not good because we need its rettupdesc result to pass to check_sql_fn_retval. We'd failed to notice right away because the code path in which check_sql_fn_retval uses that argument is fairly hard to reach in this context. It's not impossible though, and in any case inline_function would have no business assuming that check_sql_fn_retval doesn't need that value. To fix, move get_expr_result_type out of the if-block, which in turn requires moving the construction of the dummy FuncExpr out of it. Per report from Ranier Vilela. (I'm bemused by the lack of any compiler complaints...) Discussion: https://postgr.es/m/CAEudQAqBqQpQ3HruWAGU_7WaMJ7tntpk0T8k_dVtNB46DqdBgw@mail.gmail.com
* postgresql.conf.sample: Make vertical spacing consistentPeter Eisentraut2021-05-25
|
* Fix memory leak when de-toasting compressed values in VACUUM FULL/CLUSTERMichael Paquier2021-05-25
| | | | | | | | | | | | | | | | | | | | | | VACUUM FULL and CLUSTER can be used to enforce the use of the existing compression method of a toastable column if a value currently stored is compressed with a method that does not match the column's defined method. The code in charge of decompressing and recompressing toast values at rewrite left around the detoasted values, causing an accumulation of memory allocated in TopTransactionContext. When processing large relations, this could cause the system to run out of memory. The detoasted values are not needed once their tuple is rewritten, and this commit ensures that the necessary cleanup happens. Issue introduced by bbe0a81d. The comments of the area are reordered a bit while on it. Reported-by: Andres Freund Analyzed-by: Andres Freund Author: Michael Paquier Reviewed-by: Dilip Kumar Discussion: https://postgr.es/m/20210521211929.pcehg6f23icwstdb@alap3.anarazel.de
* Improve docs and error messages for parallel vacuum.Amit Kapila2021-05-25
| | | | | | | | | | | | The error messages, docs, and one of the options were using 'parallel degree' to indicate parallelism used by vacuum command. We normally use 'parallel workers' at other places so change it for parallel vacuum accordingly. Author: Bharath Rupireddy Reviewed-by: Dilip Kumar, Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CALj2ACWz=PYrrFXVsEKb9J1aiX4raA+UBe02hdRp_zqDkrWUiw@mail.gmail.com
* Disallow SSL renegotiationMichael Paquier2021-05-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SSL renegotiation is already disabled as of 48d23c72, however this does not prevent the server to comply with a client willing to use renegotiation. In the last couple of years, renegotiation had its set of security issues and flaws (like the recent CVE-2021-3449), and it could be possible to crash the backend with a client attempting renegotiation. This commit takes one extra step by disabling renegotiation in the backend in the same way as SSL compression (f9264d15) or tickets (97d3a0b0). OpenSSL 1.1.0h has added an option named SSL_OP_NO_RENEGOTIATION able to achieve that. In older versions there is an option called SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS that was undocumented, and could be set within the SSL object created when the TLS connection opens, but I have decided not to use it, as it feels trickier to rely on, and it is not official. Note that this option is not usable in OpenSSL < 1.1.0h as the internal contents of the *SSL object are hidden to applications. SSL renegotiation concerns protocols up to TLSv1.2. Per original report from Robert Haas, with a patch based on a suggestion by Andres Freund. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/YKZBXx7RhU74FlTE@paquier.xyz Backpatch-through: 9.6
* Fix setrefs.c code for Result Cache nodesDavid Rowley2021-05-25
| | | | | | | | | | | | | | Result Cache, added in 9eacee2e6 neglected to properly adjust the plan references in setrefs.c. This could lead to the following error during EXPLAIN: ERROR: cannot decompile join alias var in plan tree Fix that. Bug: 17030 Reported-by: Hans Buschmann Discussion: https://postgr.es/m/17030-5844aecae42fe223@postgresql.org
* Consider triggering VACUUM failsafe during scan.Peter Geoghegan2021-05-24
| | | | | | | | | | | | | | | | | | | | | The wraparound failsafe mechanism added by commit 1e55e7d1 handled the one-pass strategy case (i.e. the "table has no indexes" case) by adding a dedicated failsafe check. This made up for the fact that the usual one-pass checks inside lazy_vacuum_all_indexes() cannot ever be reached during a one-pass strategy VACUUM. This approach failed to account for two-pass VACUUMs that opt out of index vacuuming up-front. The INDEX_CLEANUP off case in the only case that works like that. Fix this by performing a failsafe check every 4GB during the first scan of the heap, regardless of the details of the VACUUM. This eliminates the special case, and will make the failsafe trigger more reliably. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Andres Freund <andres@anarazel.de> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20210424002921.pb3t7h6frupdqnkp@alap3.anarazel.de
* Add missing NULL check when building Result Cache pathsDavid Rowley2021-05-24
| | | | | | | | | | | | | | | Code added in 9e215378d to disable building of Result Cache paths when not all join conditions are part of the parameterization of a unique join failed to first check if the inner path's param_info was set before checking the param_info's ppi_clauses. Add a check for NULL values here and just bail on trying to build the path if param_info is NULL. lateral_vars are not considered when deciding if the join is unique, so we're not missing out on doing the optimization when there are lateral_vars and no param_info. Reported-by: Coverity, via Tom Lane Discussion: https://postgr.es/m/457998.1621779290@sss.pgh.pa.us
* Re-order pg_attribute columns to eliminate some padding space.Tom Lane2021-05-23
| | | | | | | | | | | | | | | | | | Now that attcompression is just a char, there's a lot of wasted padding space after it. Move it into the group of char-wide columns to save a net of 4 bytes per pg_attribute entry. While we're at it, swap the order of attstorage and attalign to make for a more logical grouping of these columns. Also re-order actions in related code to match the new field ordering. This patch also fixes one outright bug: equalTupleDescs() failed to compare attcompression. That could, for example, cause relcache reload to fail to adopt a new value following a change. Michael Paquier and Tom Lane, per a gripe from Andres Freund. Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
* Be more verbose when the postmaster unexpectedly quits.Tom Lane2021-05-23
| | | | | | | | | | | | | | | | | Emit a LOG message when the postmaster stops because of a failure in the startup process. There already is a similar message if we exit for that reason during PM_STARTUP phase, so it seems inconsistent that there was none if the startup process fails later on. Also emit a LOG message when the postmaster stops after a crash because restart_after_crash is disabled. This seems potentially helpful in case DBAs (or developers) forget that that's set. Also, it was the only remaining place where the postmaster would do an abnormal exit without any comment as to why. In passing, remove an unreachable call of ExitPostmaster(0). Discussion: https://postgr.es/m/194914.1621641288@sss.pgh.pa.us
* Fix access to no-longer-open relcache entry in logical-rep worker.Tom Lane2021-05-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we redirected a replicated tuple operation into a partition child table, and then tried to fire AFTER triggers for that event, the relation cache entry for the child table was already closed. This has no visible ill effects as long as the entry is still there and still valid, but an unluckily-timed cache flush could result in a crash or other misbehavior. To fix, postpone the ExecCleanupTupleRouting call (which is what closes the child table) until after we've fired triggers. This requires a bit of refactoring so that the cleanup function can have access to the necessary state. In HEAD, I took the opportunity to simplify some of worker.c's function APIs based on use of the new ApplyExecutionData struct. However, it doesn't seem safe/practical to back-patch that aspect, at least not without a lot of analysis of possible interactions with a04daa97a. In passing, add an Assert to afterTriggerInvokeEvents to catch such cases. This seems worthwhile because we've grown a number of fairly unstructured ways of calling AfterTriggerEndQuery. Back-patch to v13, where worker.c grew the ability to deal with partitioned target tables. Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us
* Fix planner's use of Result Cache with unique joinsDavid Rowley2021-05-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the planner considered using a Result Cache node to cache results from the inner side of a Nested Loop Join, it failed to consider that the inner path's parameterization may not be the entire join condition. If the join was marked as inner_unique then we may accidentally put the cache in singlerow mode. This meant that entries would be marked as complete after caching the first row. That was wrong as if only part of the join condition was parameterized then the uniqueness of the unique join was not guaranteed at the Result Cache's level. The uniqueness is only guaranteed after Nested Loop applies the join filter. If subsequent rows were found, this would lead to: ERROR: cache entry already complete This could have been fixed by only putting the cache in singlerow mode if the entire join condition was parameterized. However, Nested Loop will only read its inner side so far as the first matching row when the join is unique, so that might mean we never get an opportunity to mark cache entries as complete. Since non-complete cache entries are useless for subsequent lookups, we just don't bother considering a Result Cache path in this case. In passing, remove the XXX comment that claimed the above ERROR might be better suited to be an Assert. After there being an actual case which triggered it, it seems better to keep it an ERROR. Reported-by: David Christensen Discussion: https://postgr.es/m/CAOxo6X+dy-V58iEPFgst8ahPKEU+38NZzUuc+a7wDBZd4TrHMQ@mail.gmail.com
* Disallow whole-row variables in GENERATED expressions.Tom Lane2021-05-21
| | | | | | | | | | | | | | | This was previously allowed, but I think that was just an oversight. It's a clear violation of the rule that a generated column cannot depend on itself or other generated columns. Moreover, because the code was relying on the assumption that no such cross-references exist, it was pretty easy to crash ALTER TABLE and perhaps other places. Even if you managed not to crash, you got quite unstable, implementation-dependent results. Per report from Vitaly Ustinov. Back-patch to v12 where GENERATED came in. Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
* Fix usage of "tableoid" in GENERATED expressions.Tom Lane2021-05-21
| | | | | | | | | | | | | | | We consider this supported (though I've got my doubts that it's a good idea, because tableoid is not immutable). However, several code paths failed to fill the field in soon enough, causing such a GENERATED expression to see zero or the wrong value. This occurred when ALTER TABLE adds a new GENERATED column to a table with existing rows, and during regular INSERT or UPDATE on a foreign table with GENERATED columns. Noted during investigation of a report from Vitaly Ustinov. Back-patch to v12 where GENERATED came in. Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
* Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.Tom Lane2021-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
* Fix deadlock for multiple replicating truncates of the same table.Amit Kapila2021-05-21
| | | | | | | | | | | | | | | | | | While applying the truncate change, the logical apply worker acquires RowExclusiveLock on the relation being truncated. This allowed truncate on the relation at a time by two apply workers which lead to a deadlock. The reason was that one of the workers after updating the pg_class tuple tries to acquire SHARE lock on the relation and started to wait for the second worker which has acquired RowExclusiveLock on the relation. And when the second worker tries to update the pg_class tuple, it starts to wait for the first worker which leads to a deadlock. Fix it by acquiring AccessExclusiveLock on the relation before applying the truncate change as we do for normal truncate operation. Author: Peter Smith, test case by Haiying Tang Reviewed-by: Dilip Kumar, Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/CAHut+PsNm43p0jM+idTvWwiGZPcP0hGrHMPK9TOAkc+a4UpUqw@mail.gmail.com
* Avoid detoasting failure after COMMIT inside a plpgsql FOR loop.Tom Lane2021-05-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | exec_for_query() normally tries to prefetch a few rows at a time from the query being iterated over, so as to reduce executor entry/exit overhead. Unfortunately this is unsafe if we have COMMIT or ROLLBACK within the loop, because there might be TOAST references in the data that we prefetched but haven't yet examined. Immediately after the COMMIT/ROLLBACK, we have no snapshots in the session, meaning that VACUUM is at liberty to remove recently-deleted TOAST rows. This was originally reported as a case triggering the "no known snapshots" error in init_toast_snapshot(), but even if you miss hitting that, you can get "missing toast chunk", as illustrated by the added isolation test case. To fix, just disable prefetching in non-atomic contexts. Maybe there will be performance complaints prompting us to work harder later, but it's not clear at the moment that this really costs much, and I doubt we'd want to back-patch any complicated fix. In passing, adjust that error message in init_toast_snapshot() to be a little clearer about the likely cause of the problem. Patch by me, based on earlier investigation by Konstantin Knizhnik. Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
* Make standby promotion reset the recovery pause state to 'not paused'.Fujii Masao2021-05-19
| | | | | | | | | | | | | | If a promotion is triggered while recovery is paused, the paused state ends and promotion continues. But previously in that case pg_get_wal_replay_pause_state() returned 'paused' wrongly while a promotion was ongoing. This commit changes a standby promotion so that it marks the recovery pause state as 'not paused' when it's triggered, to fix the issue. Author: Fujii Masao Reviewed-by: Dilip Kumar, Kyotaro Horiguchi Discussion: https://postgr.es/m/f706876c-4894-0ba5-6f4d-79803eeea21b@oss.nttdata.com
* Fix issues in pg_stat_wal.Fujii Masao2021-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) Previously there were both pgstat_send_wal() and pgstat_report_wal() in order to send WAL activity to the stats collector. With the former being used by wal writer, the latter by most other processes. They were a bit redundant and so this commit merges them into pgstat_send_wal() to simplify the code. 2) Previously WAL global statistics counters were calculated and then compared with zero-filled buffer in order to determine whether any WAL activity has happened since the last submission. These calculation and comparison were not cheap. This was regularly exercised even in read-only workloads. This commit fixes the issue by making some WAL activity counters directly be checked to determine if there's WAL activity stats to send. 3) Previously pgstat_report_stat() did not check if there's WAL activity stats to send as part of the "Don't expend a clock check if nothing to do" check at the top. It's probably rare to have pending WAL stats without also passing one of the other conditions, but for safely this commit changes pgstat_report_stats() so that it checks also some WAL activity counters at the top. This commit also adds the comments about the design of WAL stats. Reported-by: Andres Freund Author: Masahiro Ikeda Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi, Andres Freund, Fujii Masao Discussion: https://postgr.es/m/20210324232224.vrfiij2rxxwqqjjb@alap3.anarazel.de
* Fix typo and outdated information in README.barrierDavid Rowley2021-05-18
| | | | | | | | | README.barrier didn't seem to get the memo when atomics were added. Fix that. Author: Tatsuo Ishii, David Rowley Discussion: https://postgr.es/m/20210516.211133.2159010194908437625.t-ishii%40sraoss.co.jp Backpatch-through: 9.6, oldest supported release
* Translation updatesPeter Eisentraut2021-05-17
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 9bbd9c3714d0c76daaa806588b1fbf744aa60496
* Unbreak EXEC_BACKEND buildAlvaro Herrera2021-05-15
| | | | Per buildfarm
* Allow compute_query_id to be set to 'auto' and make it defaultAlvaro Herrera2021-05-15
| | | | | | | | | | | | | | | | | Allowing only on/off meant that all either all existing configuration guides would become obsolete if we disabled it by default, or that we would have to accept a performance loss in the default config if we enabled it by default. By allowing 'auto' as a middle ground, the performance cost is only paid by those who enable pg_stat_statements and similar modules. I only edited the release notes to comment-out a paragraph that is now factually wrong; further edits are probably needed to describe the related change in more detail. Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20210513002623.eugftm4nk2lvvks3@nol
* Be more careful about barriers when releasing BackgroundWorkerSlots.Tom Lane2021-05-15
| | | | | | | | | | | | | | | | | | | | ForgetBackgroundWorker lacked any memory barrier at all, while BackgroundWorkerStateChange had one but unaccountably did additional manipulation of the slot after the barrier. AFAICS, the rule must be that the barrier is immediately before setting or clearing slot->in_use. It looks like back in 9.6 when ForgetBackgroundWorker was first written, there might have been some case for not needing a barrier there, but I'm not very convinced of that --- the fact that the load of bgw_notify_pid is in the caller doesn't seem to guarantee no memory ordering problem. So patch 9.6 too. It's likely that this doesn't fix any observable bug on Intel hardware, but machines with weaker memory ordering rules could have problems here. Discussion: https://postgr.es/m/4046084.1620244003@sss.pgh.pa.us
* Harden nbtree deduplication posting split code.Peter Geoghegan2021-05-14
| | | | | | | | | | | | | | | | | | Add a defensive "can't happen" error to code that handles nbtree posting list splits (promote an existing assertion). This avoids a segfault in the event of an insertion of a newitem that is somehow identical to an existing non-pivot tuple in the index. An nbtree index should never have two index tuples with identical TIDs. This scenario is not particular unlikely in the event of any kind of corruption that leaves the index in an inconsistent state relative to the heap relation that is indexed. There are two known reports of preventable hard crashes. Doing nothing seems unacceptable given the general expectation that nbtree will cope reasonably well with corrupt data. Discussion: https://postgr.es/m/CAH2-Wz=Jr_d-dOYEEmwz0-ifojVNWho01eAqewfQXgKfoe114w@mail.gmail.com Backpatch: 13-, where nbtree deduplication was introduced.
* Prevent infinite insertion loops in spgdoinsert().Tom Lane2021-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly we just relied on operator classes that assert longValuesOK to eventually shorten the leaf value enough to fit on an index page. That fails since the introduction of INCLUDE-column support (commit 09c1c6ab4), because the INCLUDE columns might alone take up more than a page, meaning no amount of leaf-datum compaction will get the job done. At least with spgtextproc.c, that leads to an infinite loop, since spgtextproc.c won't throw an error for not being able to shorten the leaf datum anymore. To fix without breaking cases that would otherwise work, add logic to spgdoinsert() to verify that the leaf tuple size is decreasing after each "choose" step. Some opclasses might not decrease the size on every single cycle, and in any case, alignment roundoff of the tuple size could obscure small gains. Therefore, allow up to 10 cycles without additional savings before throwing an error. (Perhaps this number will need adjustment, but it seems quite generous right now.) As long as we've developed this logic, let's back-patch it. The back branches don't have INCLUDE columns to worry about, but this seems like a good defense against possible bugs in operator classes. We already know that an infinite loop here is pretty unpleasant, so having a defense seems to outweigh the risk of breaking things. (Note that spgtextproc.c is actually the only known opclass with longValuesOK support, so that this is all moot for known non-core opclasses anyway.) Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
* Fix query-cancel handling in spgdoinsert().Tom Lane2021-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Knowing that a buggy opclass could cause an infinite insertion loop, spgdoinsert() intended to allow its loop to be interrupted by query cancel. However, that never actually worked, because in iterations after the first, we'd be holding buffer lock(s) which would cause InterruptHoldoffCount to be positive, preventing servicing of the interrupt. To fix, check if an interrupt is pending, and if so fall out of the insertion loop and service the interrupt after we've released the buffers. If it was indeed a query cancel, that's the end of the matter. If it was a non-canceling interrupt reason, make use of the existing provision to retry the whole insertion. (This isn't as wasteful as it might seem, since any upper-level index tuples we already created should be usable in the next attempt.) While there's no known instance of such a bug in existing release branches, it still seems like a good idea to back-patch this to all supported branches, since the behavior is fairly nasty if a loop does happen --- not only is it uncancelable, but it will quickly consume memory to the point of an OOM failure. In any case, this code is certainly not working as intended. Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
* Refactor CHECK_FOR_INTERRUPTS() to add flexibility.Tom Lane2021-05-14
| | | | | | | | | | | | | | | | | | | | Split up CHECK_FOR_INTERRUPTS() to provide an additional macro INTERRUPTS_PENDING_CONDITION(), which just tests whether an interrupt is pending without attempting to service it. This is useful in situations where the caller knows that interrupts are blocked, and would like to find out if it's worth the trouble to unblock them. Also add INTERRUPTS_CAN_BE_PROCESSED(), which indicates whether CHECK_FOR_INTERRUPTS() can be relied on to clear the pending interrupt. This commit doesn't actually add any uses of the new macros, but a follow-on bug fix will do so. Back-patch to all supported branches to provide infrastructure for that fix. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/20210513155351.GA7848@alvherre.pgsql
* Convert misleading while loop into an if conditionDavid Rowley2021-05-14
| | | | | | | | | | | | | | This seems to be leftover from ea15e1867 and from when we used to evaluate SRFs at each node. Since there is an unconditional "return" at the end of the loop body, only 1 loop is ever possible, so we can just change this into an if condition. There is no actual bug being fixed here so no back-patch. It seems fine to just fix this anomaly in master only. Author: Greg Nancarrow Discussion: https://postgr.es/m/CAJcOf-d7T1q0az-D8evWXnsuBZjigT04WkV5hCAOEJQZRWy28w@mail.gmail.com
* Fix autovacuum log output heap truncation issue.Peter Geoghegan2021-05-13
| | | | | | | | | | | | | | | The percentage of blocks from the table value reported by autovacuum log output (following commit 5100010ee4d) should never exceed 100% because it describes the state of the table back when lazy_vacuum() was called. The value could nevertheless exceed 100% in the event of heap relation truncation. We failed to compensate for how truncation affects rel_pages. Fix the faulty accounting by using the original rel_pages value instead of the current/final rel_pages value. Reported-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20210423204306.5osfpkt2ggaedyvy@alap3.anarazel.de
* Rename the logical replication global "wrconn"Alvaro Herrera2021-05-12
| | | | | | | | | | | | | | The worker.c global wrconn is only meant to be used by logical apply/ tablesync workers, but there are other variables with the same name. To reduce future confusion rename the global from "wrconn" to "LogRepWorkerWalRcvConn". While this is just cosmetic, it seems better to backpatch it all the way back to 10 where this code appeared, to avoid future backpatching issues. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAHut+Pu7Jv9L2BOEx_Z0UtJxfDevQSAUW2mJqWU+CtmDrEZVAg@mail.gmail.com
* Double-space commands in system_constraints.sql/system_functions.sql.Tom Lane2021-05-12
| | | | | | | | | | | | | Previously, any error reported by the backend while reading system_constraints.sql would report the entire file, not just the particular command it was working on. (Ask me how I know.) Likewise, there were chunks of system_functions.sql that would be read as one command, which would be annoying if anything failed there. The issue for system_constraints.sql is an oversight in commit dfb75e478. I didn't try to trace down where the poor formatting in system_functions.sql started, but it's certainly contrary to the advice at the head of that file.
* Initial pgindent and pgperltidy run for v14.Tom Lane2021-05-12
| | | | | | | | Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.
* Simplify one use of ScanKey in pg_subscription.cMichael Paquier2021-05-12
| | | | | | | | | | | The section of the code in charge of returning all the relations associated to a subscription only need one ScanKey, but allocated two of them. This code was introduced as a copy-paste from a different area on the same file by 7c4f524, making the result confusing to follow. Author: Peter Smith Reviewed-by: Tom Lane, Julien Rouhaud, Bharath Rupireddy Discussion: https://postgr.es/m/CAHut+PsLKe+rN3FjchoJsd76rx2aMsFTB7CTFxRgUP05p=kcpQ@mail.gmail.com
* Refactor some error messages for easier translationPeter Eisentraut2021-05-12
|
* Fix EXPLAIN ANALYZE for async-capable nodes.Etsuro Fujita2021-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | EXPLAIN ANALYZE for an async-capable ForeignScan node associated with postgres_fdw is done just by using instrumentation for ExecProcNode() called from the node's callbacks, causing the following problems: 1) If the remote table to scan is empty, the node is incorrectly considered as "never executed" by the command even if the node is executed, as ExecProcNode() isn't called from the node's callbacks at all in that case. 2) The command fails to collect timings for things other than ExecProcNode() done in the node, such as creating a cursor for the node's remote query. To fix these problems, add instrumentation for async-capable nodes, and modify postgres_fdw accordingly. My oversight in commit 27e1f1456. While at it, update a comment for the AsyncRequest struct in execnodes.h and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix typos in comments in nodeAppend.c. Per report from Andrey Lepikhov, though I didn't use his patch. Reviewed-by: Andrey Lepikhov Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
* Change data type of counters in BufferUsage and WalUsage from long to int64.Fujii Masao2021-05-12
| | | | | | | | | | | | | | | | Previously long was used as the data type for some counters in BufferUsage and WalUsage. But long is only four byte, e.g., on Windows, and it's entirely possible to wrap a four byte counter. For example, emitting more than four billion WAL records in one transaction isn't actually particularly rare. To avoid the overflows of those counters, this commit changes the data type of them from long to int64. Suggested-by: Andres Freund Author: Masahiro Ikeda Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20201221211650.k7b53tcnadrciqjo@alap3.anarazel.de Discussion: https://postgr.es/m/af0964ac-7080-1984-dc23-513754987716@oss.nttdata.com
* Tweak generation of Gen_dummy_probes.plAndrew Dunstan2021-05-11
| | | | | | | | Use a static prolog file instead of generating the prolog from the existing perl script. Also, support generation of the file in a vpath build. Discussion: https://postgr.es/m/700620.1620662868@sss.pgh.pa.us
* Fix typoPeter Eisentraut2021-05-11
|
* Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists.Tom Lane2021-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present. If it happens, the ON CONFLICT UPDATE code path would end up storing tuples that include the values of the extra resjunk columns. That's fairly harmless in the short run, but if new columns are added to the table then the values would become accessible, possibly leading to malfunctions if they don't match the datatypes of the new columns. This had escaped notice through a confluence of missing sanity checks, including * There's no cross-check that a tuple presented to heap_insert or heap_update matches the table rowtype. While it's difficult to check that fully at reasonable cost, we can easily add assertions that there aren't too many columns. * The output-column-assignment cases in execExprInterp.c lacked any sanity checks on the output column numbers, which seems like an oversight considering there are plenty of assertion checks on input column numbers. Add assertions there too. * We failed to apply nodeModifyTable's ExecCheckPlanOutput() to the ON CONFLICT UPDATE tlist. That wouldn't have caught this specific error, since that function is chartered to ignore resjunk columns; but it sure seems like a bad omission now that we've seen this bug. In HEAD, the right way to fix this is to make the processing of ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists now do, that is don't add "SET x = x" entries, and use ExecBuildUpdateProjection to evaluate the tlist and combine it with old values of the not-set columns. This adds a little complication to ExecBuildUpdateProjection, but allows removal of a comparable amount of now-dead code from the planner. In the back branches, the most expedient solution seems to be to (a) use an output slot for the ON CONFLICT UPDATE projection that actually matches the target table, and then (b) invent a variant of ExecBuildProjectionInfo that can be told to not store values resulting from resjunk columns, so it doesn't try to store into nonexistent columns of the output slot. (We can't simply ignore the resjunk columns altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.) This works back to v10. In 9.6, projections work much differently and we can't cheaply give them such an option. The 9.6 version of this patch works by inserting a JunkFilter when it's necessary to get rid of resjunk columns. In addition, v11 and up have the reverse problem when trying to perform ON CONFLICT UPDATE on a partitioned table. Through a further oversight, adjust_partition_tlist() discarded resjunk columns when re-ordering the ON CONFLICT UPDATE tlist to match a partition. This accidentally prevented the storing-bogus-tuples problem, but at the cost that MULTIEXPR_SUBLINK cases didn't work, typically crashing if more than one row has to be updated. Fix by preserving resjunk columns in that routine. (I failed to resist the temptation to add more assertions there too, and to do some minor code beautification.) Per report from Andres Freund. Back-patch to all supported branches. Security: CVE-2021-32028
* Prevent integer overflows in array subscripting calculations.Tom Lane2021-05-10
| | | | | | | | | | | | | | | | | | | | | | | | While we were (mostly) careful about ensuring that the dimensions of arrays aren't large enough to cause integer overflow, the lower bound values were generally not checked. This allows situations where lower_bound + dimension overflows an integer. It seems that that's harmless so far as array reading is concerned, except that array elements with subscripts notionally exceeding INT_MAX are inaccessible. However, it confuses various array-assignment logic, resulting in a potential for memory stomps. Fix by adding checks that array lower bounds aren't large enough to cause lower_bound + dimension to overflow. (Note: this results in disallowing cases where the last subscript position would be exactly INT_MAX. In principle we could probably allow that, but there's a lot of code that computes lower_bound + dimension and would need adjustment. It seems doubtful that it's worth the trouble/risk to allow it.) Somewhat independently of that, array_set_element() was careless about possible overflow when checking the subscript of a fixed-length array, creating a different route to memory stomps. Fix that too. Security: CVE-2021-32027