aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Stabilize output of new regression test.Tom Lane2022-08-08
| | | | | | | | | | Per buildfarm, the output order of \dx+ isn't consistent across locales. Apply NO_LOCALE to force C locale. There might be a more localized way, but I'm not seeing it offhand, and anyway there is nothing in this test module that particularly cares about locales. Security: CVE-2022-2625
* In extensions, don't replace objects not belonging to the extension.Tom Lane2022-08-08
| | | | | | | | | | | | | | | | | | | | | | | Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625
* Translation updatesAlvaro Herrera2022-08-08
| | | | | Source-Git-URL: ssh://git@git.postgresql.org/pgtranslation/messages.git Source-Git-Hash: 8ee19d25e0753a690bea62ddcbbfaf2e0d093c1d
* Remove unportable use of timezone in recent testAlvaro Herrera2022-08-07
| | | | | | Per buildfarm member snapper Discussion: https://postgr.es/m/129951.1659812518@sss.pgh.pa.us
* Improve recently-added test reliabilityAlvaro Herrera2022-08-06
| | | | | | | | | | | | | | Commit 59be1c942a47 already tried to make src/test/recovery/t/033_replay_tsp_drops more reliable, but it wasn't enough. Try to improve on that by making this use of a replication slot to be more like others. Also, don't drop the slot. Make a few other stylistic changes while at it. It's still quite slow, which is another thing that we need to fix in this script. Backpatch to all supported branches. Discussion: https://postgr.es/m/349302.1659191875@sss.pgh.pa.us
* Partially undo commit 94da73281.Tom Lane2022-08-05
| | | | | | | | On closer inspection, mcv.c isn't as broken for ScalarArrayOpExpr as I thought. The Var-on-right issue is real enough, but actually it does cope fine with a NULL array constant --- I was misled by an XXX comment suggesting it didn't. Undo that part of the code change, and replace the XXX comment with something less misleading.
* Fix non-bulletproof ScalarArrayOpExpr code for extended statistics.Tom Lane2022-08-05
| | | | | | | | | | | | | | | | statext_is_compatible_clause_internal() checked that the arguments of a ScalarArrayOpExpr are one Var and one Const, but it would allow cases where the Const was on the left. Subsequent uses of the clause are not expecting that and would suffer assertion failures or core dumps. mcv.c also had not bothered to cope with the case of a NULL array constant, which seems really unacceptably sloppy of somebody. (Although our tools failed us there too, since AFAIK neither Coverity nor any compiler warned of the obvious use-of-uninitialized-variable condition.) It seems best to handle that by having statext_is_compatible_clause_internal() reject it. Noted while fixing bug #17570. Back-patch to v13 where the extended stats code grew some awareness of ScalarArrayOpExpr.
* BRIN: mask BRIN_EVACUATE_PAGE for WAL consistency checkingAlvaro Herrera2022-08-05
| | | | | | | | | | | | | | | | That bit is unlogged and therefore it's wrong to consider it in WAL page comparison. Add a test that tickles the case, as branch testing technology allows. This has been a problem ever since wal consistency checking was introduced (commit a507b86900f6 for pg10), so backpatch to all supported branches. Author: 王海洋 (Haiyang Wang) <wanghaiyang.001@bytedance.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com
* Add HINT for restartpoint race with KeepFileRestoredFromArchive().Noah Misch2022-08-05
| | | | | | | | | | The five commits ending at cc2c7d65fc27e877c9f407587b0b92d46cd6dd16 closed this race condition for v15+. For v14 through v10, add a HINT to discourage studying the cosmetic problem. Reviewed by Kyotaro Horiguchi and David Steele. Discussion: https://postgr.es/m/20220731061747.GA3692882@rfd.leadboat.com
* regress: fix test instabilityAlvaro Herrera2022-08-05
| | | | | | | Having additional triggers in a test table made the ORDER BY clauses in old queries underspecified. Add another column there for stability. Per sporadic buildfarm pink.
* Fix ENABLE/DISABLE TRIGGER to handle recursion correctlyAlvaro Herrera2022-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db9b did is not correct, because ATPrepCmd() can't distinguish between triggers that may be cloned and those that may not, so would wrongly try to recurse for the latter category of triggers. So this commit restores the code in EnableDisableTrigger() that 86f575948c77 had added to do the recursion, which would do it only for triggers that may be cloned, that is, row-level triggers. This also changes tablecmds.c such that ATExecCmd() is able to pass the value of ONLY flag down to EnableDisableTrigger() using its new 'recurse' parameter. This also fixes what seems like an oversight of 86f575948c77 that the recursion to partition triggers would only occur if EnableDisableTrigger() had actually changed the trigger. It is more apt to recurse to inspect partition triggers even if the parent's trigger didn't need to be changed: only then can we be certain that all descendants share the same state afterwards. Backpatch all the way back to 11, like bbb927b4db9b. Care is taken not to break ABI compatibility (and that no catversion bump is needed.) Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
* Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop.Tom Lane2022-08-04
| | | | | | | | | | | | | Ordinarily the functions called in this loop ought to have plenty of CFIs themselves; but we've now seen a case where no such CFI is reached, making the loop uninterruptible. Even though that's from a recently-introduced bug, it seems prudent to install a CFI at the loop level in all branches. Per discussion of bug #17558 from Andrew Kesper (an actual fix for that bug will follow). Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org
* Add proper regression test for the recent SRFs-in-pathkeys problem.Tom Lane2022-08-04
| | | | | | | | | | | Remove the test case added by commit fac1b470a, which never actually worked to expose the problem it claimed to test. Replace it with a case that does expose the problem, and also covers the SRF-not- at-the-top deficiency repaired in 1aa8dad41. Richard Guo, with some editorialization by me Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org
* Fix incorrect tests for SRFs in relation_can_be_sorted_early().Tom Lane2022-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | Commit fac1b470a thought we could check for set-returning functions by testing only the top-level node in an expression tree. This is wrong in itself, and to make matters worse it encouraged others to make the same mistake, by exporting tlist.c's special-purpose IS_SRF_CALL() as a widely-visible macro. I can't find any evidence that anyone's taken the bait, but it was only a matter of time. Use expression_returns_set() instead, and stuff the IS_SRF_CALL() genie back in its bottle, this time with a warning label. I also added a couple of cross-reference comments. After a fair amount of fooling around, I've despaired of making a robust test case that exposes the bug reliably, so no test case here. (Note that the test case added by fac1b470a is itself broken, in that it doesn't notice if you remove the code change. The repro given by the bug submitter currently doesn't fail either in v15 or HEAD, though I suspect that may indicate an unrelated bug.) Per bug #17564 from Martijn van Oosterhout. Back-patch to v13, as the faulty patch was. Discussion: https://postgr.es/m/17564-c7472c2f90ef2da3@postgresql.org
* Reduce test runtime of src/test/modules/snapshot_too_old.Tom Lane2022-08-03
| | | | | | | | | | | | | | | | | | | The sto_using_cursor and sto_using_select tests were coded to exercise every permutation of their test steps, but AFAICS there is no value in exercising more than one. This matters because each permutation costs about six seconds, thanks to the "pg_sleep(6)". Perhaps we could reduce that, but the useless permutations seem worth getting rid of in any case. (Note that sto_using_hash_index got it right already.) While here, clean up some other sloppiness such as an unused table. This doesn't make too much difference in interactive testing, since the wasted time is typically masked by parallelization with other tests. However, the buildfarm runs this as a serial step, which means we can expect to shave ~40 seconds from every buildfarm run. That makes it worth back-patching. Discussion: https://postgr.es/m/2515192.1659454702@sss.pgh.pa.us
* Check maximum number of columns in function RTEs, too.Tom Lane2022-08-01
| | | | | | | | | | | | | | | I thought commit fd96d14d9 had plugged all the holes of this sort, but no, function RTEs could produce oversize tuples too, either via long coldeflists or just from multiple functions in one RTE. (I'm pretty sure the other variants of base RTEs aren't a problem, because they ultimately refer to either a table or a sub-SELECT, whose widths are enforced elsewhere. But we explicitly allow join RTEs to be overwidth, as long as you don't try to form their tuple result.) Per further discussion of bug #17561. As before, patch all branches. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
* Fix error reporting after ioctl() call with pg_upgrade --cloneMichael Paquier2022-08-01
| | | | | | | | | | | | errno was not reported correctly after attempting to clone a file, leading to incorrect error reports. While scanning through the code, I have not noticed any similar mistakes. Error introduced in 3a769d8. Author: Justin Pryzby Discussion: https://postgr.es/m/20220731134135.GY15006@telsasoft.com Backpatch-through: 12
* Fix new recovery test for log_error_verbosity=verbose caseAndrew Dunstan2022-07-29
| | | | | | | | | The new test is from commit 9e4f914b5e. With this setting messages have SQL error numbers included, so that needs to be provided for in the pattern looked for. Backpatch to all live branches like the original.
* In transformRowExpr(), check for too many columns in the row.Tom Lane2022-07-29
| | | | | | | | | | | | | | | | | | | | A RowExpr with more than MaxTupleAttributeNumber columns would fail at execution anyway, since we cannot form a tuple datum with more than that many columns. While heap_form_tuple() has a check for too many columns, it emerges that there are some intermediate bits of code that don't check and can be driven to failure with sufficiently many columns. Checking this at parse time seems like the most appropriate place to install a defense, since we already check SELECT list length there. While at it, make the SELECT-list-length error use the same errcode (TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic PROGRAM_LIMIT_EXCEEDED. Per bug #17561 from Egor Chindyaskin. The given test case crashes in all supported branches (and probably a lot further back), so patch all. Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org
* Fix test instabilityAlvaro Herrera2022-07-29
| | | | | | | | | | On FreeBSD, the new test fails due to a WAL file being removed before the standby has had the chance to copy it. Fix by adding a replication slot to prevent the removal until after the standby has connected. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reported-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wj5nau_qpjbwihvmXLfkAWOZ5TKdbnqOc6nKSiRJEoPyQ@mail.gmail.com
* Fix replay of create database records on standbyAlvaro Herrera2022-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Crash recovery on standby may encounter missing directories when replaying database-creation WAL records. Prior to this patch, the standby would fail to recover in such a case; however, the directories could be legitimately missing. Consider the following sequence of commands: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, crash recovery must be able to continue. A fix for this problem was already attempted in 49d9cfc68bf4, but it was reverted because of design issues. This new version is based on Robert Haas' proposal: any missing tablespaces are created during recovery before reaching consistency. Tablespaces are created as real directories, and should be deleted by later replay. CheckRecoveryConsistency ensures they have disappeared. The problems detected by this new code are reported as PANIC, except when allow_in_place_tablespaces is set to ON, in which case they are WARNING. Apart from making tests possible, this gives users an escape hatch in case things don't go as planned. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Author: Paul Guo <paulguo@gmail.com> Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions) Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions) Reviewed-by: Michaël Paquier <michael@paquier.xyz> Diagnosed-by: Paul Guo <paulguo@gmail.com> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
* Allow "in place" tablespaces.Alvaro Herrera2022-07-27
| | | | | | | | | | | | | | | | | | | | | | This is a backpatch to branches 10-14 of the following commits: 7170f2159fb2 Allow "in place" tablespaces. c6f2f01611d4 Fix pg_basebackup with in-place tablespaces. f6f0db4d6240 Fix pg_tablespace_location() with in-place tablespaces 7a7cd84893e0 doc: Remove mention to in-place tablespaces for pg_tablespace_location() 5344723755bd Remove unnecessary Windows-specific basebackup code. In-place tablespaces were introduced as a testing helper mechanism, but they are going to be used for a bugfix in WAL replay to be backpatched to all stable branches. I (Álvaro) had to adjust some code to account for lack of get_dirent_type() in branches prior to 14. Author: Thomas Munro <thomas.munro@gmail.com> Author: Michaël Paquier <michael@paquier.xyz> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20220722081858.omhn2in5zt3g4nek@alvherre.pgsql
* Force immediate commit after CREATE DATABASE etc in extended protocol.Tom Lane2022-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a few commands that "can't run in a transaction block", meaning that if they complete their processing but then we fail to COMMIT, we'll be left with inconsistent on-disk state. However, the existing defenses for this are only watertight for simple query protocol. In extended protocol, we didn't commit until receiving a Sync message. Since the client is allowed to issue another command instead of Sync, we're in trouble if that command fails or is an explicit ROLLBACK. In any case, sitting in an inconsistent state while waiting for a client message that might not come seems pretty risky. This case wasn't reachable via libpq before we introduced pipeline mode, but it's always been an intended aspect of extended query protocol, and likely there are other clients that could reach it before. To fix, set a flag in PreventInTransactionBlock that tells exec_execute_message to force an immediate commit. This seems to be the approach that does least damage to existing working cases while still preventing the undesirable outcomes. While here, add some documentation to protocol.sgml that explicitly says how to use pipelining. That's latent in the existing docs if you know what to look for, but it's better to spell it out; and it provides a place to document this new behavior. Per bug #17434 from Yugo Nagata. It's been wrong for ages, so back-patch to all supported branches. Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
* Fix ruleutils issues with dropped cols in functions-returning-composite.Tom Lane2022-07-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to lack of concern for the case in the dependency code, it's possible to drop a column of a composite type even though stored queries have references to the dropped column via functions-in-FROM that return the composite type. There are "soft" references, namely FROM-clause aliases for such columns, and "hard" references, that is actual Vars referring to them. The right fix for hard references is to add dependencies preventing the drop; something we've known for many years and not done (and this commit still doesn't address it). A "soft" reference shouldn't prevent a drop though. We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but nobody had noticed that the current behavior can result in dump/reload failures, because ruleutils.c can print more column aliases than the underlying composite type now has. So we need to rejigger the column-alias-handling code to treat such columns as dropped and not print aliases for them. Rather than writing new code for this, I used expandRTE() which already knows how to figure out which function result columns are dropped. I'd initially thought maybe we could use expandRTE() in all cases, but that fails for EXPLAIN's purposes, because the planner strips a lot of RTE infrastructure that expandRTE() needs. So this patch just uses it for unplanned function RTEs and otherwise does things the old way. If there is a hard reference (Var), then removing the column alias causes us to fail to print the Var, since there's no longer a name to print. Failing seems less desirable than printing a made-up name, so I made it print "?dropped?column?" instead. Per report from Timo Stolz. Back-patch to all supported branches. Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de
* Fix assertion failure and segmentation fault in backup code.Fujii Masao2022-07-20
| | | | | | | | | | | | | | | | | | | | | | When a non-exclusive backup is canceled, do_pg_abort_backup() is called and resets some variables set by pg_backup_start (pg_start_backup in v14 or before). But previously it forgot to reset the session state indicating whether a non-exclusive backup is in progress or not in this session. This issue could cause an assertion failure when the session running BASE_BACKUP is terminated after it executed pg_backup_start and pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause a segmentation fault when pg_backup_stop is called after BASE_BACKUP in the same session is canceled. This commit fixes the issue by making do_pg_abort_backup reset that session state. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
* Prevent BASE_BACKUP in the middle of another backup in the same session.Fujii Masao2022-07-20
| | | | | | | | | | | | | | | | | | | | | | | | Multiple non-exclusive backups are able to be run conrrently in different sessions. But, in the same session, only one non-exclusive backup can be run at the same moment. If pg_backup_start (pg_start_backup in v14 or before) is called in the middle of another non-exclusive backup in the same session, an error is thrown. However, previously, in logical replication walsender mode, even if that walsender session had already called pg_backup_start and started a non-exclusive backup, it could execute BASE_BACKUP command and start another non-exclusive backup. Which caused subsequent pg_backup_stop to throw an error because BASE_BACKUP unexpectedly reset the session state marked by pg_backup_start. This commit prevents BASE_BACKUP command in the middle of another non-exclusive backup in the same session. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
* Re-add SPICleanup for ABI compatibility in stable branchPeter Eisentraut2022-07-18
| | | | | | | | This fixes an ABI break introduced by cfc86f987349372dbbfc0391f9f519c0a7b27b84. Author: Markus Wanner <markus.wanner@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/defd749a-8410-841d-1126-21398686d63d@enterprisedb.com
* Fix omissions in support for the "regcollation" type.Tom Lane2022-07-17
| | | | | | | | | | | | | The patch that added regcollation doesn't seem to have been too thorough about supporting it everywhere that other reg* types are supported. Fix that. (The find_expr_references omission is moderately serious, since it could result in missing expression dependencies. The others are less exciting.) Noted while fixing bug #17483. Back-patch to v13 where regcollation was added. Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
* Make dsm_impl_posix_resize more future-proof.Thomas Munro2022-07-16
| | | | | | | | | | | | | | | | | | | | Commit 4518c798 blocks signals for a short region of code, but it assumed that whatever called it had the signal mask set to UnBlockSig on entry. That may be true today (or may even not be, in extensions in the wild), but it would be better not to make that assumption. We should save-and-restore the caller's signal mask. The PG_SETMASK() portability macro couldn't be used for that, which is why it wasn't done before. But... considering that commit a65e0864 established back in 9.6 that supported POSIX systems have sigprocmask(), and that this is POSIX-only code, there is no reason not to use standard sigprocmask() directly to achieve that. Back-patch to all supported releases, like 4518c798 and 80845b7c. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com
* Don't clobber postmaster sigmask in dsm_impl_resize.Thomas Munro2022-07-15
| | | | | | | | | | | | Commit 4518c798 intended to block signals in regular backends that allocate DSM segments, but dsm_impl_resize() is also reached by dsm_postmaster_startup(). It's not OK to clobber the postmaster's signal mask, so only manipulate the signal mask when under the postmaster. Back-patch to all releases, like 4518c798. Discussion: https://postgr.es/m/CA%2BhUKGKNpK%3D2OMeea_AZwpLg7Bm4%3DgYWk7eDjZ5F6YbozfOf8w%40mail.gmail.com
* Block signals while allocating DSM memory.Thomas Munro2022-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Linux, we call posix_fallocate() on shm_open()'d memory to avoid later potential SIGBUS (see commit 899bd785). Based on field reports of systems stuck in an EINTR retry loop there, there, we made it possible to break out of that loop via slightly odd coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from the loop (see commit 422952ee). On further reflection, that was not a great choice for at least two reasons: 1. If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing and the EINTR error would be surfaced to the user. 2. If EINTR was reported but neither QueryCancelPending nor ProcDiePending was set, then we'd dutifully retry, but with a bit more understanding of how posix_fallocate() works, it's now clear that you can get into a loop that never terminates. posix_fallocate() is not a function that can do some of the job and tell you about progress if it's interrupted, it has to undo what it's done so far and report EINTR, and if signals keep arriving faster than it can complete (cf recovery conflict signals), you're stuck. Therefore, for now, we'll simply block most signals to guarantee progress. SIGQUIT is not blocked (see InitPostmasterChild()), because its expected handler doesn't return, and unblockable signals like SIGCONT are not expected to arrive at a high rate. For good measure, we'll include the ftruncate() call in the blocked region, and add a retry loop. Back-patch to all supported releases. Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Nicola Contu <nicola.contu@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql
* Fix lock assertions in dshash.c.Thomas Munro2022-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | dshash.c previously maintained flags to be able to assert that you didn't hold any partition lock. These flags could get out of sync with reality in error scenarios. Get rid of all that, and make assertions about the locks themselves instead. Since LWLockHeldByMe() loops internally, we don't want to put that inside another loop over all partition locks. Introduce a new debugging-only interface LWLockAnyHeldByMe() to avoid that. This problem was noted by Tom and Andres while reviewing changes to support the new shared memory stats system, and later showed up in reality while working on commit 389869af. Back-patch to 11, where dshash.c arrived. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220311012712.botrpsikaufzteyt@alap3.anarazel.de Discussion: https://postgr.es/m/CA%2BhUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw%40mail.gmail.com
* Fix \watch's interaction with libedit on ^C.Thomas Munro2022-07-10
| | | | | | | | | | | | | | When you hit ^C, the terminal driver in Unix-like systems echoes "^C" as well as sending an interrupt signal (depending on stty settings). At least libedit (but maybe also libreadline) is then confused about the current cursor location, and corrupts the display if you try to scroll back. Fix, by moving to a new line before the next prompt is displayed. Back-patch to all supported released. Author: Pavel Stehule <pavel.stehule@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3278793.1626198638%40sss.pgh.pa.us
* Fix alias matching in transformLockingClause().Dean Rasheed2022-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | When locking a specific named relation for a FOR [KEY] UPDATE/SHARE clause, transformLockingClause() finds the relation to lock by scanning the rangetable for an RTE with a matching eref->aliasname. However, it failed to account for the visibility rules of a join RTE. If a join RTE doesn't have a user-supplied alias, it will have a generated eref->aliasname of "unnamed_join" that is not visible as a relation name in the parse namespace. Such an RTE needs to be skipped, otherwise it might be found in preference to a regular base relation with a user-supplied alias of "unnamed_join", preventing it from being locked. In addition, if a join RTE doesn't have a user-supplied alias, but does have a join_using_alias, then the RTE needs to be matched using that alias rather than the generated eref->aliasname, otherwise a misleading "relation not found" error will be reported rather than a "join cannot be locked" error. Backpatch all the way, except for the second part which only goes back to 14, where JOIN USING aliases were added. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
* Remove %error-verbose directive from jsonpath parserAndrew Dunstan2022-07-03
| | | | | | | | | | | None of the other bison parsers contains this directive, and it gives rise to some unfortunate and impenetrable messages, so just remove it. Backpatch to release 12, where it was introduced. Per gripe from Erik Rijkers Discussion: https://postgr.es/m/ba069ce2-a98f-dc70-dc17-2ccf2a9bf7c7@xs4all.nl
* Fix previous commit's ecpg_clocale for ppc Darwin.Noah Misch2022-07-02
| | | | | | | | | | Per buildfarm member prairiedog, this platform rejects uninitialized global variables in shared libraries. Back-patch to v10, like the addition of the variable. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20220703030619.GB2378460@rfd.leadboat.com
* ecpglib: call newlocale() once per process.Noah Misch2022-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | ecpglib has been calling it once per SQL query and once per EXEC SQL GET DESCRIPTOR. Instead, if newlocale() has not succeeded before, call it while establishing a connection. This mitigates three problems: - If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently proceeded without the intended locale change. - On AIX, each newlocale()+freelocale() cycle leaked memory. - newlocale() CPU usage may have been nontrivial. Fail the connection attempt if newlocale() fails. Rearrange ecpg_do_prologue() to validate the connection before its uselocale(). The sort of program that may regress is one running in an environment where newlocale() fails. If that program establishes connections without running SQL statements, it will stop working in response to this change. I'm betting against the importance of such an ECPG use case. Most SQL execution (any using ECPGdo()) has long required newlocale() success, so there's little a connection could do without newlocale(). Back-patch to v10 (all supported versions). Reviewed by Tom Lane. Reported by Guillaume Lelarge. Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
* Harden dsm_impl.c against unexpected EEXIST.Thomas Munro2022-07-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we trusted the OS not to report EEXIST unless we'd passed in IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate. Solaris's shm_open() can in fact do that, causing us to crash because we didn't ereport and then we blithely assumed the mapping was successful. Let's treat EEXIST just like any other error, unless we're actually trying to create a new segment. This applies to shm_open(), where this behavior has been seen, and also to the equivalent operations for our sysv and mmap modes just on principle. Based on the underlying reason for the error, namely contention on a lock file managed by Solaris librt for each distinct name, this problem is only likely to happen on 15 and later, because the new shared memory stats system produces shm_open() calls for the same path from potentially large numbers of backends concurrently during authentication. Earlier releases only shared memory segments between a small number of parallel workers under one Gather node. You could probably hit it if you tried hard enough though, and we should have been more defensive in the first place. Therefore, back-patch to all supported releases. Per build farm animal margay. This isn't the end of the story, though, it just changes random crashes into random "File exists" errors; more work needed for a green build farm. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
* Fix visibility check when XID is committed in CLOG but not in procarray.Heikki Linnakangas2022-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TransactionIdIsInProgress had a fast path to return 'false' if the single-item CLOG cache said that the transaction was known to be committed. However, that was wrong, because a transaction is first marked as committed in the CLOG but doesn't become visible to others until it has removed its XID from the proc array. That could lead to an error: ERROR: t_xmin is uncommitted in tuple to be updated or for an UPDATE to go ahead without blocking, before the previous UPDATE on the same row was made visible. The window is usually very short, but synchronous replication makes it much wider, because the wait for synchronous replica happens in that window. Another thing that makes it hard to hit is that it's hard to get such a commit-in-progress transaction into the single item CLOG cache. Normally, if you call TransactionIdIsInProgress on such a transaction, it determines that the XID is in progress without checking the CLOG and without populating the cache. One way to prime the cache is to explicitly call pg_xact_status() on the XID. Another way is to use a lot of subtransactions, so that the subxid cache in the proc array is overflown, making TransactionIdIsInProgress rely on pg_subtrans and CLOG checks. This has been broken ever since it was introduced in 2008, but the race condition is very hard to hit, especially without synchronous replication. There were a couple of reports of the error starting from summer 2021, but no one was able to find the root cause then. TransactionIdIsKnownCompleted() is now unused. In 'master', remove it, but I left it in place in backbranches in case it's used by extensions. Also change pg_xact_status() to check TransactionIdIsInProgress(). Previously, it only checked the CLOG, and returned "committed" before the transaction was actually made visible to other queries. Note that this also means that you cannot use pg_xact_status() to reproduce the bug anymore, even if the code wasn't fixed. Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with the pg_xact_status() change added by me. Author: Simon Riggs Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
* Fix PostgreSQL::Test aliasing for Perl v5.10.1.Noah Misch2022-06-25
| | | | | | | | | This Perl segfaults if a declaration of the to-be-aliased package precedes the aliasing itself. Per buildfarm members lapwing and wrasse. Like commit 20911775de4ab7ac3ecc68bd714cb3ed0fd68b6a, back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220625171533.GA2012493@rfd.leadboat.com
* CREATE INDEX: use the original userid for more ACL checks.Noah Misch2022-06-25
| | | | | | | | | | | | | Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid for ACL checks located directly in DefineIndex(), but it still adopted the table owner userid for more ACL checks than intended. That broke dump/reload of indexes that refer to an operator class, collation, or exclusion operator in a schema other than "public" or "pg_catalog". Back-patch to v10 (all supported versions), like the earlier commit. Nathan Bossart and Noah Misch Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
* For PostgreSQL::Test compatibility, alias entire package symbol tables.Noah Misch2022-06-25
| | | | | | | | | | | | Remove the need to edit back-branch-specific code sites when back-patching the addition of a PostgreSQL::Test::Utils symbol. Replace per-symbol, incomplete alias lists. Give old and new package names the same EXPORT and EXPORT_OK semantics. Back-patch to v10 (all supported versions). Reviewed by Andrew Dunstan. Discussion: https://postgr.es/m/20220622072144.GD4167527@rfd.leadboat.com
* Fix memory leak due to LogicalRepRelMapEntry.attrmap.Amit Kapila2022-06-23
| | | | | | | | | | | | | | When rebuilding the relation mapping on subscribers, we were not releasing the attribute mapping's memory which was no longer required. The attribute mapping used in logical tuple conversion was refactored in PG13 (by commit e1551f96e6) but we forgot to update the related code that frees the attribute map. Author: Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila, Shi yu Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Fix SPI's handling of errors during transaction commit.Tom Lane2022-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SPI_commit previously left it up to the caller to recover from any error occurring during commit. Since that's complicated and requires use of low-level xact.c facilities, it's not too surprising that no caller got it right. Let's move the responsibility for cleanup into spi.c. Doing that requires redefining SPI_commit as starting a new transaction, so that it becomes equivalent to SPI_commit_and_chain except that you get default transaction characteristics instead of preserving the prior transaction's characteristics. We can make this pretty transparent API-wise by redefining SPI_start_transaction() as a no-op. Callers that expect to do something in between might be surprised, but available evidence is that no callers do so. Having made that API redefinition, we can fix this mess by having SPI_commit[_and_chain] trap errors and start a new, clean transaction before re-throwing the error. Likewise for SPI_rollback[_and_chain]. Some cleanup is also needed in AtEOXact_SPI, which was nowhere near smart enough to deal with SPI contexts nested inside a committing context. While plperl and pltcl need no changes beyond removing their now-useless SPI_start_transaction() calls, plpython needs some more work because it hadn't gotten the memo about catching commit/rollback errors in the first place. Such an error resulted in longjmp'ing out of the Python interpreter, which leaks Python stack entries at present and is reported to crash Python 3.11 altogether. Add the missing logic to catch such errors and convert them into Python exceptions. This is a back-patch of commit 2e517818f. That's now aged long enough to reduce the concerns about whether it will break something, and we do need to ensure that supported branches will work with Python 3.11. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
* Fix stale values in partition map entries on subscribers.Amit Kapila2022-06-21
| | | | | | | | | | | | | | | | | | We build the partition map entries on subscribers while applying the changes for update/delete on partitions. The component relation in each entry is closed after its use so we need to update it on successive use of cache entries. This problem was there since the original commit f1ac27bfda that introduced this code but we didn't notice it till the recent commit 26b3455afa started to use the component relation of partition map cache entry. Reported-by: Tom Lane, as per buildfarm Author: Amit Langote, Hou Zhijie Reviewed-by: Amit Kapila, Shi Yu Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Fix partition table's REPLICA IDENTITY checking on the subscriber.Amit Kapila2022-06-21
| | | | | | | | | | | | | | | | | | | | In logical replication, we will check if the target table on the subscriber is updatable by comparing the replica identity of the table on the publisher with the table on the subscriber. When the target table is a partitioned table, we only check its replica identity but not for the partition tables. This leads to assertion failure while applying changes for update/delete as we expect those to succeed only when the corresponding partition table has a primary key or has a replica identity defined. Fix it by checking the replica identity of the partition table while applying changes. Reported-by: Shi Yu Author: Shi Yu, Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Fix data inconsistency between publisher and subscriber.Amit Kapila2022-06-16
| | | | | | | | | | | | | | | | We were not updating the partition map cache in the subscriber even when the corresponding remote rel is changed. Due to this data was getting incorrectly replicated for partition tables after the publisher has changed the table schema. Fix it by resetting the required entries in the partition map cache after receiving a new relation mapping from the publisher. Reported-by: Shi Yu Author: Shi Yu, Hou Zhijie Reviewed-by: Amit Langote, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Fix cache look-up failures while applying changes in logical replication.Amit Kapila2022-06-15
| | | | | | | | | | | | | | | | | | While building a new attrmap which maps partition attribute numbers to remoterel's, we incorrectly update the map for dropped column attributes. Later, it caused cache look-up failure when we tried to use the map to fetch the information about attributes. This also fixes the partition map cache invalidation which was using the wrong type cast to fetch the entry. We were using stale partition map entry after invalidation which leads to the assertion or cache look-up failure. Reported-by: Shi Yu Author: Hou Zhijie, Shi Yu Reviewed-by: Amit Langote, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
* Avoid ecpglib core dump with out-of-order operations.Tom Lane2022-06-14
| | | | | | | | | | | | | | | | | | | If an application executed operations like EXEC SQL PREPARE without having first established a database connection, it could get a core dump instead of the expected clean failure. This occurred because we did "pthread_getspecific(actual_connection_key)" without ever having initialized the TSD key actual_connection_key. The results of that are probably platform-specific, but at least on Linux it often leads to a crash. To fix, add calls to ecpg_pthreads_init() in the code paths that might use actual_connection_key uninitialized. It's harmless (and hopefully inexpensive) to do that more than once. Per bug #17514 from Okano Naoki. The problem's ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org
* Revert "Fix psql's single transaction mode on client-side errors with -c/-f ↵Tom Lane2022-06-10
| | | | | | | | | | | | | | switches". This reverts commits a04ccf6df et al. in the back branches only. There was some disagreement already over whether to back-patch 157f8739a, on the grounds that it is the sort of behavioral change that we don't like to back-patch. Furthermore, it now looks like the logic needs some more work, which we don't have time for before the upcoming 14.4 release. Revert for now, and perhaps reconsider later. Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org