aboutsummaryrefslogtreecommitdiff
path: root/contrib
Commit message (Collapse)AuthorAge
* Adjust sepgsql expected output for 681d9e462 et al.Tom Lane2023-05-08
| | | | Security: CVE-2023-2454
* Replace last PushOverrideSearchPath() call with set_config_option().Noah Misch2023-05-08
| | | | | | | | | | | | | | | | | | | | | The two methods don't cooperate, so set_config_option("search_path", ...) has been ineffective under non-empty overrideStack. This defect enabled an attacker having database-level CREATE privilege to execute arbitrary code as the bootstrap superuser. While that particular attack requires v13+ for the trusted extension attribute, other attacks are feasible in all supported versions. Standardize on the combination of NewGUCNestLevel() and set_config_option("search_path", ...). It is newer than PushOverrideSearchPath(), more-prevalent, and has no known disadvantages. The "override" mechanism remains for now, for compatibility with out-of-tree code. Users should update such code, which likely suffers from the same sort of vulnerability closed here. Back-patch to v11 (all supported versions). Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2023-2454
* In hstore_plpython, avoid crashing when return value isn't a mapping.Tom Lane2023-04-27
| | | | | | | | | | | | | | | | | | Python 3 changed the behavior of PyMapping_Check(), breaking the test in plpython_to_hstore() that verifies whether a function result to be transformed is acceptable. A backwards-compatible fix is to first verify that the object doesn't pass PySequence_Check(). Perhaps accidentally, our other uses of PyMapping_Check() already follow uses of PySequence_Check(), so that no other bugs were created by this change. Per bug #17908 from Alexander Lakhin. Back-patch to all supported branches. Dmitry Dolgov and Tom Lane Discussion: https://postgr.es/m/17908-3f19a125d56a11d6@postgresql.org
* Fix buffer refcount leak with FDW bulk insertsMichael Paquier2023-04-25
| | | | | | | | | | | | | | | | | | | | | The leak would show up when using batch inserts with foreign tables included in a partition tree, as the slots used in the batch were not reset once processed. In order to fix this problem, some ExecClearTuple() are added to clean up the slots used once a batch is filled and processed, mapping with the number of slots currently in use as tracked by the counter ri_NumSlots. This buffer refcount leak has been introduced in b676ac4 with the addition of the executor facility to improve bulk inserts for FDWs, so backpatch down to 14. Alexander has provided the patch (slightly modified by me). The test for postgres_fdw comes from me, based on the test case that the author has sent in the report. Author: Alexander Pyhalov Discussion: https://postgr.es/m/b035780a740efd38dc30790c76927255@postgrespro.ru Backpatch-through: 14
* Validate ltree siglen GiST option to be int-alignedAlexander Korotkov2023-04-23
| | | | | | | | | | | | Unaligned siglen could lead to an unaligned access to subsequent key fields. Backpatch to 13, where opclass options were introduced. Reported-by: Alexander Lakhin Bug: 17847 Discussion: https://postgr.es/m/17847-171232970bea406b%40postgresql.org Reviewed-by: Tom Lane, Pavel Borisov, Alexander Lakhin Backpatch-through: 13
* amcheck: In verify_heapam, allows tuples with xmin 0.Robert Haas2023-03-28
| | | | | | | | | | | Commit e88754a1965c0f40a723e6e46d670cacda9e19bd caused that case to be reported as corruption, but Peter Geoghegan pointed out that it can legitimately happen in the case of a speculative insertion that aborts, so we'd better not flag it as corruption after all. Back-patch to v14, like the commit that introduced the issue. Discussion: http://postgr.es/m/CAH2-WzmEabzcPTxSY-NXKH6Qt3FkAPYHGQSe2PtvGgj17ZQkCw@mail.gmail.com
* amcheck: Fix verify_heapam for tuples where xmin or xmax is 0.Robert Haas2023-03-24
| | | | | | | | | | | | | | | | | | | | | | | In such cases, get_xid_status() doesn't set its output parameter (the third argument), so we shouldn't fall through to code which will test the value of that parameter. There are five existing calls to get_xid_status(), three of which seem to already handle this case properly. This commit tries to fix the other two. If we're checking xmin and find that it is invalid (i.e. 0) just report that as corruption, similar to what's already done in the three cases that seem correct. If we're checking xmax and find that's invalid, that's fine: it just means that the tuple hasn't been updated or deleted. Thanks to Andres Freund and valgrind for finding this problem, and also to Andres for having a look at the patch. This bug seems to go all the way back to where verify_heapam was first introduced, but wasn't detected until recently, possibly because of the new test cases added for update chain verification. Back-patch to v14, where this code showed up. Discussion: http://postgr.es/m/CA+TgmoZAYzQZqyUparXy_ks3OEOfLD9-bEXt8N-2tS1qghX9gQ@mail.gmail.com
* amcheck: Fix FullTransactionIdFromXidAndCtx() for xids before epoch 0Andres Freund2023-03-11
| | | | | | | | | | | | | | | | | | | 64bit xids can't represent xids before epoch 0 (see also be504a3e974). When FullTransactionIdFromXidAndCtx() was passed such an xid, it'd create a 64bit xid far into the future. Noticed while adding assertions in the course of investigating be504a3e974, as amcheck's test create such xids. To fix the issue, just return FirstNormalFullTransactionId in this case. A freshly initdb'd cluster already has a newer horizon. The most minimal version of this would make the messages for some detected corruptions differently inaccurate. To make those cases accurate, switch FullTransactionIdFromXidAndCtx() to use the 32bit modulo difference between xid and nextxid to compute the 64bit xid, yielding sensible "in the future" / "in the past" answers. Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de Backpatch: 14-, where heapam verification was introduced
* amcheck: Fix ordering bug in update_cached_xid_range()Andres Freund2023-03-11
| | | | | | | | | | | | | | | | The initialization order in update_cached_xid_range() was wrong, calling FullTransactionIdFromXidAndCtx() before setting ->next_xid. FullTransactionIdFromXidAndCtx() uses ->next_xid. In most situations this will not cause visible issues, because the next call to update_cached_xid_range() will use a less wrong ->next_xid. It's rare that xids advance fast enough for this to be a problem. Found while adding more asserts to the 64bit xid infrastructure. Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de Backpatch: 14-, where heapam verification was introduced
* Fix misbehavior in contrib/pg_trgm with an unsatisfiable regex.Tom Lane2023-03-11
| | | | | | | | | | | | | | | | | If the regex compiler can see that a regex is unsatisfiable (for example, '$foo') then it may emit an NFA having no arcs. pg_trgm's packGraph function did the wrong thing in this case; it would access off the end of a work array, and with bad luck could produce a corrupted output data structure causing more problems later. This could end with wrong answers or crashes in queries using a pg_trgm GIN or GiST index with such a regex. Fix by not trying to de-duplicate if there aren't at least 2 arcs. Per bug #17830 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17830-57ff5f89bdb02b09@postgresql.org
* pageinspect: Fix crash with gist_page_items()Michael Paquier2023-03-02
| | | | | | | | | | | | | | | | | | Attempting to use this function with a raw page not coming from a GiST index would cause a crash, as it was missing the same sanity checks as gist_page_items_bytea(). This slightly refactors the code so as all the basic validation checks for GiST pages are done in a single routine, in the same fashion as the pageinspect functions for hash and BRIN. This fixes an issue similar to 076f4d9. A test is added to stress for this case. While on it, I have added a similar test for brin_page_items() with a combination make of a valid GiST index and a raw btree page. This one was already protected, but it was not tested. Reported-by: Egor Chindyaskin Author: Dmitry Koval Discussion: https://postgr.es/m/17815-fc4a2d3b74705703@postgresql.org Backpatch-through: 14
* Harden postgres_fdw tests against unexpected cache flushes.Tom Lane2023-02-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw will close its remote session if an sinval cache reset occurs, since it's possible that that means some FDW parameters changed. We had two tests that were trying to ensure that the session remains alive by setting debug_discard_caches = 0; but that's not sufficient. Even though the tests seem stable enough in the buildfarm, they flap a lot under CI. In the first test, which is checking the ability to recover from a lost connection, we can stabilize the results by just not caring whether pg_terminate_backend() finds a victim backend. If a reset did happen, there won't be a session to terminate anymore, but the test can proceed anyway. (Arguably, we are then not testing the unintentional-disconnect case, but as long as that scenario is exercised in most runs I think it's fine; testing the reset-driven case is of value too.) In the second test, which is trying to verify the application_name displayed in pg_stat_activity by a remote session, we had a race condition in that the remote session might go away before we can fetch its pg_stat_activity entry. We can close that race and make the test more certainly test what it intends to by arranging things so that the remote session itself fetches its pg_stat_activity entry (based on PID rather than a somewhat-circular assumption about the application name). Both tests now demonstrably pass under debug_discard_caches = 1, so we can remove that hack. Back-patch into relevant back branches. Discussion: https://postgr.es/m/20230226194340.u44bkfgyz64c67i6@awork3.anarazel.de
* Fix calculation of which GENERATED columns need to be updated.Tom Lane2023-01-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were identifying the updatable generated columns of inheritance children by transposing the calculation made for their parent. However, there's nothing that says a traditional-inheritance child can't have generated columns that aren't there in its parent, or that have different dependencies than are in the parent's expression. (At present it seems that we don't enforce that for partitioning either, which is likely wrong to some degree or other; but the case clearly needs to be handled with traditional inheritance.) Hence, drop the very-klugy-anyway "extraUpdatedCols" RTE field in favor of identifying which generated columns depend on updated columns during executor startup. In HEAD we can remove extraUpdatedCols altogether; in back branches, it's still there but always empty. Another difference between the HEAD and back-branch versions of this patch is that in HEAD we can add the new bitmap field to ResultRelInfo, but that would cause an ABI break in back branches. Like 4b3e37993, add a List field at the end of struct EState instead. Back-patch to v13. The bogus calculation is also being made in v12, but it doesn't have the same visible effect because we don't use it to decide which generated columns to recalculate; as a consequence of which the patch doesn't apply easily. I think that there might still be a demonstrable bug associated with trigger firing conditions, but that's such a weird corner-case usage that I'm content to leave it unfixed in v12. Amit Langote and Tom Lane Discussion: https://postgr.es/m/CA+HiwqFshLKNvQUd1DgwJ-7tsTp=dwv7KZqXC4j2wYBV1aCDUA@mail.gmail.com Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
* Fix contrib/seg to be more wary of long input numbers.Tom Lane2022-12-21
| | | | | | | | | | | | | | | | | | | | | seg stores the number of significant digits in an input number in a "char" field. If char is signed, and the input is more than 127 digits long, the count can read out as negative causing seg_out() to print garbage (or, if you're really unlucky, even crash). To fix, clamp the digit count to be not more than FLT_DIG. (In theory this loses some information about what the original input was, but it doesn't seem like useful information; it would not survive dump/restore in any case.) Also, in case there are stored values of the seg type containing bad data, add a clamp in seg_out's restore() subroutine. Per bug #17725 from Robins Tharakan. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/17725-0a09313b67fbe86e@postgresql.org
* Fix handling of pending inserts in nodeModifyTable.c.Etsuro Fujita2022-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b663a4136, which allowed FDWs to INSERT rows in bulk, added to nodeModifyTable.c code to flush pending inserts to the foreign-table result relation(s) before completing processing of the ModifyTable node, but the code failed to take into account the case where the INSERT query has modifying CTEs, leading to incorrect results. Also, that commit failed to flush pending inserts before firing BEFORE ROW triggers so that rows are visible to such triggers. In that commit we scanned through EState's es_tuple_routing_result_relations or es_opened_result_relations list to find the foreign-table result relations to which pending inserts are flushed, but that would be inefficient in some cases. So to fix, 1) add a List member to EState to record the insert-pending result relations, and 2) modify nodeModifyTable.c so that it adds the foreign-table result relation to the list in ExecInsert() if appropriate, and flushes pending inserts properly using the list where needed. While here, fix a copy-and-pasteo in a comment in ExecBatchInsert(), which was added by that commit. Back-patch to v14 where that commit appeared. Discussion: https://postgr.es/m/CAPmGK16qutyCmyJJzgQOhfBq%3DNoGDqTB6O0QBZTihrbqre%2BoxA%40mail.gmail.com
* Revert "Prevent instability in contrib/pageinspect's regression test."Tom Lane2022-11-21
| | | | | | | | | | | | This reverts commit 5cda142bb9d2bd7e7ed1c22ae89afe58abfa8d7b (in v14 only). It turns out that that fails under force_parallel_mode = regress, because pageinspect's disk-access functions are marked parallel safe, which they are not if you try to use them on a temp table. The cost of fixing that pre-v15 seems to exceed the value of making this test case fully stable, so we will just leave things as-is in v14.
* Prevent instability in contrib/pageinspect's regression test.Tom Lane2022-11-21
| | | | | | | | | | | | | | | | | | pageinspect has occasionally failed on slow buildfarm members, with symptoms indicating that the expected effects of VACUUM FREEZE didn't happen. This is presumably because a background transaction such as auto-analyze was holding back global xmin. We can work around that by using a temp table in the test. Since commit a7212be8b, that will use an up-to-date cutoff xmin regardless of other processes. And pageinspect itself shouldn't really care whether the table is temp. Back-patch to v14. There would be no point in older branches without back-patching a7212be8b, which seems like more trouble than the problem is worth. Discussion: https://postgr.es/m/2892135.1668976646@sss.pgh.pa.us
* Replace RelationOpenSmgr() with RelationGetSmgr().Tom Lane2022-11-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a back-patch of the v15-era commit f10f0ae42 into older supported branches. The idea is to design out bugs in which an ill-timed relcache flush clears rel->rd_smgr partway through some code sequence that wasn't expecting that. We had another report today of a corner case that reliably crashes v14 under debug_discard_caches (nee CLOBBER_CACHE_ALWAYS), and therefore would crash once in a blue moon in the field. We're unlikely to get rid of all such code paths unless we adopt the more rigorous coding rules instituted by f10f0ae42. Therefore, even though this is a bit invasive, it's time to back-patch. Some comfort can be taken in the fact that f10f0ae42 has been in v15 for 16 months without problems. I left the RelationOpenSmgr macro present in the back branches, even though no core code should use it anymore, in order to not break third-party extensions in minor releases. Such extensions might opt to start using RelationGetSmgr instead, to reduce their code differential between v15 and earlier branches. This carries a hazard of failing to compile against headers from existing minor releases. However, once compiled the extension should work fine even with such releases, because RelationGetSmgr is a "static inline" function so it creates no link-time dependency. So depending on distribution practices, that might be an OK tradeoff. Per report from Spyridon Dimitrios Agathos. Original patch by Amul Sul. Discussion: https://postgr.es/m/CAFM5RaqdgyusQvmWkyPYaWMwoK5gigdtW-7HcgHgOeAw7mqJ_Q@mail.gmail.com Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
* pg_stat_statements: fetch stmt location/length before it disappears.Tom Lane2022-11-01
| | | | | | | | | | | | | | | | | | | | When executing a utility statement, we must fetch everything we need out of the PlannedStmt data structure before calling standard_ProcessUtility. In certain cases (possibly only ROLLBACK in extended query protocol), that data structure will get freed during command execution. The situation is probably often harmless in production builds, but in debug builds we intentionally overwrite the freed memory with garbage, leading to picking up garbage values of statement location and length, typically causing an assertion failure later in pg_stat_statements. In non-debug builds, if something did go wrong it would likely lead to storing garbage for the query string. Report and fix by zhaoqigui (with cosmetic adjustments by me). It's an old problem, so back-patch to all supported versions. Discussion: https://postgr.es/m/17663-a344fd0675f92128@postgresql.org Discussion: https://postgr.es/m/1667307420050.56657@hundsun.com
* Fix executing invalidation messages generated by subtransactions during ↵Amit Kapila2022-10-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | decoding. This problem has been introduced by commit 272248a0c1 where we started assigning the subtransactions to the top-level transaction when we mark both the top-level transaction and its subtransactions as containing catalog changes. After we assign subtransactions to the top-level transaction, we were not allowed to execute any invalidations associated with it when we decide to skip the transaction. The reason to assign the subtransactions to the top-level transaction was to avoid the assertion failure in AssertTXNLsnOrder() as they have the same LSN when we sometimes start accumulating transaction changes for partial transactions after the restart. Now that with commit 64ff0fe4e8, we skip this assertion check until we reach the LSN at which we start decoding the contents of the transaction, so, there is no reason for such an assignment anymore. The assignment change was introduced in 15 and prior versions but this bug doesn't exist in branches prior to 14 since we don't add invalidation messages to subtransactions. We decided to backpatch through 11 for consistency but not for 10 since its final release is near. Reported-by: Kuroda Hayato Author: Masahiko Sawada Reviewed-by: Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/TYAPR01MB58660803BCAA7849C8584AA4F57E9%40TYAPR01MB5866.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/a89b46b6-0239-2fd5-71a9-b19b1f7a7145%40enterprisedb.com
* Fix assertion failures while processing NEW_CID record in logical decoding.Amit Kapila2022-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the logical decoding restarts from NEW_CID, since there is no association between the top transaction and its subtransaction, both are created as top transactions and have the same LSN. This caused the assertion failure in AssertTXNLsnOrder(). This patch skips the assertion check until we reach the LSN at which we start decoding the contents of the transaction, specifically start_decoding_at LSN in SnapBuild. This is okay because we don't guarantee to make the association between top transaction and subtransaction until we try to decode the actual contents of transaction. The ordering of the records prior to the start_decoding_at LSN should have been checked before the restart. The other assertion failure is due to the reason that we forgot to track that we have considered top-level transaction id in the list of catalog changing transactions that were committed when one of its subtransactions is marked as containing catalog change. Reported-by: Tomas Vondra, Osumi Takamichi Author: Masahiko Sawada, Kuroda Hayato Reviewed-by: Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi, Masahiko Sawada Backpatch-through: 10 Discussion: https://postgr.es/m/a89b46b6-0239-2fd5-71a9-b19b1f7a7145%40enterprisedb.com Discussion: https://postgr.es/m/TYCPR01MB83733C6CEAE47D0280814D5AED7A9%40TYCPR01MB8373.jpnprd01.prod.outlook.com
* postgres_fdw: Avoid 'variable not found in subplan target list' error.Etsuro Fujita2022-09-14
| | | | | | | | | | | | | | | | | | | | | | The tlist of the EvalPlanQual outer plan for a ForeignScan node is adjusted to produce a tuple whose descriptor matches the scan tuple slot for the ForeignScan node. But in the case where the outer plan contains an extra Sort node, if the new tlist contained columns required only for evaluating PlaceHolderVars or columns required only for evaluating local conditions, this would cause setrefs.c to fail with the error. The cause of this is that when creating the outer plan by injecting the Sort node into an alternative local join plan that could emit such extra columns as well, we fail to arrange for the outer plan to propagate them up through the Sort node, causing setrefs.c to fail to match up them in the new tlist to what is available from the outer plan. Repair. Per report from Alexander Pyhalov. Richard Guo and Etsuro Fujita, reviewed by Alexander Pyhalov and Tom Lane. Backpatch to all supported versions. Discussion: http://postgr.es/m/cfb17bf6dfdf876467bd5ef533852d18%40postgrespro.ru
* Reject bogus output from uuid_create(3).Tom Lane2022-09-09
| | | | | | | | | | | | | | | | | | | | | | When using the BSD UUID functions, contrib/uuid-ossp expects uuid_create() to produce a version-1 UUID. FreeBSD still does so, but in recent NetBSD releases that function produces a version-4 (random) UUID instead. That's not acceptable for our purposes: if the user wanted v4 she would have asked for v4, not v1. Hence, check the version digit and complain if it's not '1'. Also drop the documentation's claim that the NetBSD implementation is usable. It might be, depending on which OS version you're using, but we're not going to get into that kind of detail. (Maybe someday we should ditch all these external libraries and just write our own UUID code, but today is not that day.) Nazir Bilal Yavuz, with cosmetic adjustments and docs by me. Backpatch to all supported versions. Discussion: https://postgr.es/m/3848059.1661038772@sss.pgh.pa.us Discussion: https://postgr.es/m/17358-89806e7420797025@postgresql.org
* Fix catalog lookup with the wrong snapshot during logical decoding.Amit Kapila2022-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, after the restart, if the logical decoding decodes only the commit record of the transaction that has actually modified a catalog, we will miss adding its XID to the snapshot. Thus, we will end up looking at catalogs with the wrong snapshot. To fix this problem, this changes the snapshot builder so that it remembers the last-running-xacts list of the decoded RUNNING_XACTS record after restoring the previously serialized snapshot. Then, we mark the transaction as containing catalog changes if it's in the list of initial running transactions and its commit record has XACT_XINFO_HAS_INVALS. To avoid ABI breakage, we store the array of the initial running transactions in the static variables InitialRunningXacts and NInitialRunningXacts, instead of storing those in SnapBuild or ReorderBuffer. This approach has a false positive; we could end up adding the transaction that didn't change catalog to the snapshot since we cannot distinguish whether the transaction has catalog changes only by checking the COMMIT record. It doesn't have the information on which (sub) transaction has catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate that the transaction has catalog change. But that won't be a problem since we use snapshot built during decoding only to read system catalogs. On the master branch, we took a more future-proof approach by writing catalog modifying transactions to the serialized snapshot which avoids the above false positive. But we cannot backpatch it because of a change in the SnapBuild. Reported-by: Mike Oh Author: Masahiko Sawada Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi Backpatch-through: 10 Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com
* postgres_fdw: Disable batch insertion when there are WCO constraints.Etsuro Fujita2022-08-05
| | | | | | | | | | | | | | | | | | | | | | When inserting a view referencing a foreign table that has WITH CHECK OPTION constraints, in single-insert mode postgres_fdw retrieves the data that was actually inserted on the remote side so that the WITH CHECK OPTION constraints are enforced with the data locally, but in batch-insert mode it cannot currently retrieve the data (except for the row first inserted through the view), resulting in enforcing the WITH CHECK OPTION constraints with the data passed from the core (except for the first-inserted row), which led to incorrect results when inserting into a view referencing a foreign table in which a remote BEFORE ROW INSERT trigger changes the rows inserted through the view so that they violate the view's WITH CHECK OPTION constraint. Also, the query inserting into the view caused an assertion failure in assert-enabled builds. Fix these by disabling batch insertion when inserting into such a view. Back-patch to v14 where batch insertion was added. Discussion: https://postgr.es/m/CAPmGK17LpbTZs4m4a_6THP54UBeK9fHvX8aVVA%2BC6yEZDZwQcg%40mail.gmail.com
* Be more wary about 32-bit integer overflow in pg_stat_statements.Tom Lane2022-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've heard a couple of reports of people having trouble with multi-gigabyte-sized query-texts files. It occurred to me that on 32-bit platforms, there could be an issue with integer overflow of calculations associated with the total query text size. Address that with several changes: 1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX. The hashtable code will bound it to that anyway unless "long" is 64 bits. We still need overflow guards on its use, but this helps. 2. Add a check to prevent extending the query-texts file to more than MaxAllocHugeSize. If it got that big, qtext_load_file would certainly fail, so there's not much point in allowing it. Without this, we'd need to consider whether extent, query_offset, and related variables shouldn't be off_t not size_t. 3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit arithmetic on all platforms. It appears possible that under duress those multiplications could overflow 32 bits, yielding a false conclusion that we need to garbage-collect the texts file, which could lead to repeatedly garbage-collecting after every hash table insertion. Per report from Bruno da Silva. I'm not convinced that these issues fully explain his problem; there may be some other bug that's contributing to the query-texts file becoming so large in the first place. But it did get that big, so #2 is a reasonable defense, and #3 could explain the reported performance difficulties. (See also commit 8bbe4cbd9, which addressed some related bugs. The second Discussion: link is the thread that led up to that.) This issue is old, and is primarily a problem for old platforms, so back-patch. Discussion: https://postgr.es/m/CAB+Nuk93fL1Q9eLOCotvLP07g7RAv4vbdrkm0cVQohDVMpAb9A@mail.gmail.com Discussion: https://postgr.es/m/5601D354.5000703@BlueTreble.com
* postgres_fdw: Fix bug in checking of return value of PQsendQuery().Fujii Masao2022-07-22
| | | | | | | | | | | | | | | | | | When postgres_fdw begins an asynchronous data fetch, it submits FETCH query by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw should report an error. But, previously, postgres_fdw reported an error only when the return value is less than 0, though PQsendQuery() never return the values other than 0 and 1. Therefore postgres_fdw could not handle the failure to send FETCH query in an asynchronous data fetch. This commit fixes postgres_fdw so that it reports an error when PQsendQuery() returns 0. Back-patch to v14 where asynchronous execution was supported in postgres_fdw. Author: Fujii Masao Reviewed-by: Japin Li, Tom Lane Discussion: https://postgr.es/m/b187a7cf-d4e3-5a32-4d01-8383677797f3@oss.nttdata.com
* postgres_fdw: set search_path to 'pg_catalog' while deparsing constants.Tom Lane2022-07-17
| | | | | | | | | | | | | The motivation for this is to ensure successful transmission of the values of constants of regconfig and other reg* types. The remote will be reading them with search_path = 'pg_catalog', so schema qualification is necessary when referencing objects in other schemas. Per bug #17483 from Emmanuel Quincerot. Back-patch to all supported versions. (There's some other stuff to do here, but it's less back-patchable.) Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
* CREATE INDEX: use the original userid for more ACL checks.Noah Misch2022-06-25
| | | | | | | | | | | | | Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid for ACL checks located directly in DefineIndex(), but it still adopted the table owner userid for more ACL checks than intended. That broke dump/reload of indexes that refer to an operator class, collation, or exclusion operator in a schema other than "public" or "pg_catalog". Back-patch to v10 (all supported versions), like the earlier commit. Nathan Bossart and Noah Misch Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de
* Silence compiler warnings from some older compilers.Tom Lane2022-06-01
| | | | | | | | | | | | Since a117cebd6, some older gcc versions issue "variable may be used uninitialized in this function" complaints for brin_summarize_range. Silence that using the same coding pattern as in bt_index_check_internal; arguably, a117cebd6 had too narrow a view of which compilers might give trouble. Nathan Bossart and Tom Lane. Back-patch as the previous commit was. Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13
* Make relation-enumerating operations be security-restricted operations.Noah Misch2022-05-09
| | | | | | | | | | | | | | | | | | When a feature enumerates relations and runs functions associated with all found relations, the feature's user shall not need to trust every user having permission to create objects. BRIN-specific functionality in autovacuum neglected to account for this, as did pg_amcheck and CLUSTER. An attacker having permission to create non-temp objects in at least one schema could execute arbitrary SQL functions under the identity of the bootstrap superuser. CREATE INDEX (not a relation-enumerating operation) and REINDEX protected themselves too late. This change extends to the non-enumerating amcheck interface. Back-patch to v10 (all supported versions). Sergey Shinderuk, reviewed (in earlier versions) by Alexander Lakhin. Reported by Alexander Lakhin. Security: CVE-2022-1552
* Fix back-patch of "Under has_wal_read_bug, skip .../001_wal.pl."Noah Misch2022-05-07
| | | | | Per buildfarm members tadarida, snapper, and kittiwake. Back-patch to v10 (all supported versions).
* Under has_wal_read_bug, skip contrib/bloom/t/001_wal.pl.Noah Misch2022-05-07
| | | | | | | Per buildfarm members snapper and kittiwake. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com
* Disable asynchronous execution if using gating Result nodes.Etsuro Fujita2022-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | mark_async_capable_plan(), which is called from create_append_plan() to determine whether subplans are async-capable, failed to take into account that the given subplan created from a given subpath might include a gating Result node if the subpath is a SubqueryScanPath or ForeignPath, causing a segmentation fault there when the subplan created from a SubqueryScanPath includes the Result node, or causing ExecAsyncRequest() to throw an error about an unrecognized node type when the subplan created from a ForeignPath includes the Result node, because in the latter case the Result node was unintentionally considered as async-capable, but we don't currently support executing Result nodes asynchronously. Fix by modifying mark_async_capable_plan() to disable asynchronous execution in such cases. Also, adjust code in the ProjectionPath case in mark_async_capable_plan(), for consistency with other cases, and adjust/improve comments there. is_async_capable_path() added in commit 27e1f1456, which was rewritten to mark_async_capable_plan() in a later commit, has the same issue, causing the error at execution mentioned above, so back-patch to v14 where the aforesaid commit went in. Per report from Justin Pryzby. Etsuro Fujita, reviewed by Zhihong Yu and Justin Pryzby. Discussion: https://postgr.es/m/20220408124338.GK24419%40telsasoft.com
* postgres_fdw: Disable batch insert when BEFORE ROW INSERT triggers exist.Etsuro Fujita2022-04-21
| | | | | | | | | | | Previously, we allowed this, but such triggers might query the table to insert into and act differently if the tuples that have already been processed and prepared for insertion are not there, so disable it in such cases. Back-patch to v14 where batch insert was added. Discussion: https://postgr.es/m/CAPmGK16_uPqsmgK0-LpLSUk54_BoK13bPrhxhfjSoSTVz414hA%40mail.gmail.com
* Stabilize streaming tests in test_decoding.Amit Kapila2022-04-20
| | | | | | | | | | | We have some streaming tests that rely on the size of changes which can fail if there are additional changes like invalidation messages by background activity like auto analyze. Avoid such failures by increasing autovacuum_naptime to a reasonably high value (1d). Author: Dilip Kumar Backpatch-through: 14 Discussion: https://postgr.es/m/1958043.1650129119@sss.pgh.pa.us
* pageinspect: Fix handling of all-zero pagesMichael Paquier2022-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Getting from get_raw_page() an all-zero page is considered as a valid case by the buffer manager and it can happen for example when finding a corrupted page with zero_damaged_pages enabled (using zero_damaged_pages to look at corrupted pages happens), or after a crash when a relation file is extended before any WAL for its new data is generated (before a vacuum or autovacuum job comes in to do some cleanup). However, all the functions of pageinspect, as of the index AMs (except hash that has its own idea of new pages), heap, the FSM or the page header have never worked with all-zero pages, causing various crashes when going through the page internals. This commit changes all the pageinspect functions to be compliant with all-zero pages, where the choice is made to return NULL or no rows for SRFs when finding a new page. get_raw_page() still works the same way, returning a batch of zeros in the bytea of the page retrieved. A hard error could be used but NULL, while more invasive, is useful when scanning relation files in full to get a batch of results for a single relation in one query. Tests are added for all the code paths impacted. Reported-by: Daria Lepikhova Author: Michael Paquier Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru Backpatch-through: 10
* Fix postgres_fdw to check shippability of sort clauses properly.Tom Lane2022-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw would push ORDER BY clauses to the remote side without verifying that the sort operator is safe to ship. Moreover, it failed to print a suitable USING clause if the sort operator isn't default for the sort expression's type. The net result of this is that the remote sort might not have anywhere near the semantics we expect, which'd be disastrous for locally-performed merge joins in particular. We addressed similar issues in the context of ORDER BY within an aggregate function call in commit 7012b132d, but failed to notice that query-level ORDER BY was broken. Thus, much of the necessary logic already existed, but it requires refactoring to be usable in both cases. Back-patch to all supported branches. In HEAD only, remove the core code's copy of find_em_expr_for_rel, which is no longer used and really should never have been pushed into equivclass.c in the first place. Ronan Dunklau, per report from David Rowley; reviews by David Rowley, Ranier Vilela, and myself Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
* pageinspect: Add more sanity checks to prevent out-of-bound readsMichael Paquier2022-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A couple of code paths use the special area on the page passed by the function caller, expecting to find some data in it. However, feeding an incorrect page can lead to out-of-bound reads when trying to access the page special area (like a heap page that has no special area, leading PageGetSpecialPointer() to grab a pointer outside the allocated page). The functions used for hash and btree indexes have some protection already against that, while some other functions using a relation OID as argument would make sure that the access method involved is correct, but functions taking in input a raw page without knowing the relation the page is attached to would run into problems. This commit improves the set of checks used in the code paths of BRIN, btree (including one check if a leaf page is found with a non-zero level), GIN and GiST to verify that the page given in input has a special area size that fits with each access method, which is done though PageGetSpecialSize(), becore calling PageGetSpecialPointer(). The scope of the checks done is limited to work with pages that one would pass after getting a block with get_raw_page(), as it is possible to craft byteas that could bypass existing code paths. Having too many checks would also impact the usability of pageinspect, as the existing code is very useful to look at the content details in a corrupted page, so the focus is really to avoid out-of-bound reads as this is never a good thing even with functions whose execution is limited to superusers. The safest approach could be to rework the functions so as these fetch a block using a relation OID and a block number, but there are also cases where using a raw page is useful. Tests are added to cover all the code paths that needed such checks, and an error message for hash indexes is reworded to fit better with what this commit adds. Reported-By: Alexander Lakhin Author: Julien Rouhaud, Michael Paquier Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru Backpatch-through: 10
* Harden TAP tests that intentionally corrupt page checksums.Tom Lane2022-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous method for doing that was to write zeroes into a predetermined set of page locations. However, there's a roughly 1-in-64K chance that the existing checksum will match by chance, and yesterday several buildfarm animals started to reproducibly see that, resulting in test failures because no checksum mismatch was reported. Since the checksum includes the page LSN, test success depends on the length of the installation's WAL history, which is affected by (at least) the initial catalog contents, the set of locales installed on the system, and the length of the pathname of the test directory. Sooner or later we were going to hit a chance match, and today is that day. Harden these tests by specifically inverting the checksum field and leaving all else alone, thereby guaranteeing that the checksum is incorrect. In passing, fix places that were using seek() to set up for syswrite(), a combination that the Perl docs very explicitly warn against. We've probably escaped problems because no regular buffered I/O is done on these filehandles; but if it ever breaks, we wouldn't deserve or get much sympathy. Although we've only seen problems in HEAD, now that we recognize the environmental dependencies it seems like it might be just a matter of time until someone manages to hit this in back-branch testing. Hence, back-patch to v11 where we started doing this kind of test. Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
* Fix default signature length for gist_ltree_opsAlexander Korotkov2022-03-16
| | | | | | | | | | | | | | | | | | | | 911e702077 implemented operator class parameters including the signature length in ltree. Previously, the signature length for gist_ltree_ops was 8. Because of bug 911e702077 the default signature length for gist_ltree_ops became 28 for ltree 1.1 (where options method is NOT provided) and 8 for ltree 1.2 (where options method is provided). This commit changes the default signature length for ltree 1.1 to 8. Existing gist_ltree_ops indexes might be corrupted in various scenarios. Thus, we have to recommend reindexing all the gist_ltree_ops indexes after the upgrade. Reported-by: Victor Yegorov Reviewed-by: Tomas Vondra, Tom Lane, Andres Freund, Nikita Glukhov Reviewed-by: Andrew Dunstan Author: Tomas Vondra, Alexander Korotkov Discussion: https://postgr.es/m/17406-71e02820ae79bb40%40postgresql.org Discussion: https://postgr.es/m/d80e0a55-6c3e-5b26-53e3-3c4f973f737c%40enterprisedb.com
* pageinspect: Fix memory context allocation of page in brin_revmap_data()Michael Paquier2022-03-16
| | | | | | | | | | | | | | This caused the function to fail, as the aligned copy of the raw page given by the function caller was not saved in the correct memory context, which needs to be multi_call_memory_ctx in this case. Issue introduced by 076f4d9. Per buildfarm members sifika, mylodon and longfin. I have reproduced that locally with macos. Discussion: https://postgr.es/m/YjFPOtfCW6yLXUeM@paquier.xyz Backpatch-through: 10
* pageinspect: Fix handling of page sizes and AM typesMichael Paquier2022-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes a set of issues related to the use of the SQL functions in this module when the caller is able to pass down raw page data as input argument: - The page size check was fuzzy in a couple of places, sometimes looking after only a sub-range, but what we are looking for is an exact match on BLCKSZ. After considering a few options here, I have settled down to do a generalization of get_page_from_raw(). Most of the SQL functions already used that, and this is not strictly required if not accessing an 8-byte-wide value from a raw page, but this feels safer in the long run for alignment-picky environment, particularly if a code path begins to access such values. This also reduces the number of strings that need to be translated. - The BRIN function brin_page_items() uses a Relation but it did not check the access method of the opened index, potentially leading to crashes. All the other functions in need of a Relation already did that. - Some code paths could fail on elog(), but we should to use ereport() for failures that can be triggered by the user. Tests are added to stress all the cases that are fixed as of this commit, with some junk raw pages (\set VERBOSITY ensures that this works across all page sizes) and unexpected index types when functions open relations. Author: Michael Paquier, Justin Prysby Discussion: https://postgr.es/m/20220218030020.GA1137@telsasoft.com Backpatch-through: 10
* Introduce PG_TEST_TIMEOUT_DEFAULT for TAP suite non-elapsing timeouts.Noah Misch2022-03-04
| | | | | | | | | | | | Slow hosts may avoid load-induced, spurious failures by setting environment variable PG_TEST_TIMEOUT_DEFAULT to some number of seconds greater than 180. Developers may see faster failures by setting that environment variable to some lesser number of seconds. In tests, write $PostgreSQL::Test::Utils::timeout_default wherever the convention has been to write 180. This change raises the default for some briefer timeouts. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220218052842.GA3627003@rfd.leadboat.com
* Clean up assorted failures under clang's -fsanitize=undefined checks.Tom Lane2022-03-03
| | | | | | | | | | | | | | | | | | | | | | Most of these are cases where we could call memcpy() or other libc functions with a NULL pointer and a zero count, which is forbidden by POSIX even though every production version of libc allows it. We've fixed such things before in a piecemeal way, but apparently never made an effort to try to get them all. I don't claim that this patch does so either, but it gets every failure I observe in check-world, using clang 12.0.1 on current RHEL8. numeric.c has a different issue that the sanitizer doesn't like: "ln(-1.0)" will compute log10(0) and then try to assign the resulting -Inf to an integer variable. We don't actually use the result in such a case, so there's no live bug. Back-patch to all supported branches, with the idea that we might start running a buildfarm member that tests this case. This includes back-patching c1132aae3 (Check the size in COPY_POINTER_FIELD), which previously silenced some of these issues in copyfuncs.c. Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com
* WAL log unchanged toasted replica identity key attributes.Amit Kapila2022-02-14
| | | | | | | | | | | | | | | Currently, during UPDATE, the unchanged replica identity key attributes are not logged separately because they are getting logged as part of the new tuple. But if they are stored externally then the untoasted values are not getting logged as part of the new tuple and logical replication won't be able to replicate such UPDATEs. So we need to log such attributes as part of the old_key_tuple during UPDATE. Reported-by: Haiying Tang Author: Dilip Kumar and Amit Kapila Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com
* Use Test::Builder::todo_start(), replacing $::TODO.Noah Misch2022-02-09
| | | | | | | | | | Some pre-2017 Test::More versions need perfect $Test::Builder::Level maintenance to find the variable. Buildfarm member snapper reported an overall failure that the file intended to hide via the TODO construct. That trouble was reachable in v11 and v10. For later branches, this serves as defense in depth. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220202055556.GB2745933@rfd.leadboat.com
* postgres_fdw: Fix handling of a pending asynchronous request in ↵Etsuro Fujita2022-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | postgresReScanForeignScan(). Commit 27e1f1456 failed to process a pending asynchronous request made for a given ForeignScan node in postgresReScanForeignScan() (if any) in cases where we would only reset the next_tuple counter in that function, contradicting the assumption that there should be no pending asynchronous requests that have been made for async-capable subplans for the parent Append node after ReScan. This led to an assert failure in an assert-enabled build. I think this would also lead to mis-rewinding the cursor in that function in the case where we have already fetched one batch for the ForeignScan node and the asynchronous request has been made for the second batch, because even in that case we would just reset the counter when called from that function, so we would fail to execute MOVE BACKWARD ALL. To fix, modify that function to process the asynchronous request before restarting the scan. While at it, add a comment to a function to match other places. Per bug #17344 from Alexander Lakhin. Back-patch to v14 where the aforesaid commit came in. Patch by me. Test case by Alexander Lakhin, adjusted by me. Reviewed and tested by Alexander Lakhin and Dmitry Dolgov. Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org
* On sparc64+ext4, suppress test failures from known WAL read failure.Noah Misch2022-01-26
| | | | | | | | | | Buildfarm members kittiwake, tadarida and snapper began to fail frequently when commits 3cd9c3b921977272e6650a5efbeade4203c4bca2 and f47ed79cc8a0cfa154dc7f01faaf59822552363f added tests of concurrency, but the problem was reachable before those commits. Back-patch to v10 (all supported versions). Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com
* postgres_fdw: Fix subabort cleanup of connections used in asynchronous ↵Etsuro Fujita2022-01-21
| | | | | | | | | | | | | | | | | | | | | | | execution. Commit 27e1f1456 resets the per-connection states of connections used to scan foreign tables asynchronously during abort cleanup at main transaction end, but it failed to do so during subabort cleanup at subtransaction end, leading to a segmentation fault when re-executing an asynchronous-foreign-table-scan query in a transaction that was cancelled in a subtransaction of it. Fix by modifying pgfdw_abort_cleanup() to reset the per-connection state of a given connection also when called for subabort cleanup. Also, modify that function to do the reset in both the abort-cleanup and subabort-cleanup cases if necessary, to save cycles, and improve a comment on it a little bit. Back-patch to v14 where the aforesaid commit came in. Reviewed by Alexander Pyhalov Discussion: https://postgr.es/m/CAPmGK14cCV-JA7kNsyt2EUTKvZ4xkr2LNRthi1U1C3cqfGppAw@mail.gmail.com