aboutsummaryrefslogtreecommitdiff
path: root/doc/src
Commit message (Collapse)AuthorAge
...
* Doc: add information about partition lockingDavid Rowley2025-04-02
| | | | | | | | | | The documentation around locking of partitions for the executor startup phase of run-time partition pruning wasn't clear about which partitions were being locked. Fix that. Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAApHDvp738G75HfkKcfXaf3a8s%3D6mmtOLh46tMD0D2hAo1UCzA%40mail.gmail.com Backpatch-through: 13
* Introduce a SQL-callable function array_sort(anyarray).Tom Lane2025-04-01
| | | | | | | | | | | | | | | | Create a function that will sort the elements of an array according to the element type's sort order. If the array has more than one dimension, the sub-arrays of the first dimension are sorted per normal array-comparison rules, leaving their contents alone. In support of this, add pg_type.typarray to the set of fields cached by the typcache. Author: Junwang Zhao <zhjwpku@gmail.com> Co-authored-by: Jian He <jian.universality@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAEG8a3J41a4dpw_-F94fF-JPRXYxw-GfsgoGotKcjs9LVfEEvw@mail.gmail.com
* doc: Adjust some notes about pg_upgrade's file transfer modes.Nathan Bossart2025-04-01
| | | | | | | | | --copy-file-range and --swap were not mentioned in a few places that discuss the available file transfer modes. This entire page would likely benefit from an overhaul, but that's v19 material at this point. Oversights in commits d93627bcbe and 626d7236b6.
* aio: Add pg_aios viewAndres Freund2025-04-01
| | | | | | | | | | The new view lists all IO handles that are currently in use and is mainly useful for PG developers, but may also be useful when tuning PG. Bumps catversion. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
* docs: Add acronym and glossary entries for I/O and AIOAndres Freund2025-04-01
| | | | | | | | | These are fairly basic, but better than nothing. While there are several opportunities to link to these entries, this patch does not add any. They will however be referenced by future patches. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250326183102.92.nmisch@google.com
* doc: Mention clock synchronization recommendation for hot_standby_feedbackPeter Eisentraut2025-03-31
| | | | | | | | | | | hot_standby_feedback mechanics assume that clocks are synchronized, but it was not clear from documentation. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAKZiRmwBcALLrDgCyEhHP1enUxtPMjyNM_d1A2Lng3_6Rf4Qfw%40mail.gmail.com
* Enable IO concurrency on all systemsAndres Freund2025-03-30
| | | | | | | | | | | | | | | Previously effective_io_concurrency and maintenance_io_concurrency could not be set above 0 on machines without fadvise support. AIO enables IO concurrency without such support, via io_method=worker. Currently only subsystems using the read stream API will take advantage of this. Other users of maintenance_io_concurrency (like recovery prefetching) which leverage OS advice directly will not benefit from this change. In those cases, maintenance_io_concurrency will have no effect on I/O behavior. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CAAKRu_atGgZePo=_g6T3cNtfMf0QxpvoUh5OUqa_cnPdhLd=gw@mail.gmail.com
* docs: Reframe track_io_timing related docs as wait timeAndres Freund2025-03-30
| | | | | | | | | | | | | With AIO it does not make sense anymore to track the time for each individual IO, as multiple IOs can be in-flight at the same time. Instead we now track the time spent *waiting* for IOs. This should be reflected in the docs. While, so far, we only do a subset of reads, and no other operations, via AIO, describing the GUC and view columns as measuring IO waits is accurate for synchronous and asynchronous IO. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/5dzyoduxlvfg55oqtjyjehez5uoq6hnwgzor4kkybkfdgkj7ag@rbi4gsmzaczk
* amcheck: Add gin_index_check() to verify GIN indexTomas Vondra2025-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds a new function, validating two kinds of invariants on a GIN index: - parent-child consistency: Paths in a GIN graph have to contain consistent keys. Tuples on parent pages consistently include tuples from child pages; parent tuples do not require any adjustments. - balanced-tree / graph: Each internal page has at least one downlink, and can reference either only leaf pages or only internal pages. The GIN verification is based on work by Grigory Kryachko, reworked by Heikki Linnakangas and with various improvements by Andrey Borodin. Investigation and fixes for multiple bugs by Kirill Reshke. Author: Grigory Kryachko <GSKryachko@gmail.com> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Andrey Borodin <amborodin@acm.org> Reviewed-By: José Villanova <jose.arthur@gmail.com> Reviewed-By: Aleksander Alekseev <aleksander@timescale.com> Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-By: Kirill Reshke <reshkekirill@gmail.com> Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com> Reviewed-By: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
* pg_createsubscriber: Add '--all' option.Amit Kapila2025-03-28
| | | | | | | | | | | | | | | | | | | | | | | | The '--all' option indicates that the tool queries the source server (publisher) for all databases and creates subscriptions on the target server (subscriber) for databases with matching names. Without this user needs to explicitly specify all databases by using -d option for each database. This simplifies converting a physical standby to a logical subscriber, particularly during upgrades. The options '--database', '--publication', '--subscription', and '--replication-slot' cannot be used when '--all' is specified. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Discussion: https://postgr.es/m/CAHv8RjKhA=_h5vAbozzJ1Opnv=KXYQHQ-fJyaMfqfRqPpnC2bA@mail.gmail.com
* Remove the query_id_squash_values GUCÁlvaro Herrera2025-03-27
| | | | | | | | | | | | | Commit 62d712ecfd94 introduced the capability to calculate the same queryId for queries with different lengths of constants in a list for an IN clause. This behavior was originally enabled with a GUC query_id_squash_values. After a discussion about the value of such a GUC, it was decided to back out of the use of a GUC and make the squashing behavior the only available option. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z-LZyygkkNyA8-kR@msg.df7cb.de Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com
* Simplify syntax for ALTER TABLE ALTER CONSTRAINT NO INHERITÁlvaro Herrera2025-03-27
| | | | | | | | | | | | | | | Commit d45597f72fe5 introduced the ability to change a not-null constraint from NO INHERIT to INHERIT and vice versa, but we included the SET noise word in the syntax for it. The SET turns out not to be necessary and goes against what the SQL standard says for other ALTER TABLE subcommands, so remove it. This changes the way this command is processed for constraint types other than not-null, so there are some error message changes. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Suraj Kharage <suraj.kharage@enterprisedb.com> Discussion: https://postgr.es/m/202503251602.vsxaehsyaoac@alvherre.pgsql
* doc: Correct description of values used in FSM for indexesMichael Paquier2025-03-27
| | | | | | | | | | | The implementation of FSM for indexes is simpler than heap, where 0 is used to track if a page is in-use and (BLCKSZ - 1) if a page is free. One comment in indexfsm.c and one description in the documentation of pg_freespacemap were incorrect about that. Author: Alex Friedman <alexf01@gmail.com> Discussion: https://postgr.es/m/71eef655-c192-453f-ac45-2772fec2cb04@gmail.com Backpatch-through: 13
* aio: Add io_method=io_uringAndres Freund2025-03-26
| | | | | | | | | | | | | | | | | | | | | | Performing AIO using io_uring can be considerably faster than io_method=worker, particularly when lots of small IOs are issued, as a) the context-switch overhead for worker based AIO becomes more significant b) the number of IO workers can become limiting io_uring, however, is linux specific and requires an additional compile-time dependency (liburing). This implementation is fairly simple and there are substantial optimization opportunities. The description of the existing AIO_IO_COMPLETION wait event is updated to make the difference between it and the new AIO_IO_URING_EXECUTION clearer. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
* aio: Add liburing dependencyAndres Freund2025-03-26
| | | | | | | | Will be used in a subsequent commit, to implement io_method=io_uring. Kept separate for easier review. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
* doc: Mention possible ephemeral discrepancies in pg_stat_activityMichael Paquier2025-03-27
| | | | | | | | | | | | Ephemeral inconsistencies across multiple attributes of pg_stat_activity can exist as the system is designed to be efficient with a low overhead. This question is raised by users from time to time based on the data read in the view, so let's add a note in the docs about this possibility. Author: Alex Friedman <alexf01@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/8a275154-a654-44b0-ab37-197802f04c7b@gmail.com
* Fix oversights in commit 8d5ceb113e3f7ddb627bd40b26438a9d2fa05512Robert Haas2025-03-26
| | | | | | | | | It added bogus whitespace at the end of a line in the documentation. It should not have done that. The pg_overexplain tests must SET debug_parallel_query = false, not just RESET debug_parallel_query, or we get failures on test machines that make debug_parallel_query = true the defualt.
* pg_overexplain: Additional EXPLAIN options for debugging.Robert Haas2025-03-26
| | | | | | | | | | | | | | | | | | | | There's a fair amount of information in the Plan and PlanState trees that isn't printed by any existing EXPLAIN option. This means that, when working on the planner, it's often necessary to rely on facilities such as debug_print_plan, which produce excessively voluminous output. Hence, use the new EXPLAIN extension facilities to implement EXPLAIN (DEBUG) and EXPLAIN (RANGE_TABLE) as extensions to the core EXPLAIN facility. A great deal more could be done here, and the specific choices about what to print and how are definitely arguable, but this is at least a starting point for discussion and a jumping-off point for possible future improvements. Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviweed-by: Andrei Lepikhov <lepihov@gmail.com> (who didn't like it) Discussion: http://postgr.es/m/CA+TgmoZfvQUBWQ2P8iO30jywhfEAKyNzMZSR+uc2xr9PZBw6eQ@mail.gmail.com
* Introduce PG_MODULE_MAGIC_EXT macro.Tom Lane2025-03-26
| | | | | | | | | | | | | | | | | | This macro allows dynamically loaded shared libraries (modules) to provide a wired-in module name and version, and possibly other compile-time-constant fields in future. This information can be retrieved with the new pg_get_loaded_modules() function. This feature is expected to be particularly useful for modules that do not have any exposed SQL functionality and thus are not associated with a SQL-level extension object. But even for modules that do belong to extensions, being able to verify the actual code version can be useful. Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Yurii Rashkovskii <yrashk@omnigres.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com
* dblink: SCRAM authentication pass-throughPeter Eisentraut2025-03-26
| | | | | | | | | | | | | | This enables SCRAM authentication for dblink (using dblink_fdw) when connecting to a foreign server without having to store a plain-text password on user mapping options This uses the same approach as it was implemented for postgres_fdw in commit 761c79508e7. (It also contains the equivalent of the subsequent fixes 76563f88cfb and d2028e9bbc1.) Author: Matheus Alcantara <mths.dev@pm.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com
* Add support for gamma() and lgamma() functions.Dean Rasheed2025-03-26
| | | | | | | | | | | | | | These are useful general-purpose math functions which are included in POSIX and C99, and are commonly included in other math libraries, so expose them as SQL-callable functions. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Stepan Neretin <sncfmgg@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://postgr.es/m/CAEZATCXpGyfjXCirFk9au+FvM0y2Ah+2-0WSJx7MO368ysNUPA@mail.gmail.com
* Use relation name instead of OID in query jumbling for RangeTblEntryMichael Paquier2025-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | custom_query_jumble (introduced in 5ac462e2b7ac as a node field attribute) is now assigned to the expanded reference name "eref" of RangeTblEntry, adding in the query jumble computation the non-qualified aliased relation name, without the list of column names. The relation OID is removed from the query jumbling. The effects of this change can be seen in the tests added by 3430215fe35f, where pg_stat_statements (PGSS) entries are now grouped using the relation name, ignoring the relation search_path may point at. For example, these two relations are different, but are now grouped in a single PGSS entry as they are assigned the same query ID: CREATE TABLE foo1.tab (a int); CREATE TABLE foo2.tab (b int); SET search_path = 'foo1'; SELECT count(*) FROM tab; SET search_path = 'foo2'; SELECT count(*) FROM tab; SELECT count(*) FROM foo1.tab; SELECT count(*) FROM foo2.tab; SELECT query, calls FROM pg_stat_statements WHERE query ~ 'FROM tab'; query | calls --------------------------+------- SELECT count(*) FROM tab | 4 (1 row) It is still possible to use an alias in the FROM clause to split these. This behavior is useful for relations re-created with the same name, where queries based on such relations would be grouped in the same PGSS entry. For permanent schemas, it should not really matter in practice. The main benefit is for workloads that use a lot of temporary relations, which are usually re-created with the same name continuously. These can be a heavy source of bloat in PGSS depending on the workload. Such entries can now be grouped together, improving the user experience. The original idea from Christoph Berg used catalog lookups to find temporary relations, something that the query jumble has never done, and it could cause some performance regressions. The idea to use RangeTblEntry.eref and the relation name, applying the same rules for all relations, temporary and not temporary, has been proposed by Tom Lane. The documentation additions have been suggested by Sami Imseih. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z9iWXKGwkm8RAC93@msg.df7cb.de
* Add pg_dump --with-{schema|data|statistics} options.Jeff Davis2025-03-25
| | | | | | | | | | By adding the positive variants of options, in addition to the negative variants that already exist, users can be explicit about what pg_dump should produce. Discussion: https://postgr.es/m/bd0513e4b1ea2b2f2d06f02720c6579711cb62a6.camel@j-davis.com Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de>
* pg_upgrade: Add --swap for faster file transfer.Nathan Bossart2025-03-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new option instructs pg_upgrade to move the data directories from the old cluster to the new cluster and then to replace the catalog files with those generated for the new cluster. This mode can outperform --link, --clone, --copy, and --copy-file-range, especially on clusters with many relations. However, this mode creates many garbage files in the old cluster, which can prolong the file synchronization step if --sync-method=syncfs is used. To handle that, we recommend using --sync-method=fsync with this mode, and pg_upgrade internally uses "initdb --sync-only --no-sync-data-files" for file synchronization. pg_upgrade will synchronize the catalog files as they are transferred. We assume that the database files transferred from the old cluster were synchronized prior to upgrade. This mode also complicates reverting to the old cluster, so we recommend restoring from backup upon failure during or after file transfer. We did consider teaching pg_upgrade how to generate a revert script for such failures, but we decided against it due to the rarity of failing during file transfer, the complexity of generating the script, and the potential for misusing the script. The new mode is limited to clusters located in the same file system. With some effort, we could probably support upgrades between different file systems, but this mode is unlikely to offer much benefit if we have to copy the files across file system boundaries. It is also limited to upgrades from version 10 or newer. There are a few known obstacles for using swap mode to upgrade from older versions. For example, the visibility map format changed in v9.6, and the sequence tuple format changed in v10. In fact, swap mode omits the --sequence-data option in its uses of pg_dump and instead reuses the old cluster's sequence data files. While teaching swap mode to deal with these kinds of changes is surely possible (and we may have to deal with similar problems in the future, anyway), it doesn't seem worth the effort to support upgrades from long-unsupported versions. Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
* pg_dump: Add --sequence-data.Nathan Bossart2025-03-25
| | | | | | | | | | | | This new option instructs pg_dump to dump sequence data when the --no-data, --schema-only, or --statistics-only option is specified. This was originally considered for commit a7e5457db8, but it was left out at that time because there was no known use-case. A follow-up commit will use this to optimize pg_upgrade's file transfer step. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
* initdb: Add --no-sync-data-files.Nathan Bossart2025-03-25
| | | | | | | | | | | | | | | | | | | | | | This new option instructs initdb to skip synchronizing any files in database directories, the database directories themselves, and the tablespace directories, i.e., everything in the base/ subdirectory and any other tablespace directories. Other files, such as those in pg_wal/ and pg_xact/, will still be synchronized unless --no-sync is also specified. --no-sync-data-files is primarily intended for internal use by tools that separately ensure the skipped files are synchronized to disk. A follow-up commit will use this to help optimize pg_upgrade's file transfer step. The --sync-method=fsync implementation of this option makes use of a new exclude_dir parameter for walkdir(). When not NULL, exclude_dir specifies a directory to skip processing. The --sync-method=syncfs implementation of this option just skips synchronizing the non-default tablespace directories. This means that initdb will still synchronize some or all of the database files, but there's not much we can do about that. Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
* Stats: use schemaname/relname instead of regclass.Jeff Davis2025-03-25
| | | | | | | | | | | | | | | For import and export, use schemaname/relname rather than regclass. This is more natural during export, fits with the other arguments better, and it gives better control over error handling in case we need to downgrade more errors to warnings. Also, use text for the argument types for schemaname, relname, and attname so that casts to "name" are not required. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
* Minor doc update for commit 99f8f3fbbc.Jeff Davis2025-03-25
| | | | Author: Corey Huinker <corey.huinker@gmail.com>
* psql: Make default \watch interval configurableDaniel Gustafsson2025-03-25
| | | | | | | | | | | | | | | | | | | The default interval for \watch to wait between executing queries, when executed without a specified interval, was hardcoded to two seconds. This adds the new variable WATCH_INTERVAL which is used to set the default interval, making it configurable for the user. This makes \watch the first command which has a user configurable default setting. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/B2FD26B4-8F64-4552-A603-5CC3DF1C7103@yesql.se
* libpq: Deprecate pg_int64.Thomas Munro2025-03-25
| | | | | | | | | | | | | | | | | | | | | | | Previously we used pg_int64 in three function prototypes in libpq. It was added by commit 461ef73f to expose the platform-dependent type used for int64 in the C89 era. As of commit 962da900 it is defined as standard int64_t, and the dust seems to have settled. Let's just use int64_t directly in these three client-facing functions instead of (yet) another name. We've required C99 and thus <stdint.h> since PostgreSQL 12, C89 and C++98 compilers are long gone, and client applications very likely use standard types for their own 64-bit needs. This also cleans up the obscure placement of a new #include <stdint.h> directive in postgres_ext.h, required for the new definition. The typedef was hiding in there for historical reasons, but it doesn't fit postgres_ext.h's own description of its purpose and there is no evidence of client applications including postgres_ext.h directly to see it. Keep a typedef marked deprecated for backward compatibility, but move it into libpq-fe.h where it was used. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CA%2BhUKGKn_EkNNGMY5RzMcKP%2Ba6urT4JF%3DCPhw_zHtQwjvX6P2g%40mail.gmail.com
* Redefine max_files_per_process to control additionally opened filesAndres Freund2025-03-24
| | | | | | | | | | | | | | | | | | Until now max_files_per_process=N limited each backend to open N files in total (minus a safety factor), even if there were already more files opened in postmaster and inherited by backends. Change max_files_per_process to control how many additional files each process is allowed to open. The main motivation for this is the patch to add io_method=io_uring, which needs to open one file for each backend. Without this patch, even if RLIMIT_NOFILE is high enough, postmaster will fail in set_max_safe_fds() if started with a high max_connections. The cause of the failure is that, until now, set_max_safe_fds() subtracted the already open files from max_files_per_process. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/w6uiicyou7hzq47mbyejubtcyb2rngkkf45fk4q7inue5kfbeo@bbfad3qyubvs Discussion: https://postgr.es/m/CAGECzQQh6VSy3KG4pN1d=h9J=D1rStFCMR+t7yh_Kwj-g87aLQ@mail.gmail.com
* doc: Clarify required options for each action in pg_recvlogical.Fujii Masao2025-03-25
| | | | | | | | | | | | | | | | Each pg_recvlogical action requires specific options. For example, --slot, --dbname, and --file must be specified with the --start action. Previously, the documentation did not clearly outline these requirements. This commit updates the documentation to explicitly state the necessary options for each action. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966930B4357BAE8C9D68A8AF5C72@OSCPR01MB14966.jpnprd01.prod.outlook.com
* postgres_fdw: improve security checksPeter Eisentraut2025-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | SCRAM pass-through should not bypass the FDW security check as it was implemented for postgres_fdw in commit 761c79508e7. This commit improves the security check by adding new SCRAM pass-through checks to ensure that the required SCRAM connection options are not overwritten by the user mapping or foreign server options. This is meant to match the security requirements for a password-using connection. Since libpq has no SCRAM-specific equivalent of PQconnectionUsedPassword(), we enforce this instead by making the use_scram_passthrough option of postgres_fdw imply require_auth=scram-sha-256. This means that if use_scram_passthrough is set, some situations that might otherwise have worked are preempted, for example GSSAPI with delegated credentials. This could be enhanced in the future if there is desire for more flexibility. Reported-by: Jacob Champion <jacob.champion@enterprisedb.com> Author: Matheus Alcantara <mths.dev@pm.me> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAFY6G8ercA1KES%3DE_0__R9QCTR805TTyYr1No8qF8ZxmMg8z2Q%40mail.gmail.com
* Detect and Log multiple_unique_conflicts type conflict.Amit Kapila2025-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new conflict type, multiple_unique_conflicts, to handle cases where an incoming row during logical replication violates multiple UNIQUE constraints. Previously, the apply worker detected and reported only the first encountered key conflict (insert_exists/update_exists), causing repeated failures as each constraint violation needs to be handled one by one making the process slow and error-prone. With this patch, the apply worker checks all unique constraints upfront once the first key conflict is detected and reports multiple_unique_conflicts if multiple violations exist. This allows users to resolve all conflicts at once by deleting all conflicting tuples rather than dealing with them individually or skipping the transaction. In the future, this will also allow us to specify different resolution handlers for such a conflict type. Add the stats for this conflict type in pg_stat_subscription_stats. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
* Add GUC option to control maximum active replication origins.Masahiko Sawada2025-03-21
| | | | | | | | | | | | | | | | | This commit introduces a new GUC option max_active_replication_origins to control the maximum number of active replication origins. Previously, this was controlled by 'max_replication_slots'. Having a separate GUC option provides better flexibility for setting up subscribers, as they may not require replication slots (for cascading replication) but always require replication origins. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/b81db436-8262-4575-b7c4-bc0c1551000b@app.fastmail.com
* doc: Remove incorrect description about dropping replication slots.Fujii Masao2025-03-21
| | | | | | | | | | | | | | | | | | | | | | | | pg_drop_replication_slot() can drop replication slots created on a different database than the one where it is executed. This behavior has been in place since PostgreSQL 9.4, when pg_drop_replication_slot() was introduced. However, commit ff539d mistakenly added the following incorrect description in the documentation: For logical slots, this must be called when connected to the same database the slot was created on. This commit removes that incorrect statement. A similar mistake was also present in the documentation for the DROP_REPLICATION_SLOT command, which has now been corrected as well. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966C6BE304B5BB2E58D4009F5DE2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13
* Add vacuum_truncate configuration parameter.Nathan Bossart2025-03-20
| | | | | | | | | | | | | | | | | | | | | | This new parameter works just like the storage parameter of the same name: if set to true (which is the default), autovacuum and VACUUM attempt to truncate any empty pages at the end of the table. It is primarily intended to help users avoid locking issues on hot standbys. The setting can be overridden with the storage parameter or VACUUM's TRUNCATE option. Since there's presently no way to determine whether a Boolean storage parameter is explicitly set or has just picked up the default value, this commit also introduces an isset_offset member to relopt_parse_elt. Suggested-by: Will Storey <will@summercat.com> Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/Z2DE4lDX4tHqNGZt%40dev.null
* pg_createsubscriber: Add -R publications option.Amit Kapila2025-03-20
| | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new '-R'/'--remove' option in the 'pg_createsubscriber' utility to specify the object types to be removed from the subscriber. Currently, we add support to specify 'publications' as an object type. In the future, other object types like failover-slots could be added. This feature allows optionally to remove publications on the subscriber that were replicated from the primary server (before running this tool) during physical replication. Users may want to retain these publications in case they want some pre-existing subscribers to point to the newly created subscriber. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHv8RjL4OvoYafofTb_U_JD5HuyoNowBoGpMfnEbhDSENA74Kg@mail.gmail.com
* extension_control_pathPeter Eisentraut2025-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new GUC extension_control_path specifies a path to look for extension control files. The default value is $system, which looks in the compiled-in location, as before. The path search uses the same code and works in the same way as dynamic_library_path. Some use cases of this are: (1) testing extensions during package builds, (2) installing extensions outside security-restricted containers like Python.app (on macOS), (3) adding extensions to PostgreSQL running in a Kubernetes environment using operators such as CloudNativePG without having to rebuild the base image for each new extension. There is also a tweak in Makefile.global so that it is possible to install extensions using PGXS into an different directory than the default, using 'make install prefix=/else/where'. This previously only worked when specifying the subdirectories, like 'make install datadir=/else/where/share pkglibdir=/else/where/lib', for purely implementation reasons. (Of course, without the path feature, installing elsewhere was rarely useful.) Author: Peter Eisentraut <peter@eisentraut.org> Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: David E. Wheeler <david@justatheory.com> Reviewed-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> Reviewed-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Reviewed-by: Niccolò Fei <niccolo.fei@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E7C7BFFB-8857-48D4-A71F-88B359FADCFD@justatheory.com
* psql: Allow queries terminated by semicolons while in pipeline modeMichael Paquier2025-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the only way to pipe queries in an ongoing pipeline (in a \startpipeline block) is to leverage the meta-commands able to create extended queries such as \bind, \parse or \bind_named. While this is good enough for testing the backend with pipelines, it has been mentioned that it can also be very useful to allow queries terminated by semicolons to be appended to a pipeline. For example, it would be possible to migrate existing psql scripts to use pipelines by just adding a set of \startpipeline and \endpipeline meta-commands, making such scripts more efficient. Doing such a change is proving to be simple in psql: queries terminated by semicolons can be executed through PQsendQueryParams() without any parameters set when the pipeline mode is active, instead of PQsendQuery(), the default, like pgbench. \watch is still forbidden while in a pipeline, as it expects its results to be processed synchronously. The large portion of this commit consists in providing more test coverage, with mixes of extended queries appended in a pipeline by \bind and friends, and queries terminated by semicolons. This improvement has been suggested by Daniel Vérité. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/d67b9c19-d009-4a50-8020-1a0ea92366a1@manitou-mail.org
* oauth: Improve validator docs on interruptibilityThomas Munro2025-03-19
| | | | | | | | Andres pointed out that EINTR handling is inadequate for real-world use cases. Direct module writers to our wait APIs instead. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq
* Increase io_combine_limit range to 1MB.Thomas Munro2025-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | The default of 128kB is unchanged, but the upper limit is changed from 32 blocks to 128 blocks, unless the operating system's IOV_MAX is too low. Some other RDBMSes seem to cap their multi-block buffer pool I/O around this number, and it seems useful to allow experimentation. The concrete change is to our definition of PG_IOV_MAX, which provides the maximum for io_combine_limit and io_max_combine_limit. It also affects a couple of other places that work with arrays of struct iovec or smaller objects on the stack, so we still don't want to use the system IOV_MAX directly without a clamp: it is not under our control and likely to be 1024. 128 seems acceptable for our current usage. For Windows, we can't use real scatter/gather yet, so we continue to define our own IOV_MAX value of 16 and emulate preadv()/pwritev() with loops. Someone would need to research the trade-offs of raising that number. NB if trying to see this working: you might temporarily need to hack BAS_BULKREAD to be bigger, since otherwise the obvious way of "a very big SELECT" is limited by that for now. Suggested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
* Introduce io_max_combine_limit.Thomas Munro2025-03-19
| | | | | | | | | | | | | | | | | | | | | The existing io_combine_limit can be changed by users. The new io_max_combine_limit is fixed at server startup time, and functions as a silent clamp on the user setting. That in itself is probably quite useful, but the primary motivation is: aio_init.c allocates shared memory for all asynchronous IOs including some per-block data, and we didn't want to waste memory you'd never used by assuming they could be up to PG_IOV_MAX. This commit already halves the size of 'AioHandleIov' and 'AioHandleData'. A follow-up commit can now expand PG_IOV_MAX without affecting that. Since our GUC system doesn't support dependencies or cross-checks between GUCs, the user-settable one now assigns a "raw" value to io_combine_limit_guc, and the lower of io_combine_limit_guc and io_max_combine_limit is maintained in io_combine_limit. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com
* Update guidance for running vacuumdb after pg_upgrade.Nathan Bossart2025-03-18
| | | | | | | | | Now that pg_upgrade can carry over most optimizer statistics, we should recommend using vacuumdb's new --missing-stats-only option to only analyze relations that are missing statistics. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan
* vacuumdb: Add option for analyzing only relations missing stats.Nathan Bossart2025-03-18
| | | | | | | | | | | | | | | This commit adds a new --missing-stats-only option that can be used with --analyze-only or --analyze-in-stages. When this option is specified, vacuumdb will analyze a relation if it lacks any statistics for a column, expression index, or extended statistics object. This new option is primarily intended for use after pg_upgrade (since it can now retain most optimizer statistics), but it might be useful in other situations, too. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/Z5O1bpcwDrMgyrYy%40nathan
* Doc: manually break lines in wide UUID examples.Tom Lane2025-03-18
| | | | | | | | | | | | | | | Buildfarm member crake has been complaining "WARNING: The contents of fo:inline line 1 exceed the available area in the inline-progression direction by 20500 millipoints. (See position 23808:106)" since ba57dcfdc went in. The other doc-building animals are not showing this warning, and I don't see it on my RHEL8 workstation either, but I was able to reproduce it on a Fedora 41 box. So apparently this is due to a recent-ish change in DocBook's line-breaking heuristics, which caused it to cope less well with the UUIDs in these examples. Put in some zero-width spaces to encourage the PDF toolchain to break these lines in a better place. (Only one of these examples actually needs this today, but I marked up all three to ensure that they get wrapped in a consistent way.)
* Introduce squashing of constant lists in query jumblingÁlvaro Herrera2025-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pg_stat_statements produces multiple entries for queries like SELECT something FROM table WHERE col IN (1, 2, 3, ...) depending on the number of parameters, because every element of ArrayExpr is individually jumbled. Most of the time that's undesirable, especially if the list becomes too large. Fix this by introducing a new GUC query_id_squash_values which modifies the node jumbling code to only consider the first and last element of a list of constants, rather than each list element individually. This affects both the query_id generated by query jumbling, as well as pg_stat_statements query normalization so that it suppresses printing of the individual elements of such a list. The default value is off, meaning the previous behavior is maintained. Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Sergey Dudoladov (mysterious, off-list) Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Marcos Pegoraro <marcos@f10.com.br> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Tested-by: Yasuo Honda <yasuo.honda@gmail.com> Tested-by: Sergei Kornilov <sk@zsrv.org> Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com> Tested-by: Chengxi Sun <sunchengxi@highgo.com> Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com
* aio: Add io_method=workerAndres Freund2025-03-18
| | | | | | | | | | | | | | | | | | | | | | | | The previous commit introduced the infrastructure to start io_workers. This commit actually makes the workers execute IOs. IO workers consume IOs from a shared memory submission queue, run traditional synchronous system calls, and perform the shared completion handling immediately. Client code submits most requests by pushing IOs into the submission queue, and waits (if necessary) using condition variables. Some IOs cannot be performed in another process due to lack of infrastructure for reopening the file, and must processed synchronously by the client code when submitted. For now the default io_method is changed to "worker". We should re-evaluate that around beta1, we might want to be careful and set the default to "sync" for 18. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
* aio: Infrastructure for io_method=workerAndres Freund2025-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit contains the basic, system-wide, infrastructure for io_method=worker. It does not yet actually execute IO, this commit just provides the infrastructure for running IO workers, kept separate for easier review. The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually we'd like to make the number of workers dynamically scale up/down based on the current "IO load". To allow the number of IO workers to be increased without a restart, we need to reserve PGPROC entries for the workers unconditionally. This has been judged to be worth the cost. If it turns out to be problematic, we can introduce a PGC_POSTMASTER GUC to control the maximum number. As io workers might be needed during shutdown, e.g. for AIO during the shutdown checkpoint, a new PMState phase is added. IO workers are shut down after the shutdown checkpoint has been performed and walsender/archiver have shut down, but before the checkpointer itself shuts down. See also 87a6690cc69. Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
* Add X25519 to the default set of curvesDaniel Gustafsson2025-03-18
| | | | | | | | | | | | | | Since many clients default to the X25519 curve in the TLS handshake, the fact that the server by defualt doesn't support it cause an extra roundtrip for each TLS connection. By adding multiple curves, which is supported since 3d1ef3a15c3eb68da, we can reduce the risk of extra roundtrips. Author: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/20240616234612.6cslu7nqexquvwj7@awork3.anarazel.de