aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* psql: Fix translation markingPeter Eisentraut2022-04-08
| | | | | | | | | | | | | | | Commit 5a2832465fd8984d089e8c44c094e6900d987fcd added addFooterToPublicationDesc() as a wrapper around printTableAddFooter(). The translation marker _() was moved to the body of addFooterToPublicationDesc(), but addFooterToPublicationDesc() was not added to GETTEXT_TRIGGERS, so those strings were lost for translation. To fix, add the translation markers to the call sites of addFooterToPublicationDesc() and remove the translation marker from the body of the function. This seems easiest since there were only two callers and it keeps the API consistent with printTableAddFooter(). While we're here, add some const decorations to the prototype of addFooterToPublicationDesc() for consistency with printTableAddFooter().
* Apply PGDLLIMPORT markings broadly.Robert Haas2022-04-08
| | | | | | | | | | | Up until now, we've had a policy of only marking certain variables in the PostgreSQL header files with PGDLLIMPORT, but now we've decided to mark them all. This means that extensions running on Windows should no longer operate at a disadvantage as compared to extensions running on Linux: if the variable is present in a header file, it should be accessible. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com
* Helper script to apply PGDLLIMPORT markings.Robert Haas2022-04-08
| | | | | | | | | | This script isn't terribly smart and won't necessarily catch every case, but it catches many of them and is better than a totally manual approach. Patch by me, reviewed by Andrew Dunstan. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com
* Check XLogRecHasBlockRef() before XLogRecHasBlockImage().Jeff Davis2022-04-08
| | | | Trial fix of buildfarm failures on kestrel and tamandua.
* Fix buildfarm failure from commit 2258e76f90.Jeff Davis2022-04-08
|
* Add contrib/pg_walinspect.Jeff Davis2022-04-08
| | | | | | | | | Provides similar functionality to pg_waldump, but from a SQL interface rather than a separate utility. Author: Bharath Rupireddy Reviewed-by: Greg Stark, Kyotaro Horiguchi, Andres Freund, Ashutosh Sharma, Nitin Jadhav, RKN Sai Krishna Discussion: https://postgr.es/m/CALj2ACUGUYXsEQdKhEdsBzhGEyF3xggvLdD8C0VT72TNEfOiog%40mail.gmail.com
* Remove error message hints mentioning configure optionsPeter Eisentraut2022-04-08
| | | | | | | | | | | | These are usually not useful since users will use packaged distributions and won't be interested in rebuilding their installation from source. Also, we have only used these kinds of hints for some features and in some places, not consistently throughout. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/2552aed7-d0e9-280a-54aa-2dc7073f371d%40enterprisedb.com
* Track I/O timing for temporary file blocks in EXPLAIN (BUFFERS)Michael Paquier2022-04-08
| | | | | | | | | | | | | | | | | | | | Previously, the output of EXPLAIN (BUFFERS) option showed only the I/O timing spent reading and writing shared and local buffers. This commit adds on top of that the I/O timing for temporary buffers in the output of EXPLAIN (for spilled external sorts, hashes, materialization. etc). This can be helpful for users in cases where the I/O related to temporary buffers is the bottleneck. Like its cousin, this information is available only when track_io_timing is enabled. Playing the patch, this is showing an extra overhead of up to 1% even when using gettimeofday() as implementation for interval timings, which is slightly within the usual range noise still that's measurable. Author: Masahiko Sawada Reviewed-by: Georgios Kokolatos, Melanie Plageman, Julien Rouhaud, Ranier Vilela Discussion: https://postgr.es/m/CAD21AoAJgotTeP83p6HiAGDhs_9Fw9pZ2J=_tYTsiO5Ob-V5GQ@mail.gmail.com
* pgstat: Hide instability in stats.spec with -DCATCACHE_FORCE_RELEASE.Andres Freund2022-04-07
| | | | | | | | | | | | | | | | | | | With -DCATCACHE_FORCE_RELEASE a few tests failed. Those were trying to test behavior in the absence of invalidation processing and -DCATCACHE_FORCE_RELEASE obviously adds a lot of invalidation processing. The test already tried to handle debug_discard_caches > 0, by disabling it for individual tests. Instead hide potentially problematic function calls in a wrapper function that catches the does-not-exist error. The error isn't the actually interesting bit, it's whether the stats entry still exist afterwards. I confirmed that the tests still catches leaked function stats if I nuke the protections against that in pgstat_function.c. Per buildfarm animal prion. Discussion: https://postgr.es/m/20220407165709.jgdkrzqlkcwue6ko@alap3.anarazel.de
* pgstat: add/extend tests for resetting various kinds of stats.Andres Freund2022-04-07
| | | | | | | | | | | | - subscriber stats reset path was untested - slot stat sreset path for all slots was untested - pg_stat_database.sessions etc was untested - pg_stat_reset_shared() was untested, for any kind of shared stats - pg_stat_reset() was untested Author: Melanie Plageman <melanieplageman@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* Truncate line pointer array during heap pruning.Peter Geoghegan2022-04-07
| | | | | | | | | | | | | | | | | | | Reclaim space from the line pointer array when heap pruning leaves behind a contiguous group of LP_UNUSED items at the end of the array. This happens during subsequent page defragmentation. Certain kinds of heap line pointer bloat are ameliorated by this new optimization. Follow-up work to commit 3c3b8a4b26, which taught VACUUM to truncate the line pointer array in about the same way during VACUUM's second pass over the heap. We now apply line pointer array truncation during both the first and the second pass over the heap made by VACUUM. We can also perform line pointer array truncation during opportunistic pruning. Matthias van de Meent, with small tweaks by me. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2WjgaQc55Y5f5CQd3L=eS5CZcff2Obxp=O6pto8-f0hC4w@mail.gmail.com Discussion: https://postgr.es/m/CAEze2Wg36%2B4at2eWJNcYNiW2FJmht34x3YeX54ctUSs7kKoNcA%40mail.gmail.com
* Teach planner and executor about monotonic window funcsDavid Rowley2022-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Window functions such as row_number() always return a value higher than the previously returned value for tuples in any given window partition. Traditionally queries such as; SELECT * FROM ( SELECT *, row_number() over (order by c) rn FROM t ) t WHERE rn <= 10; were executed fairly inefficiently. Neither the query planner nor the executor knew that once rn made it to 11 that nothing further would match the outer query's WHERE clause. It would blindly continue until all tuples were exhausted from the subquery. Here we implement means to make the above execute more efficiently. This is done by way of adding a pg_proc.prosupport function to various of the built-in window functions and adding supporting code to allow the support function to inform the planner if the window function is monotonically increasing, monotonically decreasing, both or neither. The planner is then able to make use of that information and possibly allow the executor to short-circuit execution by way of adding a "run condition" to the WindowAgg to allow it to determine if some of its execution work can be skipped. This "run condition" is not like a normal filter. These run conditions are only built using quals comparing values to monotonic window functions. For monotonic increasing functions, quals making use of the btree operators for <, <= and = can be used (assuming the window function column is on the left). You can see here that once such a condition becomes false that a monotonic increasing function could never make it subsequently true again. For monotonically decreasing functions the >, >= and = btree operators for the given type can be used for run conditions. The best-case situation for this is when there is a single WindowAgg node without a PARTITION BY clause. Here when the run condition becomes false the WindowAgg node can simply return NULL. No more tuples will ever match the run condition. It's a little more complex when there is a PARTITION BY clause. In this case, we cannot return NULL as we must still process other partitions. To speed this case up we pull tuples from the outer plan to check if they're from the same partition and simply discard them if they are. When we find a tuple belonging to another partition we start processing as normal again until the run condition becomes false or we run out of tuples to process. When there are multiple WindowAgg nodes to evaluate then this complicates the situation. For intermediate WindowAggs we must ensure we always return all tuples to the calling node. Any filtering done could lead to incorrect results in WindowAgg nodes above. For all intermediate nodes, we can still save some work when the run condition becomes false. We've no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot reference the value of these and these tuples will not appear in the final result anyway. The savings here are small in comparison to what can be saved in the top-level WingowAgg, but still worthwhile. Intermediate WindowAgg nodes never filter out tuples, but here we change WindowAgg so that the top-level WindowAgg filters out tuples that don't match the intermediate WindowAgg node's run condition. Such filters appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node. Here we add prosupport functions to allow the above to work for; row_number(), rank(), dense_rank(), count(*) and count(expr). It appears technically possible to do the same for min() and max(), however, it seems unlikely to be useful enough, so that's not done here. Bump catversion Author: David Rowley Reviewed-by: Andy Fan, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com
* Extend plsample example to include a trigger handler.Tom Lane2022-04-07
| | | | | | Mark Wong and Konstantina Skovola, reviewed by Chapman Flack Discussion: https://postgr.es/m/Yd8Cz22eHi80XS30@workstation-mark-wong
* Add minimal tests for recovery conflict handling.Andres Freund2022-04-07
| | | | | | | | | | | | | | | | | Previously none of our tests triggered recovery conflicts. The test is primarily motivated by needing tests for recovery conflict stats for shared memory based pgstats. But it's also a decent start for recovery conflict handling in general. The only type of recovery conflict not tested yet are rcovery deadlock conflicts. By configuring log_recovery_conflict_waits the test adds some very minimal testing for that path as well. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: test stats interactions with physical replication.Andres Freund2022-04-07
| | | | | | | | | | | Tests that standbys: - drop stats for objects when the those records are replayed - persist stats across graceful restarts - discard stats after immediate / crash restarts Author: Melanie Plageman <melanieplageman@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* Revert "Rewrite some RI code to avoid using SPI"Alvaro Herrera2022-04-07
| | | | | | | This reverts commit 99392cdd78b788295e52b9f4942fa11992fd5ba9. We'd rather rewrite ri_triggers.c as a whole rather than piecemeal. Discussion: https://postgr.es/m/E1ncXX2-000mFt-Pe@gemulon.postgresql.org
* psql: add \dconfig command to show server's configuration parameters.Tom Lane2022-04-07
| | | | | | | | | | | | | | | | | | | | | | Plain \dconfig is basically equivalent to SHOW except that you can give it a pattern with wildcards, either to match multiple GUCs or because you don't exactly remember the name you want. \dconfig+ adds type, context, and access-privilege information, mainly because every other kind of object privilege has a psql command to show it, so GUC privileges should too. (A form of this command was in some versions of the patch series leading up to commit a0ffa885e. We pulled it out then because of doubts that the design and code were up to snuff, but I think subsequent work has resolved that.) In passing, fix incorrect completion of GUC names in GRANT/REVOKE ON PARAMETER: a0ffa885e neglected to use the VERBATIM form of COMPLETE_WITH_QUERY, so it misbehaved for custom (qualified) GUC names. Mark Dilger and Tom Lane Discussion: https://postgr.es/m/3118455.1649267333@sss.pgh.pa.us
* pgstat: add tests for handling of restarts, including crashes.Andres Freund2022-04-07
| | | | | | | | | Test that stats are restored during normal restarts, discarded after a crash / immediate restart, and that a corrupted stats file leads to stats being reset. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* Rewrite some RI code to avoid using SPIAlvaro Herrera2022-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Modify the subroutines called by RI trigger functions that want to check if a given referenced value exists in the referenced relation to simply scan the foreign key constraint's unique index, instead of using SPI to execute SELECT 1 FROM referenced_relation WHERE ref_key = $1 This saves a lot of work, especially when inserting into or updating a referencing relation. This rewrite allows to fix a PK row visibility bug caused by a partition descriptor hack which requires ActiveSnapshot to be set to come up with the correct set of partitions for the RI query running under REPEATABLE READ isolation. We now set that snapshot indepedently of the snapshot to be used by the PK index scan, so the two no longer interfere. The buggy output in src/test/isolation/expected/fk-snapshot.out of the relevant test case added by commit 00cb86e75d6d has been corrected. (The bug still exists in branch 14, however, but this fix is too invasive to backpatch.) Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Li Japin <japinli@hotmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CA+HiwqGkfJfYdeq5vHPh6eqPKjSbfpDDY+j-kXYFePQedtSLeg@mail.gmail.com
* Fix test instability introduced in e349c95d3e9 due to async deduplication.Andres Freund2022-04-07
| | | | | | | | The statement emitting notifies tried to make sure page boundaries were crossed, but failed to do so reliably due to deduplication. Reported-By: chap@anastigmatix.net Discussion: https://postgr.es/m/20220407185408.n7dvsgqsb3q6uze7@alap3.anarazel.de
* Add isolation tests for snapshot behavior in ri_triggers.cAlvaro Herrera2022-04-07
| | | | | | | | | | | | They are to check the behavior of RI_FKey_check() and ri_Check_Pk_Match(). A test case whereby RI_FKey_check() queries a partitioned PK table under REPEATABLE READ isolation produces wrong output due to a bug of the partition-descriptor logic and that is noted as such in the comment in the test. A subsequent commit will fix the bug and replace the buggy output by the correct one. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/1627848.1636676261@sss.pgh.pa.us
* Revert "Logical decoding of sequences"Tomas Vondra2022-04-07
| | | | | | | | | | | | | | | | | | | | | This reverts a sequence of commits, implementing features related to logical decoding and replication of sequences: - 0da92dc530c9251735fc70b20cd004d9630a1266 - 80901b32913ffa59bf157a4d88284b2b3a7511d9 - b779d7d8fdae088d70da5ed9fcd8205035676df3 - d5ed9da41d96988d905b49bebb273a9b2d6e2915 - a180c2b34de0989269fdb819bff241a249bf5380 - 75b1521dae1ff1fde17fda2e30e591f2e5d64b6a - 2d2232933b02d9396113662e44dca5f120d6830e - 002c9dd97a0c874fd1693a570383e2dd38cd40d5 - 05843b1aa49df2ecc9b97c693b755bd1b6f856a9 The implementation has issues, mostly due to combining transactional and non-transactional behavior of sequences. It's not clear how this could be fixed, but it'll require reworking significant part of the patch. Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com
* Fix off-by-one error in pg_waldump, introduced in 5c279a6d350.Jeff Davis2022-04-07
| | | | | | Per report by Bharath Rupireddy. Discussion: https://postgr.es/m/CALj2ACX+PWDK2MYjdu8CB1ot7OUSo6kd5-fkkEgduEsTSZjAEw@mail.gmail.com
* Fix another buildfarm issue from commit 5c279a6d350.Jeff Davis2022-04-07
| | | | Discussion: https://postgr.es/m/3280724.1649340315@sss.pgh.pa.us
* Unlogged sequencesPeter Eisentraut2022-04-07
| | | | | | | | | | | | | | | | | | Add support for unlogged sequences. Unlike for unlogged tables, this is not a performance feature. It allows sequences associated with unlogged tables to be excluded from replication. A new subcommand ALTER SEQUENCE ... SET LOGGED/UNLOGGED is added. An identity/serial sequence now automatically gets and follows the persistence level (logged/unlogged) of its owning table. (The sequences owned by temporary tables were already temporary through the separate mechanism in RangeVarAdjustRelationPersistence().) But you can still change the persistence of an owned sequence separately. Also, pg_dump and pg_upgrade preserve the persistence of existing sequences. Discussion: https://www.postgresql.org/message-id/flat/04e12818-2f98-257c-b926-2845d74ed04f%402ndquadrant.com
* Fix typo in xlogrecovery.c code commentDaniel Gustafsson2022-04-07
| | | | | Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/CALj2ACUoPtnReT=yAQMcWLtcCpk7p83xjeA8tiRX8Q0_sjh8kw@mail.gmail.com
* Include some missing headers.Thomas Munro2022-04-07
| | | | | | Per headerscheck on BF animal crake, and Andres. Discussion: https://postgr.es/m/20220407083630.n62vgwqfy2v6wsrd%40alap3.anarazel.de
* pgstat: add alternate output for stats.spec, for the 2PC disabled case.Andres Freund2022-04-07
| | | | | | | | It might be worth instead splitting the test up to produce a smaller alternative output file. But that's not trivial either, due to the number of steps defined. And more than I want to do tonight. Per buildfarm.
* Try to silence "-Wmissing-braces" complaints in rmgrdesc.c.Andres Freund2022-04-07
| | | | | | Per buildfarm member lapwing. https://postgr.es/m/20220407065640.xljttqcs46k4lyvr@alap3.anarazel.de
* Prefetch data referenced by the WAL, take II.Thomas Munro2022-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new GUC recovery_prefetch. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Since not all OSes have that system call, "try" is provided so that it can be enabled where available. Better mechanisms for asynchronous I/O are possible in later work. Set to "try" for now for test coverage. Default setting to be finalized before release. The GUC wal_decode_buffer_size limits the distance we can look ahead in bytes of decoded data. The existing GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os allowed, based on pessimistic heuristics used to infer that I/Os have begun and completed. We'll also not look more than maintenance_io_concurrency * 4 block references ahead. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version) Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version) Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
* Fix warning introduced in 5c279a6d350.Jeff Davis2022-04-07
| | | | | | | | | Change two macros to be static inline functions instead to keep the data type consistent. This avoids a "comparison is always true" warning that was occurring with -Wtype-limits. In the process, change the names to look less like macros. Discussion: https://postgr.es/m/20220407063505.njnnrmbn4sxqfsts@alap3.anarazel.de
* pgstat: add tests for transaction behaviour, 2PC, function stats.Andres Freund2022-04-07
| | | | | | Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: add pg_stat_have_stats() test helper.Andres Freund2022-04-07
| | | | | | | | | | Will be used by tests committed subsequently. Bumps catversion (this time for real, the one in 0f96965c658 got lost when rebasing over 5c279a6d350). Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_aNxL1WegCa45r=VAViCLnpOU7uNC7bTtGw+=QAPyYivw@mail.gmail.com
* pgstat: add pg_stat_force_next_flush(), use it to simplify tests.Andres Freund2022-04-06
| | | | | | | | | | | | | | | In the stats collector days it was hard to write tests for the stats system, because fundamentally delivery of stats messages over UDP was not synchronous (nor guaranteed). Now we easily can force pending stats updates to be flushed synchronously. This moves stats.sql into a parallel group, there isn't a reason for it to run in isolation anymore. And it may shake out some bugs. Bumps catversion. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: fix small bug in pgstat_drop_relation().Andres Freund2022-04-06
| | | | | | | | | | Just after committing 5891c7a8ed8, a test running with debug_discard_caches=1 failed locally... pgstat_drop_relation() neither checked pgstat_should_count_relation() nor called pgstat_prep_relation_pending(). With debug_discard_caches=1 rel->pgstat_info wasn't set up, leading pg_stat_get_xact_tuples_inserted() spuriously still returning > 0 while in the transaction dropping the table.
* pgstat: prevent fix pgstat_reinit_entry() from zeroing out lwlock.Andres Freund2022-04-06
| | | | | | | | | | | Zeroing out an lwlock in a normal build turns out to not trigger any alarms, if nobody can use the lwlock at that moment (as the case here). But with --disable-spinlocks --disable-atomics, the sema field needs to be initialized. We probably should make sure that this fails on more common configurations as well... Per buildfarm animal rorqual
* Fix compilation with WAL_DEBUG.Andres Freund2022-04-06
| | | | | | Broke with 5c279a6d350. But looks like it had been half-broken since 70e81861fad, because 'rmid' didn't refer to the current record's rmid anymore, but to rmid from "Initialize resource managers" - a constant.
* Custom WAL Resource Managers.Jeff Davis2022-04-06
| | | | | | | | | | | | | Allow extensions to specify a new custom resource manager (rmgr), which allows specialized WAL. This is meant to be used by a Table Access Method or Index Access Method. Prior to this commit, only Generic WAL was available, which offers support for recovery and physical replication but not logical replication. Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com
* Add single-item cache when looking at topmost XID of a subtrans XIDMichael Paquier2022-04-07
| | | | | | | | | | | | This change affects SubTransGetTopmostTransaction(), used to find the topmost transaction ID of a given transaction ID. The cache is able to store one value, so as we can save the backend from unnecessary lookups at pg_subtrans/ on repetitive calls of this routine. There is a similar practice in transam.c, for example. Author: Simon Riggs Reviewed-by: Andrey Borodin, Julien Rouhaud Discussion: https://postgr.es/m/CANbhV-G8Co=yq4v4BkW7MJDqVt68K_8A48nAZ_+8UQS7LrwLEQ@mail.gmail.com
* pgstat: move pgstat.c to utils/activity.Andres Freund2022-04-06
| | | | | | | | Now that pgstat is not related to postmaster anymore, src/backend/postmaster is not a well fitting directory. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: rename STATS_COLLECTOR GUC group to STATS_CUMULATIVE.Andres Freund2022-04-06
| | | | | | Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: remove stats_temp_directory.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | With stats now being stored in shared memory, the GUC isn't needed anymore. However, the pg_stat_tmp directory and PG_STAT_TMP_DIR define are kept, as pg_stat_statements (and some out-of-core extensions) store data in it. Docs will be updated in a subsequent commit, together with the other pending docs updates due to shared memory stats. Author: Andres Freund <andres@anarazel.de> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220330233550.eiwsbearu6xhuqwe@alap3.anarazel.de Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: store statistics in shared memory.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the statistics collector received statistics updates via UDP and shared statistics data by writing them out to temporary files regularly. These files can reach tens of megabytes and are written out up to twice a second. This has repeatedly prevented us from adding additional useful statistics. Now statistics are stored in shared memory. Statistics for variable-numbered objects are stored in a dshash hashtable (backed by dynamic shared memory). Fixed-numbered stats are stored in plain shared memory. The header for pgstat.c contains an overview of the architecture. The stats collector is not needed anymore, remove it. By utilizing the transactional statistics drop infrastructure introduced in a prior commit statistics entries cannot "leak" anymore. Previously leaked statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On systems with many small relations pgstat_vacuum_stat() could be quite expensive. Now that replicas drop statistics entries for dropped objects, it is not necessary anymore to reset stats when starting from a cleanly shut down replica. Subsequent commits will perform some further code cleanup, adapt docs and add tests. Bumps PGSTAT_FILE_FORMAT_ID. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Justin Pryzby <pryzby@telsasoft.com> Reviewed-By: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-By: Tomas Vondra <tomas.vondra@2ndquadrant.com> (in a much earlier version) Reviewed-By: Arthur Zakirov <a.zakirov@postgrespro.ru> (in a much earlier version) Reviewed-By: Antonin Houska <ah@cybertec.at> (in a much earlier version) Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de Discussion: https://postgr.es/m/20220308205351.2xcn6k4x5yivcxyd@alap3.anarazel.de Discussion: https://postgr.es/m/20210319235115.y3wz7hpnnrshdyv6@alap3.anarazel.de
* pgstat: normalize function naming.Andres Freund2022-04-06
| | | | | | Most of pgstat uses pgstat_<verb>_<subject>() or just <verb>_<subject>(). But not all (some introduced fairly recently by me). Rename ones that aren't intentionally following a different scheme (e.g. AtEOXact_*).
* Reorder subskiplsn in pg_subscription to avoid alignment issues.Amit Kapila2022-04-07
| | | | | | | | | | | | | | | | | | | | The column 'subskiplsn' uses TYPALIGN_DOUBLE (which has 4 bytes alignment on AIX) for storage. But the C Struct (Form_pg_subscription) has 8-byte alignment for this field, so retrieving it from storage causes an unaligned read. To fix this, we rearranged the 'subskiplsn' column in the catalog so that it naturally comes at an 8-byte boundary. We have fixed a similar problem in commit f3b421da5f. This patch adds a test to avoid a similar mistake in the future. Reported-by: Noah Misch Diagnosed-by: Noah Misch, Masahiko Sawada, Amit Kapila Author: Masahiko Sawada Reviewed-by: Noah Misch, Amit Kapila Discussion: https://postgr.es/m/20220401074423.GC3682158@rfd.leadboat.com https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com
* pgstat: revise replication slot API in preparation for shared memory stats.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | Previously the pgstat <-> replication slots API was done with on the basis of names. However, the upcoming move to storing stats in shared memory makes it more convenient to use a integer as key. Change the replication slot functions to take the slot rather than the slot name, and expose ReplicationSlotIndex() to compute the index of an replication slot. Special handling will be required for restarts, as the index is not stable across restarts. For now pgstat internally still uses names. Rename pgstat_report_replslot_{create,drop}() to pgstat_{create,drop}_replslot() to match the functions for other kinds of stats. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
* pgstat: scaffolding for transactional stats creation / drop.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One problematic part of the current statistics collector design is that there is no reliable way of getting rid of statistics entries. Because of that pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the current database with the catalog contents and tries to drop now-superfluous entries. That's quite expensive. What's worse, it doesn't work on physical replicas, despite physical replicas collection statistics entries. This commit introduces infrastructure to create / drop statistics entries transactionally, together with the underlying catalog objects (functions, relations, subscriptions). pgstat_xact.c maintains a list of stats entries created / dropped transactionally in the current transaction. To ensure the removal of statistics entries is durable dropped statistics entries are included in commit / abort (and prepare) records, which also ensures that stats entries are dropped on standbys. Statistics entries created separately from creating the underlying catalog object (e.g. when stats were previously lost due to an immediate restart) are *not* WAL logged. However that can only happen outside of the transaction creating the catalog object, so it does not lead to "leaked" statistics entries. For this to work, functions creating / dropping functions / relations / subscriptions need to call into pgstat. For subscriptions this was already done when dropping subscriptions, via pgstat_report_subscription_drop() (now renamed to pgstat_drop_subscription()). This commit does not actually drop stats yet, it just provides the infrastructure. It is however a largely independent piece of infrastructure, so committing it separately makes sense. Bumps XLOG_PAGE_MAGIC. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de
* pgstat: prepare APIs used by pgstatfuncs for shared memory stats.Andres Freund2022-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of PgStat_Kind PgStat_Single_Reset_Type, PgStat_Shared_Reset_Target don't make sense anymore. Replace them with PgStat_Kind. Instead of having dedicated reset functions for different kinds of stats, use two generic helper routines (one to reset all stats of a kind, one to reset one stats entry). A number of reset functions were named pgstat_reset_*_counter(), despite affecting multiple counters. The generic helper routines get rid of pgstat_reset_single_counter(), pgstat_reset_subscription_counter(). Rename pgstat_reset_slru_counter(), pgstat_reset_replslot_counter() to pgstat_reset_slru(), pgstat_reset_replslot() respectively, and have them only deal with a single SLRU/slot. Resetting all SLRUs/slots goes through the generic pgstat_reset_of_kind(). Previously pg_stat_reset_replication_slot() used SearchNamedReplicationSlot() to check if a slot exists. API wise it seems better to move that to pgstat_replslot.c. This is done separately from the - quite large - shared memory statistics patch to make review easier. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
* pgstat: introduce PgStat_Kind enum.Andres Freund2022-04-06
| | | | | | | | Will be used by following commits to generalize stats infrastructure. Kept separate to allow commits stand reasonably on their own. Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20220404041516.cctrvpadhuriawlq@alap3.anarazel.de
* Add option --config-file to pg_rewindMichael Paquier2022-04-07
| | | | | | | | | | | | | | | | | | | | | This option is useful to do a rewind with the server configuration file (aka postgresql.conf) located outside the data directory, which is something that some Linux distributions and some HA tools like to rely on. As a result, this can simplify the logic around a rewind by avoiding the copy of such files before running pg_rewind. This option affects pg_rewind when it internally starts the target cluster with some "postgres" commands, adding -c config_file=FILE to the command strings generated, when: - retrieving a restore_command using a "postgres -C" command for -c/--restore-target-wal. - forcing crash recovery once to get the cluster into a clean shutdown state. Author: Gunnar "Nick" Bluth Reviewed-by: Michael Banck, Alexander Kukushkin, Michael Paquier, Alexander Alekseev Discussion: https://postgr.es/m/7c59265d-ac50-b0aa-ca1e-65e8bd27642a@pro-open.de