postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Hide internal error for pg_collation_actual_version(<bad OID>).	Thomas Munro	2021-02-22
\| \| \| \| \| \| \| \| \|	Instead of an unsightly internal "cache lookup failed" message, just return NULL for bad OIDs, as is the convention for other similar things. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
*	Initialize atomic variable waitStart in PGPROC, at postmaster startup.	Fujii Masao	2021-02-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 46d6e5f567 added the atomic variable "waitStart" into PGPROC struct, to store the time at which wait for lock acquisition started. Previously this variable was initialized every time each backend started. Instead, this commit makes postmaster initialize it at the startup, to ensure that the variable should be initialized before any use of it. This commit also moves the code to initialize "waitStart" variable for prepare transaction, from TwoPhaseGetDummyProc() to MarkAsPreparingGuts(). Because MarkAsPreparingGuts() is more proper place to do that since it initializes other PGPROC variables. Author: Fujii Masao Reviewed-by: Atsushi Torikoshi Discussion: https://postgr.es/m/1df88660-6f08-cc6e-b7e2-f85296a2bdab@oss.nttdata.com
*	Improve new hash partition bound check error messages	Peter Eisentraut	2021-02-22
\| \| \| \| \| \| \| \| \| \| \|	For the error message "every hash partition modulus must be a factor of the next larger modulus", add a detail message that shows the particular numbers and existing partition involved. Also comment the code more. Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/bb9d60b4-aadb-607a-1a9d-fdc3434dddcd%40enterprisedb.com
*	Use pgstat_progress_update_multi_param() where possible	Michael Paquier	2021-02-22
\| \| \| \| \| \| \| \| \| \| \| \|	This commit changes one code path in REINDEX INDEX and one code path in CREATE INDEX CONCURRENTLY to report the progress of each operation using pgstat_progress_update_multi_param() rather than multiple calls to pgstat_progress_update_param(). This has the advantage to make the progress report more consistent to the end-user without impacting the amount of information provided. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACV5zW7GxD8D_tyO==bcj6ZktQchEKWKPBOAGKiLhAQo=w@mail.gmail.com
*	Remove outdated reference to RAID spindles.	Thomas Munro	2021-02-22
\| \| \| \| \| \| \| \| \| \| \|	Commit b09ff536 left behind some outdated advice in the long_desc field of the GUC "effective_io_concurrency". Remove it. Back-patch to 13. Reported-by: Andrew Gierth <andrew@tao11.riddles.org.uk> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJyyWqFBxL9gEj-qtjBThGjhAOBE8GBnF8MUJOJ3vrfag%40mail.gmail.com
*	Simplify memory management for regex DFAs a little.	Tom Lane	2021-02-21
\| \| \| \| \| \| \|	Coverity complained that functions in regexec.c might leak DFA storage. It's wrong, but this logic is confusing enough that it's not so surprising Coverity couldn't make sense of it. Rewrite in hopes of making it more legible to humans as well as machines.
*	Avoid generating extra subre tree nodes for capturing parentheses.	Tom Lane	2021-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, each pair of capturing parentheses gave rise to a separate subre tree node, whose only function was to identify that we ought to capture the match details for this particular sub-expression. In most cases we don't really need that, since we can perfectly well put a "capture this" annotation on the child node that does the real matching work. As with the two preceding commits, the main value of this is to avoid generating and optimizing an NFA for a tree node that's not really pulling its weight. The chosen data representation only allows one capture annotation per subre node. In the legal-per-spec, but seemingly not very useful, case where there are multiple capturing parens around the exact same bit of the regex (i.e. "((xyz))"), wrap the child node in N-1 capture nodes that act the same as before. We could work harder at that but I'll refrain, pending some evidence that such cases are worth troubling over. In passing, improve the comments in regex.h to say what all the different re_info bits mean. Some of them were pretty obvious but others not so much, so reverse-engineer some documentation. This is part of a patch series that in total reduces the regex engine's runtime by about a factor of four on a large corpus of real-world regexes. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
*	Convert regex engine's subre tree from binary to N-ary style.	Tom Lane	2021-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of having left and right child links in subre structs, have a single child link plus a sibling link. Multiple children of a tree node are now reached by chasing the sibling chain. The beneficiary of this is alternation tree nodes. A regular expression with N (>1) branches is now represented by one alternation node with N children, rather than a tree that includes N alternation nodes as well as N children. While the old representation didn't really cost anything extra at execution time, it was pretty horrid for compilation purposes, because each of the alternation nodes had its own NFA, which we were too stupid not to separately optimize. (To make matters worse, all of those NFAs described the entire alternation pattern, not just the portion of it that one might expect from the tree structure.) We continue to require concatenation nodes to have exactly two children. This data structure is now prepared to support more, but the executor's logic would need some careful redesign, and it's not clear that a lot of benefit could be had. This is part of a patch series that in total reduces the regex engine's runtime by about a factor of four on a large corpus of real-world regexes. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
*	Fix regex engine to suppress useless concatenation sub-REs.	Tom Lane	2021-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The comment for parsebranch() claims that it avoids generating unnecessary concatenation nodes in the "subre" tree, but it missed some significant cases. Once we've decided that a given atom is "messy" and can't be bundled with the preceding atom(s) of the current regex branch, parseqatom() always generated two new concat nodes, one to concat the messy atom to what follows it in the branch, and an upper node to concatenate the preceding part of the branch to that one. But one or both of these could be unnecessary, if the messy atom is the first, last, or only one in the branch. Improve the code to suppress such useless concat nodes, along with the no-op child nodes representing empty chunks of a branch. Reducing the number of subre tree nodes offers significant savings not only at execution but during compilation, because each subre node has its own NFA that has to be separately optimized. (Maybe someday we'll figure out how to share the optimization work across multiple tree nodes, but it doesn't look easy.) Eliminating upper tree nodes is especially useful because they tend to have larger NFAs. This is part of a patch series that in total reduces the regex engine's runtime by about a factor of four on a large corpus of real-world regexes. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
*	Recognize "match-all" NFAs within the regex engine.	Tom Lane	2021-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This builds on the previous "rainbow" patch to detect NFAs that will match any string, though possibly with constraints on the string length. This definition is chosen to match constructs such as ".*", ".+", and ".{1,100}". Recognizing such an NFA after the optimization pass is fairly cheap, since we basically just have to verify that all arcs are RAINBOW arcs and count the number of steps to the end state. (Well, there's a bit of complication with pseudo-color arcs for string boundary conditions, but not much.) Once we have these markings, the regex executor functions longest(), shortest(), and matchuntil() don't have to expend per-character work to determine whether a given substring satisfies such an NFA; they just need to check its length against the bounds. Since some matching problems require O(N) invocations of these functions, we've reduced the runtime for an N-character string from O(N^2) to O(N). Of course, this is no help for non-matchall sub-patterns, but those usually have constraints that allow us to avoid needing O(N) substring checks in the first place. It's precisely the unconstrained "match-all" cases that cause the most headaches. This is part of a patch series that in total reduces the regex engine's runtime by about a factor of four on a large corpus of real-world regexes. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
*	Invent "rainbow" arcs within the regex engine.	Tom Lane	2021-02-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some regular expression constructs, most notably the "." match-anything metacharacter, produce a sheaf of parallel NFA arcs covering all possible colors (that is, character equivalence classes). We can make a noticeable improvement in the space and time needed to process large regexes by replacing such cases with a single arc bearing the special color code "RAINBOW". This requires only minor additional complication in places such as pull() and push(). Callers of pg_reg_getoutarcs() must now be prepared for the possibility of seeing a RAINBOW arc. For the one known user, contrib/pg_trgm, that's a net benefit since it cuts the number of arcs to be dealt with, and the handling isn't any different than for other colors that contain too many characters to be dealt with individually. This is part of a patch series that in total reduces the regex engine's runtime by about a factor of four on a large corpus of real-world regexes. Patch by me, reviewed by Joel Jacobson Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
*	Fix inconsistent configure data for --with-ssl	Michael Paquier	2021-02-20
\| \| \| \| \| \| \| \|	This inconsistency was showing up after an autoreconf. Reported-by: Antonin Houska Reviewed-by: Tom Lane Discussion: https://postgr.es/m/47255.1613716807@antos
*	Fix psql's ON_ERROR_ROLLBACK so that it handles COMMIT AND CHAIN.	Fujii Masao	2021-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When ON_ERROR_ROLLBACK is enabled, psql releases a temporary savepoint if it's idle in a valid transaction block after executing a query. But psql doesn't do that after RELEASE or ROLLBACK is executed because a temporary savepoint has already been destroyed in that case. This commit changes psql's ON_ERROR_ROLLBACK so that it doesn't release a temporary savepoint also when COMMIT AND CHAIN is executed. A temporary savepoint doesn't need to be released in that case because COMMIT AND CHAIN also destroys any savepoints defined within the transaction to commit. Otherwise psql tries to release the savepoint that COMMIT AND CHAIN has already destroyed and cause an error "ERROR: savepoint "pg_psql_temporary_savepoint" does not exist". Back-patch to v12 where transaction chaining was added. Reported-by: Arthur Nascimento Author: Arthur Nascimento Reviewed-by: Fujii Masao, Vik Fearing Discussion: https://postgr.es/m/16867-3475744069228158@postgresql.org
*	Fix bug in COMMIT AND CHAIN command.	Fujii Masao	2021-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes COMMIT AND CHAIN command so that it starts new transaction immediately even if savepoints are defined within the transaction to commit. Previously COMMIT AND CHAIN command did not in that case because commit 280a408b48 forgot to make CommitTransactionCommand() handle a transaction chaining when the transaction state was TBLOCK_SUBCOMMIT. Also this commit adds the regression test for COMMIT AND CHAIN command when savepoints are defined. Back-patch to v12 where transaction chaining was added. Reported-by: Arthur Nascimento Author: Fujii Masao Reviewed-by: Arthur Nascimento, Vik Fearing Discussion: https://postgr.es/m/16867-3475744069228158@postgresql.org
*	Update snowball	Peter Eisentraut	2021-02-19
\| \| \| \| \|	Update to snowball tag v2.1.0. Major changes are new stemmers for Armenian, Serbian, and Yiddish.
*	Add nbtree README section on page recycling.	Peter Geoghegan	2021-02-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consolidate discussion of how VACUUM places pages in the FSM for recycling by adding a new section that comes after discussion of page deletion. This structure reflects the fact that page recycling is explicitly decoupled from page deletion in Lanin & Shasha's paper. Page recycling in nbtree is an implementation of what the paper calls "the drain technique". This decoupling is an important concept for nbtree VACUUM. Searchers have to detect and recover from concurrent page deletions, but they will never have to reason about concurrent page recycling. Recycling can almost always be thought of as a low level garbage collection operation that asynchronously frees the physical space that backs a logical tree node. Almost all code need only concern itself with logical tree nodes. (Note that "logical tree node" is not currently a term of art in the nbtree code -- this all works implicitly.) This is preparation for an upcoming patch that teaches nbtree VACUUM to remember the details of pages that it deletes on the fly, in local memory. This enables the same VACUUM operation to consider placing its own deleted pages in the FSM later on, when it reaches the end of btvacuumscan().
*	Fix another ancient bug in parsing of BRE-mode regular expressions.	Tom Lane	2021-02-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	While poking at the regex code, I happened to notice that the bug squashed in commit afcc8772e had a sibling: next() failed to return a specific value associated with the '}' token for a "\{m,n\}" quantifier when parsing in basic RE mode. Again, this could result in treating the quantifier as non-greedy, which it never should be in basic mode. For that to happen, the last character before "\}" that sets "nextvalue" would have to set it to zero, or it'd have to have accidentally been zero from the start. The failure can be provoked repeatably with, for example, a bound ending in digit "0". Like the previous patch, back-patch all the way.
*	Fix "invalid spinlock number: 0" error in pg_stat_wal_receiver.	Fujii Masao	2021-02-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 2c8dd05d6c added the atomic variable writtenUpto into walreceiver's shared memory information. It's initialized only when walreceiver started up but could be read via pg_stat_wal_receiver view anytime, i.e., even before it's initialized. In the server built with --disable-atomics and --disable-spinlocks, this uninitialized atomic variable read could cause "invalid spinlock number: 0" error. This commit changed writtenUpto so that it's initialized at the postmaster startup, to avoid the uninitialized variable read via pg_stat_wal_receiver and fix the error. Also this commit moved the read of writtenUpto after the release of spinlock protecting walreceiver's shared variables. This is necessary to prevent new spinlock from being taken by atomic variable read while holding another spinlock, and to shorten the spinlock duration. This change leads writtenUpto not to be consistent with the other walreceiver's shared variables protected by a spinlock. But this is OK because writtenUpto should not be used for data integrity checks. Back-patch to v13 where commit 2c8dd05d6c introduced the bug. Author: Fujii Masao Reviewed-by: Michael Paquier, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/7ef8708c-5b6b-edd3-2cf2-7783f1c7c175@oss.nttdata.com
*	Add tests for bytea LIKE operator	Peter Eisentraut	2021-02-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add test coverage for the following operations, which were previously not tested at all: bytea LIKE bytea (bytealike) bytea NOT LIKE bytea (byteanlike) ESCAPE clause for the above (like_escape_bytea) also name NOT ILIKE text (nameicnlike) Discussion: https://www.postgresql.org/message-id/flat/4d13563a-2c8d-fd91-20d5-e71b7a4eaa87%40enterprisedb.com
*	Allow specifying CRL directory	Peter Eisentraut	2021-02-18
\| \| \| \| \| \| \| \| \| \| \| \|	Add another method to specify CRLs, hashed directory method, for both server and client side. This offers a means for server or libpq to load only CRLs that are required to verify a certificate. The CRL directory is specifed by separate GUC variables or connection options ssl_crl_dir and sslcrldir, alongside the existing ssl_crl_file and sslcrl, so both methods can be used at the same time. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/20200731.173911.904649928639357911.horikyota.ntt@gmail.com
*	nbtree README: move VACUUM linear scan section.	Peter Geoghegan	2021-02-17
\| \| \| \| \| \| \| \| \|	Discuss VACUUM's linear scan after discussion of tuple deletion by VACUUM, but before discussion of page deletion by VACUUM. This progression is a lot more natural. Also tweak the wording a little. It seems unnecessary to talk about how it worked prior to PostgreSQL 8.2.
*	Fix tuple routing to initialize batching only for inserts	Tomas Vondra	2021-02-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A cross-partition update on a partitioned table is implemented as a delete followed by an insert. With foreign partitions, this was however causing issues, because the FDW and core may disagree on when to enable batching. postgres_fdw was only allowing batching for plain inserts (CMD_INSERT) while core was trying to batch the insert component of the cross-partition update. Fix by restricting core to apply batching only to plain CMD_INSERT queries. It's possible to allow batching for cross-partition updates, but that will require more extensive changes, so better to leave that for a separate patch. Author: Amit Langote Reviewed-by: Tomas Vondra, Takayuki Tsunakawa Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development
*	Make some minor improvements in the regex code.	Tom Lane	2021-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Push some hopefully-uncontroversial bits extracted from an upcoming patch series, to remove non-relevant clutter from the main patches. In compact(), return immediately after setting REG_ASSERT error; continuing the loop would just lead to assertion failure below. (Ask me how I know.) In parseqatom(), remove assertion that moresubs() did its job. When moresubs actually did its job, this is redundant with that function's final assert; but when it failed on OOM, this is an assertion crash. We could avoid the crash by adding a NOERR() check before the assertion, but it seems better to subtract code than add it. (Note that there's a NOERR exit a few lines further down, and nothing else between here and there requires moresubs to have succeeded. So we don't really need an extra error exit.) This is a live bug in assert-enabled builds, but given the very low likelihood of OOM in moresub's tiny allocation, I don't think it's worth back-patching. On the other hand, it seems worthwhile to add an assertion that our intended v->subs[subno] target is still null by the time we are ready to insert into it, since there's a recursion in between. In pg_regexec, ensure we fflush any debug output on the way out, and try to make MDEBUG messages more uniform and helpful. (In particular, ensure that all of them are prefixed with the subre's id number, so one can match up entry and exit reports.) Add some test cases in test_regex to improve coverage of lookahead and lookbehind constraints. Adding these now is mainly to establish that this is indeed the existing behavior. Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
*	Routine usage information schema tables	Peter Eisentraut	2021-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several information schema views track dependencies between functions/procedures and objects used by them. These had not been implemented so far because PostgreSQL doesn't track objects used in a function body. However, formally, these also show dependencies used in parameter default expressions, which PostgreSQL does support and track. So for the sake of completeness, we might as well add these. If dependency tracking for function bodies is ever implemented, these views will automatically work correctly. Reviewed-by: Erik Rijkers <er@xs4all.nl> Discussion: https://www.postgresql.org/message-id/flat/ac80fc74-e387-8950-9a31-2560778fc1e3%40enterprisedb.com
*	Use errmsg_internal for debug messages	Peter Eisentraut	2021-02-17
\| \| \| \| \| \|	An inconsistent set of debug-level messages was not using errmsg_internal(), thus uselessly exposing the messages to translation work. Fix those.
*	Add psql completion for [ NO ] DEPENDS ON EXTENSION	Michael Paquier	2021-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \|	ALTER INDEX was able to handle that already. This adds tab completion for all the remaining commands that support this grammar: - ALTER FUNCTION - ALTER PROCEDURE - ALTER ROUTINE - ALTER TRIGGER - ALTER MATERIALIZED VIEW Author: Ian Lawrence Barwick Discussion: https://postgr.es/m/CAB8KJ=iypYudXuMOAMOP4BpkaYbXxk=a2cdJppX0e9mJXWtuig@mail.gmail.com
*	Convert tsginidx.c's GIN indexing logic to fully ternary operation.	Tom Lane	2021-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 2f2007fbb did this partially, but there were two remaining warts. checkcondition_gin handled some uncertain cases by setting the out-of-band recheck flag, some by returning TS_MAYBE, and some by doing both. Meanwhile, TS_execute arbitrarily converted a TS_MAYBE result to TS_YES. Thus, if checkcondition_gin chose to only return TS_MAYBE, the outcome would be TS_YES with no recheck flag, potentially resulting in wrong query outputs. The case where this'd happen is if there were GIN_MAYBE entries in the indexscan results passed to gin_tsquery_[tri]consistent, which so far as I can see would only happen if the tidbitmap used to accumulate indexscan results grew large enough to become lossy. I initially thought of fixing this by ensuring we always set the recheck flag as well as returning TS_MAYBE in uncertain cases. But that errs in the other direction, potentially forcing rechecks of rows that provably match the query (since the recheck flag remains set even if TS_execute later finds that the answer must be TS_YES). Instead, let's get rid of the out-of-band recheck flag altogether and rely on returning TS_MAYBE. This requires exporting a version of TS_execute that will actually return the full ternary result of the evaluation ... but we likely should have done that to start with. Unfortunately it doesn't seem practical to add a regression test case that covers this: the amount of data needed to cause the GIN bitmap to become lossy results in a longer runtime than I think we want to have in the tests. (I'm wondering about allowing smaller work_mem settings to ameliorate that, but it'd be a matter for a separate patch.) Per bug #16865 from Dimitri Nüscheler. Back-patch to v13 where the faulty commit came in. Discussion: https://postgr.es/m/16865-4ffdc3e682e6d75b@postgresql.org
*	Remove the unnecessary PrepareWrite in pgoutput.	Amit Kapila	2021-02-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This issue exists from the inception of this code (PG-10) but got exposed by the recent commit ce0fdbfe97 where we are using origins in tablesync workers. The problem was that we were sometimes sending the prepare_write ('w') message but then the actual message was not being sent and on the subscriber side, we always expect a message after prepare_write message which led to this bug. I refrained from backpatching this because there is no way in the core code to hit this prior to commit ce0fdbfe97 and we haven't received any complaints so far. Reported-by: Erik Rijkers Author: Amit Kapila and Vignesh C Tested-by: Erik Rijkers Discussion: https://postgr.es/m/1295168140.139428.1613133237154@webmailclassic.xs4all.nl
*	Fix heap_page_prune() parameter order confusion introduced in dc7420c2c92.	Andres Freund	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \|	Both luckily and unluckily the passed values meant the same for all types. Luckily because that meant my confusion caused no harm, unluckily because otherwise the compiler might have warned... In passing, synchronize parameter names between definition and declaration. Reported-By: Peter Geoghegan <pg@bowt.ie> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wz=L=nBoepQdH9b5Qd0nMvepFT2CnT6sjWvvpOXa=K8HVQ@mail.gmail.com
*	Remove backwards compat ugliness in snapbuild.c.	Andres Freund	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In 955a684e040 we fixed a bug in initial snapshot creation. In the course of which several members of struct SnapBuild were obsoleted. As SnapBuild is serialized to disk we couldn't change the memory layout. Unfortunately I subsequently forgot about removing the backward compat gunk, but luckily Heikki just reminded me. This commit bumps SNAPBUILD_VERSION, therefore breaking existing slots (which is fine in a major release). Author: Andres Freund Reminded-By: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/c94be044-818f-15e3-1ad3-7a7ae2dfed0a@iki.fi
*	Simplify loop logic in nodeIncrementalSort.c.	Tom Lane	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inner loop in switchToPresortedPrefixMode() can be implemented as a conventional integer-counter for() loop, removing a couple of redundant boolean state variables. The old logic here was a remnant of earlier development, but as things now stand there's no reason for extra complexity. Also, annotate the test case added by 82e0e2930 to explain why it manages to hit the corner case fixed in that commit, and add an EXPLAIN to verify that it's creating an incremental-sort plan. Back-patch to v13, like the previous patch. James Coleman and Tom Lane Discussion: https://postgr.es/m/16846-ae49f51ac379a4cb@postgresql.org
*	Make ExecGetInsertedCols() and friends more robust and improve comments.	Heikki Linnakangas	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If ExecGetInsertedCols(), ExecGetUpdatedCols() or ExecGetExtraUpdatedCols() were called with a ResultRelInfo that's not in the range table and isn't a partition routing target, the functions would dereference a NULL pointer, relinfo->ri_RootResultRelInfo. Such ResultRelInfos are created when firing RI triggers in tables that are not modified directly. None of the current callers of these functions pass such relations, so this isn't a live bug, but let's make them more robust. Also update comment in ResultRelInfo; after commit 6214e2b228, ri_RangeTableIndex is zero for ResultRelInfos created for partition tuple routing. Noted by Coverity. Backpatch down to v11, like commit 6214e2b228. Reviewed-by: Tom Lane, Amit Langote
*	Display the time when the process started waiting for the lock, in pg_locks, ↵	Fujii Masao	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	take 2 This commit adds new column "waitstart" into pg_locks view. This column reports the time when the server process started waiting for the lock if the lock is not held. This information is useful, for example, when examining the amount of time to wait on a lock by subtracting "waitstart" in pg_locks from the current time, and identify the lock that the processes are waiting for very long. This feature uses the current time obtained for the deadlock timeout timer as "waitstart" (i.e., the time when this process started waiting for the lock). Since getting the current time newly can cause overhead, we reuse the already-obtained time to avoid that overhead. Note that "waitstart" is updated without holding the lock table's partition lock, to avoid the overhead by additional lock acquisition. This can cause "waitstart" in pg_locks to become NULL for a very short period of time after the wait started even though "granted" is false. This is OK in practice because we can assume that users are likely to look at "waitstart" when waiting for the lock for a long time. The first attempt of this patch (commit 3b733fcd04) caused the buildfarm member "rorqual" (built with --disable-atomics --disable-spinlocks) to report the failure of the regression test. It was reverted by commit 890d2182a2. The cause of this failure was that the atomic variable for "waitstart" in the dummy process entry created at the end of prepare transaction was not initialized. This second attempt fixes that issue. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com
*	Adjust lazy_scan_heap() accounting comments.	Peter Geoghegan	2021-02-14
\| \| \| \| \|	Explain which particular LP_DEAD line pointers get accounted for by the tups_vacuumed variable.
*	Default to wal_sync_method=fdatasync on FreeBSD.	Thomas Munro	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \|	FreeBSD 13 gained O_DSYNC, which would normally cause wal_sync_method to choose open_datasync as its default value. That may not be a good choice for all systems, and performs worse than fdatasync in some scenarios. Let's preserve the existing default behavior for now. Like commit 576477e73c4, which did the same for Linux, back-patch to all supported releases. Discussion: https://postgr.es/m/CA%2BhUKGLsAMXBQrCxCXoW-JsUYmdOL8ALYvaX%3DCrHqWxm-nWbGA%40mail.gmail.com
*	Use pg_pwrite() in pg_test_fsync.	Thomas Munro	2021-02-15
\| \| \| \| \| \| \| \| \| \|	For consistency with the PostgreSQL behavior this test program is intended to simulate, use pwrite() instead of lseek() + write(). Also fix the final "non-sync" test, which was opening and closing the file for every write. Discussion: https://postgr.es/m/CA%2BhUKGJjjid2BJsvjMALBTduo1ogdx2SPYaTQL3wAy8y2hc4nw%40mail.gmail.com
*	Fix the warnings introduced in commit ce0fdbfe97.	Amit Kapila	2021-02-15
\| \| \| \| \| \|	Author: Amit Kapila Reviewed-by: Tom Lane Discussion: https://postgr.es/m/1610789.1613170207@sss.pgh.pa.us
*	Hold interrupts while running dsm_detach() callbacks.	Thomas Munro	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While cleaning up after a parallel query or parallel index creation that created temporary files, we could be interrupted by a statement timeout. The error handling path would then fail to clean up the files when it ran dsm_detach() again, because the callback was already popped off the list. Prevent this hazard by holding interrupts while the cleanup code runs. Thanks to Heikki Linnakangas for this suggestion, and also to Kyotaro Horiguchi, Masahiko Sawada, Justin Pryzby and Tom Lane for discussion of this and earlier ideas on how to fix the problem. Back-patch to all supported releases. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20191212180506.GR2082@telsasoft.com
*	Add result size as argument of pg_cryptohash_final() for overflow checks	Michael Paquier	2021-02-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With its current design, a careless use of pg_cryptohash_final() could would result in an out-of-bound write in memory as the size of the destination buffer to store the result digest is not known to the cryptohash internals, without the caller knowing about that. This commit adds a new argument to pg_cryptohash_final() to allow such sanity checks, and implements such defenses. The internals of SCRAM for HMAC could be tightened a bit more, but as everything is based on SCRAM_KEY_LEN with uses particular to this code there is no need to complicate its interface more than necessary, and this comes back to the refactoring of HMAC in core. Except that, this minimizes the uses of the existing DIGEST_LENGTH variables, relying instead on sizeof() for the result sizes. In ossp-uuid, this also makes the code more defensive, as it already relied on dce_uuid_t being at least the size of a MD5 digest. This is in philosophy similar to cfc40d3 for base64.c and aef8948 for hex.c. Reported-by: Ranier Vilela Author: Michael Paquier, Ranier Vilela Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/CAEudQAoqEGmcff3J4sTSV-R_16Monuz-UpJFbf_dnVH=APr02Q@mail.gmail.com
*	Minor fixes to improve regex debugging code.	Tom Lane	2021-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When REG_DEBUG is defined, ensure that an un-filled "struct cnfa" is all-zeroes, not just that it has nstates == 0. This is mainly so that looking at "struct subre" structs in gdb doesn't distract one with a lot of garbage fields during regex compilation. Adjust some places that print debug output to have suitable fflush calls afterwards. In passing, correct an erroneous ancient comment: the concatenation subre-s created by parsebranch() have op == '.' not ','. Noted while fooling around with some regex performance improvements.
*	ReadNewTransactionId() -> ReadNextTransactionId().	Thomas Munro	2021-02-15
\| \| \| \| \| \| \| \| \|	The new name conveys the effect better, is more consistent with similar functions ReadNextMultiXactId(), ReadNextFullTransactionId(), and matches the name of the variable that it reads. Reported-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WzmVR4SakBXQUdhhPpMf1aYvZCnna5%3DHKa7DAgEmBAg%2B8g%40mail.gmail.com
*	README/C-comment: document GiST's NSN value	Bruce Momjian	2021-02-13
\|
*	Tweak compiler version cutoff for no_sanitize("alignment") support.	Tom Lane	2021-02-12
\| \| \| \| \| \| \| \| \|	Buildfarm results show that gcc up through 7.x produces annoying warnings for this construct (and, presumably, wouldn't do the right thing anyway). clang seems okay with the cutoff we have, though. Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us
*	Avoid divide-by-zero in regex_selectivity() with long fixed prefix.	Tom Lane	2021-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given a regex pattern with a very long fixed prefix (approaching 500 characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can underflow to zero. Typically the preceding selectivity calculation would have underflowed as well, so that we compute 0/0 and get NaN. In released branches this leads to an assertion failure later on. That doesn't happen in HEAD, for reasons I've not explored yet, but it's surely still a bug. To fix, just skip the division when the pow() result is zero, so that we'll (most likely) return a zero selectivity estimate. In the edge cases where "sel" didn't yet underflow, perhaps this isn't desirable, but I'm not sure that the case is worth spending a lot of effort on. The results of regex_selectivity_sub() are barely worth the electrons they're written on anyway :-( Per report from Alexander Lakhin. Back-patch to all supported versions. Discussion: https://postgr.es/m/6de0a0c3-ada9-cd0c-3e4e-2fa9964b41e3@gmail.com
*	pg_attribute_no_sanitize_alignment() macro	Alexander Korotkov	2021-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Modern gcc and clang compilers offer alignment sanitizers, which help to detect pointer misalignment. However, our codebase already contains x86-specific crc32 computation code, which uses unalignment access. Thankfully, those compilers also support the attribute, which disables alignment sanitizers at the function level. This commit adds pg_attribute_no_sanitize_alignment(), which wraps this attribute, and applies it to pg_comp_crc32c_sse42() function. Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us Author: Alexander Korotkov, revised by Tom Lane Reviewed-by: Tom Lane
*	Fix Subscription test added by commit ce0fdbfe97.	Amit Kapila	2021-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to test the variants of Alter Subscription that are not allowed in the transaction block but for that, we don't need to create a subscription that tries to connect to the publisher. As such, there is no problem with this test but it is good to allow such tests to run with wal_level = minimal and max_wal_senders = 0 so as to keep them consistent with other tests. Reported by buildfarm. Author: Amit Kapila Reviewed-by: Ajin Cherian Discussion: https://postgr.es/m/CAA4eK1Lw0V+e1JPGHDq=+hVACv=14H8sR+2eJ1k3PEgwKmU-jQ@mail.gmail.com
*	Allow multiple xacts during table sync in logical replication.	Amit Kapila	2021-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com
*	Remove obsolete IndexBulkDeleteResult stats field.	Peter Geoghegan	2021-02-11
\| \| \| \| \| \|	The pages_removed field is no longer used for anything. It hasn't been possible for an index to physically shrink since old-style VACUUM FULL was removed by commit 0a469c87.
*	Remove dead code in ECPGconnect(), and improve documentation.	Tom Lane	2021-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The stanza in ECPGconnect() that intended to allow specification of a Unix socket directory path in place of a port has never executed since it was committed, nearly two decades ago; the preceding strrchr() already found the last colon so there cannot be another one. The lack of complaints about that is doubtless related to the fact that no user-facing documentation suggested it was possible. Rather than try to fix that up, let's just remove the unreachable code, and instead document the way that does work to write a socket directory path, namely specifying it as a "host" option. In support of that, make another pass at clarifying the syntax documentation for ECPG connection targets, particularly documenting which things are parsed as identifiers and where to use double quotes. Rearrange some things that seemed poorly ordered, and fix a couple of minor doc errors. Kyotaro Horiguchi, per gripe from Shenhao Wang (docs changes mostly by me) Discussion: https://postgr.es/m/ae52a416bbbf459c96bab30b3038e06c@G08CNEXMBPEKD06.g08.fujitsu.local
*	Simplify jsonfuncs.c code by using strtoint() not strtol().	Tom Lane	2021-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Explicitly testing for INT_MIN and INT_MAX isn't particularly good style; it's tedious and may draw useless compiler warnings on machines where int and long are the same width. We invented strtoint() precisely for this usage, so use that instead. While here, remove gratuitous variations in the way the tests for did-strtoint-succeed were spelled. Also, avoid attempting to negate INT_MIN; that would probably work given that the result is implicitly cast to uint32, but I think it's nominally undefined behavior. Per gripe from Ranier Vilela, though this isn't his proposed patch. Discussion: https://postgr.es/m/CAEudQAqge3QfzoBRhe59QrB_5g+NmQUj2QpzqZ9Nc7QepXGAEw@mail.gmail.com