postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Fix the logical streaming test.	Amit Kapila	2020-08-08
\| \| \| \| \| \| \| \| \| \| \|	Commit 7259736a6e added the capability to stream changes in ReorderBuffer which has some tests to test the streaming mode. It is quite possible that while this test is running a parallel transaction could be logged by autovacuum. Such a transaction won't perform any insert/update/delete to non-catalog tables so will be shown as an empty transaction. Fix it by skipping the empty transactions during this test. Per report by buildfarm.
*	Implement streaming mode in ReorderBuffer.	Amit Kapila	2020-08-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of serializing the transaction to disk after reaching the logical_decoding_work_mem limit in memory, we consume the changes we have in memory and invoke stream API methods added by commit 45fdc9738b. However, sometimes if we have incomplete toast or speculative insert we spill to the disk because we can't generate the complete tuple and stream. And, as soon as we get the complete tuple we stream the transaction including the serialized changes. We can do this incremental processing thanks to having assignments (associating subxact with toplevel xacts) in WAL right away, and thanks to logging the invalidation messages at each command end. These features are added by commits 0bead9af48 and c55040ccd0 respectively. Now that we can stream in-progress transactions, the concurrent aborts may cause failures when the output plugin consults catalogs (both system and user-defined). We handle such failures by returning ERRCODE_TRANSACTION_ROLLBACK sqlerrcode from system table scan APIs to the backend or WALSender decoding a specific uncommitted transaction. The decoding logic on the receipt of such a sqlerrcode aborts the decoding of the current transaction and continue with the decoding of other transactions. We have ReorderBufferTXN pointer in each ReorderBufferChange by which we know which xact it belongs to. The output plugin can use this to decide which changes to discard in case of stream_abort_cb (e.g. when a subxact gets discarded). We also provide a new option via SQL APIs to fetch the changes being streamed. Author: Dilip Kumar, Tomas Vondra, Amit Kapila, Nikhil Sontakke Reviewed-by: Amit Kapila, Kuntal Ghosh, Ajin Cherian Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
*	Extend the logical decoding output plugin API with stream methods.	Amit Kapila	2020-07-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds seven methods to the output plugin API, adding support for streaming changes of large in-progress transactions. * stream_start * stream_stop * stream_abort * stream_commit * stream_change * stream_message * stream_truncate Most of this is a simple extension of the existing methods, with the semantic difference that the transaction (or subtransaction) is incomplete and may be aborted later (which is something the regular API does not really need to deal with). This also extends the 'test_decoding' plugin, implementing these new stream methods. The stream_start/start_stop are used to demarcate a chunk of changes streamed for a particular toplevel transaction. This commit simply adds these new APIs and the upcoming patch to "allow the streaming mode in ReorderBuffer" will use these APIs. Author: Tomas Vondra, Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma and Mahendra Singh Thalor Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
*	Replace superuser check by ACLs for replication origin functions	Michael Paquier	2020-06-14
\| \| \| \| \| \| \| \| \| \| \|	This patch removes the hardcoded check for superuser privileges when executing replication origin functions. Instead, execution is revoked from public, meaning that those functions can be executed by a superuser and that access to them can be granted. Author: Martín Marqués Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Masahiko Sawada Discussion: https:/postgr.es/m/CAPdiE1xJMZOKQL3dgHMUrPqysZkgwzSMXETfKkHYnBAB7-0VRQ@mail.gmail.com
*	Propagate ALTER TABLE ... SET STORAGE to indexes	Peter Eisentraut	2020-05-08
\| \| \| \| \| \| \| \| \|	When creating a new index, the attstorage setting of the table column is copied to regular (non-expression) index columns. But a later ALTER TABLE ... SET STORAGE is not propagated to indexes, thus creating an inconsistent and undumpable state. Discussion: https://www.postgresql.org/message-id/flat/9765d72b-37c0-06f5-e349-2a580aafd989%402ndquadrant.com
*	Check slot->restart_lsn validity in a few more places	Alvaro Herrera	2020-04-28
\| \| \| \| \| \| \| \| \| \| \| \| \|	Lack of these checks could cause visible misbehavior, including assertion failures. This was missed in commit c6550776394e, whereby restart_lsn becomes invalid when the size limit is exceeded. Also reword some existing error messages, and add errdetail(), so that the reported errors all match in spirit. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20200408.093710.447591748588426656.horikyota.ntt@gmail.com
*	Introduce xid8-based functions to replace txid_XXX.	Thomas Munro	2020-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The txid_XXX family of fmgr functions exposes 64 bit transaction IDs to users as int8. Now that we have an SQL type xid8 for FullTransactionId, define a new set of functions including pg_current_xact_id() and pg_current_snapshot() based on that. Keep the old functions around too, for now. It's a bit sneaky to use the same C functions for both, but since the binary representation is identical except for the signedness of the type, and since older functions are the ones using the wrong signedness, and since we'll presumably drop the older ones after a reasonable period of time, it seems reasonable to switch to FullTransactionId internally and share the code for both. Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Takao Fujii <btfujiitkp@oss.nttdata.com> Reviewed-by: Yoshikazu Imai <imai.yoshikazu@fujitsu.com> Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com> Discussion: https://postgr.es/m/20190725000636.666m5mad25wfbrri%40alap3.anarazel.de
*	Remove header noise from test_decoding test	Alvaro Herrera	2020-03-31
\| \| \| \| \| \| \| \|	Use psql's expanded output to avoid a pointless header. Kyotaro Horiguchi, after an idea of Michael Paquier Discussion: https://postgr.es/m/20181120050744.GJ4400@paquier.xyz
*	Stop demanding that top xact must be seen before subxact in decoding.	Amit Kapila	2020-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Manifested as ERROR: subtransaction logged without previous top-level txn record this check forbids legit behaviours like - First xl_xact_assignment record is beyond reading, i.e. earlier restart_lsn. - After restart_lsn there is some change of a subxact. - After that, there is second xl_xact_assignment (for another subxact) revealing the relationship between top and first subxact. Such a transaction won't be streamed anyway because we hadn't seen it in full. Saying for sure whether xact of some record encountered after the snapshot was deserialized can be streamed or not requires to know whether it wrote something before deserialization point --if yes, it hasn't been seen in full and can't be decoded. Snapshot doesn't have such info, so there is no easy way to relax the check. Reported-by: Hsu, John Diagnosed-by: Arseny Sher Author: Arseny Sher, Amit Kapila Reviewed-by: Amit Kapila, Dilip Kumar Backpatch-through: 9.5 Discussion: https://postgr.es/m/AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
*	Clean up newlines following left parentheses	Alvaro Herrera	2020-01-30
\| \| \| \| \| \| \| \| \| \| \| \|	We used to strategically place newlines after some function call left parentheses to make pgindent move the argument list a few chars to the left, so that the whole line would fit under 80 chars. However, pgindent no longer does that, so the newlines just made the code vertically longer for no reason. Remove those newlines, and reflow some of those lines for some extra naturality. Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
*	Update copyrights for 2020	Bruce Momjian	2020-01-01
\| \| \| \|	Backpatch-through: update all files in master, backpatch legal files through 9.4
*	Add logical_decoding_work_mem to limit ReorderBuffer memory usage.	Amit Kapila	2019-11-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of deciding to serialize a transaction merely based on the number of changes in that xact (toplevel or subxact), this makes the decisions based on amount of memory consumed by the changes. The memory limit is defined by a new logical_decoding_work_mem GUC, so for example we can do this SET logical_decoding_work_mem = '128kB' to reduce the memory usage of walsenders or set the higher value to reduce disk writes. The minimum value is 64kB. When adding a change to a transaction, we account for the size in two places. Firstly, in the ReorderBuffer, which is then used to decide if we reached the total memory limit. And secondly in the transaction the change belongs to, so that we can pick the largest transaction to evict (and serialize to disk). We still use max_changes_in_memory when loading changes serialized to disk. The trouble is we can't use the memory limit directly as there might be multiple subxact serialized, we need to read all of them but we don't know how many are there (and which subxact to read first). We do not serialize the ReorderBufferTXN entries, so if there is a transaction with many subxacts, most memory may be in this type of objects. Those records are not included in the memory accounting. We also do not account for INTERNAL_TUPLECID changes, which are kept in a separate list and not evicted from memory. Transactions with many CTID changes may consume significant amounts of memory, but we can't really do much about that. The current eviction algorithm is very simple - the transaction is picked merely by size, while it might be useful to also consider age (LSN) of the changes for example. With the new Generational memory allocator, evicting the oldest changes would make it more likely the memory gets actually pfreed. The logical_decoding_work_mem can be set in postgresql.conf, in which case it serves as the default for all publishers on that instance. Author: Tomas Vondra, with changes by Dilip Kumar and Amit Kapila Reviewed-by: Dilip Kumar and Amit Kapila Tested-By: Vignesh C Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
*	Message style fixes	Peter Eisentraut	2019-09-23
\|
*	Do more cleanup of isolation tests for test_decoding	Michael Paquier	2019-08-24
\| \| \| \| \|	989d23b has caused its tests to be broken as the module defines unused steps, turning the buildfarm red.
*	Fix inconsistencies and typos in the tree, take 9	Michael Paquier	2019-08-05
\| \| \| \| \| \| \| \|	This addresses more issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com
*	Use appendStringInfoString and appendPQExpBufferStr where possible	David Rowley	2019-07-04
\| \| \| \| \| \| \| \| \| \|	This changes various places where appendPQExpBuffer was used in places where it was possible to use appendPQExpBufferStr, and likewise for appendStringInfo and appendStringInfoString. This is really just a stylistic improvement, but there are also small performance gains to be had from doing this. Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com
*	Fix regression tests to use only global names beginning with "regress_".	Tom Lane	2019-06-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 18555b132 we tentatively established a rule that regression tests should use names containing "regression" for databases, and names starting with "regress_" for all other globally-visible object names, so as to circumscribe the side-effects that "make installcheck" could have on an existing installation. However, no enforcement mechanism was created, so it's unsurprising that some new violations have crept in since then. In fact, a whole new category of violations has crept in, to wit we now also have globally-visible subscription and replication origin names, and "make installcheck" could very easily clobber user-created objects of those types. So it's past time to do something about this. This commit sanitizes the tests enough that they will pass (i.e. not generate any visible warnings) with the enforcement mechanism I'll add in the next commit. There are some TAP tests that still trigger the warnings, but the warnings do not cause test failure. Since these tests do not actually run against a pre-existing installation, there's no need to worry whether they could conflict with user-created objects. The problem with rolenames.sql testing special role names like "user" is still there, and is dealt with only very cosmetically in this patch (by hiding the warnings :-(). What we actually need to do to be safe is to take that test script out of "make installcheck" altogether, but that seems like material for a separate patch. Discussion: https://postgr.es/m/16638.1468620817@sss.pgh.pa.us
*	Stop using spelling "nonexistant".	Noah Misch	2019-06-08
\| \| \| \| \|	The documentation used "nonexistent" exclusively, and the source tree used it three times as often as "nonexistant".
*	Phase 2 pgindent run for v12.	Tom Lane	2019-05-22
\| \| \| \| \| \| \| \| \|	Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
*	Fix potential assertion failure when reindexing a pg_class index.	Andres Freund	2019-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When reindexing individual indexes on pg_class it was possible to either trigger an assertion failure: TRAP: FailedAssertion("!(!ReindexIsProcessingIndex(((index)->rd_id))) That's because reindex_index() called SetReindexProcessing() - which enables an asserts ensuring no index insertions happen into the index - before calling RelationSetNewRelfilenode(). That not correct for indexes on pg_class, because RelationSetNewRelfilenode() updates the relevant pg_class row, which needs to update the indexes. The are two reasons this wasn't noticed earlier. Firstly the bug doesn't trigger when reindexing all of pg_class, as reindex_relation has code "hiding" all yet-to-be-reindexed indexes. Secondly, the bug only triggers when the the update to pg_class doesn't turn out to be a HOT update - otherwise there's no index insertion to trigger the bug. Most of the time there's enough space, making this bug hard to trigger. To fix, move RelationSetNewRelfilenode() to before the SetReindexProcessing() (and, together with some other code, to outside of the PG_TRY()). To make sure the error checking intended by SetReindexProcessing() is more robust, modify CatalogIndexInsert() to check ReindexIsProcessingIndex() even when the update is a HOT update. Also add a few regression tests for REINDEXing of system catalogs. The last two improvements would have prevented some of the issues fixed in 5c1560606dc4c from being introduced in the first place. Reported-By: Michael Paquier Diagnosed-By: Tom Lane and Andres Freund Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz Backpatch: 9.4-, the bug is present in all branches
*	Add facility to copy replication slots	Alvaro Herrera	2019-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows the user to create duplicates of existing replication slots, either logical or physical, and even changing properties such as whether they are temporary or the output plugin used. There are multiple uses for this, such as initializing multiple replicas using the slot for one base backup; when doing investigation of logical replication issues; and to select a different output plugins. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/CAD21AoAm7XX8y_tOPP6j4Nzzch12FvA1wPqiO690RCk+uYVstg@mail.gmail.com
*	Relax overly strict assertion	Alvaro Herrera	2019-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ever since its birth, ReorderBufferBuildTupleCidHash() has contained an assertion that a catalog tuple cannot change Cmax after acquiring one. But that's wrong: if a subtransaction executes DDL that affects that catalog tuple, and later aborts and another DDL affects the same tuple, it will change Cmax. Relax the assertion to merely verify that the Cmax remains valid and monotonically increasing, instead. Add a test that tickles the relevant code. Diagnosed by, and initial patch submitted by: Arseny Sher Co-authored-by: Arseny Sher Discussion: https://postgr.es/m/874l9p8hyw.fsf@ars-thinkpad
*	Update copyright for 2019	Bruce Momjian	2019-01-02
\| \| \| \|	Backpatch-through: certain files through 9.4
*	Add PGXS options to control TAP and isolation tests, take two	Michael Paquier	2018-12-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following options are added for extensions: - TAP_TESTS, to allow an extention to run TAP tests which are the ones present in t/*.pl. A subset of tests can always be run with the existing PROVE_TESTS for developers. - ISOLATION, to define a list of isolation tests. - ISOLATION_OPTS, to pass custom options to isolation_tester. A couple of custom Makefile rules have been accumulated across the tree to cover the lack of facility in PGXS for a couple of releases when using those test suites, which are all now replaced with the new flags, without reducing the test coverage. Note that tests of contrib/bloom/ are not enabled yet, as those are proving unstable in the buildfarm. Author: Michael Paquier Reviewed-by: Adam Berlin, Álvaro Herrera, Tom Lane, Nikolay Shaplov, Arthur Zakirov Discussion: https://postgr.es/m/20180906014849.GG2726@paquier.xyz
*	Revert all new recent changes to add PGXS options for TAP and isolation	Michael Paquier	2018-11-26
\| \| \| \| \| \| \| \| \| \| \| \|	A set of failures in buildfarm machines are proving that this is not quite ready yet because of another set of issues: - MSVC scripts assume that REGRESS_OPTS can only use top_builddir. Some test suites actually finish by using top_srcdir, like pg_stat_statements which cause the regression tests to never run. - Trying to enforce top_builddir does not work either when using VPATH as this is not recognized properly. - TAP tests of bloom are unstable on various platforms, causing various failures.
*	Fix regression test handling of test_decoding with MSVC	Michael Paquier	2018-11-26
\| \| \| \| \| \| \| \| \| \| \|	The set of scripts in charge of running the regression tests for MSVC run currently under the assumption that only $(top_builddir) can used in option values defined in REGRESS_OPTS, and those options need to have a specific format as well to be correctly parsed, so fix the Makefile values so as those are correctly set. Per complains from buildfarm member dory and whelk, with some extra testing done on my side with MSVC to check this patch.
*	Add PGXS options to control TAP and isolation tests	Michael Paquier	2018-11-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following options are added for extensions: - TAP_TESTS, to allow an extention to run TAP tests which are the ones present in t/*.pl. A subset of tests can always be run with the existing PROVE_TESTS for developers. - ISOLATION, to define a list of isolation tests. - ISOLATION_OPTS, to pass custom options to isolation_tester. A couple of custom Makefile targets have been accumulated across the tree to cover the lack of facility in PGXS for a couple of releases when using those test suites, which are all now replaced with the new flags, without reducing the test coverage. This also fixes an issue with contrib/bloom/, which had a custom target to trigger its TAP tests of its own not part of the main check runs. Author: Michael Paquier Reviewed-by: Adam Berlin, Álvaro Herrera, Tom Lane, Nikolay Shaplov, Arthur Zakirov Discussion: https://postgr.es/m/20180906014849.GG2726@paquier.xyz
*	Remove WITH OIDS support, change oid catalog column visibility.	Andres Freund	2018-11-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
*	Fix logical decoding error when system table w/ toast is repeatedly rewritten.	Andres Freund	2018-10-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Repeatedly rewriting a mapped catalog table with VACUUM FULL or CLUSTER could cause logical decoding to fail with: ERROR, "could not map filenode \"%s\" to relation OID" To trigger the problem the rewritten catalog had to have live tuples with toasted columns. The problem was triggered as during catalog table rewrites the heap_insert() check that prevents logical decoding information to be emitted for system catalogs, failed to treat the new heap's toast table as a system catalog (because the new heap is not recognized as a catalog table via RelationIsLogicallyLogged()). The relmapper, in contrast to the normal catalog contents, does not contain historical information. After a single rewrite of a mapped table the new relation is known to the relmapper, but if the table is rewritten twice before logical decoding occurs, the relfilenode cannot be mapped to a relation anymore. Which then leads us to error out. This only happens for toast tables, because the main table contents aren't re-inserted with heap_insert(). The fix is simple, add a new heap_insert() flag that prevents logical decoding information from being emitted, and accept during decoding that there might not be tuple data for toast tables. Unfortunately that does not fix pre-existing logical decoding errors. Doing so would require not throwing an error when a filenode cannot be mapped to a relation during decoding, and that seems too likely to hide bugs. If it's crucial to fix decoding for an existing slot, temporarily changing the ERROR in ReorderBufferCommit() to a WARNING appears to be the best fix. Author: Andres Freund Discussion: https://postgr.es/m/20180914021046.oi7dm4ra3ot2g2kt@alap3.anarazel.de Backpatch: 9.4-, where logical decoding was introduced
*	Force synchronous commit to be enabled for all test_decoding tests.	Andres Freund	2018-10-10
\| \| \| \| \| \| \| \| \|	Without that the tests fail when forced to be run against a cluster with synchronous_commit = off (as the WAL might not yet be flushed to disk by the point logical decoding gets called, and thus the expected output breaks). Most tests already do that, add it to a few newer tests. Author: Andres Freund
*	Fix logical replication slot initialization	Alvaro Herrera	2018-08-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was broken in commit 9c7d06d60680, which inadvertently gave the wrong value to fast_forward in one StartupDecodingContext call. Fix by flipping the value. Add a test for the obvious error, namely trying to initialize a replication slot with an nonexistent output plugin. While at it, move the CreateDecodingContext call earlier, so that any errors are reported before sending the CopyBoth message. Author: Dave Cramer <davecramer@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CADK3HHLVkeRe1v4P02-5hj55H3_yJg3AEtpXyEY5T3wuzO2jSg@mail.gmail.com
*	Block replication slot advance for these not yet reserving WAL	Michael Paquier	2018-07-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Such replication slots are physical slots freshly created without WAL being reserved, which is the default behavior, which have not been used yet as WAL consumption resources to retain WAL. This prevents advancing a slot to a position older than any WAL available, which could falsify calculations for WAL segment recycling. This also cleans up a bit the code, as ReplicationSlotRelease() would be called on ERROR, and improves error messages. Reported-by: Kyotaro Horiguchi Author: Michael Paquier Reviewed-by: Andres Freund, Álvaro Herrera, Kyotaro Horiguchi Discussion: https://postgr.es/m/20180626071305.GH31353@paquier.xyz
*	Reduce cost of test_decoding's new oldest_xmin test	Alvaro Herrera	2018-07-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change a whole-database VACUUM into doing just pg_attribute, which is the portion that verifies what we want it to do. The original formulation wastes a lot of CPU time, which leads the test to fail when runtime exceeds isolationtester timeout when it's super-slow, such as under CLOBBER_CACHE_ALWAYS. Per buildfarm member friarbird. It turns out that the previous shape of the test doesn't always detect the condition it is supposed to detect (on unpatched reorderbuffer code): the reason is that there is a good chance of encountering a xl_running_xacts record (logged every 15 seconds) before the checkpoint -- and because we advance the xmin when we receive that WAL record, and we don't advance the xmin twice consecutively without receiving a client message in between, that means the xmin is not advanced enough for the tuple to be pruned from pg_attribute by VACUUM. So the test would spuriously pass. The reason this test deficiency wasn't detected earlier is that HOT pruning removes the tuple anyway, even if vacuum leaves it in place, so the test correctly fails (detecting the coding mistake), but for the wrong reason. To fix this mess, run the s0_get_changes step twice before vacuum instead of once: this seems to cause the xmin to be advanced reliably, wreaking havoc with more certainty. Author: Arseny Sher Discussion: https://postgr.es/m/87h8lkuxoa.fsf@ars-thinkpad
*	Fix "base" snapshot handling in logical decoding	Alvaro Herrera	2018-06-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two closely related bugs are fixed. First, xmin of logical slots was advanced too early. During xl_running_xacts processing, xmin of the slot was set to the oldest running xid in the record, but that's wrong: actually, snapshots which will be used for not-yet-replayed transactions might consider older txns as running too, so we need to keep xmin back for them. The problem wasn't noticed earlier because DDL which allows to delete tuple (set xmax) while some another not-yet-committed transaction looks at it is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock conflicting with any inserts. The included test case (test_decoding's oldest_xmin) uses ALTER of a composite type, which doesn't have such interlocking. To deal with this, we must be able to quickly retrieve oldest xmin (oldest running xid among all assigned snapshots) from ReorderBuffer. To fix, add another list of ReorderBufferTXNs to the reorderbuffer, where transactions are sorted by base-snapshot-LSN. This is slightly different from the existing (sorted by first-LSN) list, because a transaction can have an earlier LSN but a later Xmin, if its first record does not obtain an xmin (eg. xl_xact_assignment). Note this new list doesn't fully replace the existing txn list: we still need that one to prevent WAL recycling. The second issue concerns SnapBuilder snapshots and subtransactions. SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a transaction that is known to be a subtxn, which is good in the common case that the top-level transaction already has one (no point in doing so), but a bug otherwise. To fix, arrange to transfer the snapshot from the subtxn to its top-level txn as soon as the kinship gets known. test_decoding's snapshot_transfer verifies this. Also, fix a minor memory leak: refcount of toplevel's old base snapshot was not decremented when the snapshot is transferred from child. Liberally sprinkle code comments, and rewrite a few existing ones. This part is my (Álvaro's) contribution to this commit, as I had to write all those comments in order to understand the existing code and Arseny's patch. Reported-by: Arseny Sher <a.sher@postgrespro.ru> Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
*	Post-feature-freeze pgindent run.	Tom Lane	2018-04-26
\| \| \| \|	Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
*	Revert MERGE patch	Simon Riggs	2018-04-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commits d204ef63776b8a00ca220adec23979091564e465, 83454e3c2b28141c0db01c7d2027e01040df5249 and a few more commits thereafter (complete list at the end) related to MERGE feature. While the feature was fully functional, with sufficient test coverage and necessary documentation, it was felt that some parts of the executor and parse-analyzer can use a different design and it wasn't possible to do that in the available time. So it was decided to revert the patch for PG11 and retry again in the future. Thanks again to all reviewers and bug reporters. List of commits reverted, in reverse chronological order: f1464c5380 Improve parse representation for MERGE ddb4158579 MERGE syntax diagram correction 530e69e59b Allow cpluspluscheck to pass by renaming variable 01b88b4df5 MERGE minor errata 3af7b2b0d4 MERGE fix variable warning in non-assert builds a5d86181ec MERGE INSERT allows only one VALUES clause 4b2d44031f MERGE post-commit review 4923550c20 Tab completion for MERGE aa3faa3c7a WITH support in MERGE 83454e3c2b New files for MERGE d204ef6377 MERGE SQL Command following SQL:2016 Author: Pavan Deolasee Reviewed-by: Michael Paquier
*	Logical decoding of TRUNCATE	Peter Eisentraut	2018-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a new WAL record type for TRUNCATE, which is only used when wal_level >= logical. (For physical replication, TRUNCATE is already replicated via SMGR records.) Add new callback for logical decoding output plugins to receive TRUNCATE actions. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
*	MERGE SQL Command following SQL:2016	Simon Riggs	2018-04-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
*	Revert "Modified files for MERGE"	Simon Riggs	2018-04-02
\| \| \| \|	This reverts commit 354f13855e6381d288dfaa52bcd4f2cb0fd4a5eb.
*	Modified files for MERGE	Simon Riggs	2018-04-02
\|
*	Handle heap rewrites even better in logical decoding	Peter Eisentraut	2018-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Logical decoding should not publish anything about tables created as part of a heap rewrite during DDL. Those tables don't exist externally, so consumers of logical decoding cannot do anything sensible with that information. In ab28feae2bd3d4629bd73ae3548e671c57d785f0, we worked around this for built-in logical replication, but that was hack. This is a more proper fix: We mark such transient heaps using the new field pg_class.relwrite, linking to the original relation OID. By default, we ignore them in logical decoding before they get to the output plugin. Optionally, a plugin can register their interest in getting such changes, if they handle DDL specially, in which case the new field will help them get information about the actual table. Reviewed-by: Craig Ringer <craig@2ndquadrant.com>
*	test_decoding: Remove unused #include directives.	Robert Haas	2018-03-07
\| \| \| \| \| \|	Euler Taveira Discussion: http://postgr.es/m/CAHE3wghBwKoCmK_sRu4xUL7f-q3dVOSwqjnOkaGmvWAqWUKaSQ@mail.gmail.com
*	Replace AclObjectKind with ObjectType	Peter Eisentraut	2018-01-19
\| \| \| \| \| \| \| \| \|	AclObjectKind was basically just another enumeration for object types, and we already have a preferred one for that. It's only used in aclcheck_error. By using ObjectType instead, we can also give some more precise error messages, for example "index" instead of "relation". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
*	Ability to advance replication slots	Simon Riggs	2018-01-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ability to advance both physical and logical replication slots using a new user function pg_replication_slot_advance(). For logical advance that means records are consumed as fast as possible and changes are not given to output plugin for sending. Makes 2nd phase (after we reached SNAPBUILD_FULL_SNAPSHOT) of replication slot creation faster, especially when there are big transactions as the reorder buffer does not have to deal with data changes and does not have to spill to disk. Author: Petr Jelinek Reviewed-by: Simon Riggs
*	Update copyright for 2018	Bruce Momjian	2018-01-02
\| \| \| \|	Backpatch-through: certain files through 9.3
*	Fix crash when logical decoding is invoked from a PL function.	Tom Lane	2017-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The logical decoding functions do BeginInternalSubTransaction and RollbackAndReleaseCurrentSubTransaction to clean up after themselves. It turns out that AtEOSubXact_SPI has an unrecognized assumption that we always need to cancel the active SPI operation in the SPI context that surrounds the subtransaction (if there is one). That's true when the RollbackAndReleaseCurrentSubTransaction call is coming from the SPI-using function itself, but not when it's happening inside some unrelated function invoked by a SPI query. In practice the affected callers are the various PLs. To fix, record the current subtransaction ID when we begin a SPI operation, and clean up only if that ID is the subtransaction being canceled. Also, remove AtEOSubXact_SPI's assertion that it must have cleaned up the surrounding SPI context's active tuptable. That's proven wrong by the same test case. Also clarify (or, if you prefer, reinterpret) the calling conventions for _SPI_begin_call and _SPI_end_call. The memory context cleanup in the latter means that these have always had the flavor of a matched resource-management pair, but they weren't documented that way before. Per report from Ben Chobot. Back-patch to 9.4 where logical decoding came in. In principle, the SPI changes should go all the way back, since the problem dates back to commit 7ec1c5a86. But given the lack of field complaints it seems few people are using internal subtransactions in this way. So I don't feel a need to take any risks in 9.2/9.3. Discussion: https://postgr.es/m/73FBA179-C68C-4540-9473-71E865408B15@silentmedia.com
*	Fix more user-visible elog() calls.	Robert Haas	2017-10-05
\| \| \| \| \| \| \| \| \|	Michael Paquier discovered that this could be triggered via SQL; give a nicer message instead. Patch by Michael Paquier, reviewed by Masahiko Sawada. Discussion: http://postgr.es/m/CAB7nPqQtPg+LKKtzdKN26judHcvPZ0s1gNigzOT4j8CYuuuBYg@mail.gmail.com
*	Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n).	Andres Freund	2017-08-20
\| \| \| \| \| \| \| \| \| \| \|	This is a mechanical change in preparation for a later commit that will change the layout of TupleDesc. Introducing a macro to abstract the details of where attributes are stored will allow us to change that in separate step and revise it in future. Author: Thomas Munro, editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
*	Add regression test for wide REPLICA IDENTITY FULL updates.	Andres Freund	2017-08-05
\| \| \| \| \| \| \| \|	This just contains the regression tests added by a fix for a 9.4 specific bug regarding $subject. Author: Andres Freund Backpatch: 9.5-
*	Phase 3 of pgindent updates.	Tom Lane	2017-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us