aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Stamp 9.6.8.REL9_6_8Tom Lane2018-02-26
|
* Schema-qualify references in test_ddl_deparse test script.Tom Lane2018-02-26
| | | | | | This omission seems to be what is causing buildfarm failures on crake. Security: CVE-2018-1058
* Document security implications of search_path and the public schema.Noah Misch2018-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ability to create like-named objects in different schemas opens up the potential for users to change the behavior of other users' queries, maliciously or accidentally. When you connect to a PostgreSQL server, you should remove from your search_path any schema for which a user other than yourself or superusers holds the CREATE privilege. If you do not, other users holding CREATE privilege can redefine the behavior of your commands, causing them to perform arbitrary SQL statements under your identity. "SET search_path = ..." and "SELECT pg_catalog.set_config(...)" are not vulnerable to such hijacking, so one can use either as the first command of a session. As special exceptions, the following client applications behave as documented regardless of search_path settings and schema privileges: clusterdb createdb createlang createuser dropdb droplang dropuser ecpg (not programs it generates) initdb oid2name pg_archivecleanup pg_basebackup pg_config pg_controldata pg_ctl pg_dump pg_dumpall pg_isready pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_standby pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb vacuumlo. Not included are core client programs that run user-specified SQL commands, namely psql and pgbench. PostgreSQL encourages non-core client applications to do likewise. Document this in the context of libpq connections, psql connections, dblink connections, ECPG connections, extension packaging, and schema usage patterns. The principal defense for applications is "SELECT pg_catalog.set_config('search_path', '', false)", and the principal defense for databases is "REVOKE CREATE ON SCHEMA public FROM PUBLIC". Either one is sufficient to prevent attack. After a REVOKE, consider auditing the public schema for objects named like pg_catalog objects. Authors of SECURITY DEFINER functions use some of the same defenses, and the CREATE FUNCTION reference page already covered them thoroughly. This is a good opportunity to audit SECURITY DEFINER functions for robust security practice. Back-patch to 9.3 (all supported versions). Reviewed by Michael Paquier and Jonathan S. Katz. Reported by Arseniy Sharoglazov. Security: CVE-2018-1058
* Empty search_path in Autovacuum and non-psql/pgbench clients.Noah Misch2018-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the client programs behave as documented regardless of the connect-time search_path and regardless of user-created objects. Today, a malicious user with CREATE permission on a search_path schema can take control of certain of these clients' queries and invoke arbitrary SQL functions under the client identity, often a superuser. This is exploitable in the default configuration, where all users have CREATE privilege on schema "public". This changes behavior of user-defined code stored in the database, like pg_index.indexprs and pg_extension_config_dump(). If they reach code bearing unqualified names, "does not exist" or "no schema has been selected to create in" errors might appear. Users may fix such errors by schema-qualifying affected names. After upgrading, consider watching server logs for these errors. The --table arguments of src/bin/scripts clients have been lax; for example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still performs a checkpoint. Back-patch to 9.3 (all supported versions). Reviewed by Tom Lane, though this fix strategy was not his first choice. Reported by Arseniy Sharoglazov. Security: CVE-2018-1058
* Avoid using unsafe search_path settings during dump and restore.Tom Lane2018-02-26
| | | | | | | | | | | | | | | | | | | | | | | | | Historically, pg_dump has "set search_path = foo, pg_catalog" when dumping an object in schema "foo", and has also caused that setting to be used while restoring the object. This is problematic because functions and operators in schema "foo" could capture references meant to refer to pg_catalog entries, both in the queries issued by pg_dump and those issued during the subsequent restore run. That could result in dump/restore misbehavior, or in privilege escalation if a nefarious user installs trojan-horse functions or operators. This patch changes pg_dump so that it does not change the search_path dynamically. The emitted restore script sets the search_path to what was used at dump time, and then leaves it alone thereafter. Created objects are placed in the correct schema, regardless of the active search_path, by dint of schema-qualifying their names in the CREATE commands, as well as in subsequent ALTER and ALTER-like commands. Since this change requires a change in the behavior of pg_restore when processing an archive file made according to this new convention, bump the archive file version number; old versions of pg_restore will therefore refuse to process files made with new versions of pg_dump. Security: CVE-2018-1058
* Translation updatesPeter Eisentraut2018-02-26
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: b97fb93ecf9b04faed73a68db33226f33ae3065e
* Synchronize doc/ copies of src/test/examples/.Noah Misch2018-02-23
| | | | | | | This is mostly cosmetic, but it might fix build failures, on some platform, when copying from the documentation. Back-patch to 9.3 (all supported versions).
* Fix planner failures with overlapping mergejoin clauses in an outer join.Tom Lane2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given overlapping or partially redundant join clauses, for example t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x the planner's EquivalenceClass machinery will ordinarily refactor the clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't see multiple references to the same EquivalenceClass in a list of join equality clauses. However, if the join is outer, it's incorrect to derive a restriction clause on the outer side from the join conditions, so the clause refactoring does not happen and we end up with overlapping join conditions. The code that attempted to deal with such cases had several subtle bugs, which could result in "left and right pathkeys do not match in mergejoin" or "outer pathkeys do not match mergeclauses" planner errors, if the selected join plan type was a mergejoin. (It does not appear that any actually incorrect plan could have been emitted.) The core of the problem really was failure to recognize that the outer and inner relations' pathkeys have different relationships to the mergeclause list. A join's mergeclause list is constructed by reference to the outer pathkeys, so it will always be ordered the same as the outer pathkeys, but this cannot be presumed true for the inner pathkeys. If the inner sides of the mergeclauses contain multiple references to the same EquivalenceClass ({t2.x} in the above example) then a simplistic rendering of the required inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery recognizes that the second sort column is redundant and throws it away. The mergejoin planning code failed to account for that behavior properly. One error was to try to generate cut-down versions of the mergeclause list from cut-down versions of the inner pathkeys in the same way as the initial construction of the mergeclause list from the outer pathkeys was done; this could lead to choosing a mergeclause list that fails to match the outer pathkeys. The other problem was that the pathkey cross-checking code in create_mergejoin_plan treated the inner and outer pathkey lists identically, whereas actually the expectations for them must be different. That led to false "pathkeys do not match" failures in some cases, and in principle could have led to failure to detect bogus plans in other cases, though there is no indication that such bogus plans could be generated. Reported by Alexander Kuzmenkov, who also reviewed this patch. This has been broken for years (back to around 8.3 according to my testing), so back-patch to all supported branches. Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
* Backport: Mark assorted GUC variables as PGDLLIMPORT.Andres Freund2018-02-22
| | | | | | | | | This backpatches 935dee9ad5a8d12f4d3b772a6e6c99d245e5ad44 to the the branches requested by extension authors. Original-Author: Metin Doslu Original-Committer: Robert Haas Author: Brian Cloutier
* Repair pg_upgrade's failure to preserve relfrozenxid for matviews.Tom Lane2018-02-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | This oversight led to data corruption in matviews, manifesting as "could not access status of transaction" before our most recent releases, and "found xmin from before relfrozenxid" errors since then. The proximate cause of the problem seems to have been confusion between the task of preserving dropped-column status and the task of preserving frozenxid status. Those are required for distinct sets of relkinds, and the reasoning was entirely undocumented in the source code. In hopes of forestalling future errors of the same kind, try to improve the commentary in this area. In passing, also improve the remarkably unhelpful comments around pg_upgrade's set_frozenxids(). That's not actually buggy AFAICS, but good luck figuring out what it does from the old comments. Per report from Claudio Freire. It appears that bug #14852 from Alexey Ermakov is an earlier report of the same issue, and there may be other cases that we failed to identify at the time. Patch by me based on analysis by Andres Freund. The bug dates back to the introduction of matviews, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAGTBQpbrY9CdRGGhyBZ9yqY4jWaGC85rUF4X+R7d-aim=mBNsw@mail.gmail.com Discussion: https://postgr.es/m/20171013115320.28049.86457@wrigleys.postgresql.org
* Fix misbehavior of CTE-used-in-a-subplan during EPQ rechecks.Tom Lane2018-02-19
| | | | | | | | | | | | | | | | | | An updating query that reads a CTE within an InitPlan or SubPlan could get incorrect results if it updates rows that are concurrently being modified. This is caused by CteScanNext supposing that nothing inside its recursive ExecProcNode call could change which read pointer is selected in the CTE's shared tuplestore. While that's normally true because of scoping considerations, it can break down if an EPQ plan tree gets built during the call, because EvalPlanQualStart builds execution trees for all subplans whether they're going to be used during the recheck or not. And it seems like a pretty shaky assumption anyway, so let's just reselect our own read pointer here. Per bug #14870 from Andrei Gorita. This has been broken since CTEs were implemented, so back-patch to all supported branches. Discussion: https://postgr.es/m/20171024155358.1471.82377@wrigleys.postgresql.org
* Fix broken logic for reporting PL/Python function names in errcontext.Tom Lane2018-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | plpython_error_callback() reported the name of the function associated with the topmost PL/Python execution context. This was not merely wrong if there were nested PL/Python contexts, but it risked a core dump if the topmost one is an inline code block rather than a named function. That will have proname = NULL, and so we were passing a NULL pointer to snprintf("%s"). It seems that none of the PL/Python-testing machines in the buildfarm will dump core for that, but some platforms do, as reported by Marina Polyakova. Investigation finds that there actually is an existing regression test that used to prove that the behavior was wrong, though apparently no one had noticed that it was printing the wrong function name. It stopped showing the problem in 9.6 when we adjusted psql to not print CONTEXT by default for NOTICE messages. The problem is masked (if your platform avoids the core dump) in error cases, because PL/Python will throw away the originally generated error info in favor of a new traceback produced at the outer level. Repair by using ErrorContextCallback.arg to pass the correct context to the error callback. Add a regression test illustrating correct behavior. Back-patch to all supported branches, since they're all broken this way. Discussion: https://postgr.es/m/156b989dbc6fe7c4d3223cf51da61195@postgrespro.ru
* Stamp 9.6.7.REL9_6_7Tom Lane2018-02-05
|
* Translation updatesPeter Eisentraut2018-02-05
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 8c0c2fe449f6c310f83664b946ba85309a69bdf6
* Ensure that all temp files made during pg_upgrade are non-world-readable.Tom Lane2018-02-05
| | | | | | | | | | | | | | | | | | | | | | | pg_upgrade has always attempted to ensure that the transient dump files it creates are inaccessible except to the owner. However, refactoring in commit 76a7650c4 broke that for the file containing "pg_dumpall -g" output; since then, that file was protected according to the process's default umask. Since that file may contain role passwords (hopefully encrypted, but passwords nonetheless), this is a particularly unfortunate oversight. Prudent users of pg_upgrade on multiuser systems would probably run it under a umask tight enough that the issue is moot, but perhaps some users are depending only on pg_upgrade's umask changes to protect their data. To fix this in a future-proof way, let's just tighten the umask at process start. There are no files pg_upgrade needs to write at a weaker security level; and if there were, transiently relaxing the umask around where they're created would be a safer approach. Report and patch by Tom Lane; the idea for the fix is due to Noah Misch. Back-patch to all supported branches. Security: CVE-2018-1053
* psql documentation fixesPeter Eisentraut2018-01-29
| | | | | | | Update the documentation for \pset to mention columns|linestyle|pager_min_lines. Author: Дилян Палаузов <dpa-postgres@aegee.org>
* Add stack-overflow guards in set-operation planning.Tom Lane2018-01-28
| | | | | | | | | | | | | | | | | | | | | | | create_plan_recurse lacked any stack depth check. This is not per our normal coding rules, but I'd supposed it was safe because earlier planner processing is more complex and presumably should eat more stack. But bug #15033 from Andrew Grossman shows this isn't true, at least not for queries having the form of a many-thousand-way INTERSECT stack. Further testing showed that recurse_set_operations is also capable of being crashed in this way, since it likewise will recurse to the bottom of a parsetree before calling any support functions that might themselves contain any stack checks. However, its stack consumption is only perhaps a third of create_plan_recurse's. It's possible that this particular problem with create_plan_recurse can only manifest in 9.6 and later, since before that we didn't build a Path tree for set operations. But having seen this example, I now have no faith in the proposition that create_plan_recurse doesn't need a stack check, so back-patch to all supported branches. Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org
* Update time zone data files to tzdata release 2018c.Tom Lane2018-01-27
| | | | | | DST law changes in Brazil, Sao Tome and Principe. Historical corrections for Bolivia, Japan, and South Sudan. The "US/Pacific-New" zone has been removed (it was only a link to America/Los_Angeles anyway).
* Teach reparameterize_path() to handle AppendPaths.Tom Lane2018-01-23
| | | | | | | | | | | | | | | | | | | | | | | If we're inside a lateral subquery, there may be no unparameterized paths for a particular child relation of an appendrel, in which case we *must* be able to create similarly-parameterized paths for each other child relation, else the planner will fail with "could not devise a query plan for the given query". This means that there are situations where we'd better be able to reparameterize at least one path for each child. This calls into question the assumption in reparameterize_path() that it can just punt if it feels like it. However, the only case that is known broken right now is where the child is itself an appendrel so that all its paths are AppendPaths. (I think possibly I disregarded that in the original coding on the theory that nested appendrels would get folded together --- but that only happens *after* reparameterize_path(), so it's not excused from handling a child AppendPath.) Given that this code's been like this since 9.3 when LATERAL was introduced, it seems likely we'd have heard of other cases by now if there were a larger problem. Per report from Elvis Pranskevichus. Back-patch to 9.3. Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
* Report an ERROR if a parallel worker fails to start properly.Robert Haas2018-01-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 28724fd90d2f85a0573a8107b48abad062a86d83 fixed things so that if a background worker fails to start due to fork() failure or because it is terminated before startup succeeds, BGWH_STOPPED will be reported. However, that only helps if the code that uses the background worker machinery notices the change in status, and the code in parallel.c did not. To fix that, do two things. First, make sure that when a worker exits, it triggers the leader to read from error queues. That way, if a worker which has attached to an error queue exits uncleanly, the leader is sure to throw some error, either the contents of the ErrorResponse sent by the worker, or "lost connection to parallel worker" if it exited without sending one. To cover the case where the worker never starts up in the first place or exits before attaching to the error queue, the ParallelContext now keeps track of which workers have sent at least one message via the error queue. A worker which sends no messages by the time the parallel operation finishes will be checked to see whether it exited before attaching to the error queue; if so, a new error message, "parallel worker failed to initialize", will be reported. If not, we'll continue to wait until it either starts up and exits cleanly, starts up and exits uncleanly, or fails to start, and then take the appropriate action. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoYnBgXgdTu6wk5YPdWhmgabYc9nY_pFLq=tB=FSLYkD8Q@mail.gmail.com
* Make pg_dump's ACL, sec label, and comment entries reliably identifiable.Tom Lane2018-01-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | _tocEntryRequired() expects that it can identify ACL, SECURITY LABEL, and COMMENT TOC entries that are for large objects by seeing whether the tag for them starts with "LARGE OBJECT ". While that works fine for actual large objects, which are indeed tagged that way, it's subject to false positives unless every such entry's tag starts with an appropriate type ID. And in fact it does not work for ACLs, because up to now we customarily tagged those entries with just the bare name of the object. This means that an ACL for an object named "LARGE OBJECT something" would be misclassified as data not schema, with undesirable results in a schema-only or data-only dump --- although pg_upgrade seems unaffected, due to the special case for binary-upgrade mode further down in _tocEntryRequired(). We can fix this by changing all the dumpACL calls to use the label strings already in use for comments and security labels, which do follow the convention of starting with an object type indicator. Well, mostly they follow it. dumpDatabase() got it wrong, using just the bare database name for those purposes, so that a database named "LARGE OBJECT something" would similarly be subject to having its comment or security label dropped or included when not wanted. Bring that into line too. (Note that up to now, database ACLs have not been processed by pg_dump, so that this issue doesn't affect them.) _tocEntryRequired() itself is not free of fault: it was overly liberal about matching object tags to "LARGE OBJECT " in binary-upgrade mode. This looks like it is probably harmless because there would be no data component to strip anyway in that mode, but at best it's trouble waiting to happen, so tighten that up too. The possible misclassification of SECURITY LABEL entries for databases is in principle a security problem, but the opportunities for actual exploits seem too narrow to be interesting. The other cases seem like just bugs, since an object owner can change its ACL or comment for himself, he needn't try to trick someone else into doing it by choosing a strange name. This has been broken since per-large-object TOC entries were introduced in 9.0, so back-patch to all supported branches. Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us
* Fix StoreCatalogInheritance1 to use 32bit inhseqnoAlvaro Herrera2018-01-19
| | | | | | | | | | | | | | | | | | | | For no apparent reason, this function was using a 16bit-wide inhseqno value, rather than the correct 32 bit width which is what is stored in the pg_inherits catalog. This becomes evident if you try to create a table with more than 65535 parents, because this error appears: ERROR: duplicate key value violates unique constraint «pg_inherits_relid_seqno_index» DETAIL: Key (inhrelid, inhseqno)=(329371, 0) already exists. Needless to say, having so many parents is an uncommon situations, which explains why this error has never been reported despite being having been introduced with the Postgres95 1.01 sources in commit d31084e9d111: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/creatinh.c;hb=d31084e9d111#l349 Backpatch all the way back. David Rowley noticed this while reviewing a patch of mine. Discussion: https://postgr.es/m/CAKJS1f8Dn7swSEhOWwzZzssW7747YB=2Hi+T7uGud40dur69-g@mail.gmail.com
* Cope with indicator arrays that do not have the correct length.Michael Meskes2018-01-15
| | | | Patch by: "Rader, David" <davidr@openscg.com>
* Avoid unnecessary failure in SELECT concurrent with ALTER NO INHERIT.Tom Lane2018-01-12
| | | | | | | | | | | | | | | | | | | | | | If a query against an inheritance tree runs concurrently with an ALTER TABLE that's disinheriting one of the tree members, it's possible to get a "could not find inherited attribute" error because after obtaining lock on the removed member, make_inh_translation_list sees that its columns have attinhcount=0 and decides they aren't the columns it's looking for. An ideal fix, perhaps, would avoid including such a just-removed member table in the query at all; but there seems no way to accomplish that without adding expensive catalog rechecks or creating a likelihood of deadlocks. Instead, let's just drop the check on attinhcount. In this way, a query that's included a just-disinherited child will still succeed, which is not a completely unreasonable behavior. This problem has existed for a long time, so back-patch to all supported branches. Also add an isolation test verifying related behaviors. Patch by me; the new isolation test is based on Kyotaro Horiguchi's work. Discussion: https://postgr.es/m/20170626.174612.23936762.horiguchi.kyotaro@lab.ntt.co.jp
* Fix incorrect handling of subquery pullup in the presence of grouping sets.Tom Lane2018-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | If we flatten a subquery whose target list contains constants or expressions, when those output columns are used in GROUPING SET columns, the planner was capable of doing the wrong thing by merging a pulled-up expression into the surrounding expression during const-simplification. Then the late processing that attempts to match subexpressions to grouping sets would fail to match those subexpressions to grouping sets, with the effect that they'd not go to null when expected. To fix, wrap such subquery outputs in PlaceHolderVars, ensuring that they preserve their separate identity throughout the planner's expression processing. This is a bit of a band-aid, because the wrapper defeats const-simplification even in places where it would be safe to allow. But a nicer fix would likely be too invasive to back-patch, and the consequences of the missed optimizations probably aren't large in most cases. Back-patch to 9.5 where grouping sets were introduced. Heikki Linnakangas, with small mods and better test cases by me; additional review by Andrew Gierth Discussion: https://postgr.es/m/7dbdcf5c-b5a6-ef89-4958-da212fe10176@iki.fi
* Remove dubious micro-optimization in ckpt_buforder_comparator().Tom Lane2018-01-10
| | | | | | | | | | | | | | | | | | | | | | | | | | It seems incorrect to assume that the list of CkptSortItems can never contain duplicate page numbers: concurrent activity could result in some page getting dropped from a low-numbered buffer and later loaded into a high-numbered buffer while BufferSync is scanning the buffer pool. If that happened, the comparator would give self-inconsistent results, potentially confusing qsort(). Saving one comparison step is not worth possibly getting the sort wrong. So far as I can tell, nothing would actually go wrong given our current implementation of qsort(). It might get a bit slower than expected if there were a large number of duplicates of one value, but that's surely a probability-epsilon case. Still, the comment is wrong, and if we ever switched to another sort implementation it might be less forgiving. In passing, avoid casting away const-ness of the argument pointers; I've not seen any compiler complaints from that, but it seems likely that some compilers would not like it. Back-patch to 9.6 where this code came in, just in case I've underestimated the possible consequences. Discussion: https://postgr.es/m/18437.1515607610@sss.pgh.pa.us
* Change some bogus PageGetLSN calls to BufferGetLSNAtomicAlvaro Herrera2018-01-09
| | | | | | | | | | | | | | | | | | | | | | | | | | As src/backend/access/transam/README says, PageGetLSN may only be called by processes holding either exclusive lock on buffer, or a shared lock on buffer plus buffer header lock. Therefore any place that only holds a shared buffer lock must use BufferGetLSNAtomic instead of PageGetLSN, which internally obtains buffer header lock prior to reading the LSN. A few callsites failed to comply with this rule. This was detected by running all tests under a new (not committed) assertion that verifies PageGetLSN locking contract. All but one of the callsites that failed the assertion are fixed by this patch. Remaining callsites were inspected manually and determined not to need any change. The exception (unfixed callsite) is in TestForOldSnapshot, which only has a Page argument, making it impossible to access the corresponding Buffer from it. Fixing that seems a much larger patch that will have to be done separately; and that's just as well, since it was only introduced in 9.6 and other bugs are much older. Some of these bugs are ancient; backpatch all the way back to 9.3. Authors: Jacob Champion, Asim Praveen, Ashwin Agrawal Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/CABAq_6GXgQDVu3u12mK9O5Xt5abBZWQ0V40LZCE+oUf95XyNFg@mail.gmail.com
* pg_upgrade: simplify code layout in a few placesBruce Momjian2018-01-05
| | | | Backpatch-through: 9.4 (9.3 didn't need improving)
* Fix failure to delete spill files of aborted transactionsAlvaro Herrera2018-01-05
| | | | | | | | | | | | | | | | | | Logical decoding's reorderbuffer.c may spill transaction files to disk when transactions are large. These are supposed to be removed when they become "too old" by xid; but file removal requires the boundary LSNs of the transaction to be known. The final_lsn is only set when we see the commit or abort record for the transaction, but nothing sets the value for transactions that crash, so the removal code misbehaves -- in assertion-enabled builds, it crashes by a failed assertion. To fix, modify the final_lsn of transactions that don't have a value set, to the LSN of the very latest change in the transaction. This causes the spilled files to be removed appropriately. Author: Atsushi Torikoshi Reviewed-by: Kyotaro HORIGUCHI, Craig Ringer, Masahiko Sawada Discussion: https://postgr.es/m/54e4e488-186b-a056-6628-50628e4e4ebc@lab.ntt.co.jp
* Back-port fix for accumulation of parallel worker instrumentation.Robert Haas2018-01-04
| | | | | | | | | | | | When a Gather or Gather Merge node is started and stopped multiple times, accumulate instrumentation data only once, at the end, instead of after each execution, to avoid recording inflated totals. This is a back-port of commit 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff by Amit Kapila. Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com Discussion: http://postgr.es/m/CAA4eK1KT3BYj50qWhK5qBF=LDzQCoUVSFZjcK3mHoJJeWA+fNA@mail.gmail.com
* Revert "Fix isolation test to be less timing-dependent"Alvaro Herrera2018-01-03
| | | | | | | | This reverts commit 2268e6afd596. It turned out that inconsistency in the report is still possible, so go back to the simpler formulation of the test and instead add an alternate expected output. Discussion: https://postgr.es/m/20180103193728.ysqpcp2xjnqpiep7@alvherre.pgsql
* Rename pg_rewind's copy_file_range() to avoid conflict with new linux syscall.Andres Freund2018-01-03
| | | | | | | | | | | | | | | | | Upcoming versions of glibc will contain copy_file_range(2), a wrapper around a new linux syscall for in-kernel copying of data ranges. This conflicts with pg_rewinds function of the same name. Therefore rename pg_rewinds version. As our version isn't a generic copying facility we decided to choose a rewind specific function name. Per buildfarm animal caiman and subsequent discussion with Tom Lane. Author: Andres Freund Discussion: https://postgr.es/m/20180103033425.w7jkljth3e26sduc@alap3.anarazel.de https://postgr.es/m/31122.1514951044@sss.pgh.pa.us Backpatch: 9.5-, where pg_rewind was introduced
* Fix use of config-specific libraries for Windows OpenSSLAndrew Dunstan2018-01-03
| | | | | | | | | | | Commit 614350a3 allowed for an different builds of OpenSSL libraries on Windows, but ignored the fact that the alternative builds don't have config-specific libraries. This patch fixes the Solution file to ask for the correct libraries. per offline discussions with Leonardo Cecchi and Marco Nenciarini, Backpatch to all live branches.
* Make XactLockTableWait work for transactions that are not yet self-lockedAlvaro Herrera2018-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | XactLockTableWait assumed that its xid argument has already added itself to the lock table. That assumption led to another assumption that if locking the xid has succeeded but the xid is reported as still in progress, then the input xid must have been a subtransaction. These assumptions hold true for the original uses of this code in locking related to on-disk tuples, but they break down in logical replication slot snapshot building -- in particular, when a standby snapshot logged contains an xid that's already in ProcArray but not yet in the lock table. This leads to assertion failures that can be reproduced all the way back to 9.4, when logical decoding was introduced. To fix, change SubTransGetParent to SubTransGetTopmostTransaction which has a slightly different API: it returns the argument Xid if there is no parent, and it goes all the way to the top instead of moving up the levels one by one. Also, to avoid busy-waiting, add a 1ms sleep to give the other process time to register itself in the lock table. For consistency, change ConditionalXactLockTableWait the same way. Author: Petr Jelínek Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru Reported-by: Konstantin Knizhnik Diagnosed-by: Stas Kelvich, Petr Jelínek Reviewed-by: Andres Freund, Robert Haas
* Fix isolation test to be less timing-dependentAlvaro Herrera2018-01-03
| | | | | | | | I did this by adding another locking process, which makes the other two wait. This way the output should be stable enough. Per buildfarm and Andres Freund Discussion: https://postgr.es/m/20180103034445.t3utrtrnrevfsghm@alap3.anarazel.de
* Fix deadlock hazard in CREATE INDEX CONCURRENTLYAlvaro Herrera2018-01-02
| | | | | | | | | | | | | | | | | | | | | | Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit c3d09b3bd23f specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by 8aa3e47510b9 keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com
* In tests, await an LSN no later than the recovery target.Noah Misch2017-12-31
| | | | | | | | | | Otherwise, the test fails with "Timed out while waiting for standby to catch up". This happened rarely, perhaps only when autovacuum wrote WAL between our choosing the recovery target and choosing the LSN to await. Commit b26f7fa6ae2b4e5d64525b3d5bc66a0ddccd9e24 fixed one case of this. Fix two more. Back-patch to 9.6, which introduced the affected test. Discussion: https://postgr.es/m/20180101055227.GA2952815@rfd.leadboat.com
* Disallow UNION/INTERSECT/EXCEPT over no columns.Tom Lane2017-12-22
| | | | | | | | | | | | | | Since 9.4, we've allowed the syntax "select union select" and variants of that. However, the planner wasn't expecting a no-column set operation and ended up treating the set operation as if it were UNION ALL. Pre-v10, there seem to be some executor issues that would need to be fixed to support such cases, and it doesn't really seem worth expending much effort on. Just disallow it, instead. Per report from Victor Yegorov. Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
* Fix bug in cancellation of non-exclusive backup to avoid assertion failure.Fujii Masao2017-12-19
| | | | | | | | | | | | | | | | | | | | | | | | Previously an assertion failure occurred when pg_stop_backup() for non-exclusive backup was aborted while it's waiting for WAL files to be archived. This assertion failure happened in do_pg_abort_backup() which was called when a non-exclusive backup was canceled. do_pg_abort_backup() assumes that there is at least one non-exclusive backup running when it's called. But pg_stop_backup() can be canceled even after it marks the end of non-exclusive backup (e.g., during waiting for WAL archiving). This broke the assumption that do_pg_abort_backup() relies on, and which caused an assertion failure. This commit changes do_pg_abort_backup() so that it does nothing when non-exclusive backup has been already marked as completed. That is, the asssumption is also changed, and do_pg_abort_backup() now can handle even the case where it's called when there is no running backup. Backpatch to 9.6 where SQL-callable non-exclusive backup was added. Author: Masahiko Sawada and Michael Paquier Reviewed-By: Robert Haas and Fujii Masao Discussion: https://www.postgresql.org/message-id/CAD21AoD2L1Fu2c==gnVASMyFAAaq3y-AQ2uEVj-zTCGFFjvmDg@mail.gmail.com
* Perform a lot more sanity checks when freezing tuples.Andres Freund2017-12-14
| | | | | | | | | | | | | | | | The previous commit has shown that the sanity checks around freezing aren't strong enough. Strengthening them seems especially important because the existance of the bug has caused corruption that we don't want to make even worse during future vacuum cycles. The errors are emitted with ereport rather than elog, despite being "should never happen" messages, so a proper error code is emitted. To avoid superflous translations, mark messages as internal. Author: Andres Freund and Alvaro Herrera Reviewed-By: Alvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-
* Fix pruning of locked and updated tuples.Andres Freund2017-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously it was possible that a tuple was not pruned during vacuum, even though its update xmax (i.e. the updating xid in a multixact with both key share lockers and an updater) was below the cutoff horizon. As the freezing code assumed, rightly so, that that's not supposed to happen, xmax would be preserved (as a member of a new multixact or xmax directly). That causes two problems: For one the tuple is below the xmin horizon, which can cause problems if the clog is truncated or once there's an xid wraparound. The bigger problem is that that will break HOT chains, which in turn can lead two to breakages: First, failing index lookups, which in turn can e.g lead to constraints being violated. Second, future hot prunes / vacuums can end up making invisible tuples visible again. There's other harmful scenarios. Fix the problem by recognizing that tuples can be DEAD instead of RECENTLY_DEAD, even if the multixactid has alive members, if the update_xid is below the xmin horizon. That's safe because newer versions of the tuple will contain the locking xids. A followup commit will harden the code somewhat against future similar bugs and already corrupted data. Author: Andres Freund, with changes by Alvaro Herrera Reported-By: Daniel Wood Analyzed-By: Andres Freund, Alvaro Herrera, Robert Haas, Peter Geoghegan, Daniel Wood, Yi Wen Wong, Michael Paquier Reviewed-By: Alvaro Herrera, Robert Haas, Michael Paquier Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-
* Fix walsender timeouts when decoding a large transactionAndrew Dunstan2017-12-14
| | | | | | | | | | | | | | | | | | | | | | | The logical slots have a fast code path for sending data so as not to impose too high a per message overhead. The fast path skips checks for interrupts and timeouts. However, the existing coding failed to consider the fact that a transaction with a large number of changes may take a very long time to be processed and sent to the client. This causes the walsender to ignore interrupts for potentially a long time and more importantly it will result in the walsender being killed due to timeout at the end of such a transaction. This commit changes the fast path to also check for interrupts and only allows calling the fast path when the last keepalive check happened less than half the walsender timeout ago. Otherwise the slower code path will be taken. Backpatched to 9.4 Petr Jelinek, reviewed by Kyotaro HORIGUCHI, Yura Sokolov, Craig Ringer and Robert Haas. Discussion: https://postgr.es/m/e082a56a-fd95-a250-3bae-0fff93832510@2ndquadrant.com
* Fix corner-case coredump in _SPI_error_callback().Tom Lane2017-12-11
| | | | | | | | | | | I noticed that _SPI_execute_plan initially sets spierrcontext.arg = NULL, and only fills it in some time later. If an error were to happen in between, _SPI_error_callback would try to dereference the null pointer. This is unlikely --- there's not much between those points except push-snapshot calls --- but it's clearly not impossible. Tweak the callback to do nothing if the pointer isn't set yet. It's been like this for awhile, so back-patch to all supported branches.
* MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries.Noah Misch2017-12-09
| | | | | | | | | | | | | | | | | | Notably, this permits linking to the 32-bit Perl binaries advertised on perl.org, namely Strawberry Perl and ActivePerl. This has a side effect of permitting linking to binaries built with obsolete MSVC versions. By default, MSVC 2012 and later require a "safe exception handler table" in each binary. MinGW-built, 32-bit DLLs lack the relevant exception handler metadata, so linking to them failed with error LNK2026. Restore the semantics of MSVC 2010, which omits the table from a given binary if some linker input lacks metadata. This has no effect on 64-bit builds or on MSVC 2010 and earlier. Back-patch to 9.3 (all supported versions). Reported by Victor Wagner. Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
* MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T.Noah Misch2017-12-08
| | | | | | | | | | | | | | | | | | | | | Commits 5a5c2feca3fd858e70ea348822595547e6fa6c15 and b5178c5d08ca59e30f9d9428fa6fdb2741794e65 introduced support for modern MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl distributions like Strawberry Perl and modern ActivePerl. Perl has no robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so test this. Back-patch to 9.3 (all supported versions). The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when $Config{gccversion} is nonempty. That banks on every gcc-built Perl using the same ABI. gcc could change its default ABI the way MSVC once did, and one could build Perl with gcc and the non-default ABI. The GNU make build system could benefit from a similar test, without which it does not support MSVC-built Perl. For now, just add a comment. Most users taking the special step of building Perl with MSVC probably build PostgreSQL with MSVC. Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com
* MSVC: Remove cosmetic, cross-branch differences pertaining to Perl.Noah Misch2017-12-08
| | | | This simplifies back-patch of the next change to v9.5 and v9.6.
* Fix mistake in commentPeter Eisentraut2017-12-08
| | | | Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
* Report failure to start a background worker.Robert Haas2017-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a worker is flagged as BGW_NEVER_RESTART and we fail to start it, or if it is not marked BGW_NEVER_RESTART but is terminated before startup succeeds, what BgwHandleStatus should be reported? The previous code really hadn't considered this possibility (as indicated by the comments which ignore it completely) and would typically return BGWH_NOT_YET_STARTED, but that's not a good answer, because then there's no way for code using GetBackgroundWorkerPid() to tell the difference between a worker that has not started but will start later and a worker that has not started and will never be started. So, when this case happens, return BGWH_STOPPED instead. Update the comments to reflect this. The preceding fix by itself is insufficient to fix the problem, because the old code also didn't send a notification to the process identified in bgw_notify_pid when startup failed. That might've been technically correct under the theory that the status of the worker was BGWH_NOT_YET_STARTED, because the status would indeed not change when the worker failed to start, but now that we're more usefully reporting BGWH_STOPPED, a notification is needed. Without these fixes, code which starts background workers and then uses the recommended APIs to wait for those background workers to start would hang indefinitely if the postmaster failed to fork a worker. Amit Kapila and Robert Haas Discussion: http://postgr.es/m/CAA4eK1KDfKkvrjxsKJi3WPyceVi3dH1VCkbTJji2fuwKuB=3uw@mail.gmail.com
* Mark assorted variables PGDLLIMPORT.Robert Haas2017-12-05
| | | | | | | | | This makes life easier for extension authors who wish to support Windows. Brian Cloutier, slightly amended by me. Discussion: http://postgr.es/m/CAJCy68fscdNhmzFPS4kyO00CADkvXvEa-28H-OtENk-pa2OTWw@mail.gmail.com
* Clean up assorted messiness around AllocateDir() usage.Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com