aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
* Stamp 9.5.8.REL9_5_8Tom Lane2017-08-07
|
* Translation updatesPeter Eisentraut2017-08-07
| | | | | Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: d316c7f205275603a833ab9758ce51a76846ec58
* Require update permission for the large object written by lo_put().Tom Lane2017-08-07
| | | | | | | | | | lo_put() surely should require UPDATE permission, the same as lowrite(), but it failed to check for that, as reported by Chapman Flack. Oversight in commit c50b7c09d; backpatch to 9.4 where that was introduced. Tom Lane and Michael Paquier Security: CVE-2017-7548
* Again match pg_user_mappings to information_schema.user_mapping_options.Noah Misch2017-08-07
| | | | | | | | | | | | | | | Commit 3eefc51053f250837c3115c12f8119d16881a2d7 claimed to make pg_user_mappings enforce the qualifications user_mapping_options had been enforcing, but its removal of a longstanding restriction left them distinct when the current user is the subject of a mapping yet has no server privileges. user_mapping_options emits no rows for such a mapping, but pg_user_mappings includes full umoptions. Change pg_user_mappings to show null for umoptions. Back-patch to 9.2, like the above commit. Reviewed by Tom Lane. Reported by Jeff Janes. Security: CVE-2017-7547
* Don't allow logging in with empty password.Heikki Linnakangas2017-08-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some authentication methods allowed it, others did not. In the client-side, libpq does not even try to authenticate with an empty password, which makes using empty passwords hazardous: an administrator might think that an account with an empty password cannot be used to log in, because psql doesn't allow it, and not realize that a different client would in fact allow it. To clear that confusion and to be be consistent, disallow empty passwords in all authentication methods. All the authentication methods that used plaintext authentication over the wire, except for BSD authentication, already checked that the password received from the user was not empty. To avoid forgetting it in the future again, move the check to the recv_password_packet function. That only forbids using an empty password with plaintext authentication, however. MD5 and SCRAM need a different fix: * In stable branches, check that the MD5 hash stored for the user does not not correspond to an empty string. This adds some overhead to MD5 authentication, because the server needs to compute an extra MD5 hash, but it is not noticeable in practice. * In HEAD, modify CREATE and ALTER ROLE to clear the password if an empty string, or a password hash that corresponds to an empty string, is specified. The user-visible behavior is the same as in the stable branches, the user cannot log in, but it seems better to stop the empty password from entering the system in the first place. Secondly, it is fairly expensive to check that a SCRAM hash doesn't correspond to an empty string, because computing a SCRAM hash is much more expensive than an MD5 hash by design, so better avoid doing that on every authentication. We could clear the password on CREATE/ALTER ROLE also in stable branches, but we would still need to check at authentication time, because even if we prevent empty passwords from being stored in pg_authid, there might be existing ones there already. Reported by Jeroen van der Ham, Ben de Graaff and Jelte Fennema. Security: CVE-2017-7546
* Fix thinko introduced in 2bef06d516460 et al.Andres Freund2017-08-06
| | | | | | | | | | | | | | The callers for GetOldestSafeDecodingTransactionId() all inverted the argument for the argument introduced in 2bef06d516460. Luckily this appears to be inconsequential for the moment, as we wait for concurrent in-progress transaction when assembling a snapshot. Additionally this could only make a difference when adding a second logical slot, because only a pre-existing slot could cause an issue by lowering the returned xid dangerously much. Reported-By: Antonin Houska Discussion: https://postgr.es/m/32704.1496993134@localhost Backport: 9.4-, where 2bef06d516460 was backpatched to.
* Disallow SSL session tickets.Tom Lane2017-08-04
| | | | | | | | | | | | | | | | | | | | | | | | | We don't actually support session tickets, since we do not create an SSL session identifier. But it seems that OpenSSL will issue a session ticket on-demand anyway, which will then fail when used. This results in reconnection failures when using ticket-aware client-side SSL libraries (such as the Npgsql .NET driver), as reported by Shay Rojansky. To fix, just tell OpenSSL not to issue tickets. At some point in the far future, we might consider enabling tickets instead. But the security implications of that aren't entirely clear; and besides it would have little benefit except for very short-lived database connections, which is Something We're Bad At anyhow. It would take a lot of other work to get to a point where that would really be an exciting thing to do. While at it, also tell OpenSSL not to use a session cache. This doesn't really do anything, since a backend would never populate the cache anyway, but it might gain some micro-efficiencies and/or reduce security exposures. Patch by me, per discussion with Heikki Linnakangas and Shay Rojansky. Back-patch to all supported versions. Discussion: https://postgr.es/m/CADT4RqBU8N-csyZuzaook-c795dt22Zcwg1aHWB6tfVdAkodZA@mail.gmail.com
* Add missing ALTER USER variantsPeter Eisentraut2017-08-03
| | | | | | | ALTER USER ... SET did not support all the syntax variants of ALTER ROLE ... SET. Reported-by: Pavel Golub <pavel@microolap.com>
* Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last.Tom Lane2017-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit dc0e76ca3). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us
* Fix build on zlib-less environmentsAlvaro Herrera2017-08-03
| | | | | | | | | | | | | | | Commit 4d57e8381677 added support for getting I/O errors out of zlib, but it introduced a portability problem for systems without zlib. Repair by wrapping the zlib call inside #ifdef and restore the original code in the other branch. This serves to illustrate the inadequacy of the zlib abstraction in pg_backup_archiver: there is no way to call gzerror() in that abstraction. This means that the several places that call GZREAD and GZWRITE are currently doing error reporting wrongly, but ENOTIME to get it fixed before next week's release set. Backpatch to 9.4, like the commit that introduced the problem.
* Fix pg_dump's errno checking for zlib I/OAlvaro Herrera2017-08-02
| | | | | | | | | | | | | | | | | | | | | | Some error reports were reporting strerror(errno), which for some error conditions coming from zlib are wrong, resulting in confusing reports such as pg_restore: [compress_io] could not read from input file: Success which makes no sense. To correctly extract the error message we need to use gzerror(), so let's do that. This isn't as comprehensive or as neat as I would like, but at least it should improve things in many common cases. The zlib abstraction in compress_io does not seem to be applied consistently enough; we could perhaps improve that, but it seems master-only material, not a bug fix for back-patching. This problem goes back all the way, but I decided to apply back to 9.4 only, because older branches don't contain commit 14ea89366 which this change depends on. Authors: Vladimir Kunschikov, Álvaro Herrera Discussion: https://postgr.es/m/1498120508308.9826@infotecs.ru
* Silence warning from modern perl about unescaped bracesTom Lane2017-08-02
| | | | | | | | | | | Back-patch commit 76a1c97bf21c301f61bb41345e0cdd0d365b2086 into the older branches (9.5 and before). This is needed because perl 5.26 and later treats the case as an error not just a warning. Original patch by Andrew Dunstan, need for back-patch noted by Ashutosh Sharma Discussion: https://postgr.es/m/CAE9k0PkfNcmj9pA7Yx4qQ=K=3aY4TuiRhp7KYpayDWm9MYsnjg@mail.gmail.com
* Fix comment.Tatsuo Ishii2017-08-01
| | | | | | | | | XLByteToSeg and XLByteToPrevSeg calculate only a segment number. The definition of these macros were modified by commit dfda6ebaec6763090fb78b458a979b558c50b39b but the comment remain unchanged. Patch by Yugo Nagata. Back patched to 9.3 and beyond.
* PL/Perl portability fix: absorb relevant -D switches from Perl.Tom Lane2017-07-31
| | | | | | | | | | | | Back-patch of commit 3c163a7fc76debbbdad1bdd3c43721cffe72f4db, which see for more info. Also throw in commit b4cc35fbb709bd6fcae8998f041fd731c9acbf42, so Coverity doesn't whine about the back branches. Ashutosh Sharma, some adjustments by me Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
* PL/Perl portability fix: avoid including XSUB.h in plperl.c.Tom Lane2017-07-31
| | | | | | | | | Back-patch of commit bebe174bb4462ef079a1d7eeafb82ff969f160a4, which see for more info. Patch by me, with some help from Ashutosh Sharma Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
* Add missing comment in postgresql.conf.Tatsuo Ishii2017-07-31
| | | | | | | current_source requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back patched to 9.2 and beyond.
* Add missing comment in postgresql.conf.Tatsuo Ishii2017-07-31
| | | | | | | dynamic_shared_memory_type requires to restart server to reflect the new value. Per Yugo Nagata and Masahiko Sawada. Back pached to 9.4 and beyond.
* Fix psql tab completion for CREATE USER MAPPING.Tom Lane2017-07-27
| | | | | | | | | After typing CREATE USER M..., it would not fill in MAPPING FOR, even though that was clearly intended behavior. Jeff Janes Discussion: https://postgr.es/m/CAMkU=1wo2iQ6jWnN=egqOb5NxEPn0PpANEtKHr3uPooQ+nYPtw@mail.gmail.com
* Clean up SQL emitted by psql/describe.c.Tom Lane2017-07-26
| | | | | | | | | | | | | | | | | | | | | Fix assorted places that had not bothered with the convention of prefixing catalog and function names with "pg_catalog.". That could possibly result in query failure when running with a nondefault search_path. Also fix two places that weren't quoting OID literals. I think the latter hasn't mattered much since about 7.3, but it's still a bad idea to be doing it in 99 places and not in 2 others. Also remove a useless EXISTS sub-select that someone had stuck into describeOneTableDetails' queries for child tables. We just got the OID out of pg_class, so I hardly see how checking that it exists in pg_class was doing anything helpful. In passing, try to improve the emitted formatting of a couple of these queries, though I didn't work really hard on that. And merge unnecessarily duplicative coding in some other places. Much of this was new in HEAD, but some was quite old; back-patch as appropriate.
* Fix concurrent locking of tuple update chainAlvaro Herrera2017-07-26
| | | | | | | | | | | | | | | | | | | If several sessions are concurrently locking a tuple update chain with nonconflicting lock modes using an old snapshot, and they all succeed, it may happen that some of them fail because of restarting the loop (due to a concurrent Xmax change) and getting an error in the subsequent pass while trying to obtain a tuple lock that they already have in some tuple version. This can only happen with very high concurrency (where a row is being both updated and FK-checked by multiple transactions concurrently), but it's been observed in the field and can have unpleasant consequences such as an FK check failing to see a tuple that definitely exists: ERROR: insert or update on table "child_table" violates foreign key constraint "fk_constraint_name" DETAIL: Key (keyid)=(123456) is not present in table "parent_table". (where the key is observably present in the table). Discussion: https://postgr.es/m/20170714210011.r25mrff4nxjhmf3g@alvherre.pgsql
* Fix race condition in predicate-lock init code in EXEC_BACKEND builds.Tom Lane2017-07-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trading a little too heavily on letting the code path be the same whether we were creating shared data structures or only attaching to them, InitPredicateLocks() inserted the "scratch" PredicateLockTargetHash entry unconditionally. This is just wrong if we're in a postmaster child, which would only reach this code in EXEC_BACKEND builds. Most of the time, the hash_search(HASH_ENTER) call would simply report that the entry already existed, causing no visible effect since the code did not bother to check for that possibility. However, if this happened while some other backend had transiently removed the "scratch" entry, then that other backend's eventual RestoreScratchTarget would suffer an assert failure; this appears to be the explanation for a recent failure on buildfarm member culicidae. In non-assert builds, there would be no visible consequences there either. But nonetheless this is a pretty bad bug for EXEC_BACKEND builds, for two reasons: 1. Each new backend would perform the hash_search(HASH_ENTER) call without holding any lock that would prevent concurrent access to the PredicateLockTargetHash hash table. This creates a low but certainly nonzero risk of corruption of that hash table. 2. In the event that the race condition occurred, by reinserting the scratch entry too soon, we were defeating the entire purpose of the scratch entry, namely to guarantee that transaction commit could move hash table entries around with no risk of out-of-memory failure. The odds of an actual OOM failure are quite low, but not zero, and if it did happen it would again result in corruption of the hash table. The user-visible symptoms of such corruption are a little hard to predict, but would presumably amount to misbehavior of SERIALIZABLE transactions that'd require a crash or postmaster restart to fix. To fix, just skip the hash insertion if IsUnderPostmaster. I also inserted a bunch of assertions that the expected things happen depending on whether IsUnderPostmaster is true. That might be overkill, since most comparable code in other functions isn't quite that paranoid, but once burnt twice shy. In passing, also move a couple of lines to places where they seemed to make more sense. Diagnosis of problem by Thomas Munro, patch by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/10593.1500670709@sss.pgh.pa.us
* Ensure that pg_get_ruledef()'s output matches pg_get_viewdef()'s.Tom Lane2017-07-24
| | | | | | | | | | | | | | | | | | | | | Various cases involving renaming of view columns are handled by having make_viewdef pass down the view's current relation tupledesc to get_query_def, which then takes care to use the column names from the tupledesc for the output column names of the SELECT. For some reason though, we'd missed teaching make_ruledef to do similarly when it is printing an ON SELECT rule, even though this is exactly the same case. The results from pg_get_ruledef would then be different and arguably wrong. In particular, this breaks pre-v10 versions of pg_dump, which in some situations would define views by means of emitting a CREATE RULE ... ON SELECT command. Third-party tools might not be happy either. In passing, clean up some crufty code in make_viewdef; we'd apparently modernized the equivalent code in make_ruledef somewhere along the way, and missed this copy. Per report from Gilles Darold. Back-patch to all supported versions. Discussion: https://postgr.es/m/ec05659a-40ff-4510-fc45-ca9d965d0838@dalibo.com
* MSVC: Accept tcl86.lib in addition to tcl86t.lib.Noah Misch2017-07-23
| | | | | ActiveTcl8.6.4.1.299124-win32-x86_64-threaded.exe ships just tcl86.lib. Back-patch to 9.2, like the commit recognizing tcl86t.lib.
* Fix pg_dump's handling of event triggers.Tom Lane2017-07-22
| | | | | | | | | | | | | | pg_dump with the --clean option failed to emit DROP EVENT TRIGGER commands for event triggers. In a closely related oversight, it also did not emit ALTER OWNER commands for event triggers. Since only superusers can create event triggers, the latter oversight is of little practical consequence ... but if we're going to record an owner for event triggers, then surely pg_dump should preserve it. Per complaint from Greg Atkins. Back-patch to 9.3 where event triggers were introduced. Discussion: https://postgr.es/m/20170722191142.yi4e7tzcg3iacclg@gmail.com
* pg_rewind: Fix some problems when copying files >2GB.Robert Haas2017-07-21
| | | | | | | | | | | | | | | | | | | | | | When incrementally updating a file larger than 2GB, the old code could either fail outright (if the client asked the server for bytes beyond the 2GB boundary) or fail to copy all the blocks that had actually been modified (if the server reported a file size to the client in excess of 2GB), resulting in data corruption. Generally, such files won't occur anyway, but they might if using a non-default segment size or if there the directory contains stray files unrelated to PostgreSQL. Fix by a more prudent choice of data types. Even with these improvements, this code still uses a mix of different types (off_t, size_t, uint64, int64) to represent file sizes and offsets, not all of which necessarily have the same width or signedness, so further cleanup might be in order here. However, at least now they all have the potential to be 64 bits wide on 64-bit platforms. Kuntal Ghosh and Michael Paquier, with a tweak by me. Discussion: http://postgr.es/m/CAGz5QC+8gbkz=Brp0TgoKNqHWTzonbPtPex80U0O6Uh_bevbaA@mail.gmail.com
* pg_rewind: Fix busted sanity check.Robert Haas2017-07-21
| | | | | | | | As written, the code would only fail the sanity check if none of the columns returned by the server were of the expected type, but we want it to fail if even one column is not of the expected type. Discussion: http://postgr.es/m/CA+TgmoYuY5zW7JEs+1hSS1D=V5K8h1SQuESrq=bMNeo0B71Sfw@mail.gmail.com
* Fix dumping of outer joins with empty qual lists.Tom Lane2017-07-20
| | | | | | | | | | | | | | | | | Normally, a JoinExpr would have empty "quals" only if it came from CROSS JOIN syntax. However, it's possible to get to this state by specifying NATURAL JOIN between two tables with no common column names, and there might be other ways too. The code previously printed no ON clause if "quals" was empty; that's right for CROSS JOIN but syntactically invalid if it's some type of outer join. Fix by printing ON TRUE in that case. This got broken by commit 2ffa740be, which stopped using NATURAL JOIN syntax in ruleutils output due to its brittleness in the face of column renamings. Back-patch to 9.3 where that commit appeared. Per report from Tushar Ahuja. Discussion: https://postgr.es/m/98b283cd-6dda-5d3f-f8ac-87db8c76a3da@enterprisedb.com
* Merge large_object.sql test into largeobject.source.Tom Lane2017-07-17
| | | | | | | | | | | | | | | | | It seems pretty confusing to have tests named both largeobject and large_object. The latter is of very recent vintage (commit ff992c074), so get rid of it in favor of merging into the former. Also, enable the LO comment test that was added by commit 70ad7ed4e, since the later commit added the then-missing pg_upgrade functionality. The large_object.sql test case is almost completely redundant with that, but not quite: it seems like creating a user-defined LO with an OID in the system range might be an interesting case for pg_upgrade, so let's keep it. Like the earlier patch, back-patch to all supported branches. Discussion: https://postgr.es/m/18665.1500306372@sss.pgh.pa.us
* fix typoAndrew Dunstan2017-07-16
|
* Fix vcregress.pl PROVE_FLAGS bug in commit 93b7d9731fAndrew Dunstan2017-07-16
| | | | | | | This change didn't adjust the publicly visible taptest function, causing buildfarm failures on bowerbird. Backpatch to 9.4 like previous change.
* Fix pg_basebackup output to stdout on Windows.Heikki Linnakangas2017-07-14
| | | | | | | | | | | | | | | When writing a backup to stdout with pg_basebackup on Windows, put stdout to binary mode. Any CR bytes in the output will otherwise be output incorrectly as CR+LF. In the passing, standardize on using "_setmode" instead of "setmode", for the sake of consistency. They both do the same thing, but according to MSDN documentation, setmode is deprecated. Fixes bug #14634, reported by Henry Boehlert. Patch by Haribabu Kommi. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/20170428082818.24366.13134@wrigleys.postgresql.org
* Fix dumping of FUNCTION RTEs that contain non-function-call expressions.Tom Lane2017-07-13
| | | | | | | | | | | | | | | | | | | | | | | | | | The grammar will only accept something syntactically similar to a function call in a function-in-FROM expression. However, there are various ways to input something that ruleutils.c won't deparse that way, potentially leading to a view or rule that fails dump/reload. Fix by inserting a dummy CAST around anything that isn't going to deparse as a function (which is one of the ways to get something like that in there in the first place). In HEAD, also make use of the infrastructure added by this to avoid emitting unnecessary parentheses in CREATE INDEX deparsing. I did not change that in back branches, thinking that people might find it to be unexpected/unnecessary behavioral change. In HEAD, also fix incorrect logic for when to add extra parens to partition key expressions. Somebody apparently thought they could get away with simpler logic than pg_get_indexdef_worker has, but they were wrong --- a counterexample is PARTITION BY LIST ((a[1])). Ignoring the prettyprint flag for partition expressions isn't exactly a nice solution anyway. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/10477.1499970459@sss.pgh.pa.us
* Fix race between GetNewTransactionId and GetOldestActiveTransactionId.Heikki Linnakangas2017-07-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The race condition goes like this: 1. GetNewTransactionId advances nextXid e.g. from 100 to 101 2. GetOldestActiveTransactionId reads the new nextXid, 101 3. GetOldestActiveTransactionId loops through the proc array. There are no active XIDs there, so it returns 101 as the oldest active XID. 4. GetNewTransactionid stores XID 100 to MyPgXact->xid So, GetOldestActiveTransactionId returned XID 101, even though 100 only just started and is surely still running. This would be hard to hit in practice, and even harder to spot any ill effect if it happens. GetOldestActiveTransactionId is only used when creating a checkpoint in a master server, and the race condition can only happen on an online checkpoint, as there are no backends running during a shutdown checkpoint. The oldestActiveXid value of an online checkpoint is only used when starting up a hot standby server, to determine the starting point where pg_subtrans is initialized from. For the race condition to happen, there must be no other XIDs in the proc array that would hold back the oldest-active XID value, which means that the missed XID must be a top transaction's XID. However, pg_subtrans is not used for top XIDs, so I believe an off-by-one error is in fact inconsequential. Nevertheless, let's fix it, as it's clearly wrong and the fix is simple. This has been wrong ever since hot standby was introduced, so backport to all supported versions. Discussion: https://www.postgresql.org/message-id/e7258662-82b6-7a45-56d4-99b337a32bf7@iki.fi
* Fix ruleutils.c for domain-over-array cases, too.Tom Lane2017-07-12
| | | | | | | | | | | | | Further investigation shows that ruleutils isn't quite up to speed either for cases where we have a domain-over-array: it needs to be prepared to look past a CoerceToDomain at the top level of field and element assignments, else it decompiles them incorrectly. Potentially this would result in failure to dump/reload a rule, if it looked like the one in the new test case. (I also added a test for EXPLAIN; that output isn't broken, but clearly we need more test coverage here.) Like commit b1cb32fb6, this bug is reachable in cases we already support, so back-patch all the way.
* Reduce memory usage of tsvector type analyze function.Heikki Linnakangas2017-07-12
| | | | | | | | | | | | | | | | | compute_tsvector_stats() detoasted and kept in memory every tsvector value in the sample, but that can be a lot of memory. The original bug report described a case using over 10 gigabytes, with statistics target of 10000 (the maximum). To fix, allocate a separate copy of just the lexemes that we keep around, and free the detoasted tsvector values as we go. This adds some palloc/pfree overhead, when you have a lot of distinct lexemes in the sample, but it's better than running out of memory. Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46797@wrigleys.postgresql.org
* Avoid integer overflow while sifting-up a heap in tuplesort.c.Tom Lane2017-07-12
| | | | | | | | | | | | | | | | If the number of tuples in the heap exceeds approximately INT_MAX/2, this loop's calculation "2*i+1" could overflow, resulting in a crash. Fix it by using unsigned int rather than int for the relevant local variables; that shouldn't cost anything extra on any popular hardware. Per bug #14722 from Sergey Koposov. Original patch by Sergey Koposov, modified by me per a suggestion from Heikki Linnakangas to use unsigned int not int64. Back-patch to 9.4, where tuplesort.c grew the ability to sort as many as INT_MAX tuples in-memory (commit 263865a48). Discussion: https://postgr.es/m/20170629161637.1478.93109@wrigleys.postgresql.org
* Fix variable and type name in comment.Heikki Linnakangas2017-07-12
| | | | | | Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/20170711.163441.241981736.horiguchi.kyotaro@lab.ntt.co.jp
* Fix ordering of operations in SyncRepWakeQueue to avoid assertion failure.Heikki Linnakangas2017-07-12
| | | | | | | | | | | | | | | | Commit 14e8803f1 removed the locking in SyncRepWaitForLSN, but that introduced a race condition, where SyncRepWaitForLSN might see syncRepState already set to SYNC_REP_WAIT_COMPLETE, but the process was not yet removed from the queue. That tripped the assertion, that the process should no longer be in the uqeue. Reorder the operations in SyncRepWakeQueue to remove the process from the queue first, and update syncRepState only after that, and add a memory barrier in between to make sure the operations are made visible to other processes in that order. Fixes bug #14721 reported by Const Zhang. Analysis and fix by Thomas Munro. Backpatch down to 9.5, where the locking was removed. Discussion: https://www.postgresql.org/message-id/20170629023623.1480.26508%40wrigleys.postgresql.org
* Remove unnecessary braces, to match the surrounding style.Heikki Linnakangas2017-07-12
| | | | | | | | | Mostly in the new subscription-related commands. Backport the few that were also present in older versions. Thomas Munro Discussion: https://www.postgresql.org/message-id/CAEepm=3CyW1QmXcXJXmqiJXtXzFDc8SvSfnxkEGD3Bkv2SrkeQ@mail.gmail.com
* Fix multiple assignments to a column of a domain type.Tom Lane2017-07-11
| | | | | | | | | | | | | | | | | | | We allow INSERT and UPDATE commands to assign to the same column more than once, as long as the assignments are to subfields or elements rather than the whole column. However, this failed when the target column was a domain over array rather than plain array. Fix by teaching process_matched_tle() to look through CoerceToDomain nodes, and add relevant test cases. Also add a group of test cases exercising domains over array of composite. It's doubtless accidental that CREATE DOMAIN allows this case while not allowing straight domain over composite; but it does, so we'd better make sure we don't break it. (I could not find any documentation mentioning either side of that, so no doc changes.) It's been like this for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us
* On Windows, retry process creation if we fail to reserve shared memory.Tom Lane2017-07-10
| | | | | | | | | | | | | | | We've heard occasional reports of backend launch failing because pgwin32_ReserveSharedMemoryRegion() fails, indicating that something has already used that address space in the child process. It's not very clear what, given that we disable ASLR in Windows builds, but suspicion falls on antivirus products. It'd be better if we didn't have to disable ASLR, anyway. So let's try to ameliorate the problem by retrying the process launch after such a failure, up to 100 times. Patch by me, based on previous work by Amit Kapila and others. This is a longstanding issue, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAA4eK1+R6hSx6t_yvwtx+NRzneVp+MRqXAdGJZChcau8Uij-8g@mail.gmail.com
* Fix potential data corruption during freezeTeodor Sigaev2017-07-06
| | | | | | | Fix oversight in 3b97e6823b94 bug fix. Bitwise AND is used instead of OR and it cleans all bits in t_infomask heap tuple field. Backpatch to 9.3
* Treat clean shutdown of an SSL connection same as the non-SSL case.Heikki Linnakangas2017-07-03
| | | | | | | | | | If the client closes an SSL connection, treat it the same as EOF on a non-SSL connection. In particular, don't write a message in the log about that. Michael Paquier. Discussion: https://www.postgresql.org/message-id/CAB7nPqSfyVV42Q2acFo%3DvrvF2gxoZAMJLAPq3S3KkjhZAYi7aw@mail.gmail.com
* Fix walsender to exit promptly if client requests shutdown.Tom Lane2017-06-30
| | | | | | | | | | | | | | It's possible for WalSndWaitForWal to be asked to wait for WAL that doesn't exist yet. That's fine, in fact it's the normal situation if we're caught up; but when the client requests shutdown we should not keep waiting. The previous coding could wait indefinitely if the source server was idle. In passing, improve the rather weak comments in this area, and slightly rearrange some related code for better readability. Back-patch to 9.4 where this code was introduced. Discussion: https://postgr.es/m/14154.1498781234@sss.pgh.pa.us
* Second try at fixing tcp_keepalives_idle option on Solaris.Tom Lane2017-06-28
| | | | | | | | | | | | | | | | | | | | | | | Buildfarm evidence shows that TCP_KEEPALIVE_THRESHOLD doesn't exist after all on Solaris < 11. This means we need to take positive action to prevent the TCP_KEEPALIVE code path from being taken on that platform. I've chosen to limit it with "&& defined(__darwin__)", since it's unclear that anyone else would follow Apple's precedent of spelling the symbol that way. Also, follow a suggestion from Michael Paquier of eliminating code duplication by defining a couple of intermediate symbols for the socket option. In passing, make some effort to reduce the number of translatable messages by replacing "setsockopt(foo) failed" with "setsockopt(%s) failed", etc, throughout the affected files. And update relevant documentation so that it doesn't claim to provide an exhaustive list of the possible socket option names. Like the previous commit (f0256c774), back-patch to all supported branches. Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org
* Support tcp_keepalives_idle option on Solaris.Tom Lane2017-06-27
| | | | | | | | | | | | | | | | Turns out that the socket option for this is named TCP_KEEPALIVE_THRESHOLD, at least according to the tcp(7P) man page for Solaris 11. (But since that text refers to "SunOS", it's likely pretty ancient.) It appears that the symbol TCP_KEEPALIVE does get defined on that platform, but it doesn't seem to represent a valid protocol-level socket option. This leads to bleats in the postmaster log, and no tcp_keepalives_idle functionality. Per bug #14720 from Andrey Lizenko, as well as an earlier report from Dhiraj Chawla that nobody had followed up on. The issue's been there since we added the TCP_KEEPALIVE code path in commit 5acd417c8, so back-patch to all supported branches. Discussion: https://postgr.es/m/20170627163757.25161.528@wrigleys.postgresql.org
* Re-allow SRFs and window functions within sub-selects within aggregates.Tom Lane2017-06-27
| | | | | | | | | | | check_agg_arguments_walker threw an error upon seeing a SRF or window function, but that is too aggressive: if the function is within a sub-select then it's perfectly fine. I broke the SRF case in commit 0436f6bde by copying the logic for window functions ... but that was broken too, and had been since commit eaccfded9. Repair both cases in HEAD, and the window function case back to 9.3. 9.2 gets this right.
* Don't lose walreceiver start requests due to race condition in postmaster.Tom Lane2017-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a walreceiver dies, the startup process will notice that and send a PMSIGNAL_START_WALRECEIVER signal to the postmaster, asking for a new walreceiver to be launched. There's a race condition, which at least in HEAD is very easy to hit, whereby the postmaster might see that signal before it processes the SIGCHLD from the walreceiver process. In that situation, sigusr1_handler() just dropped the start request on the floor, reasoning that it must be redundant. Eventually, after 10 seconds (WALRCV_STARTUP_TIMEOUT), the startup process would make a fresh request --- but that's a long time if the connection could have been re-established almost immediately. Fix it by setting a state flag inside the postmaster that we won't clear until we do launch a walreceiver. In cases where that results in an extra walreceiver launch, it's up to the walreceiver to realize it's unwanted and go away --- but we have, and need, that logic anyway for the opposite race case. I came across this through investigating unexpected delays in the src/test/recovery TAP tests: it manifests there in test cases where a master server is stopped and restarted while leaving streaming slaves active. This logic has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/21344.1498494720@sss.pgh.pa.us
* Ignore old stats file timestamps when starting the stats collector.Tom Lane2017-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stats collector disregards inquiry messages that bear a cutoff_time before when it last wrote the relevant stats file. That's fine, but at startup when it reads the "permanent" stats files, it absorbed their timestamps as if they were the times at which the corresponding temporary stats files had been written. In reality, of course, there's no data out there at all. This led to disregarding inquiry messages soon after startup if the postmaster had been shut down and restarted within less than PGSTAT_STAT_INTERVAL; which is a pretty common scenario, both for testing and in the field. Requesting backends would hang for 10 seconds and then report failure to read statistics, unless they got bailed out by some other backend coming along and making a newer request within that interval. I came across this through investigating unexpected delays in the src/test/recovery TAP tests: it manifests there because the autovacuum launcher hangs for 10 seconds when it can't get statistics at startup, thus preventing a second shutdown from occurring promptly. We might want to do some things in the autovac code to make it less prone to getting stuck that way, but this change is a good bug fix regardless. In passing, also fix pgstat_read_statsfiles() to ensure that it re-zeroes its global stats variables if they are corrupted by a short read from the stats file. (Other reads in that function go into temp variables, so that the issue doesn't arise.) This has been broken since we created the separation between permanent and temporary stats files in 8.4, so back-patch to all supported branches. Discussion: https://postgr.es/m/16860.1498442626@sss.pgh.pa.us
* Fix typo in commentAlvaro Herrera2017-06-22
| | | | | | | | Once upon a time, WAL pointers could be NULL, but no longer. We talk about "valid" now. Reported-by: Amit Langote Discussion: https://postgr.es/m/33e9617d-27f1-eee8-3311-e27af98eaf2b@lab.ntt.co.jp