aboutsummaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAge
...
* pg_dump: Add missing newline to error messagePeter Eisentraut2018-12-27
|
* Remove entry tree root conflict checking from GIN predicate lockingAlexander Korotkov2018-12-27
| | | | | | | | | | | | | | According to README we acquire predicate locks on entry tree leafs and posting tree roots. However, when ginFindLeafPage() is going to lock leaf in exclusive mode, then it checks root for conflicts regardless whether it's a entry or posting tree. Assuming that we never place predicate lock on entry tree root (excluding corner case when root is leaf), this check is redundant. This commit removes this check. Now, root conflict checking is controlled by separate argument of ginFindLeafPage(). Discussion: https://postgr.es/m/CAPpHfdv7rrDyy%3DMgsaK-L9kk0AH7az0B-mdC3w3p0FSb9uoyEg%40mail.gmail.com Author: Alexander Korotkov Backpatch-through: 11
* Ignore inherited temp relations from other sessions when truncatingMichael Paquier2018-12-27
| | | | | | | | | | | | | | | | | | Inheritance trees can include temporary tables if the parent is permanent, which makes possible the presence of multiple temporary children from different sessions. Trying to issue a TRUNCATE on the parent in this scenario causes a failure, so similarly to any other queries just ignore such cases, which makes TRUNCATE work transparently. This makes truncation behave similarly to any other DML query working on the parent table with queries which need to be work on the children. A set of isolation tests is added to cover basic cases. Reported-by: Zhou Digoal Author: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org Backpatch-through: 9.4
* Fix failure to check for open() or fsync() failures.Tom Lane2018-12-26
| | | | | | | | | | | While it seems OK to not be concerned about fsync() failure for a pre-existing signal file, it's not OK to not even check for open() failure. This at least causes complaints from static analyzers, and I think on some platforms passing -1 to fsync() or close() might trigger assertion-type failures. Also add (void) casts to make clear that we're ignoring fsync's result intentionally. Oversights in commit 2dedf4d9a, noted by Coverity.
* Fix portability failure introduced in commits d2b0b60e7 et al.Tom Lane2018-12-26
| | | | | | | | | | | | I made a frontend fprintf() format use %m, forgetting that that's only safe in HEAD not the back branches; prior to 96bf88d52 and d6c55de1f, it would work on glibc platforms but not elsewhere. Revert to using %s ... strerror(errno) as the code did before. We could have left HEAD as-is, but for code consistency across branches, I chose to apply this patch there too. Per Coverity and a few buildfarm members.
* Improve tab completion of ALTER INDEX/TABLE with SET STATISTICS in psqlMichael Paquier2018-12-25
| | | | | | | | | | | | | | | This fixes two issues with the completion of ALTER TABLE and ALTER INDEX after SET STATISTICS is typed, trying to suggest schema objects while the grammar only allows integers. The tab completion of ALTER INDEX is made smarter by handling properly more patterns. COLUMN is an optional keyword, but as no column numbers can be suggested yet as possible input simply adjust the completion so as no incorrect queries are generated. Author: Michael Paquier Reviewed-by: Tatsuro Yamada Discussion: https://postgr.es/m/20181219092255.GC680@paquier.xyz
* Prioritize history files when archivingMichael Paquier2018-12-24
| | | | | | | | | | | | | | | | | | | | | | | | At the end of recovery for the post-promotion process, a new history file is created followed by the last partial segment of the previous timeline. Based on the timing, the archiver would first try to archive the last partial segment and then the history file. This can delay the detection of a new timeline taken, particularly depending on the time it takes to transfer the last partial segment as it delays the moment the history file of the new timeline gets archived. This can cause promoted standbys to use the same timeline as one already taken depending on the circumstances if multiple instances look at archives at the same location. This commit changes the order of archiving so as history files are archived in priority over other file types, which reduces the likelihood of the same timeline being taken (still not reducing the window to zero), and it makes the archiver behave more consistently with the startup process doing its post-promotion business. Author: David Steele Reviewed-by: Michael Paquier, Kyotaro Horiguchi Discussion: https://postgr.es/m/929068cf-69e1-bba2-9dc0-e05986aed471@pgmasters.net Backpatch-through: 9.5
* Disable WAL-skipping optimization for COPY on views and foreign tablesMichael Paquier2018-12-23
| | | | | | | | | | | | | | | | | | | | COPY can skip writing WAL when loading data on a table which has been created in the same transaction as the one loading the data, however this cannot work on views or foreign table as this would result in trying to flush relation files which do not exist. So disable the optimization so as commands are able to work the same way with any configuration of wal_level. Tests are added to cover the different cases, which need to have wal_level set to minimal to allow the problem to show up, and that is not the default configuration. Reported-by: Luis M. Carril, Etsuro Fujita Author: Amit Langote, Michael Paquier Reviewed-by: Etsuro Fujita Discussion: https://postgr.es/m/15552-c64aa14c5c22f63c@postgresql.org Backpatch-through: 10, where support for COPY on views has been added, while v11 has added support for COPY on foreign tables.
* Add completion for storage parameters after CREATE TABLE WITH in psqlMichael Paquier2018-12-23
| | | | | | | | In passing, move the list of parameters where it can be used for both CREATE TABLE and ALTER TABLE, and reorder it alphabetically. Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/d8j1s77kdbb.fsf@dalvik.ping.uio.no
* Add WRITE_*_ARRAY macrosPeter Eisentraut2018-12-22
| | | | | | | | | Add WRITE_ATTRNUMBER_ARRAY, WRITE_OID_ARRAY, WRITE_INT_ARRAY, WRITE_BOOL_ARRAY macros to outfuncs.c, mirroring the existing READ_*_ARRAY macros in readfuncs.c. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8f2ebc67-e75f-9478-f5a5-bbbf090b1f8d%402ndquadrant.com
* Add some const decorationsPeter Eisentraut2018-12-22
| | | | These mainly help understanding the function signatures better.
* Fix ancient compiler warnings and typos in !HAVE_SYMLINK codePeter Eisentraut2018-12-22
| | | | This has never been correct since this code was introduced.
* Check for conflicting queries during replay of gistvacuumpage()Alexander Korotkov2018-12-21
| | | | | | | | | | | | | | | | | | | | | | | 013ebc0a7b implements so-called GiST microvacuum. That is gistgettuple() marks index tuples as dead when kill_prior_tuple is set. Later, when new tuple insertion claims page space, those dead index tuples are physically deleted from page. When this deletion is replayed on standby, it might conflict with read-only queries. But 013ebc0a7b doesn't handle this. That may lead to disappearance of some tuples from read-only snapshots on standby. This commit implements resolving of conflicts between replay of GiST microvacuum and standby queries. On the master we implement new WAL record type XLOG_GIST_DELETE, which comprises necessary information. On stable releases we've to be tricky to keep WAL compatibility. Information required for conflict processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So, PostgreSQL version, which doesn't know about conflict processing, will just ignore that. Reported-by: Andres Freund Diagnosed-by: Andres Freund Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de Author: Alexander Korotkov Backpatch-through: 9.6
* Base information_schema.sql_identifier domain on name, not varchar.Tom Lane2018-12-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SQL spec says that sql_identifier is a domain over varchar, but it also says that that domain is supposed to represent the set of valid identifiers for the implementation, in particular applying a length limit matching the implementation's identifier length limit. We were declaring sql_identifier as just "character varying", thus duplicating what the spec says about base type, but entirely failing at the rest of it. Instead, let's declare sql_identifier as a domain over type "name". (We can drop the COLLATE "C" added by commit 6b0faf723, since that's now implicit in "name".) With the recent improvements to name's comparison support, there's not a lot of functional difference between name and varchar. So although in principle this is a spec deviation, it's a pretty minor one. And correctly enforcing PG's name length limit is a good thing; on balance this seems closer to the intent of the spec than what we had. But that's all just language-lawyering. The *real* reason to do this is that it makes sql_identifier columns exposed by information_schema views be just direct representations of the underlying "name" catalog columns, eliminating a semantic mismatch that was disastrous for performance of typical queries on the information_schema. In combination with the recent change to allow dropping no-op CoerceToDomain nodes, this allows (for example) queries such as select ... from information_schema.tables where table_name = 'foo'; to produce an indexscan rather than a seqscan on pg_class. Discussion: https://postgr.es/m/CAFj8pRBUCX4LZ2rA2BbEkdD6NN59mgx+BLo1gO08Wod4RLtcTg@mail.gmail.com
* Avoid producing over-length specific_name outputs in information_schema.Tom Lane2018-12-20
| | | | | | | | | | | | | | | | | | | | | | | | information_schema output columns that are declared as being type sql_identifier are supposed to conform to the implementation's rules for valid identifiers, in particular the identifier length limit. Several places potentially violated this limit by concatenating a function's name and OID. (The OID is added to ensure name uniqueness within a schema, since the spec doesn't expect function name overloading.) Simply truncating the concatenation result to fit in "name" won't do, since losing part of the OID might wind up giving non-unique results. Instead, let's truncate the function name as necessary. The most practical way to do that is to do it in a C function; the information_schema.sql script doesn't have easy access to the value of NAMEDATALEN, nor does it have an easy way to truncate on the basis of resulting byte-length rather than number of characters. (There are still a couple of places that cast concatenation results to sql_identifier, but as far as I can see they are guaranteed not to produce over-length strings, at least with the normal value of NAMEDATALEN.) Discussion: https://postgr.es/m/23817.1545283477@sss.pgh.pa.us
* Fix lock level used for partition when detaching itAlvaro Herrera2018-12-20
| | | | | | | | | | | | | | For probably bogus reasons, we acquire only AccessShareLock on the partition when we try to detach it from its parent partitioned table. This can cause ugly things to happen if another transaction is doing any sort of DDL to the partition concurrently. Upgrade that lock to ShareUpdateExclusiveLock, which per discussion seems to be the minimum needed. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com
* Make bitmapset.c use 64-bit bitmap words on 64-bit machines.Tom Lane2018-12-20
| | | | | | | | | | | | Using the full width of the CPU's native word size shouldn't cost anything in typical cases. When working with large bitmapsets, this halves the number of operations needed for many common BMS operations. On the right sort of test case, a measurable improvement is obtained. David Rowley Discussion: https://postgr.es/m/CAKJS1f9EGBd2h-VkXvb=51tf+X46zMX5T8h-KYgXEV_u2zmLUw@mail.gmail.com
* DETACH PARTITION: hold locks on indexes until end of transactionAlvaro Herrera2018-12-20
| | | | | | | | | | | | | | | | | | | | When a partition is detached from its parent, we acquire locks on all attached indexes to also detach them ... but we release those locks immediately. This is a violation of the policy of keeping locks on user objects to the end of the transaction. Bug introduced in 8b08f7d4820f. It's unclear that there are any ill effects possible, but it's clearly wrong nonetheless. It's likely that bad behavior *is* possible, but mostly because the relation that the index is for is only locked with AccessShareLock, which is an older bug that shall be fixed separately. While touching that line of code, close the index opened with index_open() using index_close() instead of relation_close(). No difference in practice, but let's be consistent. Unearthed by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com
* Add more tab completion for CREATE TABLE in psqlMichael Paquier2018-12-20
| | | | | | | | | | | | The following completion patterns are added: - CREATE TABLE <name> with '(', OF or PARTITION OF. - CREATE TABLE <name> OF with list of composite types. - CREATE TABLE name (...) with PARTITION OF, WITH, TABLESPACE, ON COMMIT (depending on the presence of a temporary table). - CREATE TABLE ON COMMIT with actions (only for temporary tables). Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/d8j1s77kdbb.fsf@dalvik.ping.uio.no
* Fix ADD IF NOT EXISTS used in conjunction with ALTER TABLE ONLYGreg Stark2018-12-19
| | | | | | The flag for IF NOT EXISTS was only being passed down in the normal recursing case. It's been this way since originally added in 9.6 in commit 2cd40adb85 so backpatch back to 9.6.
* Add text-vs-name cross-type operators, and unify name_ops with text_ops.Tom Lane2018-12-19
| | | | | | | | | | | | | | | | | | | Now that name comparison has effectively the same behavior as text comparison, we might as well merge the name_ops opfamily into text_ops, allowing cross-type comparisons to be processed without forcing a datatype coercion first. We need do little more than add cross-type operators to make the opfamily complete, and fix one or two places in the planner that assumed text_ops was a single-datatype opfamily. I chose to unify hash name_ops into hash text_ops as well, since the types have compatible hashing semantics. This allows marking the new cross-type equality operators as oprcanhash. (Note: this doesn't remove the name_ops opclasses, so there's no breakage of index definitions. Those opclasses are just reparented into the text_ops opfamily.) Discussion: https://postgr.es/m/15938.1544377821@sss.pgh.pa.us
* Make type "name" collation-aware.Tom Lane2018-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The "name" comparison operators now all support collations, making them functionally equivalent to "text" comparisons, except for the different physical representation of the datatype. They do, in fact, mostly share the varstr_cmp and varstr_sortsupport infrastructure, which has been slightly enlarged to handle the case. To avoid changes in the default behavior of the datatype, set name's typcollation to C_COLLATION_OID not DEFAULT_COLLATION_OID, so that by default comparisons to a name value will continue to use strcmp semantics. (This would have been the case for system catalog columns anyway, because of commit 6b0faf723, but doing this makes it true for user-created name columns as well. In particular, this avoids locale-dependent changes in our regression test results.) In consequence, tweak a couple of places that made assumptions about collatable base types always having typcollation DEFAULT_COLLATION_OID. I have not, however, attempted to relax the restriction that user- defined collatable types must have that. Hence, "name" doesn't behave quite like a user-defined type; it acts more like a domain with COLLATE "C". (Conceivably, if we ever get rid of the need for catalog name columns to be fixed-length, "name" could actually become such a domain over text. But that'd be a pretty massive undertaking, and I'm not volunteering.) Discussion: https://postgr.es/m/15938.1544377821@sss.pgh.pa.us
* Remove function names from error messagesAlvaro Herrera2018-12-19
| | | | | They are not necessary, and having them there gives useless work for translators.
* Small improvements for allocation logic in ginHeapTupleFastCollect().Tom Lane2018-12-19
| | | | | | | | | | | | Avoid repetitive calls to repalloc() when the required size of the collector array grows more than 2x in one call. Also ensure that the array size is a power of 2 (since palloc will probably consume a power of 2 anyway) and doesn't start out very small (which'd likely just lead to extra repallocs). David Rowley, tweaked a bit by me Discussion: https://postgr.es/m/CAKJS1f8vn-iSBE8PKeVHrnhvyjRNYCxguPFFY08QLYmjWG9hPQ@mail.gmail.com
* Remove obsolete nbtree duplicate entries comment.Peter Geoghegan2018-12-18
| | | | | | | | Remove a comment from the Berkeley days claiming that nbtree must disambiguate duplicate keys within _bt_moveright(). There is no special care taken around duplicates within _bt_moveright(), at least since commit 9e85183bfc3 removed inscrutable _bt_moveright() code to handle pages full of duplicates.
* Correct obsolete nbtree recovery comments.Peter Geoghegan2018-12-18
| | | | | | | | | | | | | | | Commit 40dae7ec537, which made the handling of interrupted nbtree page splits more robust, removed an nbtree-specific end-of-recovery cleanup step. This meant that it was no longer possible to complete an interrupted page split during recovery. However, a reference to recovery as a reason for using a NULL stack while inserting into a parent page was missed. Remove the reference. Remove a similar obsolete reference to recovery that was introduced much more recently, as part of the btree fastpath optimization enhancement that made it into Postgres 11 (commit 2b272734, and follow-up commits). Backpatch: 11-, where the fastpath optimization was introduced.
* Make collation-aware system catalog columns use "C" collation.Tom Lane2018-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up to now we allowed text columns in system catalogs to use collation "default", but that isn't really safe because it might mean something different in template0 than it means in a database cloned from template0. In particular, this could mean that cloned pg_statistic entries for such columns weren't entirely valid, possibly leading to bogus planner estimates, though (probably) not any outright failures. In the wake of commit 5e0928005, a better solution is available: if we label such columns with "C" collation, then their pg_statistic entries will also use that collation and hence will be valid independently of the database collation. This also provides a cleaner solution for indexes on such columns than the hack added by commit 0b28ea79c: the indexes will naturally inherit "C" collation and don't have to be forced to use text_pattern_ops. Also, with the planned improvement of type "name" to be collation-aware, this policy will apply cleanly to both text and name columns. Because of the pg_statistic angle, we should also apply this policy to the tables in information_schema. This patch does that by adjusting information_schema's textual domain types to specify "C" collation. That has the user-visible effect that order-sensitive comparisons to textual information_schema view columns will now use "C" collation by default. The SQL standard says that the collation of those view columns is implementation-defined, so I think this is legal per spec. At some point this might allow for translation of such comparisons into indexable conditions on the underlying "name" columns, although additional work will be needed before that can happen. Discussion: https://postgr.es/m/19346.1544895309@sss.pgh.pa.us
* Fix ancient thinko in mergejoin cost estimation.Tom Lane2018-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | "rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples, which is sensible if it's to be multiplied by total-inner-tuples or a cost value corresponding to scanning all the inner tuples. But in reality it was (mostly) multiplied by inner_rows or a related cost, numbers that take into account the possibility of stopping short of scanning the whole inner relation thanks to a limited key range in the outer relation. This'd still make sense if we could expect that stopping short would result in a proportional decrease in the number of tuples that have to be rescanned. It does not, however. The argument that establishes the validity of our estimate for that number is independent of whether we scan all of the inner relation or stop short, and experimentation also shows that stopping short doesn't reduce the number of rescanned tuples. So the correct calculation is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply that by inner_rows or a corresponding cost value. Most of the time this doesn't make much difference, but if we have both a high rescan rate (due to lots of duplicate values) and an outer key range much smaller than the inner key range, then the error can be significant, leading to a large underestimate of the cost associated with rescanning. Per report from Vijaykumar Jain. This thinko appears to go all the way back to the introduction of the rescan estimation logic in commit 70fba7043, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
* Include partitioned indexes to system view pg_indexesMichael Paquier2018-12-18
| | | | | | | | | pg_tables already includes partitioned tables, so for consistency pg_indexes should show partitioned indexes. Author: Suraj Kharage Reviewed-by: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/CAF1DzPVrYo4XNTEnc=PqVp6aLJc7LFYpYR4rX=_5pV=wJ2KdZg@mail.gmail.com
* Tweak description comments in tests for partition functionsMichael Paquier2018-12-18
| | | | | | | | | The new wording is more generic and fixes one grammar mistake and one typo on the way. Per discussion between Amit Langote and me. Discussion: https://postgr.es/m/20181217064028.GJ31474@paquier.xyz
* Include ALTER INDEX SET STATISTICS in pg_dumpMichael Paquier2018-12-18
| | | | | | | | | | | | | | | | | | | | | | | The new grammar pattern of ALTER INDEX SET STATISTICS able to use column numbers on top of the existing column names introduced by commit 5b6d13e forgot to add support for the feature in pg_dump, so defining statistics on index columns was missing from the dumps, potentially causing silent planning problems with a subsequent restore. pg_dump ought to not use column names in what it generates as these are automatically generated by the server and could conflict with real relation attributes with matching patterns. "expr" and "exprN", N incremented automatically after the creation of the first one, are used as default attribute names for index expressions, and that could easily match what is defined in other relations, causing the dumps to fail if some of those attributes are renamed at some point. So to avoid any problems, the new grammar with column numbers gets used. Reported-by: Ronan Dunklau Author: Michael Paquier Reviewed-by: Tom Lane, Adrien Nayrat, Amul Sul Discussion: https://postgr.es/m/CAARsnT3UQ4V=yDNW468w8RqHfYiY9mpn2r_c5UkBJ97NAApUEw@mail.gmail.com Backpatch-through: 11, where the new syntax has been introduced.
* Drop support for getting signal descriptions from sys_siglist[].Tom Lane2018-12-17
| | | | | | | | | | It appears that all platforms that have sys_siglist[] also have strsignal(), making that fallback case in pg_strsignal() dead code. Getting rid of it allows dropping a configure test, which seems worth more than providing textual signal descriptions on whatever platforms might still hypothetically have use for the fallback case. Discussion: https://postgr.es/m/25758.1544983503@sss.pgh.pa.us
* Fix tablespace handling for partitioned tablesAlvaro Herrera2018-12-17
| | | | | | | | | | | | | | | | | | When partitioned tables were introduced, we failed to realize that by copying the tablespace handling for other relation kinds with no physical storage we were causing the secondary effect that their partitions would not automatically inherit the tablespace setting. This is surprising and unhelpful, so change it to adopt the behavior introduced in pg11 (commit 33e6c34c3267) for partitioned indexes: the parent relation remembers the tablespace specification, which is then used for any new partitions that don't declare one. Because this commit changes behavior of the TABLESPACE clause for partitioned tables (it's no longer a no-op), it is not backpatched. Author: David Rowley, Álvaro Herrera Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAKJS1f9SxVzqDrGD1teosFd6jBMM0UEaa14_8mRvcWE19Tu0hA@mail.gmail.com
* Remove extra semicolons.Amit Kapila2018-12-17
| | | | | | | | Reported-by: David Rowley Author: David Rowley Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/CAKJS1f8EneeYyzzvdjahVZ6gbAHFkHbSFB5m_C0Y6TUJs9Dgdg@mail.gmail.com
* Fix use-after-free bug when renaming constraintsMichael Paquier2018-12-17
| | | | | | | | | This is an oversight from recent commit b13fd344. While on it, tweak the previous test with a better name for the renamed primary key. Detected by buildfarm member prion which forces relation cache release with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous commit.
* Make constraint rename issue relcache invalidation on target relationMichael Paquier2018-12-17
| | | | | | | | | | | | | | | When a constraint gets renamed, it may have associated with it a target relation (for example domain constraints don't have one). Not invalidating the target relation cache when issuing the renaming can result in issues with subsequent commands that refer to the old constraint name using the relation cache, causing various failures. One pattern spotted was using CREATE TABLE LIKE after a constraint renaming. Reported-by: Stuart <sfbarbee@gmail.com> Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org
* Modernize our code for looking up descriptive strings for Unix signals.Tom Lane2018-12-16
| | | | | | | | | | | | | | | | | | | | | | At least as far back as the 2008 spec, POSIX has defined strsignal(3) for looking up descriptive strings for signal numbers. We hadn't gotten the word though, and were still using the crufty old sys_siglist array, which is in no standard even though most Unixen provide it. Aside from not being formally standards-compliant, this was just plain ugly because it involved #ifdef's at every place using the code. To eliminate the #ifdef's, create a portability function pg_strsignal, which wraps strsignal(3) if available and otherwise falls back to sys_siglist[] if available. The set of Unixen with neither API is probably empty these days, but on any platform with neither, you'll just get "unrecognized signal". All extant callers print the numeric signal number too, so no need to work harder than that. Along the way, upgrade pg_basebackup's child-error-exit reporting to match the rest of the system. Discussion: https://postgr.es/m/25758.1544983503@sss.pgh.pa.us
* Make error handling in parallel pg_upgrade less bogus.Tom Lane2018-12-16
| | | | | | | | | | | | reap_child() basically ignored the possibility of either an error in waitpid() itself or a child process failure on signal. We don't really need to do more than report and crash hard, but proceeding as though nothing is wrong is definitely Not Acceptable. The error report for nonzero child exit status was pretty off-point, as well. Noted while fooling around with child-process failure detection logic elsewhere. It's been like this a long time, so back-patch to all supported branches.
* Improve detection of child-process SIGPIPE failures.Tom Lane2018-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ffa4cbd62 added logic to detect SIGPIPE failure of a COPY child process, but it only worked correctly if the SIGPIPE occurred in the immediate child process. Depending on the shell in use and the complexity of the shell command string, we might instead get back an exit code of 128 + SIGPIPE, representing a shell error exit reporting SIGPIPE in the child process. We could just hack up ClosePipeToProgram() to add the extra case, but it seems like this is a fairly general issue deserving a more general and better-documented solution. I chose to add a couple of functions in src/common/wait_error.c, which is a natural place to know about wait-result encodings, that will test for either a specific child-process signal type or any child-process signal failure. Then, adjust other places that were doing ad-hoc tests of this type to use the common functions. In RestoreArchivedFile, this fixes a race condition affecting whether the process will report an error or just silently proc_exit(1): before, that depended on whether the intermediate shell got SIGTERM'd itself or reported a child process failing on SIGTERM. Like the previous patch, back-patch to v10; we could go further but there seems no real need to. Per report from Erik Rijkers. Discussion: https://postgr.es/m/f3683f87ab1701bea5d86a7742b22432@xs4all.nl
* Make pg_statistic and related code account more honestly for collations.Tom Lane2018-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we first put in collations support, we basically punted on teaching pg_statistic, ANALYZE, and the planner selectivity functions about that. They've just used DEFAULT_COLLATION_OID independently of the actual collation of the data. It's time to improve that, so: * Add columns to pg_statistic that record the specific collation associated with each statistics slot. * Teach ANALYZE to use the column's actual collation when comparing values for statistical purposes, and record this in the appropriate slot. (Note that type-specific typanalyze functions are now expected to fill stats->stacoll with the appropriate collation, too.) * Teach assorted selectivity functions to use the actual collation of the stats they are looking at, instead of just assuming it's DEFAULT_COLLATION_OID. This should give noticeably better results in selectivity estimates for columns with nondefault collations, at least for query clauses that use that same collation (which would be the default behavior in most cases). It's still true that comparisons with explicit COLLATE clauses different from the stored data's collation won't be well-estimated, but that's no worse than before. Also, this patch does make the first step towards doing better with that, which is that it's now theoretically possible to collect stats for a collation other than the column's own collation. Patch by me; thanks to Peter Eisentraut for review. Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
* Introduce new extended routines for FDW and foreign server lookupsMichael Paquier2018-12-14
| | | | | | | | | | | | | | The cache lookup routines for foreign-data wrappers and foreign servers are extended with an extra argument to handle a set of flags. The only value which can be used now is to indicate if a missing object should result in an error or not, and are designed to be extensible on need. Those new routines are added into the existing set of user-visible FDW APIs and documented in consequence. They will be used for future patches to improve the SQL interface for object addresses. Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com
* Create a separate oid range for oids assigned by genbki.pl.Andres Freund2018-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The changes I made in 578b229718e assigned oids below FirstBootstrapObjectId to objects in include/catalog/*.dat files that did not have an oid assigned, starting at the max oid explicitly assigned. Tom criticized that for mainly two reasons: 1) It's not clear which values are manually and which explicitly assigned. 2) The space below FirstBootstrapObjectId gets pretty crowded, and some PostgreSQL forks have used oids >= 9000 for their own objects, to avoid conflicting. Thus create a new range for objects not assigned explicit oids, but assigned by genbki.pl. For now 1-9999 is for explicitly assigned oids, FirstGenbkiObjectId (10000) to FirstBootstrapObjectId (1200) -1 is for genbki.pl assigned oids, and < FirstNormalObjectId (16384) is for oids assigned during bootstrap. It's possible that we'll have to adjust these boundaries, but there's some headroom for now. Add a note suggesting that oids in forks should be assigned in the 9000-9999 range. Catversion bump for obvious reasons. Per complaint from Tom Lane. Author: Andres Freund Discussion: https://postgr.es/m/16845.1544393682@sss.pgh.pa.us
* Fix bogus logic for skipping unnecessary partcollation dependencies.Tom Lane2018-12-13
| | | | | | | | | The idea here is to not call recordDependencyOn for the default collation, since we know that's pinned. But what the code actually did was to record the partition key's dependency on the opclass twice, instead. Evidently introduced by sloppy coding in commit 2186b608b. Back-patch to v10 where that came in.
* Drop no-op CoerceToDomain nodes from expressions at planning time.Tom Lane2018-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
* Prevent GIN deleted pages from being reclaimed too earlyAlexander Korotkov2018-12-13
| | | | | | | | | | | | | | | | | | When GIN vacuum deletes a posting tree page, it assumes that no concurrent searchers can access it, thanks to ginStepRight() locking two pages at once. However, since 9.4 searches can skip parts of posting trees descending from the root. That leads to the risk that page is deleted and reclaimed before concurrent search can access it. This commit prevents the risk of above by waiting for every transaction, which might wait to reference this page, to finish. Due to binary compatibility we can't change GinPageOpaqueData to store corresponding transaction id. Instead we reuse page header pd_prune_xid field, which is unused in index pages. Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Andrey Borodin, Alexander Korotkov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4
* Prevent deadlock in ginRedoDeletePage()Alexander Korotkov2018-12-13
| | | | | | | | | | | | | | | | | | | | On standby ginRedoDeletePage() can work concurrently with read-only queries. Those queries can traverse posting tree in two ways. 1) Using rightlinks by ginStepRight(), which locks the next page before unlocking its left sibling. 2) Using downlinks by ginFindLeafPage(), which locks at most one page at time. Original lock order was: page, parent, left sibling. That lock order can deadlock with ginStepRight(). In order to prevent deadlock this commit changes lock order to: left sibling, page, parent. Note, that position of parent in locking order seems insignificant, because we only lock one page at time while traversing downlinks. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Peter Geoghegan, Andrey Borodin Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov Backpatch-through: 9.4
* Fix deadlock in GIN vacuum introduced by 218f51584d5Alexander Korotkov2018-12-13
| | | | | | | | | | | | | | | | | | | | | Before 218f51584d5 if posting tree page is about to be deleted, then the whole posting tree is locked by LockBufferForCleanup() on root preventing all the concurrent inserts. 218f51584d5 reduced locking to the subtree containing page to be deleted. However, due to concurrent parent split, inserter doesn't always holds pins on all the pages constituting path from root to the target leaf page. That could cause a deadlock between GIN vacuum process and GIN inserter. And we didn't find non-invasive way to fix this. This commit reverts VACUUM behavior to lock the whole posting tree before delete any page. However, we keep another useful change by 218f51584d5: the tree is locked only if there are pages to be deleted. Reported-by: Chen Huajun Diagnosed-by: Chen Huajun, Andrey Borodin, Peter Geoghegan Discussion: https://postgr.es/m/31a702a.14dd.166c1366ac1.Coremail.chjischj%40163.com Author: Alexander Korotkov, based on ideas from Andrey Borodin and Peter Geoghegan Reviewed-by: Andrey Borodin Backpatch-through: 10
* Repair bogus EPQ plans generated for postgres_fdw foreign joins.Tom Lane2018-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | postgres_fdw's postgresGetForeignPlan() assumes without checking that the outer_plan it's given for a join relation must have a NestLoop, MergeJoin, or HashJoin node at the top. That's been wrong at least since commit 4bbf6edfb (which could cause insertion of a Sort node on top) and it seems like a pretty unsafe thing to Just Assume even without that. Through blind good fortune, this doesn't seem to have any worse consequences today than strange EXPLAIN output, but it's clearly trouble waiting to happen. To fix, test the node type explicitly before touching Join-specific fields, and avoid jamming the new tlist into a node type that can't do projection. Export a new support function from createplan.c to avoid building low-level knowledge about the latter into FDWs. Back-patch to 9.6 where the faulty coding was added. Note that the associated regression test cases don't show any changes before v11, apparently because the tests back-patched with 4bbf6edfb don't actually exercise the problem case before then (there's no top-level Sort in those plans). Discussion: https://postgr.es/m/8946.1544644803@sss.pgh.pa.us
* Repair bogus handling of multi-assignment Params in upper plan levels.Tom Lane2018-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our support for multiple-set-clauses in UPDATE assumes that the Params referencing a MULTIEXPR_SUBLINK SubPlan will appear before that SubPlan in the targetlist of the plan node that calculates the updated row. (Yeah, it's a hack...) In some PG branches it's possible that a Result node gets inserted between the primary calculation of the update tlist and the ModifyTable node. setrefs.c did the wrong thing in this case and left the upper-level Params as Params, causing a crash at runtime. What it should do is replace them with "outer" Vars referencing the child plan node's output. That's a result of careless ordering of operations in fix_upper_expr_mutator, so we can fix it just by reordering the code. Fix fix_join_expr_mutator similarly for consistency, even though join nodes could never appear in such a context. (In general, it seems likely to be a bit cheaper to use Vars than Params in such situations anyway, so this patch might offer a tiny performance improvement.) The hazard extends back to 9.5 where the MULTIEXPR_SUBLINK stuff was introduced, so back-patch that far. However, this may be a live bug only in 9.6.x and 10.x, as the other branches don't seem to want to calculate the final tlist below the Result node. (That plan shape change between branches might be a mini-bug in itself, but I'm not really interested in digging into the reasons for that right now. Still, add a regression test memorializing what we expect there, so we'll notice if it changes again.) Per bug report from Eduards Bezverhijs. Discussion: https://postgr.es/m/b6cd572a-3e44-8785-75e9-c512a5a17a73@tieto.com
* Tweak pg_partition_tree for undefined relations and unsupported relkindsMichael Paquier2018-12-12
| | | | | | | | | | | | This fixes a crash which happened when calling the function directly with a relation OID referring to a non-existing object, and changes the behavior so as NULL is returned for unsupported relkinds instead of generating an error. This puts the new function in line with many other system functions, and eases actions like full scans of pg_class. Author: Michael Paquier Reviewed-by: Amit Langote, Stephen Frost Discussion: https://postgr.es/m/20181207010406.GO2407@paquier.xyz