aboutsummaryrefslogtreecommitdiff
path: root/src/backend
Commit message (Collapse)AuthorAge
* Simplify do_pg_start_backup's API by opening pg_tblspc internally.Tom Lane2017-12-04
| | | | | | | | | | | | | do_pg_start_backup() expects its callers to pass in an open DIR pointer for the pg_tblspc directory, but there's no apparent advantage in that. It complicates the callers without adding any flexibility, and there's no robustness advantage, since we surely have to be prepared for errors during the scan of pg_tblspc anyway. In fact, by holding an extra kernel resource during operations like the preliminary checkpoint, we might be making things a fraction more failure-prone not less. Hence, remove that argument and open the directory just for the duration of the actual scan. Discussion: https://postgr.es/m/28752.1512413887@sss.pgh.pa.us
* Improve error handling in RemovePgTempFiles().Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | | Modify this function and its subsidiaries so that syscall failures are reported via ereport(LOG), rather than silently ignored as before. We don't want to throw a hard ERROR, as that would prevent database startup, and getting rid of leftover temporary files is not important enough for that. On the other hand, not reporting trouble at all seems like an odd choice not in line with current project norms, especially since any failure here is quite unexpected. On the same reasoning, adjust these functions' AllocateDir/ReadDir calls so that failure to scan a directory results in LOG not ERROR. I also removed the previous practice of silently ignoring ENOENT failures during directory opens --- there are some corner cases where that could happen given a previous database crash, but that seems like a bad excuse for ignoring a condition that isn't expected in most cases. A LOG message during postmaster start seems OK in such situations, and better than no output at all. In passing, make RemovePgTempRelationFiles' test for "is the file name all digits" look more like the way it's done elsewhere. Discussion: https://postgr.es/m/19907.1512402254@sss.pgh.pa.us
* Clean up assorted messiness around AllocateDir() usage.Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
* When VACUUM or ANALYZE skips a concurrently dropped table, log it.Robert Haas2017-12-04
| | | | | | | Hopefully, the additional logging will help avoid confusion that could otherwise result. Nathan Bossart, reviewed by Michael Paquier, FabrĂ­zio Mello, and me
* Support boolean columns in functional-dependency statistics.Tom Lane2017-12-04
| | | | | | | | | | | | | | | | | | | | | | | There's no good reason that the multicolumn stats stuff shouldn't work on booleans. But it looked only for "Var = pseudoconstant" clauses, and it will seldom find those for boolean Vars, since earlier phases of planning will fold "boolvar = true" or "boolvar = false" to just "boolvar" or "NOT boolvar" respectively. Improve dependencies_clauselist_selectivity() to recognize such clauses as equivalent to equality restrictions. This fixes a failure of the extended stats mechanism to apply in a case reported by Vitaliy Garnashevich. It's not a complete solution to his problem because the bitmap-scan costing code isn't consulting extended stats where it should, but that's surely an independent issue. In passing, improve some comments, get rid of a NumRelids() test that's redundant with the preceding bms_membership() test, and fix dependencies_clauselist_selectivity() so that estimatedclauses actually is a pure output argument as stated by its API contract. Back-patch to v10 where this code was introduced. Discussion: https://postgr.es/m/73a4936d-2814-dc08-ed0c-978f76f435b0@gmail.com
* Remove memory leak protection from Gather and Gather Merge nodes.Robert Haas2017-12-04
| | | | | | | | | | | | | | Before commit 6b65a7fe62e129d5c2b85cd74d6a91d8f7564608, tqueue.c could perform tuple remapping and thus leak memory, which is why commit af33039317ddc4a0e38a02e2255c2bf453115fd2 made TupleQueueReaderNext run in a short-lived context. Now, however, tqueue.c has been reduced to a shadow of its former self, and there shouldn't be any chance of leaks any more. Accordingly, remove some tuple copying and memory context manipulation to speed up processing. Patch by me, reviewed by Amit Kapila. Some testing by Rafia Sabih. Discussion: http://postgr.es/m/CAA4eK1LSDydwrNjmYSNkfJ3ZivGSWH9SVswh6QpNzsMdj_oOQA@mail.gmail.com
* Adjust #ifdef EXEC_BACKEND RemovePgTempFilesInDir() call.Andres Freund2017-12-01
| | | | | | | Other callers were adjusted in the course of dc6c4c9dc2a111519b76b22daaaac86c5608223b. Per buildfarm.
* Add infrastructure for sharing temporary files between backends.Andres Freund2017-12-01
| | | | | | | | | | | | | | | | | SharedFileSet allows temporary files to be created by one backend and then exported for read-only access by other backends, with clean-up managed by reference counting associated with a DSM segment. This includes changes to fd.c and buffile.c to support the new kind of temporary file. This will be used by an upcoming patch adding support for parallel hash joins. Author: Thomas Munro Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas, Rushabh Lathia Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAH2-WznJ_UgLux=_jTgCQ4yFz0iBntudsNKa1we3kN1BAG=88w@mail.gmail.com
* Minor code beautification in partition_bounds_equal.Robert Haas2017-12-01
| | | | | | | | | Use get_greatest_modulus more consistently, instead of doing the same thing in an ad-hoc manner in this one place. Ashutosh Bapat Discussion: http://postgr.es/m/CAFjFpReT9L4RCiJBKOyWC2=i02kv9uG2fx=4Fv7kFY2t0SPCgw@mail.gmail.com
* Re-allow INSERT .. ON CONFLICT DO NOTHING on partitioned tables.Robert Haas2017-12-01
| | | | | | | | | | | | Commit 8355a011a0124bdf7ccbada206a967d427039553 was reverted in f05230752d53c4aa74cffa9b699983bbb6bcb118, but this attempt is hopefully better-considered: we now pass the correct value to ExecOpenIndices, which should avoid the crash that we hit before. Amit Langote, reviewed by Simon Riggs and by me. Some final editing by me. Discussion: http://postgr.es/m/7ff1e8ec-dc39-96b1-7f47-ff5965dceeac@lab.ntt.co.jp
* Try to exclude partitioned tables in toto.Robert Haas2017-12-01
| | | | | | Ashutosh Bapat, reviewed by Jeevan Chalke. Comment by me. Discussion: http://postgr.es/m/CAFjFpRcuRaydz88CY_aQekmuvmN2A9ax5z0k=ppT+s8KS8xMRA@mail.gmail.com
* Fix uninitialized memory reference.Robert Haas2017-12-01
| | | | | | | | | | | Without this, when partdesc->nparts == 0, we end up calling ExecBuildSlotPartitionKeyDescription without initializing values and isnull. Reported by Coverity via Michael Paquier. Patch by Michael Paquier, reviewed and revised by Amit Langote. Discussion: http://postgr.es/m/CAB7nPqQ3mwkdMoPY-ocgTpPnjd8TKOadMxdTtMLvEzF8480Zfg@mail.gmail.com
* Check channel binding flag at end of SCRAM exchangePeter Eisentraut2017-12-01
| | | | | | | We need to check whether the channel-binding flag encoded in the client-final-message is the same one sent in the client-first-message. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Remove extra word from comment.Robert Haas2017-11-30
| | | | | | | | | David Rowley, who also was the primary author of the patch that added this function; the attribution in my previous commit, 84940644de931f331433b35e3a391822671f8c9c, was incorrect due to sloppiness on my part. Discussion: http://postgr.es/m/CAKJS1f_0iSiLQsf_c06AzOWAc3eS6ePjjVQFpcFv3W-O5aktnQ@mail.gmail.com
* SQL proceduresPeter Eisentraut2017-11-30
| | | | | | | | | | | | | | | | | | | | | This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
* Make create_unique_path manage memory like mark_dummy_rel.Robert Haas2017-11-30
| | | | | | | | | | | | | | | | | | Put the unique path in the same context as the owning RelOptInfo, rather than the toplevel planner context. This is how this function worked originally, but commit f41803bb39bc2949db200116a609fd242d0ec221 changed it without explanation. mark_dummy_rel adopted the older (or newer?) technique in commit eca75a12a27d28b972fc269c1c8813cd8eb15441, which also featured a much better explanation of why it is correct. So, switch back to that technique here, with the same explanation given there. Although this fixes a possible memory leak when GEQO is in use, the leak is minor and probably nobody cares, so no back-patch. Ashutosh Bapat, reviewed by Tom Lane and by me Discussion: http://postgr.es/m/CAFjFpRcXkHHrXyD9BCvkgGJV4TnHG2SWJ0PhJfrDu3NAcQvh7g@mail.gmail.com
* Fix neqjoinsel's behavior for semi/anti join cases.Tom Lane2017-11-29
| | | | | | | | | | | | | | | | | | | | | | | Previously, this function estimated the selectivity as 1 minus eqjoinsel() for the negator equality operator, regardless of join type (I think there was an expectation that eqjoinsel would handle the join type). But actually this is completely wrong for semijoin cases: the fraction of the LHS that has a non-matching row is not one minus the fraction of the LHS that has a matching row. In reality a semijoin with <> will nearly always succeed: it can only fail when the RHS is empty, or it contains a single distinct value that is equal to the particular LHS value, or the LHS value is null. The only one of those things we should have much confidence in estimating is the fraction of LHS values that are null, so let's just take the selectivity as 1 minus outer nullfrac. Per coding convention, antijoin should be estimated the same as semijoin. Arguably this is a bug fix, but in view of the lack of field complaints and the risk of destabilizing plans, no back-patch. Thomas Munro, reviewed by Ashutosh Bapat Discussion: https://postgr.es/m/CAEepm=270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg@mail.gmail.com
* Add a barrier primitive for synchronizing backends.Andres Freund2017-11-29
| | | | | | | | | | | | | | | | Provide support for dynamic or static parties of processes to wait for all processes to reach point in the code before continuing. This is similar to the mechanism of the same name in POSIX threads and MPI, though has explicit phasing and dynamic party support like the Java core library's Phaser. This will be used by an upcoming patch adding support for parallel hash joins. Author: Thomas Munro Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=2_y7oi01OjA_wLvYcWMc9_d=LaoxrY3eiROCZkB_qakA@mail.gmail.com
* New C function: bms_add_rangeRobert Haas2017-11-29
| | | | | | | | | | | This will be used by pending patches to improve partition pruning. Amit Langote and Kyotaro Horiguchi, per a suggestion from David Rowley. Review and testing of the larger patch set of which this is a part by Ashutosh Bapat, David Rowley, Dilip Kumar, Jesper Pedersen, Rajkumar Raghuwanshi, Beena Emerson, Amul Sul, and Kyotaro Horiguchi. Discussion: http://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
* Update typedefs.list and re-run pgindentRobert Haas2017-11-29
| | | | Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com
* Clarify old comment about qual_is_pushdown_safe's handling of subplans.Tom Lane2017-11-28
| | | | | This comment glossed over the difference between initplans and subplans, but they are indeed different for our purposes here.
* Make memset() use sizeof() rather than re-compute sizeAlvaro Herrera2017-11-29
| | | | | | | This is simpler and more closely follows overwhelming precedent. Report and patch by Mark Dilger. Discussion: https://postgr.es/m/9A68FB88-5F45-4848-9926-8586E2D777D1@gmail.com
* Fix extstat collection when no stats are produced for a columnAlvaro Herrera2017-11-28
| | | | | | | This is a mistakenly placed conditional in bf2a691e02d7. Reported by Justin Pryzby Discussion: https://postgr.es/m/20171117214352.GE25796@telsasoft.com
* Fix wrong function name in comment.Robert Haas2017-11-28
| | | | | | Rushabh Lathia Discussion: http://postgr.es/m/CAGPqQf2z5g+7YmGZSZgKoiFsaUB+63Rzmz8-5PQHuS6hd14FEg@mail.gmail.com
* If a range-partitioned table has no default partition, reject null keys.Robert Haas2017-11-28
| | | | | | | | | | Commit 4e5fe9ad19e14af360de7970caa8b150436c9dec introduced this problem. Also add a test so it doesn't get broken again. Report by Rushabh Lathia. Fix by Amit Langote. Reviewed by Rushabh Lathia and Amul Sul. Tweaked by me. Discussion: http://postgr.es/m/CAGPqQf0Y1iJyk4QJBdMf=pS9i6Q0JUMM_h5-qkR3OMJ-e04PyA@mail.gmail.com
* Fix ReinitializeParallelDSM to tolerate finding no error queues.Robert Haas2017-11-28
| | | | | | | | | | | | | | | | Commit d4663350646ca0c069a36d906155a0f7e3372eb7 changed things so that shm_toc_lookup would fail with an error rather than silently returning NULL in the hope that such failures would be reported in a useful way rather than via a system crash. However, it overlooked the fact that the lookup of PARALLEL_KEY_ERROR_QUEUE in ReinitializeParallelDSM is expected to fail when no DSM segment was created in the first place; in that case, we end up with a backend-private memory segment that still contains an entry for PARALLEL_KEY_FIXED but no others. Consequently a benign failure to initialize parallelism can escalate into an elog(ERROR); repair. Discussion: http://postgr.es/m/CA+Tgmob8LFw55DzH1QEREpBEA9RJ_W_amhBFCVZ6WMwUhVpOqg@mail.gmail.com
* Teach bitmap heap scan to cope with absence of a DSA.Robert Haas2017-11-28
| | | | | | | | | | | | If we have a plan that uses parallelism but are unable to execute it using parallelism, for example due to a lack of available DSM segments, then the EState's es_query_dsa will be NULL. Parallel bitmap heap scan needs to fall back to a non-parallel scan in such cases. Patch by me, reviewed by Dilip Kumar Discussion: http://postgr.es/m/CAEepm=0kADK5inNf_KuemjX=HQ=PuTP0DykM--fO5jS5ePVFEA@mail.gmail.com
* Add null test to partition constraint for default range partitions.Robert Haas2017-11-28
| | | | | | | | | | | | | | | Non-default range partitions have a constraint which include null tests, and both default and non-default list partitions also have a constraint which includes null tests, but for some reason this was missed for default range partitions. This could cause the partition constraint to evaluate to false for rows that were (correctly) routed to that partition by insert tuple routing, which could in turn cause constraint exclusion to prune the default partition in cases where it should not. Amit Langote, reviewed by Kyotaro Horiguchi Discussion: http://postgr.es/m/ba7aaeb1-4399-220e-70b4-62eade1522d0@lab.ntt.co.jp
* Fix typo.Robert Haas2017-11-28
| | | | | | Masahiko Sawada Discussion: http://postgr.es/m/CAD21AoCq_QG6UEo6yb-purmhLQjDLsCA2_B+8Wh3ah9P2-XmrQ@mail.gmail.com
* Fix assorted syscache lookup sloppiness in partition-related code.Tom Lane2017-11-27
| | | | | | | | | | | | | | | | | | | | | heap_drop_with_catalog and ATExecDetachPartition neglected to check for SearchSysCache failures, as noted in bugs #14927 and #14928 from Pan Bian. Such failures are pretty unlikely, since we should already have some sort of lock on the rel at these points, but it's neither a good idea nor per project style to omit a check for failure. Also, StorePartitionKey contained a syscache lookup that it never did anything with, including never releasing the result. Presumably the reason why we don't see refcount-leak complaints is that the lookup always fails; but in any case it's pretty useless, so remove it. All of these errors were evidently introduced by the relation partitioning feature. Back-patch to v10 where that came in. Amit Langote and Tom Lane Discussion: https://postgr.es/m/20171127090105.1463.3962@wrigleys.postgresql.org Discussion: https://postgr.es/m/20171127091341.1468.72696@wrigleys.postgresql.org
* Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.Tom Lane2017-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rewriteTargetListUD's processing is dependent on the relkind of the query's target table. That was fine at the time it was made to act that way, even for queries on inheritance trees, because all tables in an inheritance tree would necessarily be plain tables. However, the 9.5 feature addition allowing some members of an inheritance tree to be foreign tables broke the assumption that rewriteTargetListUD's output tlist could be applied to all child tables with nothing more than column-number mapping. This led to visible failures if foreign child tables had row-level triggers, and would also break in cases where child tables belonged to FDWs that used methods other than CTID for row identification. To fix, delay running rewriteTargetListUD until after the planner has expanded inheritance, so that it is applied separately to the (already mapped) tlist for each child table. We can conveniently call it from preprocess_targetlist. Refactor associated code slightly to avoid the need to heap_open the target relation multiple times during preprocess_targetlist. (The APIs remain a bit ugly, particularly around the point of which steps scribble on parse->targetList and which don't. But avoiding such scribbling would require a change in FDW callback APIs, which is more pain than it's worth.) Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when we transition from rows providing a CTID to rows that don't. (That's really an independent bug, but it manifests in much the same cases.) Add a regression test checking one manifestation of this problem, which was that row-level triggers on a foreign child table did not work right. Back-patch to 9.5 where the problem was introduced. Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
* Pad XLogReaderState's per-buffer data_bufsz more aggressively.Simon Riggs2017-11-27
| | | | | | | | | | | | Originally, we palloc'd this buffer just barely big enough to hold the largest xlog backup block seen so far. We now MAXALIGN the palloc size. The original coding could result in many repeated palloc cycles, in the worst case where we see a series ofgradually larger xlog records. We ameliorate that similarly to 8735978e7aebfbc499843630131c18d1f7346c79 by imposing a minimum buffer size of BLCKSZ. Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
* Pad XLogReaderState's main_data buffer more aggressively.Tom Lane2017-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | Originally, we palloc'd this buffer just barely big enough to hold the largest xlog record seen so far. It turns out that that can result in valgrind complaints, because some compilers will emit code that assumes it can safely fetch padding bytes at the end of a struct, and those padding bytes were unallocated so far as aset.c was concerned. We can fix that by MAXALIGN'ing the palloc request size, ensuring that it is big enough to include any possible padding that might've been omitted from the on-disk record. An additional objection to the original coding is that it could result in many repeated palloc cycles, in the worst case where we see a series of gradually larger xlog records. We can ameliorate that cheaply by imposing a minimum buffer size that's large enough for most xlog records. BLCKSZ/2 was chosen after a bit of discussion. In passing, remove an obsolete comment in struct xl_heap_new_cid that the combocid field is free due to alignment considerations. Perhaps that was true at some point, but it's not now. Back-patch to 9.5 where this code came in. Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
* Make has_sequence_privilege support WITH GRANT OPTIONJoe Conway2017-11-26
| | | | | | | | | | | The various has_*_privilege() functions all support an optional WITH GRANT OPTION added to the supported privilege types to test whether the privilege is held with grant option. That is, all except has_sequence_privilege() variations. Fix that. Back-patch to all supported branches. Discussion: https://postgr.es/m/005147f6-8280-42e9-5a03-dd2c1e4397ef@joeconway.com
* Repair failure with SubPlans in multi-row VALUES lists.Tom Lane2017-11-25
| | | | | | | | | | | | | | | | | | | | | | | When nodeValuesscan.c was written, it was impossible to have a SubPlan in VALUES --- any sub-SELECT there would have to be uncorrelated and thereby would produce an InitPlan instead. We therefore took a shortcut in the logic that throws away a ValuesScan's per-row expression evaluation data structures. This was broken by the introduction of LATERAL however; a sub-SELECT containing a lateral reference produces a correlated SubPlan. The cleanest fix for this would be to give up the optimization of discarding the expression eval state. But that still seems pretty unappetizing for long VALUES lists. It seems to work to just prevent the subexpressions from hooking into the ValuesScan node's subPlan list, so let's do that and see how well it works. (If this breaks, due to additional connections between the subexpressions and the outer query structures, we might consider compromises like throwing away data only for VALUES rows not containing SubPlans.) Per bug #14924 from Christian Duta. Back-patch to 9.3 where LATERAL was introduced. Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org
* Update buffile.h/.c comments for removal of non-temp option.Tom Lane2017-11-25
| | | | | | | | | | | | | | | | | | | | | Commit 11e264517 removed BufFile's isTemp flag, thereby eliminating the possibility of resurrecting BufFileCreate(). But it left that function in place, as well as a bunch of comments describing how things worked for the non-temp-file case. At best, that's now a source of confusion. So remove the long-since-commented-out function and change relevant comments. I (tgl) wanted to rename BufFileCreateTemp() to BufFileCreate(), but that seems not to be the consensus position, so leave it as-is. In passing, fix commit f0828b2fc's failure to update BufFileSeek's comment to match the change of its argument type from long to off_t. (I think that might actually have been intentional at the time, but now that 64-bit off_t is nearly universal, it looks anachronistic.) Thomas Munro and Tom Lane Discussion: https://postgr.es/m/E1eFVyl-0008J1-RO@gemulon.postgresql.org
* Improve planner's handling of set-returning functions in grouping columns.Tom Lane2017-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve query_is_distinct_for() to accept SRFs in the targetlist when we can prove distinctness from a DISTINCT clause. In that case the de-duplication will surely happen after SRF expansion, so the proof still works. Continue to punt in the case where we'd try to prove distinctness from GROUP BY (or, in the future, source relations). To do that, we'd have to determine whether the SRFs were in the grouping columns or elsewhere in the tlist, and it still doesn't seem worth the trouble. But this trivial change allows us to recognize that "SELECT DISTINCT unnest(foo) FROM ..." produces unique-ified output, which seems worth having. Also, fix estimate_num_groups() to consider the possibility of SRFs in the grouping columns. Its failure to do so was masked before v10 because grouping_planner() scaled up plan rowcount estimates by the estimated SRF multiplier after performing grouping. That doesn't happen anymore, which is more correct, but it means we need an adjustment in the estimate for the number of groups. Failure to do this leads to an underestimate for the number of output rows of subqueries like "SELECT DISTINCT unnest(foo)" compared to what 9.6 and earlier estimated, thus breaking plan choices in some cases. Per report from Dmitry Shalashov. Back-patch to v10 to avoid degraded plan choices compared to previous releases. Discussion: https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
* Avoid projecting tuples unnecessarily in Gather and Gather Merge.Robert Haas2017-11-25
| | | | | | | | | | | | | It's most often the case that the target list for the Gather (Merge) node matches the target list supplied by the underlying plan node; when this is so, we can avoid the overhead of projecting. This depends on commit f455e1125e2588d4cd4fc663c6a10da4e003a3b5 for proper functioning. Idea by Andres Freund. Patch by me. Review by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoZ0ZL=cesZFq8c9NnfK6bqy-wwUd3_74iYGodYrSoQ7Fw@mail.gmail.com
* Improve valgrind logic in aset.c, and fix multiple issues in generation.c.Tom Lane2017-11-24
| | | | | | | | | | | | | | | | | | | | | Revise aset.c so that all the "private" fields of chunk headers are marked NOACCESS when outside the module, improving on the previous coding which protected only requested_size. Fix a couple of corner case bugs, such as failing to re-protect the header during a failure exit from AllocSetRealloc, and wrong padding-size calculation for an oversize allocation request. Apply the same design to generation.c, and also fix several bugs therein that I found by dint of hacking the code to use generation.c as the standard allocator and then running the core regression tests with it. Notably, we have to track the actual size of each block, else the wipe_mem call in GenerationReset clears the wrong amount of memory for an oversize-chunk block; and GenerationCheck needs a way of identifying freed chunks that isn't fooled by palloc(0). I chose to fix the latter by resetting the context pointer to NULL in a freed chunk, roughly like what happens in a freed aset.c chunk. Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
* Mostly-cosmetic improvements in memory chunk header alignment coding.Tom Lane2017-11-24
| | | | | | | | | | | | | | | Add commentary about what we're doing and why. Apply the method used for padding in GenerationChunk to AllocChunkData, replacing the rather ad-hoc solution used in commit 7e3aa03b4. Reorder fields in GenerationChunk so that the padding calculation will work even if sizeof(size_t) is different from sizeof(void *) --- likely that will never happen, but we don't need the assumption if we do it like this. Improve static assertions about alignment. In passing, fix a couple of oversights in the "large chunk" path in GenerationAlloc(). Discussion: https://postgr.es/m/E1eHa4J-0006hI-Q8@gemulon.postgresql.org
* Fix bug in generation.c's valgrind support.Tom Lane2017-11-24
| | | | | | | | | This doesn't look like the last such bug, but it's one that the test_decoding regression test is tripping over. Per buildfarm. Tomas Vondra Discussion: https://postgr.es/m/c903f275-2150-fa52-64bf-dca7b53ebf8d@fuzzy.cz
* RLS comment fixes.Dean Rasheed2017-11-24
| | | | | | The comments in get_policies_for_relation() say that CREATE POLICY does not support defining restrictive policies. This is no longer true, starting from PG10.
* Fix handling of NULLs returned by aggregate combine functions.Andres Freund2017-11-23
| | | | | | | | | | | | | | | | | | | | | | | When strict aggregate combine functions, used in multi-stage/parallel aggregation, returned NULL, we didn't check for that, invoking the combine function with NULL the next round, despite it being strict. The equivalent code invoking normal transition functions has a check for that situation, which did not get copied in a7de3dc5c346. Fix the bug by adding the equivalent check. Based on a quick look I could not find any strict combine functions in core actually returning NULL, and it doesn't seem very likely external users have done so. So this isn't likely to have caused issues in practice. Add tests verifying transition / combine functions returning NULL is tested. Reported-By: Andres Freund Author: Andres Freund Discussion: https://postgr.es/m/20171121033642.7xvmjqrl4jdaaat3@alap3.anarazel.de Backpatch: 9.6, where parallel aggregation was introduced
* Ensure sizeof(GenerationChunk) is maxaligned.Tom Lane2017-11-23
| | | | | | Per buildfarm. Also improve some comments.
* Tweak code for older compilersSimon Riggs2017-11-23
| | | | | | Attempt to quiesce build farm Author: Tomas Vondra
* Generational memory allocatorSimon Riggs2017-11-23
| | | | | | | | | | | | Add new style of memory allocator, known as Generational appropriate for use in cases where memory is allocated and then freed in roughly oldest first order (FIFO). Use new allocator for logical decoding’s reorderbuffer to significantly reduce memory usage and improve performance. Author: Tomas Vondra Reviewed-by: Simon Riggs
* Set es_output_cid in replication workerSimon Riggs2017-11-22
| | | | | | | Allows triggers to operate correctly Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Konstantin Knizhnik <k.knizhnik@postgrespro.ru>
* Provide for forward compatibility with future minor protocol versions.Robert Haas2017-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, any attempt to request a 3.x protocol version other than 3.0 would lead to a hard connection failure, which made the minor protocol version really no different from the major protocol version and precluded gentle protocol version breaks. Instead, when the client requests a 3.x protocol version where x is greater than 0, send the new NegotiateProtocolVersion message to convey that we support only 3.0. This makes it possible to introduce new minor protocol versions without requiring a connection retry when the server is older. In addition, if the startup packet includes name/value pairs where the name starts with "_pq_.", assume that those are protocol options, not GUCs. Include those we don't support (i.e. all of them, at present) in the NegotiateProtocolVersion message so that the client knows they were not understood. This makes it possible for the client to request previously-unsupported features without bumping the protocol version at all; the client can tell from the server's response whether the option was understood. It will take some time before servers that support these new facilities become common in the wild; to speed things up and make things easier for a future 3.1 protocol version, back-patch to all supported releases. Robert Haas and Badrul Chowdhury Discussion: http://postgr.es/m/BN6PR21MB0772FFA0CBD298B76017744CD1730@BN6PR21MB0772.namprd21.prod.outlook.com Discussion: http://postgr.es/m/30788.1498672033@sss.pgh.pa.us
* Fix multiple problems with satisfies_hash_partition.Robert Haas2017-11-21
| | | | | | | | | | | | | | | | Fix the function header comment to describe the actual behavior. Check that table OID, modulus, and remainder arguments are not NULL before accessing them. Check that the modulus and remainder are sensible. If the table OID doesn't exist, return NULL instead of emitting an internal error, similar to what we do elsewhere. Check that the actual argument types match, or at least are binary coercible to, the expected argument types. Correctly handle invocation of this function using the VARIADIC syntax. Add regression tests. Robert Haas and Amul Sul, per a report by Andreas Seltenreich and subsequent followup investigation. Discussion: http://postgr.es/m/871sl4sdrv.fsf@ansel.ydns.eu
* Use out-of-line M68K spinlock code for OpenBSD as well as NetBSD.Tom Lane2017-11-20
| | | | | | David Carlier (from a patch being carried by OpenBSD packagers) Discussion: https://postgr.es/m/CA+XhMqzwFSGVU7MEnfhCecc8YdP98tigXzzpd0AAdwaGwaVXEA@mail.gmail.com