| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a shared iterator is used, each call to tbm_shared_iterate()
returns a result that has not yet been returned to any process
attached to the shared iterator. In other words, each cooperating
processes gets a disjoint subset of the full result set, but all
results are returned exactly once.
This is infrastructure for parallel bitmap heap scan.
Dilip Kumar. The larger patch set of which this is a part has been
reviewed and tested by (at least) Andres Freund, Amit Khandekar,
Tushar Ahuja, Rafia Sabih, Haribabu Kommi, and Thomas Munro.
Discussion: http://postgr.es/m/CAFiTN-uc4=0WxRGfCzs-xfkMYcSEWUC-Fon6thkJGjkh9i=13A@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the whole row is shown without column names. Instead,
adopt a style similar to _bt_check_unique() in ExecFindPartition()
and show the failing key: (key1, ...) = (val1, ...).
Amit Langote, per a complaint from Simon Riggs. Reviewed by me;
I also adjusted the grammar in one of the comments.
Discussion: http://postgr.es/m/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp
|
|
|
|
|
|
|
|
|
|
| |
The final patch will be less messy if the prefetching support is
a bit better isolated, so do that.
Dilip Kumar, with some changes by me. The larger patch set of which
this is a part has been reviewed and tested by (at least) Andres
Freund, Amit Khandekar, Tushar Ahuja, Rafia Sabih, Haribabu Kommi, and
Thomas Munro.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, only IndexTuple format was supported for the output data of
an index-only scan. This is fine for btree, which is just returning a
verbatim index tuple anyway. It's not so fine for SP-GiST, which can
return reconstructed data that's much larger than a page.
To fix, extend the index AM API so that index-only scan data can be
returned in either HeapTuple or IndexTuple format. There's other ways
we could have done it, but this way avoids an API break for index AMs
that aren't concerned with the issue, and it costs little except a couple
more fields in IndexScanDescs.
I changed both GiST and SP-GiST to use the HeapTuple method. I'm not
very clear on whether GiST can reconstruct data that's too large for an
IndexTuple, but that seems possible, and it's not much of a code change to
fix.
Per a complaint from Vik Fearing. Reviewed by Jason Li.
Discussion: https://postgr.es/m/49527f79-530d-0bfe-3dad-d183596afa92@2ndquadrant.fr
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is expected to be useful mostly when performing such scans in
parallel, because in that case it allows (in combination with commit
acf555bc53acb589b5a2827e65d655fa8c9adee0) nodes below a Gather to get
control just before the DSM segment goes away.
KaiGai Kohei, except that I rewrote the documentation. Reviewed by
Claudio Freire.
Discussion: http://postgr.es/m/CADyhKSXJK0jUJ8rWv4AmKDhsUh124_rEn39eqgfC5D8fu6xVuw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Twiddle the replication-related code so that its timestamp variables
are declared TimestampTz, rather than the uninformative "int64" that
was previously used for meant-to-be-always-integer timestamps.
This resolves the int64-vs-TimestampTz declaration inconsistencies
introduced by commit 7c030783a, though in the opposite direction to
what was originally suggested.
This required including datatype/timestamp.h in a couple more places
than before. I decided it would be a good idea to slim down that
header by not having it pull in <float.h> etc, as those headers are
no longer at all relevant to its purpose. Unsurprisingly, a small number
of .c files turn out to have been depending on those inclusions, so add
them back in the .c files as needed.
Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
Discussion: https://postgr.es/m/27694.1487456324@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
| |
With this change, you can see the query that a parallel worker is
executing in pg_stat_activity, and if the worker crashes you can
see what query it was executing when it crashed.
Rafia Sabih, reviewed by Kuntal Ghosh and Amit Kapila and slightly
revised by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It turns out that the original shutdown order here does not work well.
Multiple people attempting to develop further parallel query patches
have discovered that they need to do cleanup before the DSM goes away,
and you can't do that if the parent node gets cleaned up first.
Patch by me, reviewed by KaiGai Kohei and Dilip Kumar.
Discussion: http://postgr.es/m/CA+TgmoY6bOc1YnhcAQnMfCBDbsJzROQ3sYxSAL-SYB5tMJcTKg@mail.gmail.com
Discussion: http://postgr.es/m/9A28C8860F777E439AA12E8AEA7694F8012AEB82@BPXM15GP.gisp.nec.co.jp
Discussion: http://postgr.es/m/CA+TgmoYuPOc=+xrG1v0fCsoLbKAab9F1ddOeaaiLMzKOiBar1Q@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5262f7a4fc44f651241d2ff1fa688dd664a34874 added similar support
for parallel index scans; this extends that work to index-only scans.
As with parallel index scans, this requires support from the index AM,
so currently parallel index-only scans will only be possible for btree
indexes.
Rafia Sabih, reviewed and tested by Rahila Syed, Tushar Ahuja,
and Amit Kapila
Discussion: http://postgr.es/m/CAOGQiiPEAs4C=TBp0XShxBvnWXuzGL2u++Hm1=qnCpd6_Mf8Fw@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The loops in ExecHashJoinNewBatch(), ExecHashIncreaseNumBatches(), and
ExecHashRemoveNextSkewBucket() are all capable of iterating over many
tuples without ever doing a CHECK_FOR_INTERRUPTS, so that the backend
might fail to respond to SIGINT or SIGTERM for an unreasonably long time.
Fix that. In the case of ExecHashJoinNewBatch(), it seems useful to put
the added CHECK_FOR_INTERRUPTS into ExecHashJoinGetSavedTuple() rather
than directly in the loop, because that will also ensure that both
principal code paths through ExecHashJoinOuterGetTuple() will do a
CHECK_FOR_INTERRUPTS, which seems like a good idea to avoid surprises.
Back-patch to all supported branches.
Tom Lane and Thomas Munro
Discussion: https://postgr.es/m/6044.1487121720@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
| |
In combination with 569174f1be92be93f5366212cc46960d28a5c5cd, which
taught the btree AM how to perform parallel index scans, this allows
parallel index scan plans on btree indexes. This infrastructure
should be general enough to support parallel index scans for other
index AMs as well, if someone updates them to support parallel
scans.
Amit Kapila, reviewed and tested by Anastasia Lubennikova, Tushar
Ahuja, and Haribabu Kommi, and me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't do anything to make Param nodes anything other than
parallel-restricted, so this only helps with uncorrelated subplans,
and it's not necessarily very cheap because each worker will run the
subplan separately (just as a Hash Join will build a separate copy of
the hash table in each participating process), but it's a first step
toward supporting cases that are more likely to help in practice, and
is occasionally useful on its own.
Amit Kapila, reviewed and tested by Rafia Sabih, Dilip Kumar, and
me.
Discussion: http://postgr.es/m/CAA4eK1+e8Z45D2n+rnDMDYsVEb5iW7jqaCH_tvPMYau=1Rru9w@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even if we don't emit definitions for SH_ALLOCATE and SH_FREE, we
still need prototypes. The user can't define them before including
simplehash.h because SH_TYPE isn't available yet.
For the allocator to be able to access private_data, it needs to
become an argument to SH_CREATE. Previously we relied on callers
to set that after returning from SH_CREATE, but SH_CREATE calls
SH_ALLOCATE before returning.
Dilip Kumar, reviewed by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's always been possible for index AMs to cache data across successive
amgettuple calls within a single SQL command: the IndexScanDesc.opaque
field is meant for precisely that. However, no comparable facility
exists for amortizing setup work across successive aminsert calls.
This patch adds such a feature and teaches GIN, GIST, and BRIN to use it
to amortize catalog lookups they'd previously been doing on every call.
(The other standard index AMs keep everything they need in the relcache,
so there's little to improve there.)
For GIN, the overall improvement in a statement that inserts many rows
can be as much as 10%, though it seems a bit less for the other two.
In addition, this makes a really significant difference in runtime
for CLOBBER_CACHE_ALWAYS tests, since in those builds the repeated
catalog lookups are vastly more expensive.
The reason this has been hard up to now is that the aminsert function is
not passed any useful place to cache per-statement data. What I chose to
do is to add suitable fields to struct IndexInfo and pass that to aminsert.
That's not widening the index AM API very much because IndexInfo is already
within the ken of ambuild; in fact, by passing the same info to aminsert
as to ambuild, this is really removing an inconsistency in the AM API.
Discussion: https://postgr.es/m/27568.1486508680@sss.pgh.pa.us
|
|
|
|
|
|
|
| |
This method is more elegant and more efficient.
Per a suggestion from Andres Freund, who also briefly reviewed
the patch.
|
|
|
|
|
|
|
|
| |
This is infrastructure for a pending patch to allow parallel bitmap
heap scans.
Dilip Kumar, reviewed (in earlier versions) by Andres Freund and
(more recently) by me. Some further renaming by me, also.
|
|
|
|
|
|
|
|
|
| |
Backpatch to all supported versions, where applicable, to make backpatching
of future fixes go more smoothly.
Josh Soref
Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
|
|
|
|
| |
Rafia Sabih
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When I wrote commit ab1f0c822, I really missed the castNode() macro that
Peter E. had proposed shortly before. This back-fills the uses I would
have put it to. It's probably not all that significant, but there are
more assertions here than there were before, and conceivably they will
help catch any bugs associated with those representation changes.
I left behind a number of usages like "(Query *) copyObject(query_var)".
Those could have been converted as well, but Peter has proposed another
notational improvement that would handle copyObject cases automatically,
so I let that be for now.
|
|
|
|
|
|
|
|
|
| |
This is far from a pervasive conversion, but it's a good starting
point.
Author: Peter Eisentraut, with some minor changes by me
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com
|
| |
|
|
|
|
|
|
|
|
| |
Previously, ExecInitModifyTable was missing handling for WITH CHECK
OPTION, and view_query_is_auto_updatable was missing handling for
RELKIND_PARTITIONED_TABLE.
Amit Langote, reviewed by me.
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 2ac3ef7a01df859c62d0a02333b646d65eaec5ff, we changed things so that
it's possible for a different TupleTableSlot to be used for partitioned
tables at successively lower levels. If we do end up changing the slot
from the original, we must update ecxt_scantuple to point to the new one
for partition key of the tuple to be computed correctly.
Reported by Rajkumar Raghuwanshi. Patch by Amit Langote.
Discussion: http://postgr.es/m/CAKcux6%3Dm1qyqB2k6cjniuMMrYXb75O-MB4qGQMu8zg-iGGLjDw%40mail.gmail.com
|
|
|
|
|
|
| |
We've accumulated quite a bit of stuff with which pgindent is not
quite happy in this code; clean it up to provide a less-annoying base
for future pgindent runs.
|
|
|
|
|
|
|
|
|
| |
Coverity complained quite properly that commit ea15e1867 had introduced
unreachable code into ExecGather(); to wit, it was no longer possible to
iterate the final for-loop more or less than once. So remove the for().
In passing, clean up a couple of comments, and make better use of a local
variable.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add PUBLICATION catalogs and DDL
- Add SUBSCRIPTION catalog and DDL
- Define logical replication protocol and output plugin
- Add logical replication workers
From: Petr Jelinek <petr@2ndquadrant.com>
Reviewed-by: Steve Singer <steve@ssinger.info>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 69f4b9c plain expression evaluation (and thus normal projection)
can't return sets of tuples anymore. Thus remove code dealing with
that possibility.
This will require adjustments in external code using
ExecEvalExpr()/ExecProject() - that should neither be hard nor very
common.
Author: Andres Freund and Tom Lane
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
| |
In ExecInsert(), do not switch back to the root partitioned table
ResultRelInfo until after we finish ExecProcessReturning(), so that
RETURNING projection is done using the partition's descriptor. For
the projection to work correctly, we must initialize the same for each
leaf partition during ModifyTableState initialization.
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a tuple is inherited into a partitioning root, no partition
constraints need to be enforced; when it is inserted into a leaf, the
parent's partitioning quals needed to be enforced. The previous
coding got both of those cases right. When a tuple is inserted into
an intermediate level of the partitioning hierarchy (i.e. a table
which is both a partition itself and in turn partitioned), it must
enforce the partitioning qual inherited from its parent. That case
got overlooked; repair.
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.
This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.
That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.
To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist. The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get
input from an underlying node.
We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.
While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e. continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones. We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.
As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.
Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node. As that's quite confusing, we're
likely to instead prohibit SRFs in those places. But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.
There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.
Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains
prototypes for all functions registered in pg_proc.h. This avoids
having to manually maintain these prototypes across a random variety of
header files. It also automatically enforces a correct function
signature, and since there are warnings about missing prototypes, it
will detect functions that are defined but not registered in
pg_proc.h (or otherwise used).
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes several changes that improve the consistency of
representation of lists of statements. It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list. This patch brings
similar consistency to the outputs of raw parsing and planning steps:
* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.
* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements. In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node. This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.
Now, every list of statements has a consistent head-node type depending
on how far along it is in processing. This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.
Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way. It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)
Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list. While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.
The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement. This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)
There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.
Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes. This allows
more intelligent handling of cases where a source query string contains
multiple statements. This patch doesn't actually do anything with the
information, but a follow-on patch will. (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)
catversion bump because addition of location fields to struct Query
affects stored rules.
This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.
Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A client copy can't work inside a function because the FE/BE wire protocol
doesn't support nesting of a COPY operation within query results. (Maybe
it could, but the protocol spec doesn't suggest that clients should support
this, and libpq for one certainly doesn't.)
In most PLs, this prohibition is enforced by spi.c, but SQL functions don't
use SPI. A comparison of _SPI_execute_plan() and init_execution_state()
shows that rejecting client COPY is the only discrepancy in what they
allow, so there's no other similar bugs.
This is an astonishingly ancient oversight, so back-patch to all supported
branches.
Report: https://postgr.es/m/BY2PR05MB2309EABA3DEFA0143F50F0D593780@BY2PR05MB2309.namprd05.prod.outlook.com
|
|
|
|
| |
Amit Langote
|
|
|
|
|
|
| |
This commit purported to use a variable hash seed for Partial
HashAggregate, but actually did the opposite - it made us use a
variable seed for any HashAggregate that is NOT partial. Woops.
|
|
|
|
|
|
|
|
|
|
| |
With the old code, a backend that read pg_stat_activity without ever
having executed a parallel query might see a backend in the midst of
executing one waiting on a DSA LWLock, resulting in a crash. The
solution is for backends to register the tranche at startup time, not
the first time a parallel query is executed.
Report by Andreas Seltenreich. Patch by me, reviewed by Thomas Munro.
|
|
|
|
|
|
|
| |
RelationGetPartitionQual() and generate_partition_qual() are always
called with recurse = true, so we don't need an argument for that.
Extracted by me from a larger patch by Amit Langote.
|
|
|
|
|
|
|
|
|
| |
After a tuple is routed to a partition, it has been converted from the
root table's row type to the partition's row type. ExecConstraints
needs to report the failure using the original tuple and the parent's
tuple descriptor rather than the ones for the selected partition.
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 2ac3ef7a01df859c62d0a02333b646d65eaec5ff added a TupleTapleSlot
for partition tuple slot to EState (es_partition_tuple_slot) but it's
more logical to have it as part of ModifyTableState
(mt_partition_tuple_slot) and CopyState (partition_tuple_slot).
Discussion: http://postgr.es/m/1bd459d9-4c0c-197a-346e-e5e59e217d97@lab.ntt.co.jp
Amit Langote, per a gripe from me
|
| |
|
|
|
|
|
| |
makeNode() is already a macro that has the right result pointer type, so
casting it again to the same type is unnecessary.
|
|
|
|
|
|
|
|
|
|
| |
I got a little annoyed by reading documentation paragraphs containing
both spellings within a few lines of each other. My dictionary says
"descendant" is the preferred spelling, and it's certainly the majority
usage in our tree, so standardize on that.
For one usage in parallel.sgml, I thought it better to rewrite to avoid
the term altogether.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous coding failed to work correctly when we have a
multi-level partitioned hierarchy where tables at successive levels
have different attribute numbers for the partition key attributes. To
fix, have each PartitionDispatch object store a standalone
TupleTableSlot initialized with the TupleDesc of the corresponding
partitioned table, along with a TupleConversionMap to map tuples from
the its parent's rowtype to own rowtype. After tuple routing chooses
a leaf partition, we must use the leaf partition's tuple descriptor,
not the root table's. To that end, a dedicated TupleTableSlot for
tuple routing is now allocated in EState.
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the input value to a CoerceToDomain expression node is a read-write
expanded datum, we should pass a read-only pointer to any domain CHECK
expressions and then return the original read-write pointer as the
expression result. Previously we were blindly passing the same pointer to
all the consumers of the value, making it possible for a function in CHECK
to modify or even delete the expanded value. (Since a plpgsql function
will absorb a passed-in read-write expanded array as a local variable
value, it will in fact delete the value on exit.)
A similar hazard of passing the same read-write pointer to multiple
consumers exists in domain_check() and in ExecEvalCase, so fix those too.
The fix requires adding MakeExpandedObjectReadOnly calls at the appropriate
places, which is simple enough except that we need to get the data type's
typlen from somewhere. For the domain cases, solve this by redefining
DomainConstraintRef.tcache as okay for callers to access; there wasn't any
reason for the original convention against that, other than not wanting the
API of typcache.c to be any wider than it had to be. For CASE, there's
no good solution except to add a syscache lookup during executor start.
Per bug #14472 from Marcos Castedo. Back-patch to 9.5 where expanded
values were introduced.
Discussion: https://postgr.es/m/15225.1482431619@sss.pgh.pa.us
|
|
|
|
| |
Amit Langote
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
aggstate->evalproj is always set up by ExecInitAgg, so there's no
need to test. Doing so led Coverity to think that we might be
intending "slot" to be possibly NULL here, and it quite properly
complained that the rest of combine_aggregates() wasn't prepared
for that.
Also fix a couple of obvious thinkos in Asserts checking that
"inputoff" isn't past the end of the slot.
Errors introduced in commit 8ed3f11bb, so no need for back-patch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a query contained two aggregates that could share the transition value,
we would correctly collect the input into a tuplesort only once, but
incorrectly run the transition function over the accumulated input twice,
in finalize_aggregates(). That caused a crash, when we tried to call
tuplesort_performsort() on an already-freed NULL tuplestore.
Backport to 9.6, where sharing of transition state and this bug were
introduced.
Analysis by Tom Lane.
Discussion: https://www.postgresql.org/message-id/ac5b0b69-744c-9114-6218-8300ac920e61@iki.fi
|
|
|
|
|
|
|
| |
This will allow future parallel query code to dynamically allocate
storage shared by all participants.
Thomas Munro, with assorted changes by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5dfc198146b49ce7ecc8a1fc9d5e171fb75f6ba5 introduced the use
of a new type of hash table with linear reprobing for hash aggregates.
Such a hash table behaves very poorly if keys are inserted in hash
order, which does in fact happen in the case where a query use a
Finalize HashAggregate node fed (via Gather) by a Partial
HashAggregate node. In fact, queries with this type of plan tend
to run effectively forever.
Fix that by seeding the hash value differently in each worker
(and in the leader, if it participates).
Andres Freund and Robert Haas
|