| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
Refactor for upcoming optimizations.
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/1cc3b400a0e8eead18ff967436fa9e42c0c14cfb.camel@j-davis.com
|
|
|
|
|
|
|
| |
This reverts commit e0ece2a981ee9068f50c4423e303836c2585eb02 due to
performance regressions.
Reported-by: David Rowley
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the caller needed to allocate the memory and the
TupleHashTable would store a pointer to it. That wastes space for the
palloc overhead as well as the size of the pointer itself.
Now, the TupleHashTable relies on the caller to correctly specify the
additionalsize, and allocates that amount of space. The caller can
then request a pointer into that space.
Discussion: https://postgr.es/m/b9cbf0219a9859dc8d240311643ff4362fd9602c.camel@j-davis.com
Reviewed-by: Heikki Linnakangas
|
|
|
|
| |
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was reasonable to preserve the old API of BuildTupleHashTable()
in the back branches, but in HEAD we should actively discourage use
of that version. There are no remaining callers in core, so just
get rid of it. Then rename BuildTupleHashTableExt() back to
BuildTupleHashTable().
While at it, fix up the miserably-poorly-maintained header comment
for BuildTupleHashTable[Ext]. It looks like more than one patch in
this area has had the opinion that updating comments is beneath them.
Discussion: https://postgr.es/m/538343.1734646986@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
0f5738202 adjusted the execGrouping.c code so it made use of ExprStates to
generate hash values. That commit made a wrong assumption that the slot
type to pass to ExecBuildHash32FromAttrs() is always &TTSOpsMinimalTuple.
That's not the case as the slot type depends on the slot type passed to
LookupTupleHashEntry(), which for nodeRecursiveunion.c, could be any of
the current slot types.
Here we fix this by adding a new parameter to BuildTupleHashTableExt()
to allow the slot type to be passed in. In the case of nodeSubplan.c
and nodeAgg.c the slot type is always &TTSOpsVirtual, so for both of
those cases, it's beneficial to pass the known slot type as that allows
ExecBuildHash32FromAttrs() to skip adding the tuple deform step to the
resulting ExprState. Another possible fix would have been to have
ExecBuildHash32FromAttrs() set "fetch.kind" to NULL so that
ExecComputeSlotInfo() always determines the EEOP_INNER_FETCHSOME is
required, however, that option isn't favorable as slows down aggregation
and hashed subplan evaluation due to the extra (needless) deform step.
Thanks to Nathan Bossart for bisecting to find the offending commit
based on Paul's report.
Reported-by: Paul Ramsey <pramsey@cleverelephant.ca>
Discussion: https://postgr.es/m/99F064C1-B3EB-4BE7-97D2-D2A0AA487A71@cleverelephant.ca
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This speeds up obtaining hash values for GROUP BY and hashed SubPlans by
using the ExprState support for hashing, thus allowing JIT compilation for
obtaining hash values for these operations.
This, even without JIT compilation, has been shown to improve Hash
Aggregate performance in some cases by around 15% and hashed NOT IN
queries in one case by over 30%, however, real-world cases are likely to
see smaller gains as the test cases used were purposefully designed to
have high hashing overheads by keeping the hash table small to prevent
additional memory overheads that would be a factor when working with large
hash tables.
In passing, fix a hypothetical bug in ExecBuildHash32Expr() so that the
initial value is stored directly in the ExprState's result field if
there are no expressions to hash. None of the current users of this
function use an initial value, so the bug is only hypothetical.
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpYSO3kc9UryMevWqthTBrxgfd9djiAjKHMPUSQeX9vdQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bf6c614a2 did some conversion work to use ExprState instead of manually
calling equality functions to check if one set of values is not distinct
from another set. That patch removed many of the fields that became
redundant as a result of that change, but it forgot to remove
SubPlanState.tab_eq_funcs. Fix that.
In passing, fix the header comment for TupleHashEntryData to correctly
spell the field name it's talking about.
Author: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CA+FpmFeycdombFzrjZw7Rmc29CVm4OOzCWwu=dVBQ6q=PX8SvQ@mail.gmail.com
Discussion: https://postgr.es/m/CAApHDvrWR2jYVhec=COyF2g2BE_ns91NDsCHAMFiXbyhEujKdQ@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now we generated an ExprState for each parameter to a SubPlan and
evaluated them one-by-one ExecScanSubPlan. That's sub-optimal as creating lots
of small ExprStates
a) makes JIT compilation more expensive
b) wastes memory
c) is a bit slower to execute
This commit arranges to evaluate parameters to a SubPlan as part of the
ExprState referencing a SubPlan, using the new EEOP_PARAM_SET expression
step. We emit one EEOP_PARAM_SET for each argument to a subplan, just before
the EEOP_SUBPLAN step.
It likely is worth using EEOP_PARAM_SET in other places as well, e.g. for
SubPlan outputs, nestloop parameters and - more ambitiously - to get rid of
ExprContext->domainValue/caseValue/ecxt_agg*. But that's for later.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Discussion: https://postgr.es/m/20230225214401.346ancgjqc3zmvek@awork3.anarazel.de
|
|
|
|
|
|
|
|
| |
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz
Backpatch-through: 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already tried to fix this in commits 3f7323cbb et al (and follow-on
fixes), but now it emerges that there are still unfixed cases;
moreover, these cases affect all branches not only pre-v14. I thought
we had eliminated all cases of making multiple clones of an UPDATE's
target list when we nuked inheritance_planner. But it turns out we
still do that in some partitioned-UPDATE cases, notably including
INSERT ... ON CONFLICT UPDATE, because ExecInitPartitionInfo thinks
it's okay to clone and modify the parent's targetlist.
This fix is based on a suggestion from Andres Freund: let's stop
abusing the ParamExecData.execPlan mechanism, which was only ever
meant to handle initplans, and instead solve the execution timing
problem by having the expression compiler move MULTIEXPR_SUBLINK steps
to the front of their expression step lists. This is feasible because
(a) all branches still in support compile the entire targetlist of
an UPDATE into a single ExprState, and (b) we know that all
MULTIEXPR_SUBLINKs do need to be evaluated --- none could be buried
inside a CASE, for example. There is a minor semantics change
concerning the order of execution of the MULTIEXPR's subquery versus
other parts of the parent targetlist, but that seems like something
we can get away with. By doing that, we no longer need to worry
about whether different clones of a MULTIEXPR_SUBLINK share output
Params; their usage of that data structure won't overlap.
Per bug #17800 from Alexander Lakhin. Back-patch to all supported
branches. In v13 and earlier, we can revert 3f7323cbb and follow-on
fixes; however, I chose to keep the SubPlan.subLinkId field added
in ccbb54c72. We don't need that anymore in the core code, but it's
cheap enough to fill, and removing a plan node field in a minor
release seems like it'd be asking for trouble.
Andres Freund and Tom Lane
Discussion: https://postgr.es/m/17800-ff90866b3906c964@postgresql.org
|
|
|
|
| |
Backpatch-through: 11
|
|
|
|
|
|
|
|
|
| |
While the bug I just fixed in the back branches doesn't exist in
HEAD, the requirement that MULTIEXPR SubPlans not share output
parameters still does. Add a comment to memorialize that, because
perhaps it could be an issue again someday.
Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several places in the planner tried to clamp a double value to fit
in a "long" by doing
(long) Min(x, (double) LONG_MAX);
This is subtly incorrect, because it casts LONG_MAX to double and
potentially back again. If long is 64 bits then the double value
is inexact, and the platform might round it up to LONG_MAX+1
resulting in an overflow and an undesirably negative output.
While it's not hard to rewrite the expression into a safe form,
let's put it into a common function to reduce the risk of someone
doing it wrong in future.
In principle this is a bug fix, but since the problem could only
manifest with group count estimates exceeding 2^63, it seems unlikely
that anyone has actually hit this or will do so anytime soon. We're
fixing it mainly to satisfy fuzzer-type tools. That being the case,
a HEAD-only fix seems sufficient.
Andrey Lepikhov
Discussion: https://postgr.es/m/ebbc2efb-7ef9-bf2f-1ada-d6ec48f70e58@postgrespro.ru
|
|
|
|
| |
Backpatch-through: 10
|
|
|
|
| |
Backpatch-through: 9.5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the planner erroneously puts a non-parallel-safe SubPlan into
a parallelized portion of the query tree, nodeSubplan.c will fail
in the worker processes because it finds a null in es_subplanstates,
which it's unable to cope with. It seems worth a test-and-elog to
make that an error case rather than a core dump case.
This probably should have been included in commit 16ebab688, which
was responsible for allowing nulls to appear in es_subplanstates
to begin with. So, back-patch to v10 where that came in.
Discussion: https://postgr.es/m/924226.1604422326@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When commit bd3daddaf introduced AlternativeSubPlans, I had some
ambitions towards allowing the choice of subplan to change during
execution. That has not happened, or even been thought about, in the
ensuing twelve years; so it seems like a failed experiment. So let's
rip that out and resolve the choice of subplan at the end of planning
(in setrefs.c) rather than during executor startup. This has a number
of positive benefits:
* Removal of a few hundred lines of executor code, since
AlternativeSubPlans need no longer be supported there.
* Removal of executor-startup overhead (particularly, initialization
of subplans that won't be used).
* Removal of incidental costs of having a larger plan tree, such as
tree-scanning and copying costs in the plancache; not to mention
setrefs.c's own costs of processing the discarded subplans.
* EXPLAIN no longer has to print a weird (and undocumented)
representation of an AlternativeSubPlan choice; it sees only the
subplan actually used. This should mean less confusion for users.
* Since setrefs.c knows which subexpression of a plan node it's
working on at any instant, it's possible to adjust the estimated
number of executions of the subplan based on that. For example,
we should usually estimate more executions of a qual expression
than a targetlist expression. The implementation used here is
pretty simplistic, because we don't want to expend a lot of cycles
on the issue; but it's better than ignoring the point entirely,
as the executor had to.
That last point might possibly result in shifting the choice
between hashed and non-hashed EXISTS subplans in a few cases,
but in general this patch isn't meant to change planner choices.
Since we're doing the resolution so late, it's really impossible
to change any plan choices outside the AlternativeSubPlan itself.
Patch by me; thanks to David Rowley for review.
Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nodeSubplan.c expects that the testexpr for a hashable ANY SubPlan
has the form of one or more OpExprs whose LHS is an expression of the
outer query's, while the RHS is an expression over Params representing
output columns of the subquery. However, the planner only went as far
as verifying that the clauses were all binary OpExprs. This works
99.99% of the time, because the clauses have the right shape when
emitted by the parser --- but it's possible for function inlining to
break that, as reported by PegoraroF10. To fix, teach the planner
to check that the LHS and RHS contain the right things, or more
accurately don't contain the wrong things. Given that this has been
broken for years without anyone noticing, it seems sufficient to just
give up hashing when it happens, rather than go to the trouble of
commuting the clauses back again (which wouldn't necessarily work
anyway).
While poking at that, I also noticed that nodeSubplan.c had a baked-in
assumption that the number of hash clauses is identical to the number
of subquery output columns. Again, that's fine as far as parser output
goes, but it's not hard to break it via function inlining. There seems
little reason for that assumption though --- AFAICS, the only thing
it's buying us is not having to store the number of hash clauses
explicitly. Adding code to the planner to reject such cases would take
more code than getting nodeSubplan.c to cope, so I fixed it that way.
This has been broken for as long as we've had hashable SubPlans,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/1549209182255-0.post@n3.nabble.com
|
|
|
|
|
|
|
|
| |
Refactor hash lookups in nodeAgg.c to improve performance.
Author: Andres Freund and Jeff Davis
Discussion: https://postgr.es/m/20200612213715.op4ye4q7gktqvpuo%40alap3.anarazel.de
Backpatch-through: 13
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 356687bd8 omitted to remove leftover code for destroying
a hashed subplan's hash tables, with the result that the tables
were always rebuilt not reused; this leads to severe memory
leakage if a hashed subplan is re-executed enough times.
Moreover, the code for reusing the hashnulls table had a typo
that would have made it do the wrong thing if it were reached.
Looking at the code coverage report shows severe under-coverage
of the potential callers of ResetTupleHashTable, so add some test
cases that exercise them.
Andreas Karlsson and Tom Lane, per reports from Ranier Vilela
and Justin Pryzby.
Backpatch to v11, as the faulty commit was.
Discussion: https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se
Discussion: https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
|
|
|
|
| |
Backpatch-through: update all files in master, backpatch legal files through 9.4
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order
of header file inclusion consistent for backend modules.
In the passing, removed a couple of duplicate inclusions.
Author: Vignesh C
Reviewed-by: Kuntal Ghosh and Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit bf6c614a2 rearranged the lookup of the comparison operators
needed in a hashed subplan, and in so doing, broke the cross-type
case: it caused the original LHS-vs-RHS operator to be used to compare
hash table entries too (which of course are all of the RHS type).
This leads to C functions being passed a Datum that is not of the
type they expect, with the usual hazards of crashes and unauthorized
server memory disclosure.
For the set of hashable cross-type operators present in v11 core
Postgres, this bug is nearly harmless on 64-bit machines, which
may explain why it escaped earlier detection. But it is a live
security hazard on 32-bit machines; and of course there may be
extensions that add more hashable cross-type operators, which
would increase the risk.
Reported by Andreas Seltenreich. Back-patch to v11 where the
problem came in.
Security: CVE-2019-10209
|
|
|
|
|
|
|
|
|
| |
Switch to 2.1 version of pg_bsd_indent. This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.
Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
|
|
|
|
|
|
|
|
| |
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.
Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a flag "deterministic" to collations. If that is false,
such a collation disables various optimizations that assume that
strings are equal only if they are byte-wise equal. That then allows
use cases such as case-insensitive or accent-insensitive comparisons
or handling of strings with different Unicode normal forms.
This functionality is only supported with the ICU provider. At least
glibc doesn't appear to have any locales that work in a
nondeterministic way, so it's not worth supporting this for the libc
provider.
The term "deterministic comparison" in this context is from Unicode
Technical Standard #10
(https://unicode.org/reports/tr10/#Deterministic_Comparison).
This patch makes changes in three areas:
- CREATE COLLATION DDL changes and system catalog changes to support
this new flag.
- Many executor nodes and auxiliary code are extended to track
collations. Previously, this code would just throw away collation
information, because the eventually-called user-defined functions
didn't use it since they only cared about equality, which didn't
need collation information.
- String data type functions that do equality comparisons and hashing
are changed to take the (non-)deterministic flag into account. For
comparison, this just means skipping various shortcuts and tie
breakers that use byte-wise comparison. For hashing, we first need
to convert the input string to a canonical "sort key" using the ICU
analogue of strxfrm().
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the facility added in the preceding commit to fix
performance issues caused by rebuilding the hashtable (with its
comparator expression being the most expensive bit), after every
reset. That's especially important when the comparator is JIT
compiled.
Bug: #15592 #15486
Reported-By: Jakub Janeček, Dmitry Marakasov
Author: Andres Freund
Discussion:
https://postgr.es/m/15486-05850f065da42931@postgresql.org
https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Backpatch: 11, where I broke this in bf6c614a2f2c5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move a few very simple node-creation and node-type-testing functions
from the planner's clauses.c to nodes/makefuncs and nodes/nodeFuncs.
There's nothing planner-specific about them, as evidenced by the
number of other places that were using them.
While at it, rename and_clause() etc to is_andclause() etc, to clarify
that they are node-type-testing functions not node-creation functions.
And use "static inline" implementations for the shortest ones.
Also, modify flatten_join_alias_vars() and some subsidiary functions
to take a Query not a PlannerInfo to define the join structure that
Vars should be translated according to. They were only using the
"parse" field of the PlannerInfo anyway, so this just requires removing
one level of indirection. The advantage is that now parse_agg.c can
use flatten_join_alias_vars() without the horrid kluge of creating an
incomplete PlannerInfo, which will allow that file to be decoupled from
relation.h in a subsequent patch.
Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
|
|
|
|
| |
Backpatch-through: certain files through 9.4
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.
This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row. Neither pg_dump nor COPY included the contents of the
oid column by default.
The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.
WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.
Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.
The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.
The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such. This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.
The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.
Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).
The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.
While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.
Catversion bump, for obvious reasons.
Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit completes the work prepared in 1a0586de36, splitting the
old TupleTableSlot implementation (which could store buffer, heap,
minimal and virtual slots) into four different slot types. As
described in the aforementioned commit, this is done with the goal of
making tuple table slots extensible, to allow for pluggable table
access methods.
To achieve runtime extensibility for TupleTableSlots, operations on
slots that can differ between types of slots are performed using the
TupleTableSlotOps struct provided at slot creation time. That
includes information from the size of TupleTableSlot struct to be
allocated, initialization, deforming etc. See the struct's definition
for more detailed information about callbacks TupleTableSlotOps.
I decided to rename TTSOpsBufferTuple to TTSOpsBufferHeapTuple and
ExecCopySlotTuple to ExecCopySlotHeapTuple, as that seems more
consistent with other naming introduced in recent patches.
There's plenty optimization potential in the slot implementation, but
according to benchmarking the state after this commit has similar
performance characteristics to before this set of changes, which seems
sufficient.
There's a few changes in execReplication.c that currently need to poke
through the slot abstraction, that'll be repaired once the pluggable
storage patchset provides the necessary infrastructure.
Author: Andres Freund and Ashutosh Bapat, with changes by Amit Khandekar
Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upcoming work intends to allow pluggable ways to introduce new ways of
storing table data. Accessing those table access methods from the
executor requires TupleTableSlots to be carry tuples in the native
format of such storage methods; otherwise there'll be a significant
conversion overhead.
Different access methods will require different data to store tuples
efficiently (just like virtual, minimal, heap already require fields
in TupleTableSlot). To allow that without requiring additional pointer
indirections, we want to have different structs (embedding
TupleTableSlot) for different types of slots. Thus different types of
slots are needed, which requires adapting creators of slots.
The slot that most efficiently can represent a type of tuple in an
executor node will often depend on the type of slot a child node
uses. Therefore we need to track the type of slot is returned by
nodes, so parent slots can create slots based on that.
Relatedly, JIT compilation of tuple deforming needs to know which type
of slot a certain expression refers to, so it can create an
appropriate deforming function for the type of tuple in the slot.
But not all nodes will only return one type of slot, e.g. an append
node will potentially return different types of slots for each of its
subplans.
Therefore add function that allows to query the type of a node's
result slot, and whether it'll always be the same type (whether it's
fixed). This can be queried using ExecGetResultSlotOps().
The scan, result, inner, outer type of slots are automatically
inferred from ExecInitScanTupleSlot(), ExecInitResultSlot(),
left/right subtrees respectively. If that's not correct for a node,
that can be overwritten using new fields in PlanState.
This commit does not introduce the actually abstracted implementation
of different kind of TupleTableSlots, that will be left for a followup
commit. The different types of slots introduced will, for now, still
use the same backing implementation.
While this already partially invalidates the big comment in
tuptable.h, it seems to make more sense to update it later, when the
different TupleTableSlot implementations actually exist.
Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar
Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EvalPlanQual machinery assumes that any initplans (that is,
uncorrelated sub-selects) used during an EPQ recheck would have already
been evaluated during the main query; this is implicit in the fact that
execPlan pointers are not copied into the EPQ estate's es_param_exec_vals.
But it's possible for that assumption to fail, if the initplan is only
reached conditionally. For example, a sub-select inside a CASE expression
could be reached during a recheck when it had not been previously, if the
CASE test depends on a column that was just updated.
This bug is old, appearing to date back to my rewrite of EvalPlanQual in
commit 9f2ee8f28, but was not detected until Kyle Samson reported a case.
To fix, force all not-yet-evaluated initplans used within the EPQ plan
subtree to be evaluated at the start of the recheck, before entering the
EPQ environment. This could be inefficient, if such an initplan is
expensive and goes unused again during the recheck --- but that's piling
one layer of improbability atop another. It doesn't seem worth adding
more complexity to prevent that, at least not in the back branches.
It was convenient to use the new-in-v11 ExecEvalParamExecParams function
to implement this, but I didn't like either its name or the specifics of
its API, so revise that.
Back-patch all the way. Rather than rewrite the patch to avoid depending
on bms_next_member() in the oldest branches, I chose to back-patch that
function into 9.4 and 9.3. (This isn't the first time back-patches have
needed that, and it exhausted my patience.) I also chose to back-patch
some test cases added by commits 71404af2a and 342a1ffa2 into 9.4 and 9.3,
so that the 9.x versions of eval-plan-qual.spec are all the same.
Andrew Gierth diagnosed the problem and contributed the added test cases,
though the actual code changes are by me.
Discussion: https://postgr.es/m/A033A40A-B234-4324-BE37-272279F7B627@tripadvisor.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When executing a SubPlan in an expression, the EState's direction
field was left alone, resulting in an attempt to execute the subplan
backwards if it was encountered during a backwards scan of a cursor.
Also, though much less likely, it was possible to reach the execution
of an InitPlan while in backwards-scan state.
Repair by saving/restoring estate->es_direction and forcing forward
scan mode in the relevant places.
Backpatch all the way, since this has been broken since 8.3 (prior to
commit c7ff7663e, SubPlans had their own EStates rather than sharing
the parent plan's, so there was no confusion over scan direction).
Per bug #15336 reported by Vladimir Baranoff; analysis and patch by
me, review by Tom Lane.
Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org
|
|
|
|
| |
Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reason for doing so is that it will allow expression evaluation to
optimize based on the underlying tupledesc. In particular it will
allow to JIT tuple deforming together with the expression itself.
For that expression initialization needs to be moved after the
relevant slots are initialized - mostly unproblematic, except in the
case of nodeWorktablescan.c.
After doing so there's no need for ExecAssignResultType() and
ExecAssignResultTypeFromTL() anymore, as all former callers have been
converted to create a slot with a fixed descriptor.
When creating a slot with a fixed descriptor, tts_values/isnull can be
allocated together with the main slot, reducing allocation overhead
and increasing cache density a bit.
Author: Andres Freund
Discussion: https://postgr.es/m/20171206093717.vqdxe5icqttpxs3p@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has a performance benefit on own, although not hugely so. The
primary benefit is that it will allow for to JIT tuple deforming and
comparator invocations.
Large parts of this were previously committed (773aec7aa), but the
commit contained an omission around cross-type comparisons and was
thus reverted.
Author: Andres Freund
Discussion: https://postgr.es/m/20171129080934.amqqkke2zjtekd4t@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 773aec7aa98abd38d6d9435913bb8e14e392c274.
There's an unresolved issue in the reverted commit: It only creates
one comparator function, but in for the nodeSubplan.c case we need
more (c.f. FindTupleHashEntry vs LookupTupleHashEntry calls in
nodeSubplan.c).
This isn't too difficult to fix, but it's not entirely trivial
either. The fact that the issue only causes breakage on 32bit systems
shows that the current test coverage isn't that great. To avoid
turning half the buildfarm red till those two issues are addressed,
revert.
|
|
|
|
|
|
|
|
|
| |
This has a performance benefit on own, although not hugely so. The
primary benefit is that it will allow for to JIT tuple deforming and
comparator invocations.
Author: Andres Freund
Discussion: https://postgr.es/m/20171129080934.amqqkke2zjtekd4t@alap3.anarazel.de
|
|
|
|
| |
Backpatch-through: certain files through 9.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lower case spellings are C and C++ standard and are used in most
parts of the PostgreSQL sources. The upper case spellings are only used
in some files/modules. So standardize on the standard spellings.
The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so
those are left as is when using those APIs.
In code comments, we use the lower-case spelling for the C concepts and
keep the upper-case spelling for the SQL concepts.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This is a mechanical change in preparation for a later commit that
will change the layout of TupleDesc. Introducing a macro to abstract
the details of where attributes are stored will allow us to change
that in separate step and revise it in future.
Author: Thomas Munro, editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a followup commit ExecProcNode(), and especially the large switch
it contains, will largely be replaced by a function pointer directly
to the correct node. The node functions will then get invoked by a
thin inline function wrapper. To avoid having to include miscadmin.h
in headers - CHECK_FOR_INTERRUPTS() - move the interrupt checks into
the individual executor routines.
While looking through all executor nodes, I noticed a number of
arguably missing interrupt checks, add these too.
Author: Andres Freund, Tom Lane
Reviewed-By: Tom Lane
Discussion:
https://postgr.es/m/22833.1490390175@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.
This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.
The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
out operation metadata sequentially; including the avoidance of
nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
constant re-checks at evaluation time
Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.
The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
overhead of expression evaluation, by caching state in prepared
statements. That'd be helpful in OLTPish scenarios where
initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
been made here too.
The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
initialization, whereas previously they were done during
execution. In edge cases this can lead to errors being raised that
previously wouldn't have been, e.g. a NULL array being coerced to a
different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
during expression initialization, previously it was re-built
every time a domain check was evaluated. For normal queries this
doesn't change much, but e.g. for plpgsql functions, which caches
ExprStates, the old set could stick around longer. The behavior
around might still change.
Author: Andres Freund, with significant changes by Tom Lane,
changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
|
|
|
|
|
|
|
|
|
| |
This is far from a pervasive conversion, but it's a good starting
point.
Author: Peter Eisentraut, with some minor changes by me
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 69f4b9c plain expression evaluation (and thus normal projection)
can't return sets of tuples anymore. Thus remove code dealing with
that possibility.
This will require adjustments in external code using
ExecEvalExpr()/ExecProject() - that should neither be hard nor very
common.
Author: Andres Freund and Tom Lane
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5dfc198146b49ce7ecc8a1fc9d5e171fb75f6ba5 introduced the use
of a new type of hash table with linear reprobing for hash aggregates.
Such a hash table behaves very poorly if keys are inserted in hash
order, which does in fact happen in the case where a query use a
Finalize HashAggregate node fed (via Gather) by a Partial
HashAggregate node. In fact, queries with this type of plan tend
to run effectively forever.
Fix that by seeding the hash value differently in each worker
(and in the leader, if it participates).
Andres Freund and Robert Haas
|