| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
| |
where there was also a WHERE-clause restriction that applied to the
join. The check on restrictlist == NIL is really unnecessary anyway,
because select_mergejoin_clauses already checked for and complained
about any unmergejoinable join clauses. So just take it out.
|
|
|
|
|
|
|
|
|
| |
scans, using in-memory tuple ID bitmaps as the intermediary. The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed. I have tested it using some hacked planner
code that is far too ugly to see the light of day, however. Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
|
|
|
|
|
| |
left input's sorting, because null rows may be inserted at various points.
Per report from Ferenc Lutischá¸n.
|
|
|
|
|
|
|
|
| |
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
|
| |
|
| |
|
|
|
|
|
| |
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.
The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
|
|
|
|
|
|
|
| |
That particular corner case is not exactly compelling, but given 7.4's
ability to discard redundant join clauses, it is possible for the situation
to arise from queries that are not so obviously silly. Per bug report
of 6-Apr-04.
|
|
|
|
|
|
|
| |
join conditions in which each OR subclause includes a constraint on
the same relation. This implements the other useful side-effect of
conversion to CNF format, without its unpleasant side-effects. As
per pghackers discussion of a few weeks ago.
|
|
|
|
|
|
|
|
|
|
| |
fields: now they are valid whenever the clause is a binary opclause,
not only when it is a potential join clause (there is a new boolean
field canjoin to signal the latter condition). This lets us avoid
recomputing the relid sets over and over while examining indexes.
Still more work to do to make this as useful as it could be, because
there are places that could use the info but don't have access to the
RestrictInfo node.
|
| |
|
|
|
|
|
| |
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Instead of Lists of integers, we now store variable-length bitmap sets.
This should be faster as well as less error-prone.
|
|
|
|
|
|
|
| |
Try to model the effect of rescanning input tuples in mergejoins;
account for JOIN_IN short-circuiting where appropriate. Also, recognize
that mergejoin and hashjoin clauses may now be more than single operator
calls, so we have to charge appropriate execution costs.
|
|
|
|
|
|
|
|
|
|
| |
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
containing a volatile function), rather than only on 'Var = Var' clauses
as before. This makes it practical to do flatten_join_alias_vars at the
start of planning, which in turn eliminates a bunch of klugery inside the
planner to deal with alias vars. As a free side effect, we now detect
implied equality of non-Var expressions; for example in
SELECT ... WHERE a.x = b.y and b.y = 42
we will deduce a.x = 42 and use that as a restriction qual on a. Also,
we can remove the restriction introduced 12/5/02 to prevent pullup of
subqueries whose targetlists contain sublinks.
Still TODO: make statistical estimation routines in selfuncs.c and costsize.c
smarter about expressions that are more complex than plain Vars. The need
for this is considerably greater now that we have to be able to estimate
the suitability of merge and hash join techniques on such expressions.
|
|
|
|
| |
cost into account while planning.
|
|
|
|
|
|
| |
instead of only one. This should speed up planning (only one hash path
to consider for a given pair of relations) as well as allow more effective
hashing, when there are multiple hashable joinclauses.
|
|
|
|
|
|
|
|
|
|
|
| |
joinclauses is determined accurately for each join. Formerly, the code only
considered joinclauses that used all of the rels from the outer side of the
join; thus for example
FROM (a CROSS JOIN b) JOIN c ON (c.f1 = a.x AND c.f2 = b.y)
could not exploit a two-column index on c(f1,f2), since neither of the
qual clauses would be in the joininfo list it looked in. The new code does
this correctly, and also is able to eliminate redundant clauses, thus fixing
the problem noted 24-Oct-02 by Hans-Jürgen Schönig.
|
| |
|
|
|
|
| |
because c.h has sys/types.h.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
now has an RTE of its own, and references to its outputs now are Vars
referencing the JOIN RTE, rather than CASE-expressions. This allows
reverse-listing in ruleutils.c to use the correct alias easily, rather
than painfully reverse-engineering the alias namespace as it used to do.
Also, nested FULL JOINs work correctly, because the result of the inner
joins are simple Vars that the planner can cope with. This fixes a bug
reported a couple times now, notably by Tatsuo on 18-Nov-01. The alias
Vars are expanded into COALESCE expressions where needed at the very end
of planning, rather than during parsing.
Also, beginnings of support for showing plan qualifier expressions in
EXPLAIN. There are probably still cases that need work.
initdb forced due to change of stored-rule representation.
|
|
|
|
|
|
| |
mergeclauses in RIGHT/FULL join cases, just like the other routines have.
I'm not quite sure why I thought it didn't need one --- but Nick
Fankhauser's recent bug report proves that it does.
|
|
|
|
| |
tests pass.
|
|
|
|
|
|
|
|
|
| |
of costsize.c routines to pass Query root, so that costsize can figure
more things out by itself and not be so dependent on its callers to tell
it everything it needs to know. Use selectivity of hash or merge clause
to estimate number of tuples processed internally in these joins
(this is more useful than it would've been before, since eqjoinsel is
somewhat more accurate than before).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored. ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values). Random
sampling is used to make the process reasonably fast even on very large
tables. The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.
There is more still to do; the new stats are not being used everywhere
they could be in the planner. But the remaining changes for this project
should be localized, and the behavior is already better than before.
A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
|
|
|
|
|
|
|
|
| |
join clauses. The mergejoin executor wants all the join clauses to appear
as merge quals, not as extra joinquals, for these kinds of joins. But the
planner would consider plans in which partially-sorted input paths were
used, leading to only some of the join clauses becoming merge quals.
This is fine for inner/left joins, not fine for right/full joins.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
comparison does not consider paths different when they differ only in
uninteresting aspects of sort order. (We had a special case of this
consideration for indexscans already, but generalize it to apply to
ordered join paths too.) Be stricter about what is a canonical pathkey
to allow faster pathkey comparison. Cache canonical pathkeys and
dispersion stats for left and right sides of a RestrictInfo's clause,
to avoid repeated computation. Total speedup will depend on number of
tables in a query, but I see about 4x speedup of planning phase for
a sample seven-table query.
|
|
|
|
|
| |
if enable_mergejoin = OFF. Must do this, because we have no other
implementation method for full joins.
|
|
|
|
| |
Fix misspelling of disbursion to dispersion.
|
|
|
|
|
|
|
|
|
| |
(Don't forget that an alias is required.) Views reimplemented as expanding
to subselect-in-FROM. Grouping, aggregates, DISTINCT in views actually
work now (he says optimistically). No UNION support in subselects/views
yet, but I have some ideas about that. Rule-related permissions checking
moved out of rewriter and into executor.
INITDB REQUIRED!
|
|
|
|
|
| |
ends to clean up (see my message of same date to pghackers), but mostly
it works. INITDB REQUIRED!
|
| |
|
| |
|
|
|
|
| |
but this is as good as it'll get for this release...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
accesses versus sequential accesses, a (very crude) estimate of the
effects of caching on random page accesses, and cost to evaluate WHERE-
clause expressions. Export critical parameters for this model as SET
variables. Also, create SET variables for the planner's enable flags
(enable_seqscan, enable_indexscan, etc) so that these can be controlled
more conveniently than via PGOPTIONS.
Planner now estimates both startup cost (cost before retrieving
first tuple) and total cost of each path, so it can optimize queries
with LIMIT on a reasonable basis by interpolating between these costs.
Same facility is a win for EXISTS(...) subqueries and some other cases.
Redesign pathkey representation to achieve a major speedup in planning
(I saw as much as 5X on a 10-way join); also minor changes in planner
to reduce memory consumption by recycling discarded Path nodes and
not constructing unnecessary lists.
Minor cleanups to display more-plausible costs in some cases in
EXPLAIN output.
Initdb forced by change in interface to index cost estimation
functions.
|
|
|
|
|
|
|
|
|
|
|
| |
fields in JoinPaths --- turns out that we do need that after all :-(.
Also, rearrange planner so that only one RelOptInfo is created for a
particular set of joined base relations, no matter how many different
subsets of relations it can be created from. This saves memory and
processing time compared to the old method of making a bunch of RelOptInfos
and then removing the duplicates. Clean up the jointree iteration logic;
not sure if it's better, but I sure find it more readable and plausible
now, particularly for the case of 'bushy plans'.
|
|
|
|
|
|
| |
nonoverlap_sets() and is_subset() to list.c, where they should have lived
to begin with, and rename to nonoverlap_setsi and is_subseti since they
only work on integer lists.
|
|
|
|
|
|
| |
* Portions Copyright (c) 1996-2000, PostgreSQL, Inc
to all files copyright Regents of Berkeley. Man, that's a lot of files.
|
|
|
|
|
|
| |
pghackers discussion of 5-Jan-2000. The amopselect and amopnpages
estimators are gone, and in their place is a per-AM amcostestimate
procedure (linked to from pg_am, not pg_amop).
|
|
|
|
|
| |
code cleanup; no major improvements yet. However, EXPLAIN does produce
more intuitive outputs for nested loops with indexscans now...
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sort order down into planner, instead of handling it only at the very top
level of the planner. This fixes many things. An explicit sort is now
avoided if there is a cheaper alternative (typically an indexscan) not
only for ORDER BY, but also for the internal sort of GROUP BY. It works
even when there is no other reason (such as a WHERE condition) to consider
the indexscan. It works for indexes on functions. It works for indexes
on functions, backwards. It's just so cool...
CAUTION: I have changed the representation of SortClause nodes, therefore
THIS UPDATE BREAKS STORED RULES. You will need to initdb.
|
|
|
|
|
|
|
|
|
| |
store all ordering information in pathkeys lists (which are now lists of
lists of PathKeyItem nodes, not just lists of lists of vars). This was
a big win --- the code is smaller and IMHO more understandable than it
was, even though it handles more cases. I believe the node changes will
not force an initdb for anyone; planner nodes don't show up in stored
rules.
|