aboutsummaryrefslogtreecommitdiff
path: root/src/backend/optimizer/path/joinpath.c
Commit message (Collapse)AuthorAge
...
* Previous fix for "x FULL JOIN y ON true" failed to handle the caseTom Lane2005-05-24
| | | | | | | where there was also a WHERE-clause restriction that applied to the join. The check on restrictlist == NIL is really unnecessary anyway, because select_mergejoin_clauses already checked for and complained about any unmergejoinable join clauses. So just take it out.
* Create executor and planner-backend support for decoupled heap and indexTom Lane2005-04-19
| | | | | | | | | scans, using in-memory tuple ID bitmaps as the intermediary. The planner frontend (path creation and cost estimation) is not there yet, so none of this code can be executed. I have tested it using some hacked planner code that is far too ugly to see the light of day, however. Committing now so that the bulk of the infrastructure changes go in before the tree drifts under me.
* The result of a FULL or RIGHT join can't be assumed to be sorted by theTom Lane2005-01-23
| | | | | left input's sorting, because null rows may be inserted at various points. Per report from Ferenc Lutischá¸n.
* Tag appropriate files for rc3PostgreSQL Daemon2004-12-31
| | | | | | | | Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
* Pgindent run for 8.0.Bruce Momjian2004-08-29
|
* Update copyright to 2004.Bruce Momjian2004-08-29
|
* Use the new List API function names throughout the backend, and disable theNeil Conway2004-05-30
| | | | | list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.
* Reimplement the linked list data structure used throughout the backend.Neil Conway2004-05-26
| | | | | | | | | | | | | | | | In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.
* Support FULL JOIN with no join clauses, such as X FULL JOIN Y ON TRUE.Tom Lane2004-04-06
| | | | | | | That particular corner case is not exactly compelling, but given 7.4's ability to discard redundant join clauses, it is possible for the situation to arise from queries that are not so obviously silly. Per bug report of 6-Apr-04.
* Add the ability to extract OR indexscan conditions from OR-of-ANDTom Lane2004-01-05
| | | | | | | join conditions in which each OR subclause includes a constraint on the same relation. This implements the other useful side-effect of conversion to CNF format, without its unpleasant side-effects. As per pghackers discussion of a few weeks ago.
* Adjust the definition of RestrictInfo's left_relids and right_relidsTom Lane2003-12-30
| | | | | | | | | | fields: now they are valid whenever the clause is a binary opclause, not only when it is a potential join clause (there is a new boolean field canjoin to signal the latter condition). This lets us avoid recomputing the relid sets over and over while examining indexes. Still more work to do to make this as useful as it could be, because there are places that could use the info but don't have access to the RestrictInfo node.
* $Header: -> $PostgreSQL Changes ...PostgreSQL Daemon2003-11-29
|
* Message editing: remove gratuitous variations in message wording, standardizePeter Eisentraut2003-09-25
| | | | | terms, add some clarifications, fix some untranslatable attempts at dynamic message building.
* Update copyrights to 2003.Bruce Momjian2003-08-04
|
* pgindent run.Bruce Momjian2003-08-04
|
* Error message editing in backend/optimizer, backend/rewrite.Tom Lane2003-07-25
|
* Replace planner's representation of relation sets, per pghackers discussion.Tom Lane2003-02-08
| | | | | Instead of Lists of integers, we now store variable-length bitmap sets. This should be faster as well as less error-prone.
* Upgrade cost estimation for joins, per discussion with Bradley Baetz.Tom Lane2003-01-27
| | | | | | | Try to model the effect of rescanning input tuples in mergejoins; account for JOIN_IN short-circuiting where appropriate. Also, recognize that mergejoin and hashjoin clauses may now be more than single operator calls, so we have to charge appropriate execution costs.
* IN clauses appearing at top level of WHERE can now be handled as joins.Tom Lane2003-01-20
| | | | | | | | | | There are two implementation techniques: the executor understands a new JOIN_IN jointype, which emits at most one matching row per left-hand row, or the result of the IN's sub-select can be fed through a DISTINCT filter and then joined as an ordinary relation. Along the way, some minor code cleanup in the optimizer; notably, break out most of the jointree-rearrangement preprocessing in planner.c and put it in a new file prep/prepjointree.c.
* Allow merge and hash joins to occur on arbitrary expressions (anything notTom Lane2003-01-15
| | | | | | | | | | | | | | | | containing a volatile function), rather than only on 'Var = Var' clauses as before. This makes it practical to do flatten_join_alias_vars at the start of planning, which in turn eliminates a bunch of klugery inside the planner to deal with alias vars. As a free side effect, we now detect implied equality of non-Var expressions; for example in SELECT ... WHERE a.x = b.y and b.y = 42 we will deduce a.x = 42 and use that as a restriction qual on a. Also, we can remove the restriction introduced 12/5/02 to prevent pullup of subqueries whose targetlists contain sublinks. Still TODO: make statistical estimation routines in selfuncs.c and costsize.c smarter about expressions that are more complex than plain Vars. The need for this is considerably greater now that we have to be able to estimate the suitability of merge and hash join techniques on such expressions.
* Be more realistic about plans involving Materialize nodes: take theirTom Lane2002-11-30
| | | | cost into account while planning.
* Upgrade planner and executor to allow multiple hash keys for a hash join,Tom Lane2002-11-30
| | | | | | instead of only one. This should speed up planning (only one hash path to consider for a given pair of relations) as well as allow more effective hashing, when there are multiple hashable joinclauses.
* Restructure planning of nestloop inner indexscans so that the set of usableTom Lane2002-11-24
| | | | | | | | | | | joinclauses is determined accurately for each join. Formerly, the code only considered joinclauses that used all of the rels from the outer side of the join; thus for example FROM (a CROSS JOIN b) JOIN c ON (c.f1 = a.x AND c.f2 = b.y) could not exploit a two-column index on c(f1,f2), since neither of the qual clauses would be in the joininfo list it looked in. The new code does this correctly, and also is able to eliminate redundant clauses, thus fixing the problem noted 24-Oct-02 by Hans-Jürgen Schönig.
* pgindent run.Bruce Momjian2002-09-04
|
* Remove sys/types.h in files that include postgres.h, and hence c.h,Bruce Momjian2002-09-02
| | | | because c.h has sys/types.h.
* Update copyright to 2002.Bruce Momjian2002-06-20
|
* Restructure representation of join alias variables. An explicit JOINTom Lane2002-03-12
| | | | | | | | | | | | | | | now has an RTE of its own, and references to its outputs now are Vars referencing the JOIN RTE, rather than CASE-expressions. This allows reverse-listing in ruleutils.c to use the correct alias easily, rather than painfully reverse-engineering the alias namespace as it used to do. Also, nested FULL JOINs work correctly, because the result of the inner joins are simple Vars that the planner can cope with. This fixes a bug reported a couple times now, notably by Tatsuo on 18-Nov-01. The alias Vars are expanded into COALESCE expressions where needed at the very end of planning, rather than during parsing. Also, beginnings of support for showing plan qualifier expressions in EXPLAIN. There are probably still cases that need work. initdb forced due to change of stored-rule representation.
* sort_inner_and_outer needs a check to ensure that it's consumed all theTom Lane2001-11-11
| | | | | | mergeclauses in RIGHT/FULL join cases, just like the other routines have. I'm not quite sure why I thought it didn't need one --- but Nick Fankhauser's recent bug report proves that it does.
* pgindent run on all C files. Java run to follow. initdb/regressionBruce Momjian2001-10-25
| | | | tests pass.
* Further work on making use of new statistics in planner. Adjust APIsTom Lane2001-06-05
| | | | | | | | | of costsize.c routines to pass Query root, so that costsize can figure more things out by itself and not be so dependent on its callers to tell it everything it needs to know. Use selectivity of hash or merge clause to estimate number of tuples processed internally in these joins (this is more useful than it would've been before, since eqjoinsel is somewhat more accurate than before).
* Rewrite of planner statistics-gathering code. ANALYZE is now available asTom Lane2001-05-07
| | | | | | | | | | | | | | | | | a separate statement (though it can still be invoked as part of VACUUM, too). pg_statistic redesigned to be more flexible about what statistics are stored. ANALYZE now collects a list of several of the most common values, not just one, plus a histogram (not just the min and max values). Random sampling is used to make the process reasonably fast even on very large tables. The number of values and histogram bins collected is now user-settable via an ALTER TABLE command. There is more still to do; the new stats are not being used everywhere they could be in the planner. But the remaining changes for this project should be localized, and the behavior is already better than before. A not-very-related change is that sorting now makes use of btree comparison routines if it can find one, rather than invoking '<' twice.
* Prevent generation of invalid plans for RIGHT or FULL joins with multipleTom Lane2001-04-15
| | | | | | | | join clauses. The mergejoin executor wants all the join clauses to appear as merge quals, not as extra joinquals, for these kinds of joins. But the planner would consider plans in which partially-sorted input paths were used, leading to only some of the join clauses becoming merge quals. This is fine for inner/left joins, not fine for right/full joins.
* pgindent run. Make it all clean.Bruce Momjian2001-03-22
|
* Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group.Bruce Momjian2001-01-24
|
* Planner speedup hacking. Avoid saving useless pathkeys, so that pathTom Lane2000-12-14
| | | | | | | | | | | | comparison does not consider paths different when they differ only in uninteresting aspects of sort order. (We had a special case of this consideration for indexscans already, but generalize it to apply to ordered join paths too.) Be stricter about what is a canonical pathkey to allow faster pathkey comparison. Cache canonical pathkeys and dispersion stats for left and right sides of a RestrictInfo's clause, to avoid repeated computation. Total speedup will depend on number of tables in a query, but I see about 4x speedup of planning phase for a sample seven-table query.
* Ensure that mergejoin plan will be considered for FULL OUTER JOIN evenTom Lane2000-11-23
| | | | | if enable_mergejoin = OFF. Must do this, because we have no other implementation method for full joins.
* Add proofreader's changes to docs.Bruce Momjian2000-10-05
| | | | Fix misspelling of disbursion to dispersion.
* Subselects in FROM clause, per ISO syntax: FROM (SELECT ...) [AS] alias.Tom Lane2000-09-29
| | | | | | | | | (Don't forget that an alias is required.) Views reimplemented as expanding to subselect-in-FROM. Grouping, aggregates, DISTINCT in views actually work now (he says optimistically). No UNION support in subselects/views yet, but I have some ideas about that. Rule-related permissions checking moved out of rewriter and into executor. INITDB REQUIRED!
* First cut at full support for OUTER JOINs. There are still a few looseTom Lane2000-09-12
| | | | | ends to clean up (see my message of same date to pghackers), but mostly it works. INITDB REQUIRED!
* Remove unused include files. Do not touch /port or includes used by defines.Bruce Momjian2000-05-30
|
* Ye-old pgindent run. Same 4-space tabs.Bruce Momjian2000-04-12
|
* Plug some more memory leaks in the planner. It still leaks like a sieve,Tom Lane2000-02-18
| | | | but this is as good as it'll get for this release...
* New cost model for planning, incorporating a penalty for random pageTom Lane2000-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | accesses versus sequential accesses, a (very crude) estimate of the effects of caching on random page accesses, and cost to evaluate WHERE- clause expressions. Export critical parameters for this model as SET variables. Also, create SET variables for the planner's enable flags (enable_seqscan, enable_indexscan, etc) so that these can be controlled more conveniently than via PGOPTIONS. Planner now estimates both startup cost (cost before retrieving first tuple) and total cost of each path, so it can optimize queries with LIMIT on a reasonable basis by interpolating between these costs. Same facility is a win for EXISTS(...) subqueries and some other cases. Redesign pathkey representation to achieve a major speedup in planning (I saw as much as 5X on a 10-way join); also minor changes in planner to reduce memory consumption by recycling discarded Path nodes and not constructing unnecessary lists. Minor cleanups to display more-plausible costs in some cases in EXPLAIN output. Initdb forced by change in interface to index cost estimation functions.
* Repair planning bugs caused by my misguided removal of restrictinfo linkTom Lane2000-02-07
| | | | | | | | | | | fields in JoinPaths --- turns out that we do need that after all :-(. Also, rearrange planner so that only one RelOptInfo is created for a particular set of joined base relations, no matter how many different subsets of relations it can be created from. This saves memory and processing time compared to the old method of making a bunch of RelOptInfos and then removing the duplicates. Clean up the jointree iteration logic; not sure if it's better, but I sure find it more readable and plausible now, particularly for the case of 'bushy plans'.
* Rename same() to sameseti() to have a slightly less generic name. MoveTom Lane2000-02-06
| | | | | | nonoverlap_sets() and is_subset() to list.c, where they should have lived to begin with, and rename to nonoverlap_setsi and is_subseti since they only work on integer lists.
* Add:Bruce Momjian2000-01-26
| | | | | | * Portions Copyright (c) 1996-2000, PostgreSQL, Inc to all files copyright Regents of Berkeley. Man, that's a lot of files.
* Revise handling of index-type-specific indexscan cost estimation, perTom Lane2000-01-22
| | | | | | pghackers discussion of 5-Jan-2000. The amopselect and amopnpages estimators are gone, and in their place is a per-AM amcostestimate procedure (linked to from pg_am, not pg_amop).
* Another round of planner/optimizer work. This is just restructuring andTom Lane2000-01-09
| | | | | code cleanup; no major improvements yet. However, EXPLAIN does produce more intuitive outputs for nested loops with indexscans now...
* Major revision of sort-node handling: push knowledge of queryTom Lane1999-08-21
| | | | | | | | | | | | | sort order down into planner, instead of handling it only at the very top level of the planner. This fixes many things. An explicit sort is now avoided if there is a cheaper alternative (typically an indexscan) not only for ORDER BY, but also for the internal sort of GROUP BY. It works even when there is no other reason (such as a WHERE condition) to consider the indexscan. It works for indexes on functions. It works for indexes on functions, backwards. It's just so cool... CAUTION: I have changed the representation of SortClause nodes, therefore THIS UPDATE BREAKS STORED RULES. You will need to initdb.
* Major planner/optimizer revision: get rid of PathOrder node type,Tom Lane1999-08-16
| | | | | | | | | store all ordering information in pathkeys lists (which are now lists of lists of PathKeyItem nodes, not just lists of lists of vars). This was a big win --- the code is smaller and IMHO more understandable than it was, even though it handles more cases. I believe the node changes will not force an initdb for anyone; planner nodes don't show up in stored rules.