aboutsummaryrefslogtreecommitdiff
path: root/src
diff options
context:
space:
mode:
Diffstat (limited to 'src')
-rw-r--r--src/backend/optimizer/README124
1 files changed, 64 insertions, 60 deletions
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 65c95884201..e7aa29295f6 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -3,25 +3,24 @@ Summary
The optimizer generates optimial query plans by doing several steps:
-1) Take each relation in a query, and make a RelOptInfo structure for it.
-Find each way of accessing the relation, called a Path, including
-sequential and index scans, and add it to the RelOptInfo.path_order
-list.
+1) Take each relation in a query, and make a RelOptInfo structure for
+it. Find each way of accessing the relation, called a Path, including
+sequential and index scans, and add it to RelOptInfo.pathlist.
2) Join each RelOptInfo to each other RelOptInfo as specified in the
WHERE clause. At this point each RelOptInfo is a single relation, so
-you are joining every relation to every relation it is joined to in the
-WHERE clause.
+you are joining every relation to every relation as joined in the WHERE
+clause.
Joins occur using two RelOptInfos. One is outer, the other inner.
Outers drive lookups of values in the inner. In a nested loop, lookups
of values in the inner occur by scanning to find each matching inner
-row. In a mergejoin, inner rows are ordered, and are accessed in order,
-so only one scan of inner is required to perform the entire join. In a
-hashjoin, inner rows are hashed for lookups.
+row. In a mergejoin, inner and outer rows are ordered, and are accessed
+in order, so only one scan of inner is required to perform the entire
+join. In a hashjoin, inner rows are hashed for lookups.
Each unique join combination becomes a new RelOptInfo. The RelOptInfo
-is now the joining of two relations. RelOptInfo.path_order are various
+is now the joining of two relations. RelOptInfo.pathlist are various
paths to create the joined result, having different orderings depending
on the join method used.
@@ -30,27 +29,34 @@ a new relation added to each RelOptInfo. This continues until all
relations have been joined into one RelOptInfo, and the cheapest Path is
chosen.
- SELECT *
- FROM tab1, tab2, tab3, tab4
- WHERE tab1.col = tab2.col AND
- tab2.col = tab3.col AND
- tab3.col = tab4.col
-
- Tables 1, 2, 3, and 4 are joined as:
- {1 2},{2 3},{3 4}
- {1 2 3},{2 3 4}
- {1 2 3 4}
-
- SELECT *
- FROM tab1, tab2, tab3, tab4
- WHERE tab1.col = tab2.col AND
- tab1.col = tab3.col AND
- tab1.col = tab4.col
-
- Tables 1, 2, 3, and 4 are joined as:
- {1 2},{1 3},{1 4}
- {1 2 3},{1 3 4},{1,2,4}
- {1 2 3 4}
+ SELECT *
+ FROM tab1, tab2, tab3, tab4
+ WHERE tab1.col = tab2.col AND
+ tab2.col = tab3.col AND
+ tab3.col = tab4.col
+
+ Tables 1, 2, 3, and 4 are joined as:
+ {1 2},{2 3},{3 4}
+ {1 2 3},{2 3 4}
+ {1 2 3 4}
+
+ SELECT *
+ FROM tab1, tab2, tab3, tab4
+ WHERE tab1.col = tab2.col AND
+ tab1.col = tab3.col AND
+ tab1.col = tab4.col
+
+ Tables 1, 2, 3, and 4 are joined as:
+ {1 2},{1 3},{1 4}
+ {1 2 3},{1 3 4},{1,2,4}
+ {1 2 3 4}
+
+In the default left-handed joins, each RelOptInfo adds one
+single-relation RelOptInfo in each join pass, and the added RelOptInfo
+is always the inner relation in the join. In right-handed joins, the
+added RelOptInfo is the outer relation in the join. In bushy plans,
+multi-relation RelOptInfo's can be joined to other multi-relation
+RelOptInfo's.
Optimizer Functions
-------------------
@@ -95,28 +101,28 @@ planner()
---subplanner()
make list of relations in target
make list of relations in where clause
- split up the qual into restrictions (a=1) and joins (b=c)
- find which relations can do merge sort and hash joins
-----find_paths()
- find scan and all index paths for each relation not yet joined
- one relation, return
- find selectivity of columns used in joins
------find_join_paths()
+ split up the qual into restrictions (a=1) and joins (b=c)
+ find relation clauses can do merge sort and hash joins
+----make_one_rel()
+ set_base_rel_pathlist()
+ find scan and all index paths for each relation
+ find selectivity of columns used in joins
+-----make_one_rel_by_joins()
jump to geqo if needed
again:
- find_join_rels():
+ make_rels_by_joins():
for each joinrel:
- find_clause_joins()
- for each join on joinrel:
+ make_rels_by_clause_joins()
+ for each rel's joininfo list:
if a join from the join clause adds only one relation, do the join
- or find_clauseless_joins()
- find_all_join_paths()
- generate paths(nested,sortmerge) for joins found in find_join_rels()
- prune_joinrels()
- remove from the join list the relation we just added to each join
- prune_rel_paths()
- set cheapest and perhaps remove unordered path, recompute table sizes
- if we have not done all the tables, go to again:
+ or make_rels_by_clauseless_joins()
+ update_rels_pathlist_for_joins()
+ generate nested,merge,hash join paths for new rel's created above
+ merge_rels_with_same_relids()
+ merge RelOptInfo paths that have the same relids because of joins
+ rels_set_cheapest()
+ set cheapest path
+ if all relations in one RelOptInfo, return
do group(GROUP)
do aggregate
put back constants
@@ -129,17 +135,15 @@ planner()
Optimizer Structures
--------------------
-RelOptInfo - Every relation
-
- RestrictInfo - restriction clauses
- JoinInfo - join combinations
-
- Path - every way to access a relation(sequential, index)
- IndexPath - index scans
+RelOptInfo - a relation or joined relations
- JoinPath - joins
- MergePath - merge joins
- HashPath - hash joins
+ RestrictInfo - restriction clauses
+ JoinInfo - join clauses
- PathOrder - every ordering type (sort, merge of relations)
+ Path - every way to generate a RelOptInfo(sequential,index,joins)
+ IndexPath - index scans
+ NestPath - nested joins
+ MergePath - merge joins
+ HashPath - hash joins
+ PathOrder - every ordering type (sort, merge of relations)