aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor/nodeMergejoin.c
Commit message (Collapse)AuthorAge
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian2012-06-10
| | | | commit-fest.
* Lots of doc corrections.Robert Haas2012-04-23
| | | | Josh Kupershmidt
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Create a "sort support" interface API for faster sorting.Tom Lane2011-12-07
| | | | | | | | | | | | This patch creates an API whereby a btree index opclass can optionally provide non-SQL-callable support functions for sorting. In the initial patch, we only use this to provide a directly-callable comparator function, which can be invoked with a bit less overhead than the traditional SQL-callable comparator. While that should be of value in itself, the real reason for doing this is to provide a datatype-extensible framework for more aggressive optimizations, as in Peter Geoghegan's recent work. Robert Haas and Tom Lane
* Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps.Tom Lane2011-09-22
| | | | | | | | | | | | | | | This provides information about the numbers of tuples that were visited but not returned by table scans, as well as the numbers of join tuples that were considered and discarded within a join plan node. There is still some discussion going on about the best way to report counts for outer-join situations, but I think most of what's in the patch would not change if we revise that, so I'm going to go ahead and commit it as-is. Documentation changes to follow (they weren't in the submitted patch either). Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
* Remove unnecessary #include references, per pgrminclude script.Bruce Momjian2011-09-01
|
* Make a code-cleanup pass over the collations patch.Tom Lane2011-04-22
| | | | | | | This patch is almost entirely cosmetic --- mostly cleaning up a lot of neglected comments, and fixing code layout problems in places where the patch made lines too long and then pgindent did weird things with that. I did find a bug-of-omission in equalTupleDescs().
* Pass collations to functions in FunctionCallInfoData, not FmgrInfo.Tom Lane2011-04-12
| | | | | | | | | | | Since collation is effectively an argument, not a property of the function, FmgrInfo is really the wrong place for it; and this becomes critical in cases where a cached FmgrInfo is used for varying purposes that might need different collation settings. Fix by passing it in FunctionCallInfoData instead. In particular this allows a clean fix for bug #5970 (record_cmp not working). This requires touching a bit more code than the original method, but nobody ever thought that collations would not be an invasive patch...
* Clean up most -Wunused-but-set-variable warnings from gcc 4.6Peter Eisentraut2011-04-11
| | | | | | This warning is new in gcc 4.6 and part of -Wall. This patch cleans up most of the noise, but there are some still warnings that are trickier to remove.
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Revise collation derivation method and expression-tree representation.Tom Lane2011-03-19
| | | | | | | | | | | | | | | | | | | All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing.
* Per-column collation supportPeter Eisentraut2011-02-08
| | | | | | | | This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Move symbols for ExecMergeJoin's state machine into nodeMergejoin.c.Tom Lane2010-12-30
| | | | | There's no reason for these values to be known anywhere else. After doing this, executor/execdefs.h is vestigial and can be removed.
* Create core infrastructure for KNNGIST.Tom Lane2010-12-02
| | | | | | | | | | | | | | | | | | | This is a heavily revised version of builtin_knngist_core-0.9. The ordering operators are no longer mixed in with actual quals, which would have confused not only humans but significant parts of the planner. Instead, ordering operators are carried separately throughout planning and execution. Since the API for ambeginscan and amrescan functions had to be changed anyway, this commit takes the opportunity to rationalize that a bit. RelationGetIndexScan no longer forces a premature index_rescan call; instead, callers of index_beginscan must call index_rescan too. Aside from making the AM-side initialization logic a bit less peculiar, this has the advantage that we do not make a useless extra am_rescan call when there are runtime key values. AMs formerly could not assume that the key values passed to amrescan were actually valid; now they can. Teodor Sigaev and Tom Lane
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Make NestLoop plan nodes pass outer-relation variables into their innerTom Lane2010-07-12
| | | | | | | | | | | | relation using the general PARAM_EXEC executor parameter mechanism, rather than the ad-hoc kluge of passing the outer tuple down through ExecReScan. The previous method was hard to understand and could never be extended to handle parameters coming from multiple join levels. This patch doesn't change the set of possible plans nor have any significant performance effect, but it's necessary infrastructure for future generalization of the concept of an inner indexscan plan. ExecReScan's second parameter is now unused, so it's removed.
* pgindent run for 9.0, second runBruce Momjian2010-07-06
|
* Rejigger mergejoin logic so that a tuple with a null in the first merge columnTom Lane2010-05-28
| | | | | | | | | | | | | | | | | | | is treated like end-of-input, if nulls sort last in that column and we are not doing outer-join filling for that input. In such a case, the tuple cannot join to anything from the other input (because we assume mergejoinable operators are strict), and neither can any tuple following it in the sort order. If we're not interested in doing outer-join filling we can just pretend the tuple and its successors aren't there at all. This can save a great deal of time in situations where there are many nulls in the join column, as in a recent example from Scott Marlowe. Also, since the planner tends to not count nulls in its mergejoin scan selectivity estimates, this is an important fix to make the runtime behavior more like the estimate. I regard this as an omission in the patch I wrote years ago to teach mergejoin that tuples containing nulls aren't joinable, so I'm back-patching it. But only to 8.3 --- in older versions, we didn't have a solid notion of whether nulls sort high or low, so attempting to apply this optimization could break things.
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Add support for doing FULL JOIN ON FALSE. While this is really a ratherTom Lane2010-01-05
| | | | | | | | | | peculiar variant of UNION ALL, and so wouldn't likely get written directly as-is, it's possible for it to arise as a result of simplification of less-obviously-silly queries. In particular, now that we can do flattening of subqueries that have constant outputs and are underneath an outer join, it's possible for the case to result from simplification of queries of the type exhibited in bug #5263. Back-patch to 8.4 to avoid a functionality regression for this type of query.
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Remove no-longer-needed ExecCountSlots infrastructure.Tom Lane2009-09-27
|
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-11
| | | | provided by Andrew.
* Revert DTrace patch from Robert LorBruce Momjian2009-04-02
|
* Add support for additional DTrace probes.Bruce Momjian2009-04-02
| | | | Robert Lor
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Performance fix for new anti-join code in nodeMergejoin.c: after finding aTom Lane2008-08-15
| | | | | | | | | | | | | | | | | | match in antijoin mode, we should advance to next outer tuple not next inner. We know we don't want to return this outer tuple, and there is no point in advancing over matching inner tuples now, because we'd just have to do it again if the next outer tuple has the same merge key. This makes a noticeable difference if there are lots of duplicate keys in both inputs. Similarly, after finding a match in semijoin mode, arrange to advance to the next outer tuple after returning the current match; or immediately, if it fails the extra quals. The rationale is the same. (This is a performance bug in existing releases; perhaps worth back-patching? The planner tries to avoid using mergejoin with lots of duplicates, so it may not be a big issue in practice.) Nestloop and hash got this right to start with, but I made some cosmetic adjustments there to make the corresponding bits of logic look more similar.
* Implement SEMI and ANTI joins in the planner and executor. (Semijoins replaceTom Lane2008-08-14
| | | | | | | | | | | | | | the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus.
* Since createplan.c no longer cares whether index operators are lossy, it hasTom Lane2008-04-13
| | | | | | | | | | no particular need to do get_op_opfamily_properties() while building an indexscan plan. Postpone that lookup until executor start. This simplifies createplan.c a lot more than it complicates nodeIndexscan.c, and makes things more uniform since we already had to do it that way for RowCompare expressions. Should be a bit faster too, at least for plans that aren't re-used many times, since we avoid palloc'ing and perhaps copying the intermediate list data structure.
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* pgindent run for 8.3.Bruce Momjian2007-11-15
|
* Teach tuplestore.c to throw away data before the "mark" point when the callerTom Lane2007-05-21
| | | | | | | | | | | | is using mark/restore but not rewind or backward-scan capability. Insert a materialize plan node between a mergejoin and its inner child if the inner child is a sort that is expected to spill to disk. The materialize shields the sort from the need to do mark/restore and thereby allows it to perform its final merge pass on-the-fly; while the materialize itself is normally cheap since it won't spill to disk unless the number of tuples with equal key values exceeds work_mem. Greg Stark, with some kibitzing from Tom Lane.
* Repair failure to check that a table is still compatible with a previouslyTom Lane2007-02-02
| | | | | | | | | | | | | | | | | | | | | | made query plan. Use of ALTER COLUMN TYPE creates a hazard for cached query plans: they could contain Vars that claim a column has a different type than it now has. Fix this by checking during plan startup that Vars at relation scan level match the current relation tuple descriptor. Since at that point we already have at least AccessShareLock, we can be sure the column type will not change underneath us later in the query. However, since a backend's locks do not conflict against itself, there is still a hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE while a query is in progress in the current backend. Seal that hole by rejecting ALTER TABLE whenever the target relation is already open in the current backend. This is a significant security hole: not only can one trivially crash the backend, but with appropriate misuse of pass-by-reference datatypes it is possible to read out arbitrary locations in the server process's memory, which could allow retrieving database content the user should not be able to see. Our thanks to Jeff Trout for the initial report. Security: CVE-2007-0556
* Teach nodeMergejoin how to handle DESC and/or NULLS FIRST sort orders.Tom Lane2007-01-11
| | | | So far only tested by hacking the planner ...
* Change the planner-to-executor API so that the planner tells the executorTom Lane2007-01-10
| | | | | | | | | | | | | | | | which comparison operators to use for plan nodes involving tuple comparison (Agg, Group, Unique, SetOp). Formerly the executor looked up the default equality operator for the datatype, which was really pretty shaky, since it's possible that the data being fed to the node is sorted according to some nondefault operator class that could have an incompatible idea of equality. The planner knows what it has sorted by and therefore can provide the right equality operator to use. Also, this change moves a couple of catalog lookups out of the executor and into the planner, which should help startup time for pre-planned queries by some small amount. Modify the planner to remove some other cavalier assumptions about always being able to use the default operators. Also add "nulls first/last" info to the Plan node for a mergejoin --- neither the executor nor the planner can cope yet, but at least the API is in place.
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* Restructure operator classes to allow improved handling of cross-data-typeTom Lane2006-12-23
| | | | | | | | | | | | | | | | cases. Operator classes now exist within "operator families". While most families are equivalent to a single class, related classes can be grouped into one family to represent the fact that they are semantically compatible. Cross-type operators are now naturally adjunct parts of a family, without having to wedge them into a particular opclass as we had done originally. This commit restructures the catalogs and cleans up enough of the fallout so that everything still works at least as well as before, but most of the work needed to actually improve the planner's behavior will come later. Also, there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way to create a new family right now is to allow CREATE OPERATOR CLASS to make one by default. I owe some more documentation work, too. But that can all be done in smaller pieces once this infrastructure is in place.
* pgindent run for 8.2.Bruce Momjian2006-10-04
|
* Remove 576 references of include files that were not needed.Bruce Momjian2006-07-14
|
* Fix problems with cached tuple descriptors disappearing while still in useTom Lane2006-06-16
| | | | | | | | | | by creating a reference-count mechanism, similar to what we did a long time ago for catcache entries. The back branches have an ugly solution involving lots of extra copies, but this way is more efficient. Reference counting is only applied to tupdescs that are actually in caches --- there seems no need to use it for tupdescs that are generated in the executor, since they'll go away during plan shutdown by virtue of being in the per-query memory context. Neil Conway and Tom Lane
* Fix bug introduced into mergejoin logic by performance improvement patch ofTom Lane2006-03-17
| | | | | | | | | | | | | | 2005-05-13. When we find that a new inner tuple can't possibly match any outer tuple (because it contains a NULL), we can't immediately skip the tuple when we are in NEXTINNER state. Doing so can lead to emitting multiple copies of the tuple in FillInner mode, because we may rescan the tuple after returning to a previous marked tuple. Instead, proceed to NEXTOUTER state the same as we used to do. After we've found that there's no need to return to the marked position, we can go to SKIPINNER_ADVANCE state instead of SKIP_TEST when the inner tuple is unmatchable; this preserves the performance improvement. Per bug report from Bruce. I also made a couple of cosmetic code rearrangements and added a regression test for the problem.
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Extend the ExecInitNode API so that plan nodes receive a set of flagTom Lane2006-02-28
| | | | | | | | | | | | bits indicating which optional capabilities can actually be exercised at runtime. This will allow Sort and Material nodes, and perhaps later other nodes, to avoid unnecessary overhead in common cases. This commit just adds the infrastructure and arranges to pass the correct flag values down to plan nodes; none of the actual optimizations are here yet. I'm committing this separately in case anyone wants to measure the added overhead. (It should be negligible.) Simon Riggs and Tom Lane
* Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian2005-11-22
| | | | | | | | | comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* Fix latent bug in ExecSeqRestrPos: it leaves the plan node's result slotTom Lane2005-05-15
| | | | | | | in an inconsistent state. (This is only latent because in reality ExecSeqRestrPos is dead code at the moment ... but someday maybe it won't be.) Add some comments about what the API for plan node mark/restore actually is, because it's not immediately obvious.
* Minor refactoring to eliminate duplicate code and make startup aTom Lane2005-05-14
| | | | tad faster.
* Revise nodeMergejoin in light of example provided by Guillaume Smet.Tom Lane2005-05-13
| | | | | | | | | | When one side of the join has a NULL, we don't want to uselessly try to match it against every remaining tuple of the other side. While at it, rewrite the comparison machinery to avoid multiple evaluations of the left and right input expressions and to use a btree comparator where available, instead of double operator calls. Also revise the state machine to eliminate redundant comparisons and hopefully make it more readable too.