aboutsummaryrefslogtreecommitdiff
path: root/src/backend/executor/nodeHash.c
Commit message (Collapse)AuthorAge
* Update copyrights for 2013Bruce Momjian2013-01-01
| | | | | Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
* Add defenses against integer overflow in dynahash numbuckets calculations.Tom Lane2012-12-11
| | | | | | | | | | | | | | | The dynahash code requires the number of buckets in a hash table to fit in an int; but since we calculate the desired hash table size dynamically, there are various scenarios where we might calculate too large a value. The resulting overflow can lead to infinite loops, division-by-zero crashes, etc. I (tgl) had previously installed some defenses against that in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one call path. Moreover it worked by limiting the request size to work_mem, but in a 64-bit machine it's possible to set work_mem high enough that the problem appears anyway. So let's fix the problem at the root by installing limits in the dynahash.c functions themselves. Trouble report and patch by Jeff Davis.
* Split tuple struct defs from htup.h to htup_details.hAlvaro Herrera2012-08-30
| | | | | | | | | | | | This reduces unnecessary exposure of other headers through htup.h, which is very widely included by many files. I have chosen to move the function prototypes to the new file as well, because that means htup.h no longer needs to include tupdesc.h. In itself this doesn't have much effect in indirect inclusion of tupdesc.h throughout the tree, because it's also required by execnodes.h; but it's something to explore in the future, and it seemed best to do the htup.h change now while I'm busy with it.
* Update copyright notices for year 2012.Bruce Momjian2012-01-01
|
* Rearrange the implementation of index-only scans.Tom Lane2011-10-11
| | | | | | | | | | | | | | | | | | | | | | This commit changes index-only scans so that data is read directly from the index tuple without first generating a faux heap tuple. The only immediate benefit is that indexes on system columns (such as OID) can be used in index-only scans, but this is necessary infrastructure if we are ever to support index-only scans on expression indexes. The executor is now ready for that, though the planner still needs substantial work to recognize the possibility. To do this, Vars in index-only plan nodes have to refer to index columns not heap columns. I introduced a new special varno, INDEX_VAR, to mark such Vars to avoid confusion. (In passing, this commit renames the two existing special varnos to OUTER_VAR and INNER_VAR.) This allows ruleutils.c to handle them with logic similar to what we use for subplan reference Vars. Since index-only scans are now fundamentally different from regular indexscans so far as their expression subtrees are concerned, I also chose to change them to have their own plan node type (and hence, their own executor source file).
* Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps.Tom Lane2011-09-22
| | | | | | | | | | | | | | | This provides information about the numbers of tuples that were visited but not returned by table scans, as well as the numbers of join tuples that were considered and discarded within a join plan node. There is still some discussion going on about the best way to report counts for outer-join situations, but I think most of what's in the patch would not change if we revise that, so I'm going to go ahead and commit it as-is. Documentation changes to follow (they weren't in the submitted patch either). Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
* Remove unnecessary #include references, per pgrminclude script.Bruce Momjian2011-09-01
|
* pgindent run before PG 9.1 beta 1.Bruce Momjian2011-04-10
|
* Stamp copyrights for year 2011.Bruce Momjian2011-01-01
|
* Support RIGHT and FULL OUTER JOIN in hash joins.Tom Lane2010-12-30
| | | | | | | | | | | | | | | | | | | | | | This is advantageous first because it allows us to hash the smaller table regardless of the outer-join type, and second because hash join can be more flexible than merge join in dealing with arbitrary join quals in a FULL join. For merge join all the join quals have to be mergejoinable, but hash join will work so long as there's at least one hashjoinable qual --- the others can be any condition. (This is true essentially because we don't keep per-inner-tuple match flags in merge join, while hash join can do so.) To do this, we need a has-it-been-matched flag for each tuple in the hashtable, not just one for the current outer tuple. The key idea that makes this practical is that we can store the match flag in the tuple's infomask, since there are lots of bits there that are of no interest for a MinimalTuple. So we aren't increasing the size of the hashtable at all for the feature. To write this without turning the hash code into even more of a pile of spaghetti than it already was, I rewrote ExecHashJoin in a state-machine style, similar to ExecMergeJoin. Other than that decision, it was pretty straightforward.
* Remove cvs keywords from all files.Magnus Hagander2010-09-20
|
* Make NestLoop plan nodes pass outer-relation variables into their innerTom Lane2010-07-12
| | | | | | | | | | | | relation using the general PARAM_EXEC executor parameter mechanism, rather than the ad-hoc kluge of passing the outer tuple down through ExecReScan. The previous method was hard to understand and could never be extended to handle parameters coming from multiple join levels. This patch doesn't change the set of possible plans nor have any significant performance effect, but it's necessary infrastructure for future generalization of the concept of an inner indexscan plan. ExecReScan's second parameter is now unused, so it's removed.
* pgindent run for 9.0Bruce Momjian2010-02-26
|
* Wrap calls to SearchSysCache and related functions using macros.Robert Haas2010-02-14
| | | | | | | | | | | | The purpose of this change is to eliminate the need for every caller of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists, GetSysCacheOid, and SearchSysCacheList to know the maximum number of allowable keys for a syscache entry (currently 4). This will make it far easier to increase the maximum number of keys in a future release should we choose to do so, and it makes the code shorter, too. Design and review by Tom Lane.
* Augment EXPLAIN output with more details on Hash nodes.Robert Haas2010-02-01
| | | | | | We show the number of buckets, the number of batches (and also the original number if it has changed), and the peak space used by the hash table. Minor executor changes to track peak space used.
* When estimating the selectivity of an inequality "column > constant" orTom Lane2010-01-04
| | | | | | | | | | | | | | | "column < constant", and the comparison value is in the first or last histogram bin or outside the histogram entirely, try to fetch the actual column min or max value using an index scan (if there is an index on the column). If successful, replace the lower or upper histogram bound with that value before carrying on with the estimate. This limits the estimation error caused by moving min/max values when the comparison value is close to the min or max. Per a complaint from Josh Berkus. It is tempting to consider using this mechanism for mergejoinscansel as well, but that would inject index fetches into main-line join estimation not just endpoint cases. I'm refraining from that until we can get a better handle on the costs of doing this type of lookup.
* Update copyright for the year 2010.Bruce Momjian2010-01-02
|
* Add the ability to store inheritance-tree statistics in pg_statistic,Tom Lane2009-12-29
| | | | | | | | and teach ANALYZE to compute such stats for tables that have subclasses. Per my proposal of yesterday. autovacuum still needs to be taught about running ANALYZE on parent tables when their subclasses change, but the feature is useful even without that.
* Make the overflow guards in ExecChooseHashTableSize be more protective.Tom Lane2009-10-30
| | | | | | | | | | | | | | | The original coding ensured nbuckets and nbatch didn't exceed INT_MAX, which while not insane on its own terms did nothing to protect subsequent code like "palloc(nbatch * sizeof(BufFile *))". Since enormous join size estimates might well be planner error rather than reality, it seems best to constrain the initial sizes to be not more than work_mem/sizeof(pointer), thus ensuring the allocated arrays don't exceed work_mem. We will allow nbatch to get bigger than that during subsequent ExecHashIncreaseNumBatches calls, but we should still guard against integer overflow in those palloc requests. Per bug #5145 from Bernt Marius Johnsen. Although the given test case only seems to fail back to 8.2, previous releases have variants of this issue, so patch all supported branches.
* Remove no-longer-needed ExecCountSlots infrastructure.Tom Lane2009-09-27
|
* 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian2009-06-11
| | | | provided by Andrew.
* Revert DTrace patch from Robert LorBruce Momjian2009-04-02
|
* Add support for additional DTrace probes.Bruce Momjian2009-04-02
| | | | Robert Lor
* Optimize multi-batch hash joins when the outer relation has a nonuniformTom Lane2009-03-21
| | | | | | | | | distribution, by creating a special fast path for the (first few) most common values of the outer relation. Tuples having hashvalues matching the MCVs are effectively forced to be in the first batch, so that we never write them out to the batch temp files. Bryce Cutt and Ramon Lawrence, with some editorialization by me.
* Update copyright for 2009.Bruce Momjian2009-01-01
|
* Update copyrights in source tree to 2008.Bruce Momjian2008-01-01
|
* pgindent run for 8.3.Bruce Momjian2007-11-15
|
* Rework temp_tablespaces patch so that temp tablespaces are assigned separatelyTom Lane2007-06-07
| | | | | | | | | for each temp file, rather than once per sort or hashjoin; this allows spreading the data of a large sort or join across multiple tablespaces. (I remain dubious that this will make any difference in practice, but certain people insisted.) Arrange to cache the results of parsing the GUC variable instead of recomputing from scratch on every demand, and push usage of the cache down to the bottommost fd.c level.
* Create a GUC parameter temp_tablespaces that allows selection of theTom Lane2007-06-03
| | | | | | | | | | tablespace(s) in which to store temp tables and temporary files. This is a list to allow spreading the load across multiple tablespaces (a random list element is chosen each time a temp object is to be created). Temp files are not stored in per-database pgsql_tmp/ directories anymore, but per-tablespace directories. Jaime Casanova and Albert Cervera, with review by Bernd Helmle and Tom Lane.
* Buy back some of the cycles spent in more-expensive hash functions byTom Lane2007-06-01
| | | | | | | selecting power-of-2, rather than prime, numbers of buckets in hash joins. If the hash functions are doing their jobs properly by making all hash bits equally random, this is good enough, and it saves expensive integer division and modulus operations.
* Fix bug I introduced in recent patch to make hash joins discard null tuplesTom Lane2007-02-22
| | | | | immediately: ExecHashGetHashValue failed to restore the caller's memory context before taking the failure exit.
* Add support for cross-type hashing in hash index searches and hash joins.Tom Lane2007-01-30
| | | | | | Hashing for aggregation purposes still needs work, so it's not time to mark any cross-type operators as hashable for general use, but these cases work if the operators are so marked by hand in the system catalogs.
* Improve hash join to discard input tuples immediately if they can'tTom Lane2007-01-28
| | | | | | match because they contain a null join key (and the join operator is known strict). Improves performance significantly when the inner relation contains a lot of nulls, as per bug #2930.
* Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian2007-01-05
| | | | back-stamped for this.
* Add additional includes needed on some platforms.Bruce Momjian2006-07-14
|
* Move math.h after postgresql.hBruce Momjian2006-07-13
|
* Allow include files to compile own their own.Bruce Momjian2006-07-13
| | | | | | | Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.
* Convert hash join code to use MinimalTuple format in tuple hash tableTom Lane2006-06-27
| | | | and batch files. Should reduce memory and I/O demands for such joins.
* Make EXPLAIN sampling smarter, to avoid excessive sampling delay.Bruce Momjian2006-05-30
| | | | Martijn van Oosterhout
* Remove CXT_printf/CXT1_printf macros. If anyone had found them to be ofTom Lane2006-05-23
| | | | | | | | | any use in the past many years, we'd have made some effort to include them in all executor node types; but in fact they were only in nodeAppend.c and nodeIndexscan.c, up until I copied nodeIndexscan.c's occurrence into the new bitmap node types. Remove some other unused macros in execdebug.h, too. Some day the whole header probably ought to go away in favor of better-designed facilities.
* Update copyright for 2006. Update scripts.Bruce Momjian2006-03-05
|
* Extend the ExecInitNode API so that plan nodes receive a set of flagTom Lane2006-02-28
| | | | | | | | | | | | bits indicating which optional capabilities can actually be exercised at runtime. This will allow Sort and Material nodes, and perhaps later other nodes, to avoid unnecessary overhead in common cases. This commit just adds the infrastructure and arranges to pass the correct flag values down to plan nodes; none of the actual optimizations are here yet. I'm committing this separately in case anyone wants to measure the added overhead. (It should be negligible.) Simon Riggs and Tom Lane
* Get rid of ExecAssignResultTypeFromOuterPlan() and make all plan node typesTom Lane2005-11-23
| | | | | | | | | | | generate their output tuple descriptors from their target lists (ie, using ExecAssignResultTypeFromTL()). We long ago fixed things so that all node types have minimally valid tlists, so there's no longer any good reason to have two different ways of doing it. This change is needed to fix bug reported by Hayden James: the fix of 2005-11-03 to emit the correct column names after optimizing away a SubqueryScan node didn't work if the new top-level plan node used ExecAssignResultTypeFromOuterPlan to generate its tupdesc, since the next plan node down won't have the correct column labels.
* Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian2005-11-22
| | | | | | | | | comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
* Remove the t_datamcxt field of HeapTupleData. This was introduced forTom Lane2005-11-20
| | | | | the convenience of tuptoaster.c and is no longer needed, so may as well get rid of some small amount of overhead.
* Standard pgindent run for 8.1.Bruce Momjian2005-10-15
|
* The original patch to avoid building a hash join's hashtable when theTom Lane2005-09-25
| | | | | | | | outer relation is empty did not work, per test case from Patrick Welche. It tried to use nodeHashjoin.c's high-level mechanisms for fetching an outer-relation tuple, but that code expected the hash table to be filled already. As patched, the code failed in corner cases such as having no outer-relation tuples for the first hash batch. Revert and rewrite.
* Change the implementation of hash join to attempt to avoid unnecessaryNeil Conway2005-06-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | work if either of the join relations are empty. The logic is: (1) if the inner relation's startup cost is less than the outer relation's startup cost and this is not an outer join, read a single tuple from the inner relation via ExecHash() - if NULL, we're done (2) read a single tuple from the outer relation - if NULL, we're done (3) build the hash table on the inner relation - if hash table is empty and this is not an outer join, we're done (4) otherwise, do hash join as usual The implementation uses the new MultiExecProcNode API, per a suggestion from Tom: invoking ExecHash() now produces the first tuple from the Hash node's child node, whereas MultiExecHash() builds the hash table. I had to put in a bit of a kludge to get the row count returned for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to return a tuple, and then MultiExecHash() is invoked, we would return one too many tuples to EXPLAIN ANALYZE. I hacked around this by just manually detecting this situation and subtracting 1 from the EXPLAIN ANALYZE row count.
* Create a new 'MultiExecProcNode' call API for plan nodes that don'tTom Lane2005-04-16
| | | | | | | return just a single tuple at a time. Currently the only such node type is Hash, but I expect we will soon have indexscans that can return tuple bitmaps. A side benefit is that EXPLAIN ANALYZE now shows the correct tuple count for a Hash node.
* Minor code cleanup: ExecHash() was returning a null TupleTableSlot, and anNeil Conway2005-03-31
| | | | | | old comment in the code claimed that this was necessary. Since it is not actually necessary any more, it is clearer to remove the comment and just return NULL instead -- the return value of ExecHash() is not used.