postgresql - postgresql mirror

	Commit message (Collapse)	Author	Age
...
*	Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace	Tom Lane	2008-08-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus.
*	Update copyrights in source tree to 2008.	Bruce Momjian	2008-01-01
\|
*	pgindent run for 8.3.	Bruce Momjian	2007-11-15
\|
*	Rework temp_tablespaces patch so that temp tablespaces are assigned separately	Tom Lane	2007-06-07
\| \| \| \| \| \| \| \| \|	for each temp file, rather than once per sort or hashjoin; this allows spreading the data of a large sort or join across multiple tablespaces. (I remain dubious that this will make any difference in practice, but certain people insisted.) Arrange to cache the results of parsing the GUC variable instead of recomputing from scratch on every demand, and push usage of the cache down to the bottommost fd.c level.
*	Create a GUC parameter temp_tablespaces that allows selection of the	Tom Lane	2007-06-03
\| \| \| \| \| \| \| \| \| \|	tablespace(s) in which to store temp tables and temporary files. This is a list to allow spreading the load across multiple tablespaces (a random list element is chosen each time a temp object is to be created). Temp files are not stored in per-database pgsql_tmp/ directories anymore, but per-tablespace directories. Jaime Casanova and Albert Cervera, with review by Bernd Helmle and Tom Lane.
*	Repair failure to check that a table is still compatible with a previously	Tom Lane	2007-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	made query plan. Use of ALTER COLUMN TYPE creates a hazard for cached query plans: they could contain Vars that claim a column has a different type than it now has. Fix this by checking during plan startup that Vars at relation scan level match the current relation tuple descriptor. Since at that point we already have at least AccessShareLock, we can be sure the column type will not change underneath us later in the query. However, since a backend's locks do not conflict against itself, there is still a hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE while a query is in progress in the current backend. Seal that hole by rejecting ALTER TABLE whenever the target relation is already open in the current backend. This is a significant security hole: not only can one trivially crash the backend, but with appropriate misuse of pass-by-reference datatypes it is possible to read out arbitrary locations in the server process's memory, which could allow retrieving database content the user should not be able to see. Our thanks to Jeff Trout for the initial report. Security: CVE-2007-0556
*	Add support for cross-type hashing in hash index searches and hash joins.	Tom Lane	2007-01-30
\| \| \| \| \| \|	Hashing for aggregation purposes still needs work, so it's not time to mark any cross-type operators as hashable for general use, but these cases work if the operators are so marked by hand in the system catalogs.
*	Improve hash join to discard input tuples immediately if they can't	Tom Lane	2007-01-28
\| \| \| \| \| \|	match because they contain a null join key (and the join operator is known strict). Improves performance significantly when the inner relation contains a lot of nulls, as per bug #2930.
*	Update CVS HEAD for 2007 copyright. Back branches are typically not	Bruce Momjian	2007-01-05
\| \| \| \|	back-stamped for this.
*	pgindent run for 8.2.	Bruce Momjian	2006-10-04
\|
*	Remove 576 references of include files that were not needed.	Bruce Momjian	2006-07-14
\|
*	Convert hash join code to use MinimalTuple format in tuple hash table	Tom Lane	2006-06-27
\| \| \| \|	and batch files. Should reduce memory and I/O demands for such joins.
*	Fix problems with cached tuple descriptors disappearing while still in use	Tom Lane	2006-06-16
\| \| \| \| \| \| \| \| \| \|	by creating a reference-count mechanism, similar to what we did a long time ago for catcache entries. The back branches have an ugly solution involving lots of extra copies, but this way is more efficient. Reference counting is only applied to tupdescs that are actually in caches --- there seems no need to use it for tupdescs that are generated in the executor, since they'll go away during plan shutdown by virtue of being in the per-query memory context. Neil Conway and Tom Lane
*	Update copyright for 2006. Update scripts.	Bruce Momjian	2006-03-05
\|
*	Extend the ExecInitNode API so that plan nodes receive a set of flag	Tom Lane	2006-02-28
\| \| \| \| \| \| \| \| \| \| \| \|	bits indicating which optional capabilities can actually be exercised at runtime. This will allow Sort and Material nodes, and perhaps later other nodes, to avoid unnecessary overhead in common cases. This commit just adds the infrastructure and arranges to pass the correct flag values down to plan nodes; none of the actual optimizations are here yet. I'm committing this separately in case anyone wants to measure the added overhead. (It should be negligible.) Simon Riggs and Tom Lane
*	Tweak hash join code to use an additional heuristic for deciding whether	Tom Lane	2005-11-28
\| \| \| \| \| \| \| \|	it's worth probing the outer relation for emptiness before building the hash table. To wit, if we're rescanning a join previously performed, remember whether we found it nonempty the previous time, and don't bother with the probe if it was nonempty. This buys back the performance lost in examples like Mario Weilguni's.
*	Recent changes to allow hash join to exit early given empty input from	Tom Lane	2005-11-28
\| \| \| \| \| \| \|	one child or the other had a problem: they did not leave the node in a state that ExecReScanHashJoin would understand. In particular it would tend to fail to reset the child plans when needed. Per report from Mario Weilguni.
*	Re-run pgindent, fixing a problem where comment lines after a blank	Bruce Momjian	2005-11-22
\| \| \| \| \| \| \| \| \|	comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
*	Remove the t_datamcxt field of HeapTupleData. This was introduced for	Tom Lane	2005-11-20
\| \| \| \| \|	the convenience of tuptoaster.c and is no longer needed, so may as well get rid of some small amount of overhead.
*	A few trivial code cleanups motivated by reading warnings generated	Tom Lane	2005-10-18
\| \| \| \| \|	by a recent HP C compiler. Mostly, get rid of useless local variables that are assigned to but never used.
*	Standard pgindent run for 8.1.	Bruce Momjian	2005-10-15
\|
*	The original patch to avoid building a hash join's hashtable when the	Tom Lane	2005-09-25
\| \| \| \| \| \| \| \|	outer relation is empty did not work, per test case from Patrick Welche. It tried to use nodeHashjoin.c's high-level mechanisms for fetching an outer-relation tuple, but that code expected the hash table to be filled already. As patched, the code failed in corner cases such as having no outer-relation tuples for the first hash batch. Revert and rewrite.
*	Change the implementation of hash join to attempt to avoid unnecessary	Neil Conway	2005-06-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	work if either of the join relations are empty. The logic is: (1) if the inner relation's startup cost is less than the outer relation's startup cost and this is not an outer join, read a single tuple from the inner relation via ExecHash() - if NULL, we're done (2) read a single tuple from the outer relation - if NULL, we're done (3) build the hash table on the inner relation - if hash table is empty and this is not an outer join, we're done (4) otherwise, do hash join as usual The implementation uses the new MultiExecProcNode API, per a suggestion from Tom: invoking ExecHash() now produces the first tuple from the Hash node's child node, whereas MultiExecHash() builds the hash table. I had to put in a bit of a kludge to get the row count returned for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to return a tuple, and then MultiExecHash() is invoked, we would return one too many tuples to EXPLAIN ANALYZE. I hacked around this by just manually detecting this situation and subtracting 1 from the EXPLAIN ANALYZE row count.
*	Create a new 'MultiExecProcNode' call API for plan nodes that don't	Tom Lane	2005-04-16
\| \| \| \| \| \| \|	return just a single tuple at a time. Currently the only such node type is Hash, but I expect we will soon have indexscans that can return tuple bitmaps. A side benefit is that EXPLAIN ANALYZE now shows the correct tuple count for a Hash node.
*	Minor code cleanup: ExecHash() was returning a null TupleTableSlot, and an	Neil Conway	2005-03-31
\| \| \| \| \| \|	old comment in the code claimed that this was necessary. Since it is not actually necessary any more, it is clearer to remove the comment and just return NULL instead -- the return value of ExecHash() is not used.
*	Revise TupleTableSlot code to avoid unnecessary construction and disassembly	Tom Lane	2005-03-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of tuples when passing data up through multiple plan nodes. A slot can now hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting of Datum/isnull arrays. Upper plan levels can usually just copy the Datum arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr() calls to extract the data again. This work extends Atsushi Ogawa's earlier patch, which provided the key idea of adding Datum arrays to TupleTableSlots. (I believe however that something like this was foreseen way back in Berkeley days --- see the old comment on ExecProject.) A test case involving many levels of join of fairly wide tables (about 80 columns altogether) showed about 3x overall speedup, though simple queries will probably not be helped very much. I have also duplicated some code in heaptuple.c in order to provide versions of heap_formtuple and friends that use "bool" arrays to indicate null attributes, instead of the old convention of "char" arrays containing either 'n' or ' '. This provides a better match to the convention used by ExecEvalExpr. While I have not made a concerted effort to get rid of uses of the old routines, I think they should be deprecated and eventually removed.
*	Revise hash join code so that we can increase the number of batches	Tom Lane	2005-03-06
\| \| \| \| \| \| \|	on-the-fly, and thereby avoid blowing out memory when the planner has underestimated the hash table size. Hash join will now obey the work_mem limit with some faithfulness. Per my recent proposal (hash aggregate part isn't done yet though).
*	Tag appropriate files for rc3	PostgreSQL Daemon	2004-12-31
\| \| \| \| \| \| \| \|	Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
*	Arrange for hash join to skip scanning the outer relation if it detects	Tom Lane	2004-09-22
\| \| \| \| \|	that the inner one is completely empty. Per recent discussion. Also some cosmetic cleanups in nearby code.
*	Hashed LEFT JOIN would miss outer tuples with no inner match if the join	Tom Lane	2004-09-17
\| \| \| \| \| \|	was large enough to be batched and the tuples fell into a batch where there were no inner tuples at all. Thanks to Xiaoyu Wang for finding a test case that exposed this long-standing bug.
*	Pgindent run for 8.0.	Bruce Momjian	2004-08-29
\|
*	Update copyright to 2004.	Bruce Momjian	2004-08-29
\|
*	Use the new List API function names throughout the backend, and disable the	Neil Conway	2004-05-30
\| \| \| \| \|	list compatibility API by default. While doing this, I decided to keep the llast() macro around and introduce llast_int() and llast_oid() variants.
*	Reimplement the linked list data structure used throughout the backend.	Neil Conway	2004-05-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the past, we used a 'Lispy' linked list implementation: a "list" was merely a pointer to the head node of the list. The problem with that design is that it makes lappend() and length() linear time. This patch fixes that problem (and others) by maintaining a count of the list length and a pointer to the tail node along with each head node pointer. A "list" is now a pointer to a structure containing some meta-data about the list; the head and tail pointers in that structure refer to ListCell structures that maintain the actual linked list of nodes. The function names of the list API have also been changed to, I hope, be more logically consistent. By default, the old function names are still available; they will be disabled-by-default once the rest of the tree has been updated to use the new API names.
*	More janitorial work: remove the explicit casting of NULL literals to a	Neil Conway	2004-01-07
\| \| \| \| \| \| \| \|	pointer type when it is not necessary to do so. For future reference, casting NULL to a pointer type is only necessary when (a) invoking a function AND either (b) the function has no prototype OR (c) the function is a varargs function.
*	$Header: -> $PostgreSQL Changes ...	PostgreSQL Daemon	2003-11-29
\|
*	Get rid of hashkeys field of Hash plan node, since it's redundant with	Tom Lane	2003-11-25
\| \| \| \| \| \|	the hashclauses field of the parent HashJoin. This avoids problems with duplicated links to SubPlans in hash clauses, as per report from Andrew Holm-Hansen.
*	Message editing: remove gratuitous variations in message wording, standardize	Peter Eisentraut	2003-09-25
\| \| \| \| \|	terms, add some clarifications, fix some untranslatable attempts at dynamic message building.
*	Another pgindent run with updated typedefs.	Bruce Momjian	2003-08-08
\|
*	Update copyrights to 2003.	Bruce Momjian	2003-08-04
\|
*	pgindent run.	Bruce Momjian	2003-08-04
\|
*	Error message editing in backend/executor.	Tom Lane	2003-07-21
\|
*	Revise hash join and hash aggregation code to use the same datatype-	Tom Lane	2003-06-22
\| \| \| \| \| \| \| \|	specific hash functions used by hash indexes, rather than the old not-datatype-aware ComputeHashFunc routine. This makes it safe to do hash joining on several datatypes that previously couldn't use hashing. The sets of datatypes that are hash indexable and hash joinable are now exactly the same, whereas before each had some that weren't in the other.
*	Small performance improvement for hash joins and hash aggregation:	Tom Lane	2003-05-30
\| \| \| \| \| \| \|	when the plan is ReScanned, we don't have to rebuild the hash table if there is no parameter change for its child node. This idea has been used for a long time in Sort and Material nodes, but was not in the hash code till now.
*	Ditch ExecGetTupType() in favor of the much simpler ExecGetResultType(),	Tom Lane	2003-05-05
\| \| \| \| \| \|	which does the same thing. Perhaps at one time there was a reason to allow plan nodes to store their result types in different places, but AFAICT that's been unnecessary for a good while.
*	This patch implements holdable cursors, following the proposal	Bruce Momjian	2003-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(materialization into a tuple store) discussed on pgsql-hackers earlier. I've updated the documentation and the regression tests. Notes on the implementation: - I needed to change the tuple store API slightly -- it assumes that it won't be used to hold data across transaction boundaries, so the temp files that it uses for on-disk storage are automatically reclaimed at end-of-transaction. I added a flag to tuplestore_begin_heap() to control this behavior. Is changing the tuple store API in this fashion OK? - in order to store executor results in a tuple store, I added a new CommandDest. This works well for the most part, with one exception: the current DestFunction API doesn't provide enough information to allow the Executor to store results into an arbitrary tuple store (where the particular tuple store to use is chosen by the call site of ExecutorRun). To workaround this, I've temporarily hacked up a solution that works, but is not ideal: since the receiveTuple DestFunction is passed the portal name, we can use that to lookup the Portal data structure for the cursor and then use that to get at the tuple store the Portal is using. This unnecessarily ties the Portal code with the tupleReceiver code, but it works... The proper fix for this is probably to change the DestFunction API -- Tom suggested passing the full QueryDesc to the receiveTuple function. In that case, callers of ExecutorRun could "subclass" QueryDesc to add any additional fields that their particular CommandDest needed to get access to. This approach would work, but I'd like to think about it for a little bit longer before deciding which route to go. In the mean time, the code works fine, so I don't think a fix is urgent. - (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and adjusted the behavior of SCROLL in accordance with the discussion on -hackers. - (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml Neil Conway
*	Upgrade cost estimation for joins, per discussion with Bradley Baetz.	Tom Lane	2003-01-27
\| \| \| \| \| \| \|	Try to model the effect of rescanning input tuples in mergejoins; account for JOIN_IN short-circuiting where appropriate. Also, recognize that mergejoin and hashjoin clauses may now be more than single operator calls, so we have to charge appropriate execution costs.
*	IN clauses appearing at top level of WHERE can now be handled as joins.	Tom Lane	2003-01-20
\| \| \| \| \| \| \| \| \| \|	There are two implementation techniques: the executor understands a new JOIN_IN jointype, which emits at most one matching row per left-hand row, or the result of the IN's sub-select can be fed through a DISTINCT filter and then joined as an ordinary relation. Along the way, some minor code cleanup in the optimizer; notably, break out most of the jointree-rearrangement preprocessing in planner.c and put it in a new file prep/prepjointree.c.
*	Better solution to integer overflow problem in hash batch-number	Tom Lane	2002-12-30
\| \| \| \| \| \| \|	computation: reduce the bucket number mod nbatch. This changes the association between original bucket numbers and batches, but that doesn't matter. Minor other cleanups in hashjoin code to help centralize decisions.
*	Revise executor APIs so that all per-query state structure is built in	Tom Lane	2002-12-15
\| \| \| \| \| \|	a per-query memory context created by CreateExecutorState --- and destroyed by FreeExecutorState. This provides a final solution to the longstanding problem of memory leaked by various ExecEndNode calls.