diff options
Diffstat (limited to 'src/backend/access/transam')
-rw-r--r-- | src/backend/access/transam/README | 106 | ||||
-rw-r--r-- | src/backend/access/transam/xact.c | 151 |
2 files changed, 191 insertions, 66 deletions
diff --git a/src/backend/access/transam/README b/src/backend/access/transam/README index 87b40591702..7c69e09cb5d 100644 --- a/src/backend/access/transam/README +++ b/src/backend/access/transam/README @@ -1,4 +1,4 @@ -$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.7 2007/09/05 18:10:47 tgl Exp $ +$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.8 2007/09/07 20:59:26 tgl Exp $ The Transaction System ---------------------- @@ -221,6 +221,110 @@ InvalidSubTransactionId.) Note that subtransactions do not have their own VXIDs; they use the parent top transaction's VXID. +Interlocking transaction begin, transaction end, and snapshots +-------------------------------------------------------------- + +We try hard to minimize the amount of overhead and lock contention involved +in the frequent activities of beginning/ending a transaction and taking a +snapshot. Unfortunately, we must have some interlocking for this, because +we must ensure consistency about the commit order of transactions. +For example, suppose an UPDATE in xact A is blocked by xact B's prior +update of the same row, and xact B is doing commit while xact C gets a +snapshot. Xact A can complete and commit as soon as B releases its locks. +If xact C's GetSnapshotData sees xact B as still running, then it had +better see xact A as still running as well, or it will be able to see two +tuple versions - one deleted by xact B and one inserted by xact A. Another +reason why this would be bad is that C would see (in the row inserted by A) +earlier changes by B, and it would be inconsistent for C not to see any +of B's changes elsewhere in the database. + +Formally, the correctness requirement is "if A sees B as committed, +and B sees C as committed, then A must see C as committed". + +What we actually enforce is strict serialization of commits and rollbacks +with snapshot-taking: we do not allow any transaction to exit the set of +running transactions while a snapshot is being taken. (This rule is +stronger than necessary for consistency, but is relatively simple to +enforce, and it assists with some other issues as explained below.) The +implementation of this is that GetSnapshotData takes the ProcArrayLock in +shared mode (so that multiple backends can take snapshots in parallel), +but xact.c must take the ProcArrayLock in exclusive mode while clearing +MyProc->xid at transaction end (either commit or abort). + +GetSnapshotData must in fact acquire ProcArrayLock before it calls +ReadNewTransactionId. Otherwise it would be possible for a transaction A +postdating the xmax to commit, and then an existing transaction B that saw +A as committed to commit, before GetSnapshotData is able to acquire +ProcArrayLock and finish taking its snapshot. This would violate the +consistency requirement, because A would be still running and B not +according to this snapshot. + +In short, then, the rule is that no transaction may exit the set of +currently-running transactions between the time we fetch xmax and the time +we finish building our snapshot. However, this restriction only applies +to transactions that have an XID --- read-only transactions can end without +acquiring ProcArrayLock, since they don't affect anyone else's snapshot. + +Transaction start, per se, doesn't have any interlocking with these +considerations, since we no longer assign an XID immediately at transaction +start. But when we do decide to allocate an XID, we must require +GetNewTransactionId to store the new XID into the shared ProcArray before +releasing XidGenLock. This ensures that when GetSnapshotData calls +ReadNewTransactionId (which also takes XidGenLock), all active XIDs before +the returned value of nextXid are already present in the ProcArray and +can't be missed by GetSnapshotData. Unfortunately, we can't have +GetNewTransactionId take ProcArrayLock to do this, else it could deadlock +against GetSnapshotData. Therefore, we simply let GetNewTransactionId +store into MyProc->xid without any lock. We are thereby relying on +fetch/store of an XID to be atomic, else other backends might see a +partially-set XID. (NOTE: for multiprocessors that need explicit memory +access fence instructions, this means that acquiring/releasing XidGenLock +is just as necessary as acquiring/releasing ProcArrayLock for +GetSnapshotData to ensure it sees up-to-date xid fields.) This also means +that readers of the ProcArray xid fields must be careful to fetch a value +only once, rather than assume they can read it multiple times and get the +same answer each time. + +Another important activity that uses the shared ProcArray is GetOldestXmin, +which must determine a lower bound for the oldest xmin of any active MVCC +snapshot, system-wide. Each individual backend advertises the smallest +xmin of its own snapshots in MyProc->xmin, or zero if it currently has no +live snapshots (eg, if it's between transactions or hasn't yet set a +snapshot for a new transaction). GetOldestXmin takes the MIN() of the +valid xmin fields. It does this with only shared lock on ProcArrayLock, +which means there is a potential race condition against other backends +doing GetSnapshotData concurrently: we must be certain that a concurrent +backend that is about to set its xmin does not compute an xmin less than +what GetOldestXmin returns. We ensure that by including all the active +XIDs into the MIN() calculation, along with the valid xmins. The rule that +transactions can't exit without taking exclusive ProcArrayLock ensures that +concurrent holders of shared ProcArrayLock will compute the same minimum of +currently-active XIDs: no xact, in particular not the oldest, can exit +while we hold shared ProcArrayLock. So GetOldestXmin's view of the minimum +active XID will be the same as that of any concurrent GetSnapshotData, and +so it can't produce an overestimate. If there is no active transaction at +all, GetOldestXmin returns the result of ReadNewTransactionId. Note that +two concurrent executions of GetOldestXmin might not see the same result +from ReadNewTransactionId --- but if there is a difference, the intervening +execution(s) of GetNewTransactionId must have stored their XIDs into the +ProcArray, so the later execution of GetOldestXmin will see them and +compute the same global xmin anyway. + +GetSnapshotData also performs an oldest-xmin calculation (which had better +match GetOldestXmin's) and stores that into RecentGlobalXmin, which is used +for some tuple age cutoff checks where a fresh call of GetOldestXmin seems +too expensive. Note that while it is certain that two concurrent +executions of GetSnapshotData will compute the same xmin for their own +snapshots, as argued above, it is not certain that they will arrive at the +same estimate of RecentGlobalXmin. This is because we allow XID-less +transactions to clear their MyProc->xmin asynchronously (without taking +ProcArrayLock), so one execution might see what had been the oldest xmin, +and another not. This is OK since RecentGlobalXmin need only be a valid +lower bound. As noted above, we are already assuming that fetch/store +of the xid fields is atomic, so assuming it for xmin as well is no extra +risk. + + pg_clog and pg_subtrans ----------------------- diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 2e972d56f60..02b064179f2 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -10,7 +10,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.248 2007/09/05 18:10:47 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.249 2007/09/07 20:59:26 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -747,6 +747,8 @@ AtSubStart_ResourceOwner(void) /* * RecordTransactionCommit + * + * This is exported only to support an ugly hack in VACUUM FULL. */ void RecordTransactionCommit(void) @@ -1552,46 +1554,53 @@ CommitTransaction(void) */ RecordTransactionCommit(); - /*---------- + PG_TRACE1(transaction__commit, MyProc->lxid); + + /* * Let others know about no transaction in progress by me. Note that * this must be done _before_ releasing locks we hold and _after_ * RecordTransactionCommit. * - * LWLockAcquire(ProcArrayLock) is required; consider this example: - * UPDATE with xid 0 is blocked by xid 1's UPDATE. - * xid 1 is doing commit while xid 2 gets snapshot. - * If xid 2's GetSnapshotData sees xid 1 as running then it must see - * xid 0 as running as well, or it will be able to see two tuple versions - * - one deleted by xid 1 and one inserted by xid 0. See notes in - * GetSnapshotData. - * * Note: MyProc may be null during bootstrap. - *---------- */ if (MyProc != NULL) { - /* - * Lock ProcArrayLock because that's what GetSnapshotData uses. - * You might assume that we can skip this step if we had no - * transaction id assigned, because the failure case outlined - * in GetSnapshotData cannot happen in that case. This is true, - * but we *still* need the lock guarantee that two concurrent - * computations of the *oldest* xmin will get the same result. - */ - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); - MyProc->xid = InvalidTransactionId; - MyProc->lxid = InvalidLocalTransactionId; - MyProc->xmin = InvalidTransactionId; - MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + if (TransactionIdIsValid(MyProc->xid)) + { + /* + * We must lock ProcArrayLock while clearing MyProc->xid, so + * that we do not exit the set of "running" transactions while + * someone else is taking a snapshot. See discussion in + * src/backend/access/transam/README. + */ + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); - /* Clear the subtransaction-XID cache too while holding the lock */ - MyProc->subxids.nxids = 0; - MyProc->subxids.overflowed = false; + MyProc->xid = InvalidTransactionId; + MyProc->lxid = InvalidLocalTransactionId; + MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ - LWLockRelease(ProcArrayLock); - } + /* Clear the subtransaction-XID cache too while holding the lock */ + MyProc->subxids.nxids = 0; + MyProc->subxids.overflowed = false; - PG_TRACE1(transaction__commit, s->transactionId); + LWLockRelease(ProcArrayLock); + } + else + { + /* + * If we have no XID, we don't need to lock, since we won't + * affect anyone else's calculation of a snapshot. We might + * change their estimate of global xmin, but that's OK. + */ + MyProc->lxid = InvalidLocalTransactionId; + MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + + Assert(MyProc->subxids.nxids == 0); + Assert(MyProc->subxids.overflowed == false); + } + } /* * This is all post-commit cleanup. Note that if an error is raised here, @@ -1815,28 +1824,21 @@ PrepareTransaction(void) * Let others know about no transaction in progress by me. This has to be * done *after* the prepared transaction has been marked valid, else * someone may think it is unlocked and recyclable. + * + * We can skip locking ProcArrayLock here, because this action does not + * actually change anyone's view of the set of running XIDs: our entry + * is duplicate with the gxact that has already been inserted into the + * ProcArray. */ - - /* - * Lock ProcArrayLock because that's what GetSnapshotData uses. - * You might assume that we can skip this step if we have no - * transaction id assigned, because the failure case outlined - * in GetSnapshotData cannot happen in that case. This is true, - * but we *still* need the lock guarantee that two concurrent - * computations of the *oldest* xmin will get the same result. - */ - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyProc->xid = InvalidTransactionId; MyProc->lxid = InvalidLocalTransactionId; MyProc->xmin = InvalidTransactionId; MyProc->inVacuum = false; /* must be cleared with xid/xmin */ - /* Clear the subtransaction-XID cache too while holding the lock */ + /* Clear the subtransaction-XID cache too */ MyProc->subxids.nxids = 0; MyProc->subxids.overflowed = false; - LWLockRelease(ProcArrayLock); - /* * This is all post-transaction cleanup. Note that if an error is raised * here, it's too late to abort the transaction. This should be just @@ -1987,36 +1989,55 @@ AbortTransaction(void) */ RecordTransactionAbort(false); + PG_TRACE1(transaction__abort, MyProc->lxid); + /* * Let others know about no transaction in progress by me. Note that this * must be done _before_ releasing locks we hold and _after_ * RecordTransactionAbort. + * + * Note: MyProc may be null during bootstrap. */ if (MyProc != NULL) { - /* - * Lock ProcArrayLock because that's what GetSnapshotData uses. - * You might assume that we can skip this step if we have no - * transaction id assigned, because the failure case outlined - * in GetSnapshotData cannot happen in that case. This is true, - * but we *still* need the lock guarantee that two concurrent - * computations of the *oldest* xmin will get the same result. - */ - LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); - MyProc->xid = InvalidTransactionId; - MyProc->lxid = InvalidLocalTransactionId; - MyProc->xmin = InvalidTransactionId; - MyProc->inVacuum = false; /* must be cleared with xid/xmin */ - MyProc->inCommit = false; /* be sure this gets cleared */ - - /* Clear the subtransaction-XID cache too while holding the lock */ - MyProc->subxids.nxids = 0; - MyProc->subxids.overflowed = false; - - LWLockRelease(ProcArrayLock); - } + if (TransactionIdIsValid(MyProc->xid)) + { + /* + * We must lock ProcArrayLock while clearing MyProc->xid, so + * that we do not exit the set of "running" transactions while + * someone else is taking a snapshot. See discussion in + * src/backend/access/transam/README. + */ + LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); - PG_TRACE1(transaction__abort, s->transactionId); + MyProc->xid = InvalidTransactionId; + MyProc->lxid = InvalidLocalTransactionId; + MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + MyProc->inCommit = false; /* be sure this gets cleared */ + + /* Clear the subtransaction-XID cache too while holding the lock */ + MyProc->subxids.nxids = 0; + MyProc->subxids.overflowed = false; + + LWLockRelease(ProcArrayLock); + } + else + { + /* + * If we have no XID, we don't need to lock, since we won't + * affect anyone else's calculation of a snapshot. We might + * change their estimate of global xmin, but that's OK. + */ + MyProc->lxid = InvalidLocalTransactionId; + MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + MyProc->inCommit = false; /* be sure this gets cleared */ + + Assert(MyProc->subxids.nxids == 0); + Assert(MyProc->subxids.overflowed == false); + } + } /* * Post-abort cleanup. See notes in CommitTransaction() concerning |