| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
While the in-core authentication mechanism doesn't need to access
pg_shseclabel at all, it's reasonable to think that an authentication
hook will want to look at the label for the role logging in, or for rows
in other catalogs used during the authentication phase of startup.
Catalog version bumped, because this changes the "is nailed" status for
pg_shseclabel.
Author: Adam Brightwell
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some time back we agreed that row_security=off should not be a way to
bypass RLS entirely, but only a way to get an error if it was being
applied. However, the code failed to act that way for table owners.
Per discussion, this is a must-fix bug for 9.5.0.
Adjust the logic in rls.c to behave as expected; also, modify the
error message to be more consistent with the new interpretation.
The regression tests need minor corrections as well. Also update
the comments about row_security in ddl.sgml to be correct. (The
official description of the GUC in config.sgml is already correct.)
I failed to resist the temptation to do some other very minor
cleanup as well, such as getting rid of a duplicate extern declaration.
|
|
|
|
| |
We discussed this but somehow failed to implement it...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aside from any consistency arguments, this is logically necessary because
the I/O functions for these types also handle numeric OID values. Without
a quoting rule it is impossible to distinguish numeric OIDs from role or
namespace names that happen to contain only digits.
Also change the to_regrole and to_regnamespace functions to dequote their
arguments. While not logically essential, this seems like a good idea
since the other to_reg* functions do it. Anyone who really wants raw
lookup of an uninterpreted name can fall back on the time-honored solution
of (SELECT oid FROM pg_namespace WHERE nspname = whatever).
Report and patch by Jim Nasby, reviewed by Michael Paquier
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Can't release the AccessExclusiveLock on the target table until commit.
Otherwise there is a race condition whereby other backends might service
our cache invalidation signals before they can actually see the updated
catalog rows.
Just to add insult to injury, RemovePolicyById was closing the rel (with
incorrect lock drop) and then passing the now-dangling rel pointer to
CacheInvalidateRelcache. Probably the reason this doesn't fall over on
CLOBBER_CACHE buildfarm members is that some outer level of the DROP logic
is still holding the rel open ... but it'd have bit us on the arse
eventually, no doubt.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CHECK_IS_BINARY_UPGRADE macro is not sufficient security protection
if we're going to dereference pass-by-reference arguments before it.
But in any case we really need to explicitly check PG_ARGISNULL for all
the arguments of a non-strict function, not only the ones we expect null
values for.
Oversight in commits 30982be4e5019684e1772dd9170aaa53f5a8e894 and
f92fc4c95ddcc25978354a8248d3df22269201bc. Found by Andreas Seltenreich.
(The other usages in pg_upgrade_support.c seem safe.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pgwin32_recv() has treated a non-error return of zero bytes from WSARecv()
as being a reason to block ever since the current implementation was
introduced in commit a4c40f140d23cefb. However, so far as one can tell
from Microsoft's documentation, that is just wrong: what it means is
graceful connection closure (in stream protocols) or receipt of a
zero-length message (in message protocols), and neither case should result
in blocking here. The only reason the code worked at all was that control
then fell into the retry loop, which did *not* treat zero bytes specially,
so we'd get out after only wasting some cycles. But as of 9.5 we do not
normally reach the retry loop and so the bug is exposed, as reported by
Shay Rojansky and diagnosed by Andres Freund.
Remove the unnecessary test on the byte count, and rearrange the code
in the retry loop so that it looks identical to the initial sequence.
Back-patch to 9.5. The code is wrong all the way back, AFAICS, but
since it's relatively harmless in earlier branches we'll leave it alone.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spg_text_inner_consistent is capable of reconstructing an empty string
to pass down to the next index level; this happens if we have an empty
string coming in, no prefix, and a dummy node label. (In practice, what
is needed to trigger that is insertion of a whole bunch of empty-string
values.) Then, we will arrive at the next level with in->level == 0
and a non-NULL (but zero length) in->reconstructedValue, which is valid
but the Assert tests weren't expecting it.
Per report from Andreas Seltenreich. This has no impact in non-Assert
builds, so should not be a problem in production, but back-patch to
all affected branches anyway.
In passing, remove a couple of useless variable initializations and
shorten the code by not duplicating DatumGetPointer() calls.
|
|
|
|
| |
Backpatch certain files through 9.1
|
|
|
|
| |
Jeff Janes, reviewed by Jim Nasby.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
flatten_reloptions() supposed that it didn't really need to do anything
beyond inserting commas between reloption array elements. However, in
principle the value of a reloption could be nearly anything, since the
grammar allows a quoted string there. Any restrictions on it would come
from validity checking appropriate to the particular option, if any.
A reloption value that isn't a simple identifier or number could thus lead
to dump/reload failures due to syntax errors in CREATE statements issued
by pg_dump. We've gotten away with not worrying about this so far with
the core-supported reloptions, but extensions might allow reloption values
that cause trouble, as in bug #13840 from Kouhei Sutou.
To fix, split the reloption array elements explicitly, and then convert
any value that doesn't look like a safe identifier to a string literal.
(The details of the quoting rule could be debated, but this way is safe
and requires little code.) While we're at it, also quote reloption names
if they're not safe identifiers; that may not be a likely problem in the
field, but we might as well try to be bulletproof here.
It's been like this for a long time, so back-patch to all supported
branches.
Kouhei Sutou, adjusted some by me
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A report from Andy Colson showed that gincostestimate() was not being
nearly paranoid enough about whether to believe the statistics it finds in
the index metapage. The problem is that the metapage stats (other than the
pending-pages count) are only updated by VACUUM, and in the worst case
could still reflect the index's original empty state even when it has grown
to many entries. We attempted to deal with that by scaling up the stats to
match the current index size, but if nEntries is zero then scaling it up
still gives zero. Moreover, the proportion of pages that are entry pages
vs. data pages vs. pending pages is unlikely to be estimated very well by
scaling if the index is now orders of magnitude larger than before.
We can improve matters by expanding the use of the rule-of-thumb estimates
I introduced in commit 7fb008c5ee59b040: if the index has grown by more
than a cutoff amount (here set at 4X growth) since VACUUM, then use the
rule-of-thumb numbers instead of scaling. This might not be exactly right
but it seems much less likely to produce insane estimates.
I also improved both the scaling estimate and the rule-of-thumb estimate
to account for numPendingPages, since it's reasonable to expect that that
is accurate in any case, and certainly pages that are in the pending list
are not either entry or data pages.
As a somewhat separate issue, adjust the estimation equations that are
concerned with extra fetches for partial-match searches. These equations
suppose that a fraction partialEntries / numEntries of the entry and data
pages will be visited as a consequence of a partial-match search. Now,
it's physically impossible for that fraction to exceed one, but our
estimate of partialEntries is mostly bunk, and our estimate of numEntries
isn't exactly gospel either, so we could arrive at a silly value. In the
example presented by Andy we were coming out with a value of 100, leading
to insane cost estimates. Clamp the fraction to one to avoid that.
Like the previous patch, back-patch to all supported branches; this
problem can be demonstrated in one form or another in all of them.
|
|
|
|
|
|
|
|
|
|
|
| |
Recovery does not achieve its goal of zeroing all pg_multixact entries
whose accompanying WAL records never reached disk. Remove that claim
and justify its expendability. Detail the need for TrimMultiXact(),
which has little in common with the TrimCLOG() rationale. Merge two
tightly-related comments. Stop presenting pg_multixact as specific to
heap_lock_tuple(); PostgreSQL 9.3 extended its use to heap_update().
Noticed while investigating a report from Andres Freund.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix an oversight in commit 321eed5f0f7563a0: replacing an operator's
selectivity functions needs to result in a corresponding update in
pg_depend. We have a function that can handle that, but it was not
called by AlterOperator().
To fix this without enlarging pg_operator.h's #include list beyond
what clients can safely include, split off the function definitions
into a new file pg_operator_fn.h, similarly to what we've done for
some other catalog header files. It's not entirely clear whether
any client-side code needs to include pg_operator.h, but it seems
prudent to assume that there is some such code somewhere.
|
|
|
|
|
|
| |
In the extreme edge case where contended pages are the only ones that
escape being scanned, the previous commit would have allowed us to think
that relfrozenxid could be advanced, which is exactly wrong.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VACUUM can skip heap pages altogether when there's a run of consecutive
pages that are all-visible according to the visibility map. This causes it
to not update its nonempty_pages count, just as if those pages were empty,
which means that at the end we will think they are candidates for deletion.
Thus, we may take the table's AccessExclusive lock only to find that no
pages are really truncatable. This usually causes no real problems on a
master server, thanks to the lock being acquired only conditionally; but on
hot-standby servers, the same lock must be acquired unconditionally which
can result in unnecessary query cancellations.
To improve matters, force examination of the table's last page whenever
we reach there with a nonempty_pages count that would allow a truncation
attempt. If it's not empty, we'll advance nonempty_pages and thereby
prevent the truncation attempt.
If we are unable to acquire cleanup lock on that page, there's no need to
force it, unless we're doing an anti-wraparound vacuum. We can just check
for tuples with a shared buffer lock and then give up. (When we are doing
an anti-wraparound vacuum, and decide it's okay to skip the page because it
contains no freezable tuples, this patch still improves matters because
nonempty_pages is properly updated, which it was not before.)
Since only the last page is special-cased in this way, we might attempt a
truncation that will release many fewer pages than the normal heuristic
would suggest; at worst, only one page would be truncated. But that seems
all right, because the situation won't repeat during the next vacuum.
The real problem with the old logic is that the useless truncation attempt
happens every time we vacuum, so long as the state of the last few dozen
pages doesn't change.
This is a longstanding deficiency, but since the consequences aren't very
severe in most scenarios, I'm not going to risk a back-patch.
Jeff Janes and Tom Lane
|
|
|
|
|
|
| |
The rationale for the way targetlist processing is done wasn't clearly
stated anywhere, and I for one had forgotten some of the details. Having
just painfully re-learned them, add some breadcrumbs for the next person.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6f8cb1e23485bd6d tried to centralize rewriteTargetView's copying
of a target view's Query struct. However, it ignored the fact that the
jointree->quals field was used twice. This only accidentally failed to
fail immediately because the same ChangeVarNodes mutation is applied in
both cases, so that we end up with logically identical expression trees
for both uses (and, as the code stands, the second ChangeVarNodes call
actually does nothing). However, we end up linking *physically*
identical expression trees into both an RTE's securityQuals list and
the WithCheckOption list. That's pretty dangerous, mainly because
prepsecurity.c is utterly cavalier about further munging such structures
without copying them first.
There may be no live bug in HEAD as a consequence of the fact that we apply
preprocess_expression in between here and prepsecurity.c, and that will
make a copy of the tree anyway. Or it may just be that the regression
tests happen to not trip over it. (I noticed this only because things
fell over pretty badly when I tried to relocate the planner's call of
expand_security_quals to before expression preprocessing.) In any case
it's very fragile because if anyone tried to make the securityQuals and
WithCheckOption trees diverge before we reach preprocess_expression, it
would not work. The fact that the current code will preprocess
securityQuals and WithCheckOptions lists at completely different times in
different query levels does nothing to increase my trust that that can't
happen.
In view of the fact that 9.5.0 is almost upon us and the aforesaid commit
has seen exactly zero field testing, the prudent course is to make an extra
copy of the quals so that the behavior is not different from what has been
in the field during beta.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variables newestCommitTs and oldestCommitTs sound as if they are
timestamps, but in fact they are the transaction Ids that correspond
to the newest and oldest timestamps rather than the actual timestamps.
Rename these variables to reflect that they are actually xids: to wit
newestCommitTsXid and oldestCommitTsXid respectively. Also modify
related code in a similar fashion, particularly the user facing output
emitted by pg_controldata and pg_resetxlog.
Complaint and patch by me, review by Tom Lane and Alvaro Herrera.
Backpatch to 9.5 where these variables were first introduced.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MergeAttributes() rejects cases where columns to be merged have the same
type but different typmod, which is correct; but the error message it
printed didn't show either typmod, which is unhelpful. Changing this
requires using format_type_with_typemod() in place of TypeNameToString(),
which will have some minor side effects on the way some type names are
printed, but on balance this is an improvement: the old code sometimes
printed one type according to one set of rules and the other type according
to the other set, which could be confusing in its own way.
Oddly, there were no regression test cases covering any of this behavior,
so add some.
Complaint and fix by Amit Langote
|
|
|
|
|
|
|
|
|
|
| |
brin_summarize_new_values() did not check that the passed OID was for
an index at all, much less that it was a BRIN index, and would fail in
obscure ways if it wasn't (possibly damaging data first?). It also
lacked any permissions test; by analogy to VACUUM, we should only allow
the table's owner to summarize.
Noted by Jeff Janes, fix by Michael Paquier and me
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original coding read tuples from workers in round-robin fashion,
but performance testing shows that it works much better to read enough
to empty one queue before moving on to the next. I believe the
reason for this is that, with the old approach, we could easily wake
up a worker repeatedly to write only one new tuple into the shm_mq
each time. With this approach, by the time the process gets scheduled,
it has a decent chance of being able to fill the entire buffer in
one go.
Patch by me. Dilip Kumar helped with performance testing.
|
|
|
|
|
|
|
| |
This should have been part of the original commit, but was missed.
Pushing data between processes is expensive, so we definitely want
to project away unneeded columns here, just as we do for other nodes
like Sort and Hash that care about the volume of data.
|
|
|
|
| |
'\"' is more commonly written simply as '"'.
|
|
|
|
|
|
|
|
|
|
| |
Omitted boundaries represent the upper or lower limit of the corresponding
array subscript. This allows simpler specification of many common
use-cases.
(Revised version of commit 9246af6799819847faa33baf441251003acbb8fe)
YUriy Zhuravlev
|
|
|
|
| |
Peter Geoghegan and Robert Haas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When use_remote_estimate is enabled, consider adding ORDER BY to the
query we sending to the remote server so that we can use that ordered
data for a merge join. Commit f18c944b6137329ac4a6b2dce5745c5dc21a8578
arranges to push down the query pathkeys, which seems like the case
mostly likely to be a win, but testing shows this can sometimes win,
too.
For a regular table, we know which indexes are present and therefore
test whether the ordering provided by each such index is useful. Here,
we take the opposite approach: guess what orderings would be useful if
they could be generated cheaply, and then ask the remote side what those
will cost.
Ashutosh Bapat, with very substantial cosmetic revisions by me. Also
reviewed by Rushabh Lathia.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than expect the Query returned by get_view_query() to be
read-only and then copy bits and pieces of it out, simply copy the
entire structure when we get it. This addresses an issue where
AcquireRewriteLocks, which is called by acquireLocksOnSubLinks(),
scribbles on the parsetree passed in, which was actually an entry
in relcache, leading to segfaults with certain view definitions.
This also future-proofs us a bit for anyone adding more code to this
path.
The acquireLocksOnSubLinks() was added in commit c3e0ddd40.
Back-patch to 9.3 as that commit was.
|
|
|
|
| |
I miss too much. Patch is returned to commitfest process.
|
|
|
|
| |
Kyotaro Horiguchi
|
|
|
|
|
|
| |
This was a silly goof on my (rhaas's) part.
Report and fix by Rushabh Lathia.
|
|
|
|
|
|
|
| |
This could result in the error context misidentifying where the error
actually occurred.
Craig Ringer
|
|
|
|
| |
Amit Langote
|
|
|
|
|
|
|
| |
Allow to omiy lower or upper or both boundaries in array subscript
for selecting slice of array.
Author: YUriy Zhuravlev
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, each subroutine in initdb fired up its own standalone backend
session. Over time we'd grown as many as fifteen of these sessions,
and the cumulative startup and shutdown work for them was getting pretty
noticeable. Combining things so that all these steps share a single
backend session cuts a good 10% off the total runtime of initdb, more
if you're not fsync'ing.
The main stumbling block to doing this before was that some of the sessions
were run with -j and some not. The improved definition of -j mode
implemented by my previous commit makes it possible to fix that by running
all the post-bootstrap steps with -j; we just have to use double instead of
single newlines to end command strings. (This is only absolutely necessary
around the VACUUM and CREATE DATABASE steps, since those can't be run in a
transaction block. But it seems best to make them all use double newlines
so that the commands remain separate for error-reporting purposes.)
A minor disadvantage is that since initdb can't tell how much of its
output the backend has executed, we can no longer have the per-step
progress reporting initdb used to print. But things are fast enough
nowadays that that's not really all that useful anyway.
In passing, add more const decoration to some of the static arrays in
initdb.c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, -j caused the entire input file to be read in and executed as
a single command string. That's undesirable, not least because any error
causes the entire file to be regurgitated as the "failing query". Some
experimentation suggests a better rule: end the command string when we see
a semicolon immediately followed by two newlines, ie, an empty line after
a query. This serves nicely to break up the existing examples such as
information_schema.sql and system_views.sql. A limitation is that it's
no longer possible to write such a sequence within a string literal or
multiline comment in a file meant to be read with -j; but there are no
instances of such a problem within the data currently used by initdb.
(If someone does make such a mistake in future, it'll be obvious because
they'll get an unterminated-literal or unterminated-comment syntax error.)
Other than that, there shouldn't be any negative consequences; you're not
forced to end statements that way, it's just a better idea in most cases.
In passing, remove src/include/tcop/tcopdebug.h, which is dead code
because it's not included anywhere, and hasn't been for more than
ten years. One of the debug-support symbols it purported to describe
has been unreferenced for at least the same amount of time, and the
other is removed by this commit on the grounds that it was useless:
forcing -j mode all the time would have broken initdb. The lack of
complaints about that, or about the missing inclusion, shows that
no one has tried to use TCOP_DONTUSENEWLINE in many years.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary so that REASSIGN OWNED does the right thing with
composite types, to wit, that it also alters ownership of the type's
pg_class entry -- previously, the pg_class entry remained owned by the
original user, which caused later other failures such as the new owner's
inability to use ALTER TYPE to rename an attribute of the affected
composite. Also, if the original owner is later dropped, the pg_class
entry becomes owned by a non-existant user which is bogus.
To fix, create a new routine AlterTypeOwner_oid which knows whether to
pass the request to ATExecChangeOwner or deal with it directly, and use
that in shdepReassignOwner rather than calling AlterTypeOwnerInternal
directly. AlterTypeOwnerInternal is now simpler in that it only
modifies the pg_type entry and recurses to handle a possible array type;
higher-level tasks are handled by either AlterTypeOwner directly or
AlterTypeOwner_oid.
I took the opportunity to add a few more objects to the test rig for
REASSIGN OWNED, so that more cases are exercised. Additional ones could
be added for superuser-only-ownable objects (such as FDWs and event
triggers) but I didn't want to push my luck by adding a new superuser to
the tests on a backpatchable bug fix.
Per bug #13666 reported by Chris Pacejo.
Backpatch to 9.5.
(I would back-patch this all the way back, except that it doesn't apply
cleanly in 9.4 and earlier because 59367fdf9 wasn't backpatched. If we
decide that we need this in earlier branches too, we should backpatch
both.)
|
|
|
|
|
|
|
|
|
| |
Encode TIDs as 64-bit integers to speed up comparisons. This seems to
speed things up on all platforms, but is even more beneficial when
8-byte integers are passed by value.
Peter Geoghegan. Design suggestions and review by Tom Lane. Review
also by Simon Riggs and by me.
|
|
|
|
|
|
|
|
| |
FOREIGN KEY constraints have behaved this way for a long time, but for
some reason the behavior of CHECK constraints has been inconsistent up
until now.
Amit Langote and Amul Sul, with assorted tweaks by me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's entirely surprising that mdnblocks() has the side effect of
creating new files on disk, so let's make it not do that. One
consequence of the old behavior is that, if running on a damaged
cluster that is missing a file, mdnblocks() can recreate the file
and allow a subsequent _mdfd_getseg() for a higher segment to succeed.
This happens because, while mdnblocks() stops when it finds a segment
that is shorter than 1GB, _mdfd_getseg() has no such check, and thus
the empty file created by mdnblocks() can allow it to continue its
traversal and find higher-numbered segments which remain.
It might be a good idea for _mdfd_getseg() to actually verify that
each segment it finds is exactly 1GB before proceeding to the next
one, but that would involve some additional system calls, so for
now I'm just doing this much.
Patch by me, per off-list analysis by Kevin Grittner and Rahila Syed.
Review by Andres Freund.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the content lock directly into the BufferDesc, so that locking and
pinning a buffer touches only one cache line rather than two. Adjust
the definition of BufferDesc slightly so that this doesn't make the
BufferDesc any larger than one cache line (at least on platforms where
a spinlock is only 1 or 2 bytes).
We can't fit the I/O locks into the BufferDesc and stay within one
cache line, so move those to a completely separate tranche. This
leaves a relatively limited number of LWLocks in the main tranche, so
increase the padding of those remaining locks to a full cache line,
rather than allowing adjacent locks to share a cache line, hopefully
reducing false sharing.
Performance testing shows that these changes make little difference
on laptop-class machines, but help significantly on larger servers,
especially those with more than 2 sockets.
Andres Freund, originally based on an earlier patch by Simon Riggs.
Review and cosmetic adjustments (including heavy rewriting of the
comments) by me.
|
|
|
|
|
|
|
|
|
|
|
| |
It's a bit cumbersome to use LWLockNewTrancheId(), because the returned
value needs to be shared between backends so that each backend can call
LWLockRegisterTranche() with the correct ID. So, for built-in tranches,
use a hard-coded value instead.
This is motivated by an upcoming patch adding further built-in tranches.
Andres Freund and Robert Haas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We carry around information about if a given query has row security or
not to allow the plancache to use that information to invalidate a
planned query in the event that the environment changes.
Previously, the flag of one of the subqueries was simply being copied
into place to indicate if the query overall included RLS components.
That's wrong as we need the global OR of all subqueries. Fix by
changing the code to match how fireRIRules works, which is results
in OR'ing all of the flags.
Noted by Tom.
Back-patch to 9.5 where RLS was introduced.
|
|
|
|
|
|
|
|
|
| |
Apparently, there are bugs in this code that cause it to loop endlessly.
That bug still needs more research, but in the meantime it's clear that
the loop is missing a check for interrupts so that it can be cancelled
timely.
Backpatch to 9.1 -- this has been missing since 49475aab8d0d.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if find_multixact_start() failed, SetOffsetVacuumLimit() would
install 0 into MultiXactState->offsetStopLimit if it previously succeeded.
Luckily, there are no known cases where find_multixact_start() will return
an error in 9.5 and above. But if it were to happen, for example due to
filesystem permission issues, it'd be somewhat bad: GetNewMultiXactId()
could continue allocating mxids even if close to a wraparound, or it could
erroneously stop allocating mxids, even if no wraparound is looming. The
wrong value would be corrected the next time SetOffsetVacuumLimit() is
called, or by a restart.
Reported-By: Noah Misch, although this is not his preferred fix
Discussion: 20151210140450.GA22278@alap3.anarazel.de
Backpatch: 9.5, where the bug was introduced as part of 4f627f
|
|
|
|
|
|
|
|
|
| |
e3f4cfc7 introduced a LWLockHeldByMe() call, without the corresponding
Assert() surrounding it.
Spotted by Coverity.
Backpatch: 9.1+, like the previous commit
|
|
|
|
|
|
| |
Previously the "sent" field would be set to 0 and all other xlog
pointers be set to NULL if there were no valid values (such as when
in a backup sending walsender).
|
|
|
|
|
|
|
|
|
| |
These would leak random xlog positions if a walsender used for backup would
a walsender slot previously used by a replication walsender.
In passing also fix a couple of cases where the xlog pointer is directly
compared to zero instead of using XLogRecPtrIsInvalid, noted by
Michael Paquier.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changing the tablespace of an unlogged relation did not WAL log the
creation and content of the init fork. Thus, after a standby is
promoted, unlogged relation cannot be accessed anymore, with errors
like:
ERROR: 58P01: could not open file "pg_tblspc/...": No such file or directory
Additionally the init fork was not synced to disk, independent of the
configured wal_level, a relatively small durability risk.
Investigation of that problem also brought to light that, even for
permanent relations, the creation of !main forks was not WAL logged,
i.e. no XLOG_SMGR_CREATE record were emitted. That mostly turns out not
to be a problem, because these files were created when the actual
relation data is copied; nonexistent files are not treated as an error
condition during replay. But that doesn't work for empty files, and
generally feels a bit haphazard. Luckily, outside init and main forks,
empty forks don't occur often or are not a problem.
Add the required WAL logging and syncing to disk.
Reported-By: Michael Paquier
Author: Michael Paquier and Andres Freund
Discussion: 20151210163230.GA11331@alap3.anarazel.de
Backpatch: 9.1, where unlogged relations were introduced
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As reported in bug #13809 by Alexander Ashurkov, the code for REASSIGN
OWNED hadn't gotten word about user mappings. Deal with them in the
same way default ACLs do, which is to ignore them altogether; they are
handled just fine by DROP OWNED. The other foreign object cases are
already handled correctly by both commands.
Also add a REASSIGN OWNED statement to foreign_data test to exercise the
foreign data objects. (The changes are just before the "cleanup" phase,
so it shouldn't remove any existing live test.)
Reported by Alexander Ashurkov, then independently by Jaime Casanova.
|