aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils
Commit message (Collapse)AuthorAge
...
* Again fix initialization of auto-tuned effective_cache_size.Tom Lane2014-03-20
| | | | | | | | | | | | | | | | | | | The previous method was overly complex and underly correct; in particular, by assigning the default value with PGC_S_OVERRIDE, it prevented later attempts to change the setting in postgresql.conf, as noted by Jeff Janes. We should just assign the default value with source PGC_S_DYNAMIC_DEFAULT, which will have the desired priority relative to the boot_val as well as user-set values. There is still a gap in this method: if there's an explicit assignment of effective_cache_size = -1 in the postgresql.conf file, and that assignment appears before shared_buffers is assigned, the code will substitute 4 times the bootstrap default for shared_buffers, and that value will then persist (since it will have source PGC_S_FILE). I don't see any very nice way to avoid that though, and it's not a case to be expected in practice. The existing comments in guc-file.l look forward to a redesign of the DYNAMIC_DEFAULT mechanism; if that ever happens, we should consider this case as one of the things we'd like to improve.
* Fix typos in comments.Fujii Masao2014-03-17
| | | | Thom Brown
* Make punctuation consistentPeter Eisentraut2014-03-16
|
* Cleanups from the remove-native-krb5 patchMagnus Hagander2014-03-16
| | | | | | | | | | | krb_srvname is actually not available anymore as a parameter server-side, since with gssapi we accept all principals in our keytab. It's still used in libpq for client side specification. In passing remove declaration of krb_server_hostname, where all the functionality was already removed. Noted by Stephen Frost, though a different solution than his suggestion
* Prevent interrupts while reporting non-ERROR elog messages.Tom Lane2014-03-13
| | | | | | | | | | | | | | | | This should eliminate the risk of recursive entry to syslog(3), which appears to be the cause of the hang reported in bug #9551 from James Morton. Arguably, the real problem here is auth.c's willingness to turn on ImmediateInterruptOK while executing fairly wide swaths of backend code. We may well need to work at narrowing the code ranges in which the authentication_timeout interrupt is enabled. For the moment, though, this is a cheap and reasonably noninvasive fix for a field-reported failure; the other approach would be complex and not necessarily bug-free itself. Back-patch to all supported branches.
* C comments: remove odd blank lines after #ifdef WIN32 linesBruce Momjian2014-03-13
| | | | A few more
* C comments: remove odd blank lines after #ifdef WIN32 linesBruce Momjian2014-03-13
|
* Items on GIN data pages are no longer always 6 bytes; update gincostestimate.Heikki Linnakangas2014-03-12
| | | | Also improve the comments a bit.
* Show PIDs of lock holders and waiters in log_lock_waits log message.Fujii Masao2014-03-13
| | | | Christian Kruse, reviewed by Kumar Rajeev Rastogi.
* Fix incorrect assertion about historical snapshots.Robert Haas2014-03-12
| | | | | | Also fix some nearby comments. Andres Freund
* Allow opclasses to provide tri-valued GIN consistent functions.Heikki Linnakangas2014-03-12
| | | | | | | | | | | | | | | With the GIN "fast scan" feature, GIN can skip items without fetching all the keys for them, if it can prove that they don't match regardless of those keys. So far, it has done the proving by calling the boolean consistent function with all combinations of TRUE/FALSE for the unfetched keys, but since that's O(n^2), it becomes unfeasible with more than a few keys. We can avoid calling consistent with all the combinations, if we can tell the operator class implementation directly which keys are unknown. This commit includes a triConsistent function for the built-in array and tsvector opclasses. Alexander Korotkov, with some changes by me.
* Allow logical decoding via the walsender interface.Robert Haas2014-03-10
| | | | | | | | | | | | | | | In order for this to work, walsenders need the optional ability to connect to a database, so the "replication" keyword now allows true or false, for backward-compatibility, and the new value "database" (which causes the "dbname" parameter to be respected). walsender needs to loop not only when idle but also when sending decoded data to the user and when waiting for more xlog data to decode. This means that there are now three separate loops inside walsender.c; although some refactoring has been done here, this is still a bit ugly. Andres Freund, with contributions from Álvaro Herrera, and further review by me.
* Avoid memcpy() with same source and destination address.Heikki Linnakangas2014-03-07
| | | | | | | The behavior of that is undefined, although unlikely to lead to problems in practice. Found by running regression tests with Valgrind.
* Avoid getting more than AccessShareLock when deparsing a query.Tom Lane2014-03-06
| | | | | | | | | | | | | | | | | | | | In make_ruledef and get_query_def, we have long used AcquireRewriteLocks to ensure that the querytree we are about to deparse is up-to-date and the schemas of the underlying relations aren't changing. Howwever, that function thinks the query is about to be executed, so it acquires locks that are stronger than necessary for the purpose of deparsing. Thus for example, if pg_dump asks to deparse a rule that includes "INSERT INTO t", we'd acquire RowExclusiveLock on t. That results in interference with concurrent transactions that might for example ask for ShareLock on t. Since pg_dump is documented as being purely read-only, this is unexpected. (Worse, it used to actually be read-only; this behavior dates back only to 8.1, cf commit ba4200246.) Fix this by adding a parameter to AcquireRewriteLocks to tell it whether we want the "real" execution locks or only AccessShareLock. Report, diagnosis, and patch by Dean Rasheed. Back-patch to all supported branches.
* isdigit() needs an unsigned char argument.Heikki Linnakangas2014-03-06
| | | | | | | Per the C standard, the routine should be passed an int, with a value that's representable as an unsigned char or EOF. Passing a signed char is wrong, because a negative value is not representable as an unsigned char. Unfortunately no compiler warns about that.
* Fix portability issues in recently added make_timestamp/make_interval code.Tom Lane2014-03-05
| | | | | | | | Explicitly reject infinity/NaN inputs, rather than just assuming that something else will do it for us. Per buildfarm. While at it, make some over-parenthesized and under-legible code more readable.
* Constructors for interval, timestamp, timestamptzAlvaro Herrera2014-03-04
| | | | | | Author: Pavel Stěhule, editorialized somewhat by Álvaro Herrera Reviewed-by: Tomáš Vondra, Marko Tiikkaja With input from Fabrízio de Royes Mello, Jim Nasby
* Introduce logical decoding.Robert Haas2014-03-03
| | | | | | | | | | | | | | | | | | | | | | This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.
* Rename huge_tlb_pages to huge_pages, and improve docs.Heikki Linnakangas2014-03-03
| | | | Christian Kruse
* Another round of Coverity fixesStephen Frost2014-03-03
| | | | | | | | | | | | | | | | | | | | | | | Additional non-security issues/improvements spotted by Coverity. In backend/libpq, no sense trying to protect against port->hba being NULL after we've already dereferenced it in the switch() statement. Prevent against possible overflow due to 32bit arithmitic in basebackup throttling (not yet released, so no security concern). Remove nonsensical check of array pointer against NULL in procarray.c, looks to be a holdover from 9.1 and earlier when there were pointers being used but now it's just an array. Remove pointer check-against-NULL in tsearch/spell.c as we had already dereferenced it above (in the strcmp()). Remove dead code from adt/orderedsetaggs.c, isnull is checked immediately after each tuplesort_getdatum() call and if true we return, so no point checking it again down at the bottom. Remove recently added minor error-condition memory leak in pg_regress.
* Allow regex operations to be terminated early by query cancel requests.Tom Lane2014-03-01
| | | | | | | | | | | | | | | | | | | | | | | | | The regex code didn't have any provision for query cancel; which is unsurprising given its non-Postgres origin, but still problematic since some operations can take a long time. Introduce a callback function to check for a pending query cancel or session termination request, and call it in a couple of strategic spots where we can make the regex code exit with an error indicator. If we ever actually split out the regex code as a standalone library, some additional work will be needed to let the cancel callback function be specified externally to the library. But that's straightforward (certainly so by comparison to putting the locale-dependent character classification logic on a similar arms-length basis), and there seems no need to do it right now. A bigger issue is that there may be more places than these two where we need to check for cancels. We can always add more checks later, now that the infrastructure is in place. Since there are known examples of not-terribly-long regexes that can lock up a backend for a long time, back-patch to all supported branches. I have hopes of fixing the known performance problems later, but adding query cancel ability seems like a good idea even if they were all fixed.
* Fix crash in json_to_record().Jeff Davis2014-02-26
| | | | | | | | | | | | json_to_record() depends on get_call_result_type() for the tuple descriptor of the record that should be returned, but in some cases that cannot be determined. Add a guard to check if the tuple descriptor has been properly resolved, similar to other callers of get_call_result_type(). Also add guard for two other callers of get_call_result_type() in jsonfuncs.c. Although json_to_record() is the only actual bug, it's a good idea to follow convention.
* Use SnapshotDirty rather than an active snapshot to probe index endpoints.Tom Lane2014-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are lots of uncommitted tuples at the end of the index range, get_actual_variable_range() ends up fetching each one and doing an MVCC visibility check on it, until it finally hits a visible tuple. This is bad enough in isolation, considering that we don't need an exact answer only an approximate one. But because the tuples are not yet committed, each visibility check does a TransactionIdIsInProgress() test, which involves scanning the ProcArray. When multiple sessions do this concurrently, the ensuing contention results in horrid performance loss. 20X overall throughput loss on not-too-complicated queries is easy to demonstrate in the back branches (though someone's made it noticeably less bad in HEAD). We can dodge the problem fairly effectively by using SnapshotDirty rather than a normal MVCC snapshot. This will cause the index probe to take uncommitted tuples as good, so that we incur only one tuple fetch and test even if there are many such tuples. The extent to which this degrades the estimate is debatable: it's possible the result is actually a more accurate prediction than before, if the endmost tuple has become committed by the time we actually execute the query being planned. In any case, it's not very likely that it makes the estimate a lot worse. SnapshotDirty will still reject tuples that are known committed dead, so we won't give bogus answers if an invalid outlier has been deleted but not yet vacuumed from the index. (Because btrees know how to mark such tuples dead in the index, we shouldn't have a big performance problem in the case that there are many of them at the end of the range.) This consideration motivates not using SnapshotAny, which was also considered as a fix. Note: the back branches were using SnapshotNow instead of an MVCC snapshot, but the problem and solution are the same. Per performance complaints from Bartlomiej Romanski, Josh Berkus, and others. Back-patch to 9.0, where the issue was introduced (by commit 40608e7f949fb7e4025c0ddd5be01939adc79eec).
* Show xid and xmin in pg_stat_activity and pg_stat_replication.Robert Haas2014-02-25
| | | | | Christian Kruse, reviewed by Andres Freund and myself, with further minor adjustments by me.
* Update and clarify ssl_ciphers defaultPeter Eisentraut2014-02-24
| | | | | | | | | | | | - Write HIGH:MEDIUM instead of DEFAULT:!LOW:!EXP for clarity. - Order 3DES last to work around inappropriate OpenSSL default. - Remove !MD5 and @STRENGTH, because they are irrelevant. - Add clarifying documentation. Effectively, the new default is almost the same as the old one, but it is arguably easier to understand and modify. Author: Marko Kreen <markokr@gmail.com>
* Increase work_mem and maintenance_work_mem defaults by 4xBruce Momjian2014-02-24
| | | | New defaults are 4MB and 64MB.
* Allow single-point polygons to be converted to circlesBruce Momjian2014-02-24
| | | | | | | This allows finding the center of a single-point polygon and converting it to a point. Per report from Josef Grahn
* docs: document behavior of CHAR() comparisons with chars < spaceBruce Momjian2014-02-24
| | | | | | | Space trimming rather than space-padding causes unusual behavior, which might not be standards-compliant. Also remove recently-added now-redundant C comment.
* Prefer pg_any_to_server/pg_server_to_any over pg_do_encoding_conversion.Tom Lane2014-02-23
| | | | | | | | | | | | | | | | | | | | A large majority of the callers of pg_do_encoding_conversion were specifying the database encoding as either source or target of the conversion, meaning that we can use the less general functions pg_any_to_server/pg_server_to_any instead. The main advantage of using the latter functions is that they can make use of a cached conversion-function lookup in the common case that the other encoding is the current client_encoding. It's notationally cleaner too in most cases, not least because of the historical artifact that the latter functions use "char *" rather than "unsigned char *" in their APIs. Note that pg_any_to_server will apply an encoding verification step in some cases where pg_do_encoding_conversion would have just done nothing. This seems to me to be a good idea at most of these call sites, though it partially negates the performance benefit. Per discussion of bug #9210.
* Plug some more holes in encoding conversion.Tom Lane2014-02-23
| | | | | | | | | | | | | | | | | | | | | | | | Various places assume that pg_do_encoding_conversion() and pg_server_to_any() will ensure encoding validity of their results; but they failed to do so in the case that the source encoding is SQL_ASCII while the destination is not. We cannot perform any actual "conversion" in that scenario, but we should still validate the string according to the destination encoding. Per bug #9210 from Digoal Zhou. Arguably this is a back-patchable bug fix, but on the other hand adding more enforcing of encoding checks might break existing applications that were being sloppy. On balance there doesn't seem to be much enthusiasm for a back-patch, so fix in HEAD only. While at it, remove some apparently-no-longer-needed provisions for letting pg_do_encoding_conversion() "work" outside a transaction --- if you consider it "working" to silently fail to do the requested conversion. Also, make a few cosmetic improvements in mbutils.c, notably removing some Asserts that are certainly dead code since the variables they assert aren't null are never null, even at process start. (I think this wasn't true at one time, but it is now.)
* Do ScalarArrayOp estimation correctly when array is a stable expression.Tom Lane2014-02-21
| | | | | | | | | | | | | | Most estimation functions apply estimate_expression_value to see if they can reduce an expression to a constant; the key difference is that it allows evaluation of stable as well as immutable functions in hopes of ending up with a simple Const node. scalararraysel didn't get the memo though, and neither did gincost_opexpr/gincost_scalararrayopexpr. Fix that, and remove a now-unnecessary estimate_expression_value step in the subsidiary function scalararraysel_containment. Per complaint from Alexey Klyukin. Back-patch to 9.3. The problem goes back further, but I'm hesitant to change estimation behavior in long-stable release branches.
* Further code review for pg_lsn data type.Robert Haas2014-02-19
| | | | | | | | | Change input function error messages to be more consistent with what is done elsewhere. Remove a bunch of redundant type casts, so that the compiler will warn us if we screw up. Don't pass LSNs by value on platforms where a Datum is only 32 bytes, per buildfarm. Move macros for packing and unpacking LSNs to pg_lsn.h so that we can include access/xlogdefs.h, to avoid an unsatisfied dependency on XLogRecPtr.
* pg_lsn macro naming and type behavior revisions.Robert Haas2014-02-19
| | | | | Change pg_lsn_mi so that it can return negative values when subtracting LSNs, and clean up some perhaps ill-considered macro names.
* Add a pg_lsn data type, to represent an LSN.Robert Haas2014-02-19
| | | | Robert Haas and Michael Paquier
* Prevent potential overruns of fixed-size buffers.Tom Lane2014-02-17
| | | | | | | | | | | | | | | | | | | | | | | Coverity identified a number of places in which it couldn't prove that a string being copied into a fixed-size buffer would fit. We believe that most, perhaps all of these are in fact safe, or are copying data that is coming from a trusted source so that any overrun is not really a security issue. Nonetheless it seems prudent to forestall any risk by using strlcpy() and similar functions. Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports. In addition, fix a potential null-pointer-dereference crash in contrib/chkpass. The crypt(3) function is defined to return NULL on failure, but chkpass.c didn't check for that before using the result. The main practical case in which this could be an issue is if libc is configured to refuse to execute unapproved hashing algorithms (e.g., "FIPS mode"). This ideally should've been a separate commit, but since it touches code adjacent to one of the buffer overrun changes, I included it in this commit to avoid last-minute merge issues. This issue was reported by Honza Horak. Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
* Predict integer overflow to avoid buffer overruns.Noah Misch2014-02-17
| | | | | | | | | | | | | | | | | Several functions, mostly type input functions, calculated an allocation size such that the calculation wrapped to a small positive value when arguments implied a sufficiently-large requirement. Writes past the end of the inadvertent small allocation followed shortly thereafter. Coverity identified the path_in() vulnerability; code inspection led to the rest. In passing, add check_stack_depth() to prevent stack overflow in related functions. Back-patch to 8.4 (all supported versions). The non-comment hstore changes touch code that did not exist in 8.4, so that part stops at 9.0. Noah Misch and Heikki Linnakangas, reviewed by Tom Lane. Security: CVE-2014-0064
* Prevent privilege escalation in explicit calls to PL validators.Noah Misch2014-02-17
| | | | | | | | | | | | | | The primary role of PL validators is to be called implicitly during CREATE FUNCTION, but they are also normal functions that a user can call explicitly. Add a permissions check to each validator to ensure that a user cannot use explicit validator calls to achieve things he could not otherwise achieve. Back-patch to 8.4 (all supported versions). Non-core procedural language extensions ought to make the same two-line change to their own validators. Andres Freund, reviewed by Tom Lane and Noah Misch. Security: CVE-2014-0061
* Shore up ADMIN OPTION restrictions.Noah Misch2014-02-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Granting a role without ADMIN OPTION is supposed to prevent the grantee from adding or removing members from the granted role. Issuing SET ROLE before the GRANT bypassed that, because the role itself had an implicit right to add or remove members. Plug that hole by recognizing that implicit right only when the session user matches the current role. Additionally, do not recognize it during a security-restricted operation or during execution of a SECURITY DEFINER function. The restriction on SECURITY DEFINER is not security-critical. However, it seems best for a user testing his own SECURITY DEFINER function to see the same behavior others will see. Back-patch to 8.4 (all supported versions). The SQL standards do not conflate roles and users as PostgreSQL does; only SQL roles have members, and only SQL users initiate sessions. An application using PostgreSQL users and roles as SQL users and roles will never attempt to grant membership in the role that is the session user, so the implicit right to add or remove members will never arise. The security impact was mostly that a role member could revoke access from others, contrary to the wishes of his own grantor. Unapproved role member additions are less notable, because the member can still largely achieve that by creating a view or a SECURITY DEFINER function. Reviewed by Andres Freund and Tom Lane. Reported, independently, by Jonas Sundman and Noah Misch. Security: CVE-2014-0060
* Add C comment about problems with CHAR() space trimmingBruce Momjian2014-02-13
|
* Separate multixact freezing parameters from xid'sAlvaro Herrera2014-02-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we were piggybacking on transaction ID parameters to freeze multixacts; but since there isn't necessarily any relationship between rates of Xid and multixact consumption, this turns out not to be a good idea. Therefore, we now have multixact-specific freezing parameters: vacuum_multixact_freeze_min_age: when to remove multis as we come across them in vacuum (default to 5 million, i.e. early in comparison to Xid's default of 50 million) vacuum_multixact_freeze_table_age: when to force whole-table scans instead of scanning only the pages marked as not all visible in visibility map (default to 150 million, same as for Xids). Whichever of both which reaches the 150 million mark earlier will cause a whole-table scan. autovacuum_multixact_freeze_max_age: when for cause emergency, uninterruptible whole-table scans (default to 400 million, double as that for Xids). This means there shouldn't be more frequent emergency vacuuming than previously, unless multixacts are being used very rapidly. Backpatch to 9.3 where multixacts were made to persist enough to require freezing. To avoid an ABI break in 9.3, VacuumStmt has a couple of fields in an unnatural place, and StdRdOptions is split in two so that the newly added fields can go at the end. Patch by me, reviewed by Robert Haas, with additional input from Andres Freund and Tom Lane.
* Mark some more variables as static or include the appropriate headerPeter Eisentraut2014-02-08
| | | | | | Detected by clang's -Wmissing-variable-declarations. From: Andres Freund <andres@anarazel.de>
* In RelationClearRelation, postpone cache reload if !IsTransactionState().Tom Lane2014-02-06
| | | | | | | | | | | | | | | | | | We may process relcache flush requests during transaction startup or shutdown. In general it's not terribly safe to do catalog access at those times, so the code's habit of trying to immediately revalidate unflushable relcache entries is risky. Although there are no field trouble reports that are positively traceable to this, we have been able to demonstrate failure of the assertions recently added in RelationIdGetRelation() and SearchCatCache(). On the other hand, it seems safe to just postpone revalidation of the cache entry until we're inside a valid transaction. The one case where this is questionable is where we're exiting a subtransaction and the outer transaction is holding the relcache entry open --- but if we made any significant changes to the rel inside such a subtransaction, we've got problems anyway. There are mechanisms in place to prevent that (to wit, locks for cross-session cases and CheckTableNotInUse() for intra-session cases), so let's trust to those mechanisms to keep us out of trouble.
* Alphabeticize list in OBJS definition in utils/adt Makefile.Andrew Dunstan2014-02-06
|
* Assert(IsTransactionState()) in RelationIdGetRelation().Tom Lane2014-02-06
| | | | | | | Commit 42c80c696e9c8323841180029cc62741c21bd356 added an Assert(IsTransactionState()) in SearchCatCache(), to catch any code that thought it could do a catcache lookup outside transactions. Extend the same idea to relcache lookups.
* Fix whitespacePeter Eisentraut2014-02-05
|
* Fix comparison of an array of characters with zero to compare with '\0' instead.Fujii Masao2014-02-04
| | | | Report from Andres Freund.
* In json code, clean up temp memory contexts after processing.Andrew Dunstan2014-02-03
| | | | Craig Ringer.
* Make pg_basebackup skip temporary statistics files.Fujii Masao2014-02-03
| | | | | | | | The temporary statistics files don't need to be included in the backup because they are always reset at the beginning of the archive recovery. This patch changes pg_basebackup so that it skips all files located in $PGDATA/pg_stat_tmp or the directory specified by stats_temp_directory parameter.
* arrays: tighten checks for multi-dimensional inputBruce Momjian2014-02-01
| | | | | | | Previously an input array string that started with a single-element array dimension would then later accept a multi-dimensional segment. BACKWARD INCOMPATIBILITY
* Introduce replication slots.Robert Haas2014-01-31
| | | | | | | | | | | | | | | | Replication slots are a crash-safe data structure which can be created on either a master or a standby to prevent premature removal of write-ahead log segments needed by a standby, as well as (with hot_standby_feedback=on) pruning of tuples whose removal would cause replication conflicts. Slots have some advantages over existing techniques, as explained in the documentation. In a few places, we refer to the type of replication slots introduced by this patch as "physical" slots, because forthcoming patches for logical decoding will also have slots, but with somewhat different properties. Andres Freund and Robert Haas