aboutsummaryrefslogtreecommitdiff
path: root/src/common
Commit message (Collapse)AuthorAge
* Factor out system call names from error messagesPeter Eisentraut2021-04-23
| | | | | | | | | | Instead, put them in via a format placeholder. This reduces the number of distinct translatable messages and also reduces the chances of typos during translation. We already did this for the system call arguments in a number of cases, so this is just the same thing taken a bit further. Discussion: https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com
* Fix typos and grammar in comments and docsMichael Paquier2021-04-19
| | | | | Author: Justin Pryzby Discussion: https://postgr.es/m/20210416070310.GG3315@telsasoft.com
* Refactor HMAC implementationsMichael Paquier2021-04-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | Similarly to the cryptohash implementations, this refactors the existing HMAC code into a single set of APIs that can be plugged with any crypto libraries PostgreSQL is built with (only OpenSSL currently). If there is no such libraries, a fallback implementation is available. Those new APIs are designed similarly to the existing cryptohash layer, so there is no real new design here, with the same logic around buffer bound checks and memory handling. HMAC has a dependency on cryptohashes, so all the cryptohash types supported by cryptohash{_openssl}.c can be used with HMAC. This refactoring is an advantage mainly for SCRAM, that included its own implementation of HMAC with SHA256 without relying on the existing crypto libraries even if PostgreSQL was built with their support. This code has been tested on Windows and Linux, with and without OpenSSL, across all the versions supported on HEAD from 1.1.1 down to 1.0.1. I have also checked that the implementations are working fine using some sample results, a custom extension of my own, and doing cross-checks across different major versions with SCRAM with the client and the backend. Author: Michael Paquier Reviewed-by: Bruce Momjian Discussion: https://postgr.es/m/X9m0nkEJEzIPXjeZ@paquier.xyz
* pg_upgrade: Check version of target cluster binariesPeter Eisentraut2021-03-03
| | | | | | | | | | | | This expands the binary validation in pg_upgrade with a version check per binary to ensure that the target cluster installation only contains binaries from the target version. In order to reduce duplication, validate_exec is exported from port.h and the local copy in pg_upgrade is removed. Author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/flat/9328.1552952117@sss.pgh.pa.us
* Improve reporting for syntax errors in multi-line JSON data.Tom Lane2021-03-01
| | | | | | | | | | | Point to the specific line where the error was detected; the previous code tended to include several preceding lines as well. Avoid re-scanning the entire input to recompute which line that was. Simplify the logic a bit. Add test cases. Simon Riggs and Hamid Akhtar, reviewed by Daniel Gustafsson and myself Discussion: https://postgr.es/m/CANbhV-EPBnXm3MF_TTWBwwqgn1a1Ghmep9VHfqmNBQ8BT0f+_g@mail.gmail.com
* Add result size as argument of pg_cryptohash_final() for overflow checksMichael Paquier2021-02-15
| | | | | | | | | | | | | | | | | | | | | | | | | | With its current design, a careless use of pg_cryptohash_final() could would result in an out-of-bound write in memory as the size of the destination buffer to store the result digest is not known to the cryptohash internals, without the caller knowing about that. This commit adds a new argument to pg_cryptohash_final() to allow such sanity checks, and implements such defenses. The internals of SCRAM for HMAC could be tightened a bit more, but as everything is based on SCRAM_KEY_LEN with uses particular to this code there is no need to complicate its interface more than necessary, and this comes back to the refactoring of HMAC in core. Except that, this minimizes the uses of the existing DIGEST_LENGTH variables, relying instead on sizeof() for the result sizes. In ossp-uuid, this also makes the code more defensive, as it already relied on dce_uuid_t being at least the size of a MD5 digest. This is in philosophy similar to cfc40d3 for base64.c and aef8948 for hex.c. Reported-by: Ranier Vilela Author: Michael Paquier, Ranier Vilela Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/CAEudQAoqEGmcff3J4sTSV-R_16Monuz-UpJFbf_dnVH=APr02Q@mail.gmail.com
* Fix copy-paste error with SHA256 digest length in checksum_helper.cMichael Paquier2021-02-11
| | | | | Issue introduced by 87ae969, noticed while working on the area. While on it, fix some grammar in the surrounding static assertions.
* Introduce --with-ssl={openssl} as a configure optionMichael Paquier2021-02-01
| | | | | | | | | | | | | This is a replacement for the existing --with-openssl, extending the logic to make easier the addition of new SSL libraries. The grammar is chosen to be similar to --with-uuid, where multiple values can be chosen, with "openssl" as the only supported value for now. The original switch, --with-openssl, is kept for compatibility. Author: Daniel Gustafsson, Michael Paquier Reviewed-by: Jacob Champion Discussion: https://postgr.es/m/FAB21FC8-0F62-434F-AA78-6BD9336D630A@yesql.se
* Add mbverifystr() functions specific to each encoding.Heikki Linnakangas2021-01-28
| | | | | | | | | | | This makes pg_verify_mbstr() function faster, by allowing more efficient encoding-specific implementations. All the implementations included in this commit are pretty naive, they just call the same encoding-specific verifychar functions that were used previously, but that already gives a performance boost because the tight character-at-a-time loop is simpler. Reviewed-by: John Naylor Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01@iki.fi
* Introduce SHA1 implementations in the cryptohash infrastructureMichael Paquier2021-01-23
| | | | | | | | | | | | | | | | | | | | | | With this commit, SHA1 goes through the implementation provided by OpenSSL via EVP when building the backend with it, and uses as fallback implementation KAME which was located in pgcrypto and already shaped for an integration with a set of init, update and final routines. Structures and routines have been renamed to make things consistent with the fallback implementations of MD5 and SHA2. uuid-ossp has used for ages a shortcut with pgcrypto to fetch a copy of SHA1 if needed. This was built depending on the build options within ./configure, so this cleans up some code and removes the build dependency between pgcrypto and uuid-ossp. Note that this will help with the refactoring of HMAC, as pgcrypto offers the option to use MD5, SHA1 or SHA2, so only the second option was missing to make that possible. Author: Michael Paquier Reviewed-by: Heikki Linnakangas Discussion: https://postgr.es/m/X9HXKTgrvJvYO7Oh@paquier.xyz
* Rework refactoring of hex and encoding routinesMichael Paquier2021-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit addresses some issues with c3826f83 that moved the hex decoding routine to src/common/: - The decoding function lacked overflow checks, so when used for security-related features it was an open door to out-of-bound writes if not carefully used that could remain undetected. Like the base64 routines already in src/common/ used by SCRAM, this routine is reworked to check for overflows by having the size of the destination buffer passed as argument, with overflows checked before doing any writes. - The encoding routine was missing. This is moved to src/common/ and it gains the same overflow checks as the decoding part. On failure, the hex routines of src/common/ issue an error as per the discussion done to make them usable by frontend tools, but not by shared libraries. Note that this is why ECPG is left out of this commit, and it still includes a duplicated logic doing hex encoding and decoding. While on it, this commit uses better variable names for the source and destination buffers in the existing escape and base64 routines in encode.c and it makes them more robust to overflow detection. The previous core code issued a FATAL after doing out-of-bound writes if going through the SQL functions, which would be enough to detect problems when working on changes that impacted this area of the code. Instead, an error is issued before doing an out-of-bound write. The hex routines were being directly called for bytea conversions and backup manifests without such sanity checks. The current calls happen to not have any problems, but careless uses of such APIs could easily lead to CVE-class bugs. Author: Bruce Momjian, Michael Paquier Reviewed-by: Sehrope Sarkuni Discussion: https://postgr.es/m/20201231003557.GB22199@momjian.us
* Fix and simplify some code related to cryptohashesMichael Paquier2021-01-08
| | | | | | | | | | | This commit addresses two issues: - In pgcrypto, MD5 computation called pg_cryptohash_{init,update,final} without checking for the result status. - Simplify pg_checksum_raw_context to use only one variable for all the SHA2 options available in checksum manifests. Reported-by: Heikki Linnakangas Discussion: https://postgr.es/m/f62f26bb-47a5-8411-46e5-4350823e06a5@iki.fi
* Fix allocation logic of cryptohash context data with OpenSSLMichael Paquier2021-01-07
| | | | | | | | | | | | | | | | | | | | The allocation of the cryptohash context data when building with OpenSSL was happening in the memory context of the caller of pg_cryptohash_create(), which could lead to issues with resowner cleanup if cascading resources are cleaned up on an error. Like other facilities using resowners, move the base allocation to TopMemoryContext to ensure a correct cleanup on failure. The resulting code gets simpler with this commit as the context data is now hold by a unique opaque pointer, so as there is only one single allocation done in TopMemoryContext. After discussion, also change the cryptohash subroutines to return an error if the caller provides NULL for the context data to ease error detection on OOM. Author: Heikki Linnakangas Discussion: https://postgr.es/m/X9xbuEoiU3dlImfa@paquier.xyz
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Use setenv() in preference to putenv().Tom Lane2020-12-30
| | | | | | | | | | | | | | | | | | | | | | | | Since at least 2001 we've used putenv() and avoided setenv(), on the grounds that the latter was unportable and not in POSIX. However, POSIX added it that same year, and by now the situation has reversed: setenv() is probably more portable than putenv(), since POSIX now treats the latter as not being a core function. And setenv() has cleaner semantics too. So, let's reverse that old policy. This commit adds a simple src/port/ implementation of setenv() for any stragglers (we have one in the buildfarm, but I'd not be surprised if that code is never used in the field). More importantly, extend win32env.c to also support setenv(). Then, replace usages of putenv() with setenv(), and get rid of some ad-hoc implementations of setenv() wannabees. Also, adjust our src/port/ implementation of unsetenv() to follow the POSIX spec that it returns an error indicator, rather than returning void as per the ancient BSD convention. I don't feel a need to make all the call sites check for errors, but the portability stub ought to match real-world practice. Discussion: https://postgr.es/m/2065122.1609212051@sss.pgh.pa.us
* Revert "Add key management system" (978f869b99) & later commitsBruce Momjian2020-12-27
| | | | | | | | | | The patch needs test cases, reorganization, and cfbot testing. Technically reverts commits 5c31afc49d..e35b2bad1a (exclusive/inclusive) and 08db7c63f3..ccbe34139b. Reported-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/E1ktAAG-0002V2-VB@gemulon.postgresql.org
* Fix function call typo in frontend Win32 code, commit 978f869b99Bruce Momjian2020-12-25
| | | | | | Reported-by: buildfarm member walleye Backpatch-through: master
* Really fix the dummy implementations in cipher.c.Tom Lane2020-12-25
| | | | 945083b2f wasn't enough to silence compiler warnings.
* fix no-return function call in cipher.c from commit 978f869b99Bruce Momjian2020-12-25
| | | | | | Reported-by: buildfarm member sifaka Backpatch-through: master
* remove uint128 requirement from patch 978f869b99 (CFE)Bruce Momjian2020-12-25
| | | | | | | | Used char[16] instead. Reported-by: buildfarm member florican Backpatch-through: master
* Fix return value and const declaration from commit 978f869b99Bruce Momjian2020-12-25
| | | | | | | | This fixes the non-OpenSSL compile case. Reported-by: buildfarm member sifaka Backpatch-through: master
* Add key management systemBruce Momjian2020-12-25
| | | | | | | | | | | | | | | This adds a key management system that stores (currently) two data encryption keys of length 128, 192, or 256 bits. The data keys are AES256 encrypted using a key encryption key, and validated via GCM cipher mode. A command to obtain the key encryption key must be specified at initdb time, and will be run at every database server start. New parameters allow a file descriptor open to the terminal to be passed. pg_upgrade support has also been added. Discussion: https://postgr.es/m/CA+fd4k7q5o6Nc_AaX6BcYM9yqTbC6_pnH-6nSD=54Zp6NBQTCQ@mail.gmail.com Discussion: https://postgr.es/m/20201202213814.GG20285@momjian.us Author: Masahiko Sawada, me, Stephen Frost
* move hex_decode() to /common so it can be called from frontendBruce Momjian2020-12-24
| | | | | | | This allows removal of a copy of hex_decode() from ecpg, and will be used by the soon-to-be added pg_alterckey command. Backpatch-through: master
* Refactor logic to check for ASCII-only characters in stringMichael Paquier2020-12-21
| | | | | | | | | The same logic was present for collation commands, SASLprep and pgcrypto, so this removes some code. Author: Michael Paquier Reviewed-by: Stephen Frost, Heikki Linnakangas Discussion: https://postgr.es/m/X9womIn6rne6Gud2@paquier.xyz
* Improve some code around cryptohash functionsMichael Paquier2020-12-14
| | | | | | | | | | | | | | | This adjusts some code related to recent changes for cryptohash functions: - Add a variable in md5.h to track down the size of a computed result, moved from pgcrypto. Note that pg_md5_hash() assumed a result of this size already. - Call explicit_bzero() on the hashed data when freeing the context for fallback implementations. For MD5, particularly, it would be annoying to leave some non-zeroed data around. - Clean up some code related to recent changes of uuid-ossp. .gitignore still included md5.c and a comment was incorrect. Discussion: https://postgr.es/m/X9HXKTgrvJvYO7Oh@paquier.xyz
* Refactor MD5 implementations according to new cryptohash infrastructureMichael Paquier2020-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit heavily reorganizes the MD5 implementations that exist in the tree in various aspects. First, MD5 is added to the list of options available in cryptohash.c and cryptohash_openssl.c. This means that if building with OpenSSL, EVP is used for MD5 instead of the fallback implementation that Postgres had for ages. With the recent refactoring work for cryptohash functions, this change is straight-forward. If not building with OpenSSL, a fallback implementation internal to src/common/ is used. Second, this reduces the number of MD5 implementations present in the tree from two to one, by moving the KAME implementation from pgcrypto to src/common/, and by removing the implementation that existed in src/common/. KAME was already structured with an init/update/final set of routines by pgcrypto (see original pgcrypto/md5.h) for compatibility with OpenSSL, so moving it to src/common/ has proved to be a straight-forward move, requiring no actual manipulation of the internals of each routine. Some benchmarking has not shown any performance gap between both implementations. Similarly to the fallback implementation used for SHA2, the fallback implementation of MD5 is moved to src/common/md5.c with an internal header called md5_int.h for the init, update and final routines. This gets then consumed by cryptohash.c. The original routines used for MD5-hashed passwords are moved to a separate file called md5_common.c, also in src/common/, aimed at being shared between all MD5 implementations as utility routines to keep compatibility with any code relying on them. Like the SHA2 changes, this commit had its round of tests on both Linux and Windows, across all versions of OpenSSL supported on HEAD, with and even without OpenSSL. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20201106073434.GA4961@paquier.xyz
* Simplify code for getting a unicode codepoint's canonical class.Michael Paquier2020-12-09
| | | | | | | | | | | | Three places of unicode_norm.c use a similar logic for getting the combining class from a codepoint. Commit 2991ac5 has added the function get_canonical_class() for this purpose, but it was only called by the backend. This commit refactors the code to use this function in all the places where the combining class is retrieved from a given codepoint. Author: John Naylor Discussion: https://postgr.es/m/CAFBsxsHUV7s7YrOm6hFz-Jq8Sc7K_yxTkfNZxsDV-DuM-k-gwg@mail.gmail.com
* Change SHA2 implementation based on OpenSSL to use EVP digest routinesMichael Paquier2020-12-04
| | | | | | | | | | | | | | | | | | | | | | | | The use of low-level hash routines is not recommended by upstream OpenSSL since 2000, and pgcrypto already switched to EVP as of 5ff4a67. This takes advantage of the refactoring done in 87ae969 that has introduced the allocation and free routines for cryptographic hashes. Since 1.1.0, OpenSSL does not publish the contents of the cryptohash contexts, forcing any consumers to rely on OpenSSL for all allocations. Hence, the resource owner callback mechanism gains a new set of routines to track and free cryptohash contexts when using OpenSSL, preventing any risks of leaks in the backend. Nothing is needed in the frontend thanks to the refactoring of 87ae969, and the resowner knowledge is isolated into cryptohash_openssl.c. Note that this also fixes a failure with SCRAM authentication when using FIPS in OpenSSL, but as there have been few complaints about this problem and as this causes an ABI breakage, no backpatch is done. Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Heikki Linnakangas Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz Discussion: https://postgr.es/m/20180911030250.GA27115@paquier.xyz
* Fix compilation warnings in cryptohash_openssl.cMichael Paquier2020-12-02
| | | | | | | These showed up with -O2. Oversight in 87ae969. Author: Fujii Masao Discussion: https://postgr.es/m/cee3df00-566a-400c-1252-67c3701f918a@oss.nttdata.com
* Move SHA2 routines to a new generic API layer for crypto hashesMichael Paquier2020-12-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two new routines to allocate a hash context and to free it are created, as these become necessary for the goal behind this refactoring: switch the all cryptohash implementations for OpenSSL to use EVP (for FIPS and also because upstream does not recommend the use of low-level cryptohash functions for 20 years). Note that OpenSSL hides the internals of cryptohash contexts since 1.1.0, so it is necessary to leave the allocation to OpenSSL itself, explaining the need for those two new routines. This part is going to require more work to properly track hash contexts with resource owners, but this not introduced here. Still, this refactoring makes the move possible. This reduces the number of routines for all SHA2 implementations from twelve (SHA{224,256,386,512} with init, update and final calls) to five (create, free, init, update and final calls) by incorporating the hash type directly into the hash context data. The new cryptohash routines are moved to a new file, called cryptohash.c for the fallback implementations, with SHA2 specifics becoming a part internal to src/common/. OpenSSL specifics are part of cryptohash_openssl.c. This infrastructure is usable for more hash types, like MD5 or HMAC. Any code paths using the internal SHA2 routines are adapted to report correctly errors, which are most of the changes of this commit. The zones mostly impacted are checksum manifests, libpq and SCRAM. Note that e21cbb4 was a first attempt to switch SHA2 to EVP, but it lacked the refactoring needed for libpq, as done here. This patch has been tested on Linux and Windows, with and without OpenSSL, and down to 1.0.1, the oldest version supported on HEAD. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz
* Add support for abstract Unix-domain socketsPeter Eisentraut2020-11-25
| | | | | | | | | | This is a variant of the normal Unix-domain sockets that don't use the file system but a separate "abstract" namespace. At the user interface, such sockets are represented by names starting with "@". Supported on Linux and Windows right now. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/6dee8574-b0ad-fc49-9c8c-2edc796f0033@2ndquadrant.com
* Fix minor issues with new unicode {de,re}composition codeMichael Paquier2020-11-07
| | | | | | | | | | | | The table generation script would incorrectly complain in the recomposition sorting when matching code points. This would not have caused the generation of an incorrect table. Note that this condition is not reachable yet, but could have been reached with future updates. pg_bswap.h does not need to be included in the frontend.x Author: John Naylor Discussion: https://postgr.es/m/CAFBsxsGWmExpvv=61vtDKCs7+kBbhkwBDL2Ph9CacziFKnV_yw@mail.gmail.com
* Second thoughts on TOAST decompression.Tom Lane2020-11-02
| | | | | | | | | On detecting a corrupted match tag, pglz_decompress() should just summarily return -1. Breaking out of the loop, as I did in dfc797730, doesn't quite guarantee that will happen. Also, we can use unlikely() on that check, just in case it helps. Backpatch to v13, like the previous patch.
* Fix two issues in TOAST decompression.Tom Lane2020-11-01
| | | | | | | | | | | | | | | | | | | | | | | | | | pglz_maximum_compressed_size() potentially underestimated the amount of compressed data required to produce N bytes of decompressed data; this is a fault in commit 11a078cf8. Separately from that, pglz_decompress() failed to protect itself against corrupt compressed data, particularly off == 0 in a match tag. Commit c60e520f6 turned such a situation into an infinite loop, where before it'd just have resulted in garbage output. The combination of these two bugs seems like it may explain bug #16694 from Tom Vijlbrief, though it's impossible to be quite sure without direct inspection of the failing session. (One needs to assume that the pglz_maximum_compressed_size() bug caused us to fail to fetch the second byte of a match tag, and what happened to be there instead was a zero. The reported infinite loop is hard to explain without off == 0, though.) Aside from fixing the bugs, rewrite associated comments for more clarity. Back-patch to v13 where both these commits landed. Discussion: https://postgr.es/m/16694-f107871e499ec114@postgresql.org
* Fix issue with --enable-coverage and the new unicode {de,re}composition codeMichael Paquier2020-10-24
| | | | | | | | | | | | | | | | | | | genhtml has been generating the following warning with this new code: WARNING: function data mismatch at /path/src/common/unicode_norm.c:102 HTML coverage reports care about the uniqueness of functions defined in source files, ignoring any assumptions around CFLAGS. 783f0cc introduced a duplicated definition of get_code_entry(), leading to a warning and potentially some incorrect data generated in the reports. This refactors the code so as the code has only one function declaration, fixing the warning. Oversight in 783f0cc. Reported-by: Tom Lane Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/207789.1603469272@sss.pgh.pa.us
* Improve performance of Unicode {de,re}composition in the backendMichael Paquier2020-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | This replaces the existing binary search with two perfect hash functions for the composition and the decomposition in the backend code, at the cost of slightly-larger binaries there (35kB in libpgcommon_srv.a). Per the measurements done, this improves the speed of the recomposition and decomposition by up to 30~40 times for the NFC and NFKC conversions, while all other operations get at least 40% faster. This is not as "good" as what libicu has, but it closes the gap a lot as per the feedback from Daniel Verite. The decomposition table remains the same, getting used for the binary search in the frontend code, where we care more about the size of the libraries like libpq over performance as this gets involved only in code paths related to the SCRAM authentication. In consequence, note that the perfect hash function for the recomposition needs to use a new inverse lookup array back to to the existing decomposition table. The size of all frontend deliverables remains unchanged, even with --enable-debug, including libpq. Author: John Naylor Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/CAFBsxsHUuMFCt6-pU+oG-F1==CmEp8wR+O+bRouXWu6i8kXuqA@mail.gmail.com
* Fix -Wcast-function-type warnings on Windows/MinGWPeter Eisentraut2020-10-21
| | | | | | | | After de8feb1f3a23465b5737e8a8c160e8ca62f61339, some warnings remained that were only visible when using GCC on Windows. Fix those as well. Note that the ecpg test source files don't use the full pg_config.h, so we can't use pg_funcptr_t there but have to do it the long way.
* Fix compilation warning in unicode_norm.cMichael Paquier2020-10-12
| | | | | | | | | | | | | | 80f8eb7 has introduced in unicode_norm.c some new code that uses htonl(). On at least some FreeBSD environments, it is possible to find that this function is undeclared, causing a compilation warning. It is worth noting that no buildfarm members have reported this issue. Instead of adding a new inclusion to arpa/inet.h, switch to use the equivalent defined in pg_bswap.h, to benefit from any built-in function if the compiler has one. Reported-by: Masahiko Sawada Discussion: https://postgr.es/m/CA+fd4k7D4b12ShywWj=AbcHZzV1-OqMjNe7RZAu+tgz5rd_11A@mail.gmail.com
* Use perfect hash for NFC and NFKC Unicode Normalization quick checkMichael Paquier2020-10-11
| | | | | | | | | | | | This makes the normalization quick check about 30% faster for NFC and 50% faster for NFKC than the binary search used previously. The hash lookup reuses the existing array of bit fields used for the binary search to get the quick check property and is generated as part of "make update-unicode" in src/common/unicode/. Author: John Naylor Reviewed-by: Mark Dilger, Michael Paquier Discussion: https://postgr.es/m/CACPNZCt4fbJ0_bGrN5QPt34N4whv=mszM0LMVQdoa2rC9UMRXA@mail.gmail.com
* Remove logging.c from the shared library of src/common/Michael Paquier2020-10-01
| | | | | | | | | | | | | | | | As fe0a1dc has proved, it is not a good concept to add to libpq dependencies that would enforce the error output to a central logging facility because it breaks the promise of reporting the error back to an application in a consistent way, with the application to potentially exit() suddenly if using pieces from for example jsonapi.c. prairiedog has allowed to report an actual design problem with fe0a1dc, but it will not be around forever, so removing logging.c from libpgcommon_shlib is a simple and much better long-term way to prevent any attempt to load the central logging in libraries with general purposes. Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20200928073330.GC2316@paquier.xyz
* Revert "Change SHA2 implementation based on OpenSSL to use EVP digest routines"Michael Paquier2020-09-29
| | | | | | | | | | | | | | | | | This reverts commit e21cbb4, as the switch to EVP routines requires a more careful design where we would need to have at least our wrapper routines return a status instead of issuing an error by themselves to let the caller do the error handling. The memory handling was also incorrect and could cause leaks in the backend if a failure happened, requiring most likely a callback to do the necessary cleanup as the only clean way to be able to allocate an EVP context requires the use of an allocation within OpenSSL. The potential rework of the wrappers also impacts the fallback implementation when not building with OpenSSL. Originally, prairiedog has reported a compilation failure, but after discussion with Tom Lane this needs a better design. Discussion: https://postgr.es/m/20200928073330.GC2316@paquier.xyz
* Change SHA2 implementation based on OpenSSL to use EVP digest routinesMichael Paquier2020-09-28
| | | | | | | | | | | | | The use of low-level hash routines is not recommended by upstream OpenSSL since 2000, and pgcrypto already switched to EVP as of 5ff4a67. Note that this also fixes a failure with SCRAM authentication when using FIPS in OpenSSL, but as there have been few complaints about this problem and as this causes an ABI breakage, no backpatch is done. Author: Michael Paquier, Alessandro Gherardi Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz Discussion: https://postgr.es/m/20180911030250.GA27115@paquier.xyz
* Rethink API for pg_get_line.c, one more time.Tom Lane2020-09-22
| | | | | | | | | | | | Further experience says that the appending behavior offered by pg_get_line_append is useful to only a very small minority of callers. For most, the requirement to reset the buffer after each line is just an error-prone nuisance. Hence, invent another alternative call pg_get_line_buf, which takes care of that detail. Noted while reviewing a patch from Daniel Gustafsson. Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
* Allow most keywords to be used as column labels without requiring AS.Tom Lane2020-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | Up to now, if you tried to omit "AS" before a column label in a SELECT list, it would only work if the column label was an IDENT, that is not any known keyword. This is rather unfriendly considering that we have so many keywords and are constantly growing more. In the wake of commit 1ed6b8956 it's possible to improve matters quite a bit. We'd originally tried to make this work by having some of the existing keyword categories be allowed without AS, but that didn't work too well, because each category contains a few special cases that don't work without AS. Instead, invent an entirely orthogonal keyword property "can be bare column label", and mark all keywords that way for which we don't get shift/reduce errors by doing so. It turns out that of our 450 current keywords, all but 39 can be made bare column labels, improving the situation by over 90%. This number might move around a little depending on future grammar work, but it's a pretty nice improvement. Mark Dilger, based on work by myself and Robert Haas; review by John Naylor Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
* Improve common/logging.c's support for multiple verbosity levels.Tom Lane2020-09-17
| | | | | | | | | | | | | | | | | | Instead of hard-wiring specific verbosity levels into the option processing of client applications, invent pg_logging_increase_verbosity() and encourage clients to implement --verbose by calling that. Then, the common convention that more -v's gets you more verbosity just works. In particular, this allows resurrection of the debug-grade messages that have long existed in pg_dump and its siblings. They were unreachable before this commit due to lack of a way to select PG_LOG_DEBUG logging level. (It appears that they may have been unreachable for some time before common/logging.c was introduced, too, so I'm not specifically blaming cc8d41511 for the oversight. One reason for thinking that is that it's now apparent that _allocAH()'s message needs a null-pointer guard. Testing might have failed to reveal that before 96bf88d52.) Discussion: https://postgr.es/m/1173106.1600116625@sss.pgh.pa.us
* Skip unnecessary stat() calls in walkdir().Thomas Munro2020-09-07
| | | | | | | | | | | Some kernels can tell us the type of a "dirent", so we can avoid a call to stat() or lstat() in many cases. Define a new function get_dirent_type() to contain that logic, for use by the backend and frontend versions of walkdir(), and perhaps other callers in future. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com
* Refactor pg_get_line() to expose an alternative StringInfo-based API.Tom Lane2020-09-06
| | | | | | | | | | | | | | | | | Letting the caller provide a StringInfo to read into is helpful when the caller needs to merge lines or otherwise modify the data after it's been read. Notably, now the code added by commit 8f8154a50 can use pg_get_line_append() instead of having its own copy of that logic. A follow-on commit will also make use of this. Also, since StringInfo buffers are a minimum of 1KB long, blindly using pg_get_line() in a loop can eat a lot more memory than one would expect. I discovered for instance that commit e0f05cd5b caused initdb to consume circa 10MB to read postgres.bki, even though that's under 1MB worth of data. A less memory-hungry alternative is to re-use the same StringInfo for all lines and pg_strdup the results. Discussion: https://postgr.es/m/1315832.1599345736@sss.pgh.pa.us
* Remove arbitrary restrictions on password length.Tom Lane2020-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch started out with the goal of harmonizing various arbitrary limits on password length, but after awhile a better idea emerged: let's just get rid of those fixed limits. recv_password_packet() has an arbitrary limit on the packet size, which we don't really need, so just drop it. (Note that this doesn't really affect anything for MD5 or SCRAM password verification, since those will hash the user's password to something shorter anyway. It does matter for auth methods that require a cleartext password.) Likewise remove the arbitrary error condition in pg_saslprep(). The remaining limits are mostly in client-side code that prompts for passwords. To improve those, refactor simple_prompt() so that it allocates its own result buffer that can be made as big as necessary. Actually, it proves best to make a separate routine pg_get_line() that has essentially the semantics of fgets(), except that it allocates a suitable result buffer and hence will never return a truncated line. (pg_get_line has a lot of potential applications to replace randomly-sized fgets buffers elsewhere, but I'll leave that for another patch.) I built pg_get_line() atop stringinfo.c, which requires moving that code to src/common/; but that seems fine since it was a poor fit for src/port/ anyway. This patch is mostly mine, but it owes a good deal to Nathan Bossart who pressed for a solution to the password length problem and created a predecessor patch. Also thanks to Peter Eisentraut and Stephen Frost for ideas and discussion. Discussion: https://postgr.es/m/09512C4F-8CB9-4021-B455-EF4C4F0D55A0@amazon.com
* Replace remaining StrNCpy() by strlcpy()Peter Eisentraut2020-08-10
| | | | | | | | | | | | | | | | | They are equivalent, except that StrNCpy() zero-fills the entire destination buffer instead of providing just one trailing zero. For all but a tiny number of callers, that's just overhead rather than being desirable. Remove StrNCpy() as it is now unused. In some cases, namestrcpy() is the more appropriate function to use. While we're here, simplify the API of namestrcpy(): Remove the return value, don't check for NULL input. Nothing was using that anyway. Also, remove a few unused name-related functions. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/44f5e198-36f6-6cdb-7fa9-60e34784daae%402ndquadrant.com
* Prevent compilation of frontend-only files in src/common/ with backendMichael Paquier2020-06-30
| | | | | | | | | | Any frontend-only file of src/common/ should include a protection to prevent such code to be included in the backend compilation. fe_memutils.c and restricted_token.c have been doing that, while file_utils.c (since bf5bb2e) and logging.c (since fc9a62a) forgot it. Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20200625080757.GI130132@paquier.xyz