aboutsummaryrefslogtreecommitdiff
path: root/src/backend/replication/logical/reorderbuffer.c
Commit message (Collapse)AuthorAge
...
* Improve hash_create's API for selecting simple-binary-key hash functions.Tom Lane2014-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if you wanted anything besides C-string hash keys, you had to specify a custom hashing function to hash_create(). Nearly all such callers were specifying tag_hash or oid_hash; which is tedious, and rather error-prone, since a caller could easily miss the opportunity to optimize by using hash_uint32 when appropriate. Replace this with a design whereby callers using simple binary-data keys just specify HASH_BLOBS and don't need to mess with specific support functions. hash_create() itself will take care of optimizing when the key size is four bytes. This nets out saving a few hundred bytes of code space, and offers a measurable performance improvement in tidbitmap.c (which was not exploiting the opportunity to use hash_uint32 for its 4-byte keys). There might be some wins elsewhere too, I didn't analyze closely. In future we could look into offering a similar optimized hashing function for 8-byte keys. Under this design that could be done in a centralized and machine-independent fashion, whereas getting it right for keys of platform-dependent sizes would've been notationally painful before. For the moment, the old way still works fine, so as not to break source code compatibility for loadable modules. Eventually we might want to remove tag_hash and friends from the exported API altogether, since there's no real need for them to be explicitly referenced from outside dynahash.c. Teodor Sigaev and Tom Lane
* Fix assorted confusion between Oid and int32.Tom Lane2014-12-11
| | | | | | | | | | | In passing, also make some debugging elog's in pgstat.c a bit more consistently worded. Back-patch as far as applicable (9.3 or 9.4; none of these mistakes are really old). Mark Dilger identified and patched the type violations; the message rewordings are mine.
* Revamp the WAL record format.Heikki Linnakangas2014-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.
* Fix and improve cache invalidation logic for logical decoding.Andres Freund2014-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are basically three situations in which logical decoding needs to perform cache invalidation. During/After replaying a transaction with catalog changes, when skipping a uninteresting transaction that performed catalog changes and when erroring out while replaying a transaction. Unfortunately these three cases were all done slightly differently - partially because 8de3e410fa, which greatly simplifies matters, got committed in the midst of the development of logical decoding. The actually problematic case was when logical decoding skipped transaction commits (and thus processed invalidations). When used via the SQL interface cache invalidation could access the catalog - bad, because we didn't set up enough state to allow that correctly. It'd not be hard to setup sufficient state, but the simpler solution is to always perform cache invalidation outside a valid transaction. Also make the different cache invalidation cases look as similar as possible, to ease code review. This fixes the assertion failure reported by Antonin Houska in 53EE02D9.7040702@gmail.com. The presented testcase has been expanded into a regression test. Backpatch to 9.4, where logical decoding was introduced.
* Message improvementsPeter Eisentraut2014-11-11
|
* Assorted message fixes and improvementsPeter Eisentraut2014-09-05
|
* Fix decoding of MULTI_INSERTs when rows other than the last are toasted.Andres Freund2014-07-06
| | | | | | | | | | | | | | | | | | | | | | | When decoding the results of a HEAP2_MULTI_INSERT (currently only generated by COPY FROM) toast columns for all but the last tuple weren't replaced by their actual contents before being handed to the output plugin. The reassembled toast datums where disregarded after every REORDER_BUFFER_CHANGE_(INSERT|UPDATE|DELETE) which is correct for plain inserts, updates, deletes, but not multi inserts - there we generate several REORDER_BUFFER_CHANGE_INSERTs for a single xl_heap_multi_insert record. To solve the problem add a clear_toast_afterwards boolean to ReorderBufferChange's union member that's used by modifications. All row changes but multi_inserts always set that to true, but multi_insert sets it only for the last change generated. Add a regression test covering decoding of multi_inserts - there was none at all before. Backpatch to 9.4 where logical decoding was introduced. Bug found by Petr Jelinek.
* Rename logical decoding's pg_llog directory to pg_logical.Andres Freund2014-07-02
| | | | | | | | | | | | | | | The old name wasn't very descriptive as of actual contents of the directory, which are historical snapshots in the snapshots/ subdirectory and mappingdata for rewritten tuples in mappings/. There's been a fair amount of discussion what would be a good name. I'm settling for pg_logical because it's likely that further data around logical decoding and replication will need saving in the future. Also add the missing entry for the directory into storage.sgml's list of PGDATA contents. Bumps catversion as the data directories won't be compatible.
* Fix a bunch of functions that were declared static then defined not-static.Tom Lane2014-05-17
| | | | Per testing with a compiler that whines about this.
* Code review for logical decoding patch.Robert Haas2014-05-09
| | | | | | | | | | Post-commit review identified a number of places where addition was used instead of multiplication or memory wasn't zeroed where it should have been. This commit also fixes one case where a structure member was mis-initialized, and moves another memory allocation closer to the place where the allocated storage is used for clarity. Andres Freund
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Pass sensible value to memset() when randomizing reorderbuffer's tuple slab.Heikki Linnakangas2014-05-05
| | | | | | This is entirely harmless, but still wrong. Noticed by coverity. Andres Freund
* Improve error messages in reorderbuffer.c.Tom Lane2014-04-30
| | | | | | | | Be more clear about failure cases in relfilenode->relation lookup, and fix some other places that were inconsistent or not per our message style guidelines. Andres Freund and Tom Lane
* Rationalize common/relpath.[hc].Tom Lane2014-04-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a73018392636ce832b09b5c31f6ad1f18a4643ea created rather a mess by putting dependencies on backend-only include files into include/common. We really shouldn't do that. To clean it up: * Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in catalog/catalog.h. We won't consider this symbol part of the FE/BE API. * Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider relpath.h as the source of truth for fork numbers, since relpath.c was already partially serving that function, and anyway relfilenode.h was kind of a random place for that enum. * So, relfilenode.h now includes relpath.h rather than vice-versa. This direction of dependency is fine. (That allows most, but not quite all, of the existing explicit #includes of relpath.h to go away again.) * Push forkname_to_number from catalog.c to relpath.c, just to centralize fork number stuff a bit better. * Push GetDatabasePath from catalog.c to relpath.c; it was rather odd that the previous commit didn't keep this together with relpath(). * To avoid needing relfilenode.h in common/, redefine the underlying function (now called GetRelationPath) as taking separate OID arguments, and make the APIs using RelFileNode or RelFileNodeBackend into macro wrappers. (The macros have a potential multiple-eval risk, but none of the existing call sites have an issue with that; one of them had such a risk already anyway.) * Fix failure to follow the directions when "init" fork type was added; specifically, the errhint in forkname_to_number wasn't updated, and neither was the SGML documentation for pg_relation_size(). * Fix tablespace-path-too-long check in CreateTableSpace() to account for fork-name component of maximum-length pathnames. This requires putting FORKNAMECHARS into a header file, but it was rather useless (and actually unreferenced) where it was. The last couple of items are potentially back-patchable bug fixes, if anyone is sufficiently excited about them; but personally I'm not. Per a gripe from Christoph Berg about how include/common wasn't self-contained.
* Remove unportable use of anonymous unions from reorderbuffer.h.Tom Lane2014-03-07
| | | | | | | | | | | | | In b89e151054a I had assumed it was ok to use anonymous unions as struct members, but while a longstanding extension in many compilers, it's only been standardized in C11. To fix, remove one of the anonymous unions which tried to hide some implementation specific enum values and give the other a name. The latter unfortunately requires changes in output plugins, but since the feature has only been added a few days ago... Andres Freund
* Fix some typos introduced by the logical decoding patch.Robert Haas2014-03-05
| | | | Erik Rijkers
* Introduce logical decoding.Robert Haas2014-03-03
This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.