aboutsummaryrefslogtreecommitdiff
path: root/src/backend/access/spgist
Commit message (Collapse)AuthorAge
* Fix typos in comments, code and documentationMichael Paquier2023-01-03
| | | | | | | | | | While on it, newlines are removed from the end of two elog() strings. The others are simple grammar mistakes. One comment in pg_upgrade referred incorrectly to sequences since a7e5457. Author: Justin Pryzby Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com Backpatch-through: 11
* Replace RelationOpenSmgr() with RelationGetSmgr().Tom Lane2022-11-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a back-patch of the v15-era commit f10f0ae42 into older supported branches. The idea is to design out bugs in which an ill-timed relcache flush clears rel->rd_smgr partway through some code sequence that wasn't expecting that. We had another report today of a corner case that reliably crashes v14 under debug_discard_caches (nee CLOBBER_CACHE_ALWAYS), and therefore would crash once in a blue moon in the field. We're unlikely to get rid of all such code paths unless we adopt the more rigorous coding rules instituted by f10f0ae42. Therefore, even though this is a bit invasive, it's time to back-patch. Some comfort can be taken in the fact that f10f0ae42 has been in v15 for 16 months without problems. I left the RelationOpenSmgr macro present in the back branches, even though no core code should use it anymore, in order to not break third-party extensions in minor releases. Such extensions might opt to start using RelationGetSmgr instead, to reduce their code differential between v15 and earlier branches. This carries a hazard of failing to compile against headers from existing minor releases. However, once compiled the extension should work fine even with such releases, because RelationGetSmgr is a "static inline" function so it creates no link-time dependency. So depending on distribution practices, that might be an OK tradeoff. Per report from Spyridon Dimitrios Agathos. Original patch by Amul Sul. Discussion: https://postgr.es/m/CAFM5RaqdgyusQvmWkyPYaWMwoK5gigdtW-7HcgHgOeAw7mqJ_Q@mail.gmail.com Discussion: https://postgr.es/m/CANiYTQsU7yMFpQYnv=BrcRVqK_3U3mtAzAsJCaqtzsDHfsUbdQ@mail.gmail.com
* Fix SP-GiST scan initialization logic for binary-compatible cases.Tom Lane2021-11-20
| | | | | | | | | | | | | | | | | | | | | | | Commit ac9099fc1 rearranged the logic in spgGetCache() that determines the index's attType (nominal input data type) and leafType (actual type stored in leaf index tuples). Turns out this broke things for the case where (a) the actual input data type is different from the nominal type, (b) the opclass's config function leaves leafType defaulted, and (c) the opclass has no "compress" function. (b) caused us to assign the actual input data type as leafType, and then since that's not attType, we complained that a "compress" function is required. For non-polymorphic opclasses, condition (a) arises in binary-compatible cases, such as using SP-GiST text_ops for a varchar column, or using any opclass on a domain over its nominal input type. To fix, use attType for leafType when the index's declared column type is different from but binary-compatible with attType. Do this only in the defaulted-leafType case, to avoid overriding any explicit selection made by the opclass. Per bug #17294 from Ilya Anfimov. Back-patch to v14. Discussion: https://postgr.es/m/17294-8f6c7962ce877edc@postgresql.org
* Count SP-GiST index scans in pg_stat statistics.Tom Lane2021-08-27
| | | | | | | | | | | | | | | | | | | Somehow, spgist overlooked the need to call pgstat_count_index_scan(). Hence, pg_stat_all_indexes.idx_scan and equivalent columns never became nonzero for an SP-GiST index, although the related per-tuple counters worked fine. This fix works a bit differently from other index AMs, in that the counter increment occurs in spgrescan not spggettuple/spggetbitmap. It looks like this won't make the user-visible semantics noticeably different, so I won't go to the trouble of introducing an is-this- the-first-call flag just to make the counter bumps happen in the same places. Per bug #17163 from Christian Quest. Back-patch to all supported versions. Discussion: https://postgr.es/m/17163-b8c5cc88322a5e92@postgresql.org
* Re-order pg_attribute columns to eliminate some padding space.Tom Lane2021-05-23
| | | | | | | | | | | | | | | | | | Now that attcompression is just a char, there's a lot of wasted padding space after it. Move it into the group of char-wide columns to save a net of 4 bytes per pg_attribute entry. While we're at it, swap the order of attstorage and attalign to make for a more logical grouping of these columns. Also re-order actions in related code to match the new field ordering. This patch also fixes one outright bug: equalTupleDescs() failed to compare attcompression. That could, for example, cause relcache reload to fail to adopt a new value following a change. Michael Paquier and Tom Lane, per a gripe from Andres Freund. Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
* Prevent infinite insertion loops in spgdoinsert().Tom Lane2021-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly we just relied on operator classes that assert longValuesOK to eventually shorten the leaf value enough to fit on an index page. That fails since the introduction of INCLUDE-column support (commit 09c1c6ab4), because the INCLUDE columns might alone take up more than a page, meaning no amount of leaf-datum compaction will get the job done. At least with spgtextproc.c, that leads to an infinite loop, since spgtextproc.c won't throw an error for not being able to shorten the leaf datum anymore. To fix without breaking cases that would otherwise work, add logic to spgdoinsert() to verify that the leaf tuple size is decreasing after each "choose" step. Some opclasses might not decrease the size on every single cycle, and in any case, alignment roundoff of the tuple size could obscure small gains. Therefore, allow up to 10 cycles without additional savings before throwing an error. (Perhaps this number will need adjustment, but it seems quite generous right now.) As long as we've developed this logic, let's back-patch it. The back branches don't have INCLUDE columns to worry about, but this seems like a good defense against possible bugs in operator classes. We already know that an infinite loop here is pretty unpleasant, so having a defense seems to outweigh the risk of breaking things. (Note that spgtextproc.c is actually the only known opclass with longValuesOK support, so that this is all moot for known non-core opclasses anyway.) Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
* Fix query-cancel handling in spgdoinsert().Tom Lane2021-05-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Knowing that a buggy opclass could cause an infinite insertion loop, spgdoinsert() intended to allow its loop to be interrupted by query cancel. However, that never actually worked, because in iterations after the first, we'd be holding buffer lock(s) which would cause InterruptHoldoffCount to be positive, preventing servicing of the interrupt. To fix, check if an interrupt is pending, and if so fall out of the insertion loop and service the interrupt after we've released the buffers. If it was indeed a query cancel, that's the end of the matter. If it was a non-canceling interrupt reason, make use of the existing provision to retry the whole insertion. (This isn't as wasteful as it might seem, since any upper-level index tuples we already created should be usable in the next attempt.) While there's no known instance of such a bug in existing release branches, it still seems like a good idea to back-patch this to all supported branches, since the behavior is fairly nasty if a loop does happen --- not only is it uncancelable, but it will quickly consume memory to the point of an OOM failure. In any case, this code is certainly not working as intended. Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com
* Remove redundant memset(0) calls for page init of some index AMsMichael Paquier2021-04-07
| | | | | | | | | | | | | | | | | | | Bloom, GIN, GiST and SP-GiST rely on PageInit() to initialize the contents of a page, and this routine fills entirely a page with zeros for a size of BLCKSZ, including the special space. Those index AMs have been using an extra memset() call to fill with zeros the special page space, or even the whole page, which is not necessary as PageInit() already does this work, so let's remove them. GiST was not doing this extra call, but has commented out a system call that did so since 6236991. While on it, remove one MAXALIGN() for SP-GiST as PageInit() takes care of that. This makes the whole page initialization logic more consistent across all index AMs. Author: Bharath Rupireddy Reviewed-by: Vignesh C, Mahendra Singh Thalor Discussion: https://postgr.es/m/CALj2ACViOo2qyaPT7krWm4LRyRTw9kOXt+g6PfNmYuGA=YHj9A@mail.gmail.com
* Support INCLUDE'd columns in SP-GiST.Tom Lane2021-04-05
| | | | | | | | | | | | | Not much to say here: does what it says on the tin. We steal a previously-always-zero bit from the nextOffset field of leaf index tuples in order to track whether there is a nulls bitmap. Otherwise it works about like included columns in other index types. Pavel Borisov, reviewed by Andrey Borodin and Anastasia Lubennikova, and rather heavily editorialized on by me Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com
* Fix more confusion in SP-GiST.Tom Lane2021-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | spg_box_quad_leaf_consistent unconditionally returned the leaf datum as leafValue, even though in its usage for poly_ops that value is of completely the wrong type. In versions before 12, that was harmless because the core code did nothing with leafValue in non-index-only scans ... but since commit 2a6368343, if we were doing a KNN-style scan, spgNewHeapItem would unconditionally try to copy the value using the wrong datatype parameters. Said copying is a waste of time and space if we're not going to return the data, but it accidentally failed to fail until I fixed the datatype confusion in ac9099fc1. Hence, change spgNewHeapItem to not copy the datum unless we're actually going to return it later. This saves cycles and dodges the question of whether lossy opclasses are returning the right type. Also change spg_box_quad_leaf_consistent to not return data that might be of the wrong type, as insurance against somebody introducing a similar bug into the core code in future. It seems like a good idea to back-patch these two changes into v12 and v13, although I'm afraid to change spgNewHeapItem's mistaken idea of which datatype to use in those branches. Per buildfarm results from ac9099fc1. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us
* Fix confusion in SP-GiST between attribute type and leaf storage type.Tom Lane2021-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the documentation, the attType passed to the opclass config function (and also relied on by the core code) is the type of the heap column or expression being indexed. But what was actually being passed was the type stored for the index column. This made no difference for user-defined SP-GiST opclasses, because we weren't allowing the STORAGE clause of CREATE OPCLASS to be used, so the two types would be the same. But it's silly not to allow that, seeing that the built-in poly_ops opclass has a different value for opckeytype than opcintype, and that if you want to do lossy storage then the types must really be different. (Thus, user-defined opclasses doing lossy storage had to lie about what type is in the index.) Hence, remove the restriction, and make sure that we use the input column type not opckeytype where relevant. For reasons of backwards compatibility with existing user-defined opclasses, we can't quite insist that the specified leafType match the STORAGE clause; instead just add an amvalidate() warning if they don't match. Also fix some bugs that would only manifest when trying to return index entries when attType is different from attLeafType. It's not too surprising that these have not been reported, because the only usual reason for such a difference is to store the leaf value lossily, rendering index-only scans impossible. Add a src/test/modules module to exercise cases where attType is different from attLeafType and yet index-only scan is supported. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us
* Rethink handling of pass-by-value leaf datums in SP-GiST.Tom Lane2021-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing convention in SP-GiST is that any pass-by-value datatype is stored in Datum representation, i.e. it's of width sizeof(Datum) even when typlen is less than that. This is okay, or at least it's too late to change it, for prefix datums and node-label datums in inner (upper) tuples. But it's problematic for leaf datums, because we'd prefer those to be stored in Postgres' standard on-disk representation so that we can easily extend leaf tuples to carry additional "included" columns. I believe, however, that we can get away with just up and changing that. This would be an unacceptable on-disk-format break, but there are two big mitigating factors: 1. It seems quite unlikely that there are any SP-GiST opclasses out there that use pass-by-value leaf datatypes. Certainly none of the ones in core do, nor has codesearch.debian.net heard of any. Given what SP-GiST is good for, it's hard to conceive of a use-case where the leaf-level values would be both small and fixed-width. (As an example, if you wanted to index text values with the leaf level being just a byte, then every text string would have to be represented with one level of inner tuple per preceding byte, which would be horrendously space-inefficient and slow to access. You always want to use as few inner-tuple levels as possible, leaving as much as possible in the leaf values.) 2. Even granting that you have such an index, this change only breaks things on big-endian machines. On little-endian, the high order bytes of the Datum format will now just appear to be alignment padding space. So, change the code to store pass-by-value leaf datums in their usual on-disk form. Inner-tuple datums are not touched. This is extracted from a larger patch that intends to add support for "included" columns. I'm committing it separately for visibility in our commit logs. Pavel Borisov and Tom Lane, reviewed by Andrey Borodin Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com
* VACUUM VERBOSE: Count "newly deleted" index pages.Peter Geoghegan2021-02-25
| | | | | | | | | | | | | | | | | | | | | | Teach VACUUM VERBOSE to report on pages deleted by the _current_ VACUUM operation -- these are newly deleted pages. VACUUM VERBOSE continues to report on the total number of deleted pages in the entire index (no change there). The former is a subset of the latter. The distinction between each category of deleted index page only arises with index AMs where page deletion is supported and is decoupled from page recycling for performance reasons. This is follow-up work to commit e5d8a999, which made nbtree store 64-bit XIDs (not 32-bit XIDs) in pages at the point at which they're deleted. Note that the btm_last_cleanup_num_delpages metapage field added by that commit usually gets set to pages_newly_deleted. The exceptions (the scenarios in which they're not equal) all seem to be tricky cases for the implementation (of page deletion and recycling) in general. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX%3Dd5u-tRYhi-F4wnV2uN2zHpMUXw%40mail.gmail.com
* Pass down "logically unchanged index" hint.Peter Geoghegan2021-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an executor aminsert() hint mechanism that informs index AMs that the incoming index tuple (the tuple that accompanies the hint) is not being inserted by execution of an SQL statement that logically modifies any of the index's key columns. The hint is received by indexes when an UPDATE takes place that does not apply an optimization like heapam's HOT (though only for indexes where all key columns are logically unchanged). Any index tuple that receives the hint on insert is expected to be a duplicate of at least one existing older version that is needed for the same logical row. Related versions will typically be stored on the same index page, at least within index AMs that apply the hint. Recognizing the difference between MVCC version churn duplicates and true logical row duplicates at the index AM level can help with cleanup of garbage index tuples. Cleanup can intelligently target tuples that are likely to be garbage, without wasting too many cycles on less promising tuples/pages (index pages with little or no version churn). This is infrastructure for an upcoming commit that will teach nbtree to perform bottom-up index deletion. No index AM actually applies the hint just yet. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
* Update copyright for 2021Bruce Momjian2021-01-02
| | | | Backpatch-through: 9.5
* Rename the "point is strictly above/below point" comparison operators.Tom Lane2020-11-23
| | | | | | | | | | | | | | | | Historically these were called >^ and <^, but that is inconsistent with the similar box, polygon, and circle operators, which are named |>> and <<| respectively. Worse, the >^ and <^ names are used for *not* strict above/below tests for the box type. Hence, invent new operators following the more common naming. The old operators remain available for now, and are still accepted by the relevant index opclasses too. But there's a deprecation notice, so maybe we can get rid of them someday. Emre Hasegeli, reviewed by Pavel Borisov Discussion: https://postgr.es/m/24348.1587444160@sss.pgh.pa.us
* snapshot scalability: Don't compute global horizons while building snapshots.Andres Freund2020-08-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To make GetSnapshotData() more scalable, it cannot not look at at each proc's xmin: While snapshot contents do not need to change whenever a read-only transaction commits or a snapshot is released, a proc's xmin is modified in those cases. The frequency of xmin modifications leads to, particularly on higher core count systems, many cache misses inside GetSnapshotData(), despite the data underlying a snapshot not changing. That is the most significant source of GetSnapshotData() scaling poorly on larger systems. Without accessing xmins, GetSnapshotData() cannot calculate accurate horizons / thresholds as it has so far. But we don't really have to: The horizons don't actually change that much between GetSnapshotData() calls. Nor are the horizons actually used every time a snapshot is built. The trick this commit introduces is to delay computation of accurate horizons until there use and using horizon boundaries to determine whether accurate horizons need to be computed. The use of RecentGlobal[Data]Xmin to decide whether a row version could be removed has been replaces with new GlobalVisTest* functions. These use two thresholds to determine whether a row can be pruned: 1) definitely_needed, indicating that rows deleted by XIDs >= definitely_needed are definitely still visible. 2) maybe_needed, indicating that rows deleted by XIDs < maybe_needed can definitely be removed GetSnapshotData() updates definitely_needed to be the xmin of the computed snapshot. When testing whether a row can be removed (with GlobalVisTestIsRemovableXid()) and the tested XID falls in between the two (i.e. XID >= maybe_needed && XID < definitely_needed) the boundaries can be recomputed to be more accurate. As it is not cheap to compute accurate boundaries, we limit the number of times that happens in short succession. As the boundaries used by GlobalVisTestIsRemovableXid() are never reset (with maybe_needed updated by GetSnapshotData()), it is likely that further test can benefit from an earlier computation of accurate horizons. To avoid regressing performance when old_snapshot_threshold is set (as that requires an accurate horizon to be computed), heap_page_prune_opt() doesn't unconditionally call TransactionIdLimitedForOldSnapshots() anymore. Both the computation of the limited horizon, and the triggering of errors (with SetOldSnapshotThresholdTimestamp()) is now only done when necessary to remove tuples. This commit just removes the accesses to PGXACT->xmin from GetSnapshotData(), but other members of PGXACT residing in the same cache line are accessed. Therefore this in itself does not result in a significant improvement. Subsequent commits will take advantage of the fact that GetSnapshotData() now does not need to access xmins anymore. Note: This contains a workaround in heap_page_prune_opt() to keep the snapshot_too_old tests working. While that workaround is ugly, the tests currently are not meaningful, and it seems best to address them separately. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
* Invent "amadjustmembers" AM method for validating opclass members.Tom Lane2020-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | This allows AM-specific knowledge to be applied during creation of pg_amop and pg_amproc entries. Specifically, the AM knows better than core code which entries to consider as required or optional. Giving the latter entries the appropriate sort of dependency allows them to be dropped without taking out the whole opclass or opfamily; which is something we'd like to have to correct obsolescent entries in extensions. This callback also opens the door to performing AM-specific validity checks during opclass creation, rather than hoping than an opclass developer will remember to test with "amvalidate". For the most part I've not actually added any such checks yet; that can happen in a follow-on patch. (Note that we shouldn't remove any tests from "amvalidate", as those are still needed to cross-check manually constructed entries in the initdb data. So adding tests to "amadjustmembers" will be somewhat duplicative, but it seems like a good idea anyway.) Patch by me, reviewed by Alexander Korotkov, Hamid Akhtar, and Anastasia Lubennikova. Discussion: https://postgr.es/m/4578.1565195302@sss.pgh.pa.us
* Fix masking of SP-GiST pages during xlog consistency checkAlexander Korotkov2020-06-20
| | | | | | | | | | spg_mask() didn't take into account that pd_lower equal to SizeOfPageHeaderData is still valid value. This commit fixes that. Backpatch to 11, where spg_mask() pg_lower check was introduced. Reported-by: Michael Paquier Discussion: https://postgr.es/m/20200615131405.GM52676%40paquier.xyz Backpatch-through: 11
* Initial pgindent and pgperltidy run for v13.Tom Lane2020-05-14
| | | | | | | | | | | Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit c9d297751, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.
* Fix missing SP-GiST support in 911e702077Alexander Korotkov2020-03-30
| | | | | 911e702077 misses setting of amoptsprocnum for SP-GiST. This commit fixes that.
* Implement operator class parametersAlexander Korotkov2020-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
* Clean up newlines following left parenthesesAlvaro Herrera2020-01-30
| | | | | | | | | | | | We used to strategically place newlines after some function call left parentheses to make pgindent move the argument list a few chars to the left, so that the whole line would fit under 80 chars. However, pgindent no longer does that, so the newlines just made the code vertically longer for no reason. Remove those newlines, and reflow some of those lines for some extra naturality. Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
* Introduce IndexAM fields for parallel vacuum.Amit Kapila2020-01-15
| | | | | | | | | | | | | | | | | Introduce new fields amusemaintenanceworkmem and amparallelvacuumoptions in IndexAmRoutine for parallel vacuum. The amusemaintenanceworkmem tells whether a particular IndexAM uses maintenance_work_mem or not. This will help in controlling the memory used by individual workers as otherwise, each worker can consume memory equal to maintenance_work_mem. The amparallelvacuumoptions tell whether a particular IndexAM participates in a parallel vacuum and if so in which phase (bulkdelete, vacuumcleanup) of vacuum. Author: Masahiko Sawada and Amit Kapila Reviewed-by: Dilip Kumar, Amit Kapila, Tomas Vondra and Robert Haas Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com https://postgr.es/m/CAA4eK1LmcD5aPogzwim5Nn58Ki+74a6Edghx4Wd8hAskvHaq5A@mail.gmail.com
* Update copyrights for 2020Bruce Momjian2020-01-01
| | | | Backpatch-through: update all files in master, backpatch legal files through 9.4
* Revert "Rename files and headers related to index AM"Michael Paquier2019-12-27
| | | | | | | | This follows multiple complains from Peter Geoghegan, Andres Freund and Alvaro Herrera that this issue ought to be dug more before actually happening, if it happens. Discussion: https://postgr.es/m/20191226144606.GA5659@alvherre.pgsql
* Rename files and headers related to index AMMichael Paquier2019-12-25
| | | | | | | | | | | | | | | | | | | | | The following renaming is done so as source files related to index access methods are more consistent with table access methods (the original names used for index AMs ware too generic, and could be confused as including features related to table AMs): - amapi.h -> indexam.h. - amapi.c -> indexamapi.c. Here we have an equivalent with backend/access/table/tableamapi.c. - amvalidate.c -> indexamvalidate.c. - amvalidate.h -> indexamvalidate.h. - genam.c -> indexgenam.c. - genam.h -> indexgenam.h. This has been discussed during the development of v12 when table AM was worked on, but the renaming never happened. Author: Michael Paquier Reviewed-by: Fabien Coelho, Julien Rouhaud Discussion: https://postgr.es/m/20191223053434.GF34339@paquier.xyz
* Refactor reloption handling for index AMs in-coreMichael Paquier2019-11-25
| | | | | | | | | | | | | | | | | | This reworks the reloption parsing and build of a couple of index AMs by creating new structures for each index AM's options. This split was already done for BRIN, GIN and GiST (which actually has a fillfactor parameter), but not for hash, B-tree and SPGiST which relied on StdRdOptions due to an overlap with the default option set. This saves a couple of bytes for rd_options in each relcache entry with indexes making use of relation options, and brings more consistency between all index AMs. While on it, add a couple of AssertMacro() calls to make sure that utility macros to grab values of reloptions are used with the expected index AM. Author: Nikolay Shaplov Reviewed-by: Amit Langote, Michael Paquier, Álvaro Herrera, Dent John Discussion: https://postgr.es/m/4127670.gFlpRb6XCm@x200m
* Make the order of the header file includes consistent in backend modules.Amit Kapila2019-11-12
| | | | | | | | | | | Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order of header file inclusion consistent for backend modules. In the passing, removed a couple of duplicate inclusions. Author: Vignesh C Reviewed-by: Kuntal Ghosh and Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
* Pass ItemPointer not HeapTuple to IndexBuildCallback.Andres Freund2019-11-08
| | | | | | | | | | Not all AMs use HeapTuples internally, making it inconvenient to pass a HeapTuple. As the index callbacks really only need the TID, not the full tuple, modify callback to only take ItemPointer. Author: Ashwin Agrawal Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CALfoeis6=8ehuR=VNtHvj3z16cYfCwPdTcpaxU+sfSUJ5QgR3g@mail.gmail.com
* Split all OBJS style lines in makefiles into one-line-per-entry style.Andres Freund2019-11-05
| | | | | | | | | | | | | | | When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
* Fix bug in pairingheap_SpGistSearchItem_cmp()Alexander Korotkov2019-09-25
| | | | | | | | | Our item contains only so->numberOfNonNullOrderBys of distances. Reflect that in the loop upper bound. Discussion: https://postgr.es/m/53536807-784c-e029-6e92-6da802ab8d60%40postgrespro.ru Author: Nikita Glukhov Backpatch-through: 12
* Improve handling of NULLs in KNN-GiST and KNN-SP-GiSTAlexander Korotkov2019-09-19
| | | | | | | | | | | | | | | | | | | | | This commit improves subject in two ways: * It removes ugliness of 02f90879e7, which stores distance values and null flags in two separate arrays after GISTSearchItem struct. Instead we pack both distance value and null flag in IndexOrderByDistance struct. Alignment overhead should be negligible, because we typically deal with at most few "col op const" expressions in ORDER BY clause. * It fixes handling of "col op NULL" expression in KNN-SP-GiST. Now, these expression are not passed to support functions, which can't deal with them. Instead, NULL result is implicitly assumed. It future we may decide to teach support functions to deal with NULL arguments, but current solution is bugfix suitable for backpatch. Reported-by: Nikita Glukhov Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4
* Fix handling of NULL distances in KNN-GiSTAlexander Korotkov2019-09-08
| | | | | | | | | | | | | | In order to implement NULL LAST semantic GiST previously assumed distance to the NULL value to be Inf. However, our distance functions can return Inf and NaN for non-null values. In such cases, NULL LAST semantic appears to be broken. This commit fixes that by introducing separate array of null flags for distances. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Backpatch-through: 9.4
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-29
| | | | | | | | This is numbered take 8, and addresses again a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/b137b5eb-9c95-9c2f-586e-38aba7d59788@gmail.com
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-22
| | | | | | | | This is numbered take 7, and addresses a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/dff75442-2468-f74f-568c-6006e141062f@gmail.com
* Fix inconsistencies and typos in the treeMichael Paquier2019-07-16
| | | | | | | | | | | This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com
* Initial pgindent run for v12.Tom Lane2019-05-22
| | | | | | | | This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
* Standardize ItemIdData terminology.Peter Geoghegan2019-05-13
| | | | | | | | | | | | | The term "item pointer" should not be used to refer to ItemIdData variables, since that is needlessly ambiguous. Only ItemPointerData/ItemPointer variables should be called item pointers. To fix, establish the convention that ItemIdData variables should always be referred to either as "item identifiers" or "line pointers". The term "item identifier" already predominates in docs and translatable messages, and so should be the preferred alternative there. Discussion: https://postgr.es/m/CAH2-Wz=c=MZQjUzde3o9+2PLAPuHTpVZPPdYxN=E4ndQ2--8ew@mail.gmail.com
* Generate less WAL during GiST, GIN and SP-GiST index build.Heikki Linnakangas2019-04-03
| | | | | | | | | | | | | | | | | | | | Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov
* Report progress of CREATE INDEX operationsAlvaro Herrera2019-04-02
| | | | | | | | | | | | | | | | | | | | This uses the progress reporting infrastructure added by c16dc1aca5e0, adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY. There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate for btrees, including reportage for parallel index builds and the separate phases that btree index creation uses; other index AMs, which are much simpler in their building procedures, have simplistic reporting only, but that seems sufficient, at least for non-concurrent builds. The index-AM-agnostic part is fairly complete, providing insight into the CONCURRENTLY wait phases as well as block-based progress during the index validation table scan. (The index validation index scan requires patching each AM, which has not been included here.) Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
* tableam: Support for an index build's initial table scan(s).Andres Freund2019-03-27
| | | | | | | | | | | | | To support building indexes over tables of different AMs, the scans to do so need to be routed through the table AM. While moving a fair amount of code, nearly all the changes are just moving code to below a callback. Currently the range based interface wouldn't make much sense for non block based table AMs. But that seems aceptable for now. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
* Collations with nondeterministic comparisonPeter Eisentraut2019-03-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
* tableam: Add and use scan APIs.Andres Freund2019-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Too allow table accesses to be not directly dependent on heap, several new abstractions are needed. Specifically: 1) Heap scans need to be generalized into table scans. Do this by introducing TableScanDesc, which will be the "base class" for individual AMs. This contains the AM independent fields from HeapScanDesc. The previous heap_{beginscan,rescan,endscan} et al. have been replaced with a table_ version. There's no direct replacement for heap_getnext(), as that returned a HeapTuple, which is undesirable for a other AMs. Instead there's table_scan_getnextslot(). But note that heap_getnext() lives on, it's still used widely to access catalog tables. This is achieved by new scan_begin, scan_end, scan_rescan, scan_getnextslot callbacks. 2) The portion of parallel scans that's shared between backends need to be able to do so without the user doing per-AM work. To achieve that new parallelscan_{estimate, initialize, reinitialize} callbacks are introduced, which operate on a new ParallelTableScanDesc, which again can be subclassed by AMs. As it is likely that several AMs are going to be block oriented, block oriented callbacks that can be shared between such AMs are provided and used by heap. table_block_parallelscan_{estimate, intiialize, reinitialize} as callbacks, and table_block_parallelscan_{nextpage, init} for use in AMs. These operate on a ParallelBlockTableScanDesc. 3) Index scans need to be able to access tables to return a tuple, and there needs to be state across individual accesses to the heap to store state like buffers. That's now handled by introducing a sort-of-scan IndexFetchTable, which again is intended to be subclassed by individual AMs (for heap IndexFetchHeap). The relevant callbacks for an AM are index_fetch_{end, begin, reset} to create the necessary state, and index_fetch_tuple to retrieve an indexed tuple. Note that index_fetch_tuple implementations need to be smarter than just blindly fetching the tuples for AMs that have optimizations similar to heap's HOT - the currently alive tuple in the update chain needs to be fetched if appropriate. Similar to table_scan_getnextslot(), it's undesirable to continue to return HeapTuples. Thus index_fetch_heap (might want to rename that later) now accepts a slot as an argument. Core code doesn't have a lot of call sites performing index scans without going through the systable_* API (in contrast to loads of heap_getnext calls and working directly with HeapTuples). Index scans now store the result of a search in IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the target is not generally a HeapTuple anymore that seems cleaner. To be able to sensible adapt code to use the above, two further callbacks have been introduced: a) slot_callbacks returns a TupleTableSlotOps* suitable for creating slots capable of holding a tuple of the AMs type. table_slot_callbacks() and table_slot_create() are based upon that, but have additional logic to deal with views, foreign tables, etc. While this change could have been done separately, nearly all the call sites that needed to be adapted for the rest of this commit also would have been needed to be adapted for table_slot_callbacks(), making separation not worthwhile. b) tuple_satisfies_snapshot checks whether the tuple in a slot is currently visible according to a snapshot. That's required as a few places now don't have a buffer + HeapTuple around, but a slot (which in heap's case internally has that information). Additionally a few infrastructure changes were needed: I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now internally uses a slot to keep track of tuples. While systable_getnext() still returns HeapTuples, and will so for the foreseeable future, the index API (see 1) above) now only deals with slots. The remainder, and largest part, of this commit is then adjusting all scans in postgres to use the new APIs. Author: Andres Freund, Haribabu Kommi, Alvaro Herrera Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
* Refactor planner's header files.Tom Lane2019-01-29
| | | | | | | | | | | | | | | | | | | | | | | | Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
* Move remaining code from tqual.[ch] to heapam.h / heapam_visibility.c.Andres Freund2019-01-21
| | | | | | | | | | | | | | Given these routines are heap specific, and that there will be more generic visibility support in via table AM, it makes sense to move the prototypes to heapam.h (routines like HeapTupleSatisfiesVacuum will not be exposed in a generic fashion, because they are too storage specific). Similarly, the code in tqual.c is specific to heap, so moving it into access/heap/ makes sense. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
* Update copyright for 2019Bruce Momjian2019-01-02
| | | | Backpatch-through: certain files through 9.4
* Fix memory leak in repeated SPGIST index scans.Tom Lane2018-10-31
| | | | | | | | | | | | | | | | | | | | | | | spgendscan neglected to pfree all the memory allocated by spgbeginscan. It's possible to get away with that in most normal queries, since the memory is allocated in the executor's per-query context which is about to get deleted anyway; but it causes severe memory leakage during creation or filling of large exclusion-constraint indexes. Also, document that amendscan is supposed to free what ambeginscan allocates. The docs' lack of clarity on that point probably caused this bug to begin with. (There is discussion of changing that API spec going forward, but I don't think it'd be appropriate for the back branches.) Per report from Bruno Wolff. It's been like this since the beginning, so back-patch to all active branches. In HEAD, also fix an independent leak caused by commit 2a6368343 (allocating memory during spgrescan instead of spgbeginscan, which might be all right if it got cleaned up, but it didn't). And do a bit of code beautification on that commit, too. Discussion: https://postgr.es/m/20181024012314.GA27428@wolff.to
* Minor formatting cleanup for 2a6368343fAlexander Korotkov2018-09-27
|
* Remove extra usage of BoxPGetDatum() macroAlexander Korotkov2018-09-27
| | | | | Author: Mark Dilger Discussion: https://postgr.es/m/B2AEFCD0-836D-4654-9D59-3DF616E0A6F3%40gmail.com