From ab596105b55f1d7fbd5a66b66f65227d210b047d Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Fri, 26 Mar 2021 13:54:29 +0100 Subject: BRIN minmax-multi indexes Adds BRIN opclasses similar to the existing minmax, except that instead of summarizing the page range into a single [min,max] range, the summary consists of multiple ranges and/or points, allowing gaps. This allows more efficient handling of data with poor correlation to physical location within the table and/or outlier values, for which the regular minmax opclassed tend to work poorly. It's possible to specify the number of values kept for each page range, either as a single point or an interval boundary. CREATE TABLE t (a int); CREATE INDEX ON t USING brin (a int4_minmax_multi_ops(values_per_range=16)); When building the summary, the values are combined into intervals with the goal to minimize the "covering" (sum of interval lengths), using a support procedure computing distance between two values. Bump catversion, due to various catalog changes. Author: Tomas Vondra Reviewed-by: Alvaro Herrera Reviewed-by: Alexander Korotkov Reviewed-by: Sokolov Yura Reviewed-by: John Naylor Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com Discussion: https://postgr.es/m/5d78b774-7e9c-c94e-12cf-fef51cc89b1a%402ndquadrant.com --- doc/src/sgml/brin.sgml | 280 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 270 insertions(+), 10 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml index 9524ef55d34..d2f12bb605f 100644 --- a/doc/src/sgml/brin.sgml +++ b/doc/src/sgml/brin.sgml @@ -116,7 +116,10 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was in the indexed column within the range. The inclusion operator classes store a value which includes the values in the indexed column within the range. The bloom operator - classes build a Bloom filter for all values in the range. + classes build a Bloom filter for all values in the range. The + minmax-multi operator classes store multiple + minimum and maximum values, representing values appearing in the indexed + column within the range. @@ -211,6 +214,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (date,date)>= (date,date) + + date_minmax_multi_ops + = (date,date) + + < (date,date) + <= (date,date) + > (date,date) + >= (date,date) + float4_bloom_ops = (float4,float4) @@ -225,6 +237,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (float4,float4) >= (float4,float4) + + float4_minmax_multi_ops + = (float4,float4) + + < (float4,float4) + > (float4,float4) + <= (float4,float4) + >= (float4,float4) + float8_bloom_ops = (float8,float8) @@ -239,6 +260,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (float8,float8) >= (float8,float8) + + float8_minmax_multi_ops + = (float8,float8) + + < (float8,float8) + <= (float8,float8) + > (float8,float8) + >= (float8,float8) + inet_inclusion_ops << (inet,inet) @@ -263,6 +293,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (inet,inet) >= (inet,inet) + + inet_minmax_multi_ops + = (inet,inet) + + < (inet,inet) + <= (inet,inet) + > (inet,inet) + >= (inet,inet) + int2_bloom_ops = (int2,int2) @@ -277,6 +316,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (int2,int2) >= (int2,int2) + + int2_minmax_multi_ops + = (int2,int2) + + < (int2,int2) + > (int2,int2) + <= (int2,int2) + >= (int2,int2) + int4_bloom_ops = (int4,int4) @@ -291,6 +339,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (int4,int4) >= (int4,int4) + + int4_minmax_multi_ops + = (int4,int4) + + < (int4,int4) + > (int4,int4) + <= (int4,int4) + >= (int4,int4) + int8_bloom_ops = (bigint,bigint) @@ -305,6 +362,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (bigint,bigint) >= (bigint,bigint) + + int8_minmax_multi_ops + = (bigint,bigint) + + < (bigint,bigint) + > (bigint,bigint) + <= (bigint,bigint) + >= (bigint,bigint) + interval_bloom_ops = (interval,interval) @@ -319,6 +385,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (interval,interval) >= (interval,interval) + + interval_minmax_multi_ops + = (interval,interval) + + < (interval,interval) + <= (interval,interval) + > (interval,interval) + >= (interval,interval) + macaddr_bloom_ops = (macaddr,macaddr) @@ -333,6 +408,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (macaddr,macaddr) >= (macaddr,macaddr) + + macaddr_minmax_multi_ops + = (macaddr,macaddr) + + < (macaddr,macaddr) + <= (macaddr,macaddr) + > (macaddr,macaddr) + >= (macaddr,macaddr) + macaddr8_bloom_ops = (macaddr8,macaddr8) @@ -347,6 +431,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (macaddr8,macaddr8) >= (macaddr8,macaddr8) + + macaddr8_minmax_multi_ops + = (macaddr8,macaddr8) + + < (macaddr8,macaddr8) + <= (macaddr8,macaddr8) + > (macaddr8,macaddr8) + >= (macaddr8,macaddr8) + name_bloom_ops = (name,name) @@ -375,6 +468,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (numeric,numeric) >= (numeric,numeric) + + numeric_minmax_multi_ops + = (numeric,numeric) + + < (numeric,numeric) + <= (numeric,numeric) + > (numeric,numeric) + >= (numeric,numeric) + oid_bloom_ops = (oid,oid) @@ -389,6 +491,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (oid,oid) >= (oid,oid) + + oid_minmax_multi_ops + = (oid,oid) + + < (oid,oid) + > (oid,oid) + <= (oid,oid) + >= (oid,oid) + pg_lsn_bloom_ops = (pg_lsn,pg_lsn) @@ -403,6 +514,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (pg_lsn,pg_lsn) >= (pg_lsn,pg_lsn) + + pg_lsn_minmax_multi_ops + = (pg_lsn,pg_lsn) + + < (pg_lsn,pg_lsn) + > (pg_lsn,pg_lsn) + <= (pg_lsn,pg_lsn) + >= (pg_lsn,pg_lsn) + range_inclusion_ops = (anyrange,anyrange) @@ -449,6 +569,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (tid,tid) >= (tid,tid) + + tid_minmax_multi_ops + = (tid,tid) + + < (tid,tid) + > (tid,tid) + <= (tid,tid) + >= (tid,tid) + timestamp_bloom_ops = (timestamp,timestamp) @@ -463,6 +592,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (timestamp,timestamp) >= (timestamp,timestamp) + + timestamp_minmax_multi_ops + = (timestamp,timestamp) + + < (timestamp,timestamp) + <= (timestamp,timestamp) + > (timestamp,timestamp) + >= (timestamp,timestamp) + timestamptz_bloom_ops = (timestamptz,timestamptz) @@ -477,6 +615,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (timestamptz,timestamptz) >= (timestamptz,timestamptz) + + timestamptz_minmax_multi_ops + = (timestamptz,timestamptz) + + < (timestamptz,timestamptz) + <= (timestamptz,timestamptz) + > (timestamptz,timestamptz) + >= (timestamptz,timestamptz) + time_bloom_ops = (time,time) @@ -491,6 +638,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (time,time) >= (time,time) + + time_minmax_multi_ops + = (time,time) + + < (time,time) + <= (time,time) + > (time,time) + >= (time,time) + timetz_bloom_ops = (timetz,timetz) @@ -505,6 +661,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was > (timetz,timetz) >= (timetz,timetz) + + timetz_minmax_multi_ops + = (timetz,timetz) + + < (timetz,timetz) + <= (timetz,timetz) + > (timetz,timetz) + >= (timetz,timetz) + uuid_bloom_ops = (uuid,uuid) @@ -519,6 +684,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was <= (uuid,uuid) >= (uuid,uuid) + + uuid_minmax_multi_ops + = (uuid,uuid) + + < (uuid,uuid) + > (uuid,uuid) + <= (uuid,uuid) + >= (uuid,uuid) + varbit_minmax_ops = (varbit,varbit) @@ -537,8 +711,8 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was Some of the built-in operator classes allow specifying parameters affecting behavior of the operator class. Each operator class has its own set of - allowed parameters. Only the bloom operator class - allows specifying parameters: + allowed parameters. Only the bloom and minmax-multi + operator classes allow specifying parameters: @@ -577,6 +751,25 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was + + + minmax-multi operator classes accept these parameters: + + + + + values_per_range + + + Defines the maximum number of values stored by BRIN + minmax indexes to summarize a block range. Each value may represent + either a point, or a boundary of an interval. Values must be between + 8 and 256, and the default value is 32. + + + + + @@ -715,13 +908,14 @@ typedef struct BrinOpcInfo - The core distribution includes support for two types of operator classes: - minmax and inclusion. Operator class definitions using them are shipped for - in-core data types as appropriate. Additional operator classes can be - defined by the user for other data types using equivalent definitions, - without having to write any source code; appropriate catalog entries being - declared is enough. Note that assumptions about the semantics of operator - strategies are embedded in the support functions' source code. + The core distribution includes support for four types of operator classes: + minmax, minmax-multi, inclusion and bloom. Operator class definitions + using them are shipped for in-core data types as appropriate. Additional + operator classes can be defined by the user for other data types using + equivalent definitions, without having to write any source code; + appropriate catalog entries being declared is enough. Note that + assumptions about the semantics of operator strategies are embedded in the + support functions' source code. @@ -1018,6 +1212,72 @@ typedef struct BrinOpcInfo and return a hash of the value. + + The minmax-multi operator class is also intended for data types implementing + a totally ordered sets, and may be seen as a simple extension of the minmax + operator class. While minmax operator class summarizes values from each block + range into a single contiguous interval, minmax-multi allows summarization + into multiple smaller intervals to improve handling of outlier values. + It is possible to use the minmax-multi support procedures alongside the + corresponding operators, as shown in + . + All operator class members (procedures and operators) are mandatory. + + +
+ Procedure and Support Numbers for minmax-multi Operator Classes + + + + Operator class member + Object + + + + + Support Procedure 1 + internal function brin_minmax_multi_opcinfo() + + + Support Procedure 2 + internal function brin_minmax_multi_add_value() + + + Support Procedure 3 + internal function brin_minmax_multi_consistent() + + + Support Procedure 4 + internal function brin_minmax_multi_union() + + + Support Procedure 11 + function to compute distance between two values (length of a range) + + + Operator Strategy 1 + operator less-than + + + Operator Strategy 2 + operator less-than-or-equal-to + + + Operator Strategy 3 + operator equal-to + + + Operator Strategy 4 + operator greater-than-or-equal-to + + + Operator Strategy 5 + operator greater-than + + + +
+ Both minmax and inclusion operator classes support cross-data-type operators, though with these the dependencies become more complicated. -- cgit v1.2.3