From 12e611d43e6efbf0e36014a3055ed47366facee7 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sun, 11 May 2014 12:06:04 -0400 Subject: Rename jsonb_hash_ops to jsonb_path_ops. There's no longer much pressure to switch the default GIN opclass for jsonb, but there was still some unhappiness with the name "jsonb_hash_ops", since hashing is no longer a distinguishing property of that opclass, and anyway it seems like a relatively minor detail. At the suggestion of Heikki Linnakangas, we'll use "jsonb_path_ops" instead; that captures the important characteristic that each index entry depends on the entire path from the document root to the indexed value. Also add a user-facing explanation of the implementation properties of these two opclasses. --- doc/src/sgml/gin.sgml | 4 ++-- doc/src/sgml/json.sgml | 45 ++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 42 insertions(+), 7 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml index 0b3d6eeb633..1cbc73c70cf 100644 --- a/doc/src/sgml/gin.sgml +++ b/doc/src/sgml/gin.sgml @@ -395,7 +395,7 @@ - jsonb_hash_ops + jsonb_path_ops jsonb @> @@ -415,7 +415,7 @@ Of the two operator classes for type jsonb, jsonb_ops - is the default. jsonb_hash_ops supports fewer operators but + is the default. jsonb_path_ops supports fewer operators but offers better performance for those operators. See for details. diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml index 518fe63873e..66426189ca5 100644 --- a/doc/src/sgml/json.sgml +++ b/doc/src/sgml/json.sgml @@ -156,7 +156,7 @@ - <type>jsonb</> Input and Output Syntax + JSON Input and Output Syntax The input/output syntax for the JSON data types is as specified in RFC 7159. @@ -366,11 +366,11 @@ SELECT '"foo"'::jsonb ? 'foo'; CREATE INDEX idxgin ON api USING gin (jdoc); - The non-default GIN operator class jsonb_hash_ops + The non-default GIN operator class jsonb_path_ops supports indexing the @> operator only. An example of creating an index with this operator class is: -CREATE INDEX idxginh ON api USING gin (jdoc jsonb_hash_ops); +CREATE INDEX idxginp ON api USING gin (jdoc jsonb_path_ops); @@ -444,10 +444,10 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu - Although the jsonb_hash_ops operator class supports + Although the jsonb_path_ops operator class supports only queries with the @> operator, it has notable performance advantages over the default operator - class jsonb_ops. A jsonb_hash_ops + class jsonb_ops. A jsonb_path_ops index is usually much smaller than a jsonb_ops index over the same data, and the specificity of searches is better, particularly when queries contain keys that appear frequently in the @@ -455,6 +455,41 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu than with the default operator class. + + The technical difference between a jsonb_ops + and a jsonb_path_ops GIN index is that the former + creates independent index items for each key and value in the data, + while the latter creates index items only for each value in the + data.For this purpose, the term value + includes array elements, though JSON terminology sometimes considers + array elements distinct from values within objects. + But in jsonb_path_ops, each index item is a hash + of both the value and the key(s) leading to it; for example to index + {"foo": {"bar": "baz"}}, a single index item would + be created incorporating all three of foo, bar, + and baz into the hash value. Thus a containment query + looking for this structure would result in an extremely specific index + search; but there is no way at all to find out whether foo + appears as a key. On the other hand, a jsonb_ops + index would create three index items representing foo, + bar, and baz separately; then to do the + containment query, it would look for rows containing all three of + these items. While GIN indexes can perform such an AND search fairly + efficiently, it will still be less specific and slower than the + equivalent jsonb_path_ops search, especially if + there are a very large number of rows containing any single one of the + three index items. + + + + A disadvantage of the jsonb_path_ops approach is + that it produces no index entries for JSON structures not containing + any values, such as {"a": {}}. If a search for + documents containing such a structure is requested, it will require a + full-index scan, which is quite slow. jsonb_path_ops is + therefore ill-suited for applications that often perform such searches. + + jsonb also supports btree and hash indexes. These are usually useful only if it's important to check -- cgit v1.2.3