Rename jsonb_hash_ops to jsonb_path_ops.

There's no longer much pressure to switch the default GIN opclass for jsonb, but there was still some unhappiness with the name "jsonb_hash_ops", since hashing is no longer a distinguishing property of that opclass, and anyway it seems like a relatively minor detail. At the suggestion of Heikki Linnakangas, we'll use "jsonb_path_ops" instead; that captures the important characteristic that each index entry depends on the entire path from the document root to the indexed value. Also add a user-facing explanation of the implementation properties of these two opclasses.
author: Tom Lane <tgl@sss.pgh.pa.us> 2014-05-11 12:06:04 -0400
committer: Tom Lane <tgl@sss.pgh.pa.us> 2014-05-11 12:06:04 -0400
commit: 12e611d43e6efbf0e36014a3055ed47366facee7 (patch)
tree: 9c012d4a86bb7aed5cd0fcdf971a28a2bd82c8e0 /doc/src
parent: e136271a947d142eaa7093a789e6fa9ceda18c17 (diff)
download: postgresql-12e611d43e6efbf0e36014a3055ed47366facee7.tar.gz
postgresql-12e611d43e6efbf0e36014a3055ed47366facee7.zip
2 files changed, 42 insertions, 7 deletions
diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index 0b3d6eeb633..1cbc73c70cf 100644
--- a/doc/src/sgml/gin.sgml
+++ b/doc/src/sgml/gin.sgml
@@ -395,7 +395,7 @@
       </entry>
      </row>
      <row>
-      <entry><literal>jsonb_hash_ops</></entry>
+      <entry><literal>jsonb_path_ops</></entry>
       <entry><type>jsonb</></entry>
       <entry>
        <literal>@&gt;</>
@@ -415,7 +415,7 @@
 
  <para>
   Of the two operator classes for type <type>jsonb</>, <literal>jsonb_ops</>
-  is the default.  <literal>jsonb_hash_ops</> supports fewer operators but
+  is the default.  <literal>jsonb_path_ops</> supports fewer operators but
   offers better performance for those operators.
   See <xref linkend="json-indexing"> for details.
  </para>
diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 518fe63873e..66426189ca5 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -156,7 +156,7 @@
    </table>
 
  <sect2 id="json-keys-elements">
-  <title><type>jsonb</> Input and Output Syntax</title>
+  <title>JSON Input and Output Syntax</title>
   <para>
    The input/output syntax for the JSON data types is as specified in
    <acronym>RFC</> 7159.
@@ -366,11 +366,11 @@ SELECT '"foo"'::jsonb ? 'foo';
   <programlisting>
 CREATE INDEX idxgin ON api USING gin (jdoc);
   </programlisting>
-    The non-default GIN operator class <literal>jsonb_hash_ops</>
+    The non-default GIN operator class <literal>jsonb_path_ops</>
     supports indexing the <literal>@&gt;</> operator only.
     An example of creating an index with this operator class is:
   <programlisting>
-CREATE INDEX idxginh ON api USING gin (jdoc jsonb_hash_ops);
+CREATE INDEX idxginp ON api USING gin (jdoc jsonb_path_ops);
   </programlisting>
   </para>
 
@@ -444,10 +444,10 @@ SELECT jdoc-&gt;'guid', jdoc-&gt;'name' FROM api WHERE jdoc @&gt; '{"tags": ["qu
   </para>
 
   <para>
-    Although the <literal>jsonb_hash_ops</literal> operator class supports
+    Although the <literal>jsonb_path_ops</literal> operator class supports
     only queries with the <literal>@&gt;</> operator, it has notable
     performance advantages over the default operator
-    class <literal>jsonb_ops</literal>.  A <literal>jsonb_hash_ops</literal>
+    class <literal>jsonb_ops</literal>.  A <literal>jsonb_path_ops</literal>
     index is usually much smaller than a <literal>jsonb_ops</literal>
     index over the same data, and the specificity of searches is better,
     particularly when queries contain keys that appear frequently in the
@@ -456,6 +456,41 @@ SELECT jdoc-&gt;'guid', jdoc-&gt;'name' FROM api WHERE jdoc @&gt; '{"tags": ["qu
   </para>
 
   <para>
+    The technical difference between a <literal>jsonb_ops</literal>
+    and a <literal>jsonb_path_ops</literal> GIN index is that the former
+    creates independent index items for each key and value in the data,
+    while the latter creates index items only for each value in the
+    data.<footnote><para>For this purpose, the term <quote>value</>
+    includes array elements, though JSON terminology sometimes considers
+    array elements distinct from values within objects.</para></footnote>
+    But in <literal>jsonb_path_ops</literal>, each index item is a hash
+    of both the value and the key(s) leading to it; for example to index
+    <literal>{"foo": {"bar": "baz"}}</literal>, a single index item would
+    be created incorporating all three of <literal>foo</>, <literal>bar</>,
+    and <literal>baz</> into the hash value.  Thus a containment query
+    looking for this structure would result in an extremely specific index
+    search; but there is no way at all to find out whether <literal>foo</>
+    appears as a key.  On the other hand, a <literal>jsonb_ops</literal>
+    index would create three index items representing <literal>foo</>,
+    <literal>bar</>, and <literal>baz</> separately; then to do the
+    containment query, it would look for rows containing all three of
+    these items.  While GIN indexes can perform such an AND search fairly
+    efficiently, it will still be less specific and slower than the
+    equivalent <literal>jsonb_path_ops</literal> search, especially if
+    there are a very large number of rows containing any single one of the
+    three index items.
+  </para>
+
+  <para>
+    A disadvantage of the <literal>jsonb_path_ops</literal> approach is
+    that it produces no index entries for JSON structures not containing
+    any values, such as <literal>{"a": {}}</literal>.  If a search for
+    documents containing such a structure is requested, it will require a
+    full-index scan, which is quite slow.  <literal>jsonb_path_ops</> is
+    therefore ill-suited for applications that often perform such searches.
+  </para>
+
+  <para>
     <type>jsonb</> also supports <literal>btree</> and <literal>hash</>
     indexes.  These are usually useful only if it's important to check
     equality of complete JSON documents.
author	Tom Lane <tgl@sss.pgh.pa.us>	2014-05-11 12:06:04 -0400
committer	Tom Lane <tgl@sss.pgh.pa.us>	2014-05-11 12:06:04 -0400
commit	12e611d43e6efbf0e36014a3055ed47366facee7 (patch)
tree	9c012d4a86bb7aed5cd0fcdf971a28a2bd82c8e0 /doc/src
parent	e136271a947d142eaa7093a789e6fa9ceda18c17 (diff)
download	postgresql-12e611d43e6efbf0e36014a3055ed47366facee7.tar.gz postgresql-12e611d43e6efbf0e36014a3055ed47366facee7.zip