diff options
Diffstat (limited to 'doc/src/sgml/json.sgml')
-rw-r--r-- | doc/src/sgml/json.sgml | 174 |
1 files changed, 87 insertions, 87 deletions
diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml index 7dfdf967645..05ecef2ffc9 100644 --- a/doc/src/sgml/json.sgml +++ b/doc/src/sgml/json.sgml @@ -1,7 +1,7 @@ <!-- doc/src/sgml/json.sgml --> <sect1 id="datatype-json"> - <title><acronym>JSON</> Types</title> + <title><acronym>JSON</acronym> Types</title> <indexterm zone="datatype-json"> <primary>JSON</primary> @@ -22,25 +22,25 @@ </para> <para> - There are two JSON data types: <type>json</> and <type>jsonb</>. - They accept <emphasis>almost</> identical sets of values as + There are two JSON data types: <type>json</type> and <type>jsonb</type>. + They accept <emphasis>almost</emphasis> identical sets of values as input. The major practical difference is one of efficiency. The - <type>json</> data type stores an exact copy of the input text, + <type>json</type> data type stores an exact copy of the input text, which processing functions must reparse on each execution; while - <type>jsonb</> data is stored in a decomposed binary format that + <type>jsonb</type> data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing - is needed. <type>jsonb</> also supports indexing, which can be a + is needed. <type>jsonb</type> also supports indexing, which can be a significant advantage. </para> <para> - Because the <type>json</> type stores an exact copy of the input text, it + Because the <type>json</type> type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the - operative one.) By contrast, <type>jsonb</> does not preserve white + operative one.) By contrast, <type>jsonb</type> does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept. @@ -48,7 +48,7 @@ <para> In general, most applications should prefer to store JSON data as - <type>jsonb</>, unless there are quite specialized needs, such as + <type>jsonb</type>, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys. </para> @@ -64,15 +64,15 @@ <para> RFC 7159 permits JSON strings to contain Unicode escape sequences - denoted by <literal>\u<replaceable>XXXX</></literal>. In the input - function for the <type>json</> type, Unicode escapes are allowed + denoted by <literal>\u<replaceable>XXXX</replaceable></literal>. In the input + function for the <type>json</type> type, Unicode escapes are allowed regardless of the database encoding, and are checked only for syntactic - correctness (that is, that four hex digits follow <literal>\u</>). - However, the input function for <type>jsonb</> is stricter: it disallows - Unicode escapes for non-ASCII characters (those above <literal>U+007F</>) - unless the database encoding is UTF8. The <type>jsonb</> type also - rejects <literal>\u0000</> (because that cannot be represented in - <productname>PostgreSQL</productname>'s <type>text</> type), and it insists + correctness (that is, that four hex digits follow <literal>\u</literal>). + However, the input function for <type>jsonb</type> is stricter: it disallows + Unicode escapes for non-ASCII characters (those above <literal>U+007F</literal>) + unless the database encoding is UTF8. The <type>jsonb</type> type also + rejects <literal>\u0000</literal> (because that cannot be represented in + <productname>PostgreSQL</productname>'s <type>text</type> type), and it insists that any use of Unicode surrogate pairs to designate characters outside the Unicode Basic Multilingual Plane be correct. Valid Unicode escapes are converted to the equivalent ASCII or UTF8 character for storage; @@ -84,8 +84,8 @@ Many of the JSON processing functions described in <xref linkend="functions-json"> will convert Unicode escapes to regular characters, and will therefore throw the same types of errors - just described even if their input is of type <type>json</> - not <type>jsonb</>. The fact that the <type>json</> input function does + just described even if their input is of type <type>json</type> + not <type>jsonb</type>. The fact that the <type>json</type> input function does not make these checks may be considered a historical artifact, although it does allow for simple storage (without processing) of JSON Unicode escapes in a non-UTF8 database encoding. In general, it is best to @@ -95,22 +95,22 @@ </note> <para> - When converting textual JSON input into <type>jsonb</>, the primitive - types described by <acronym>RFC</> 7159 are effectively mapped onto + When converting textual JSON input into <type>jsonb</type>, the primitive + types described by <acronym>RFC</acronym> 7159 are effectively mapped onto native <productname>PostgreSQL</productname> types, as shown in <xref linkend="json-type-mapping-table">. Therefore, there are some minor additional constraints on what constitutes valid <type>jsonb</type> data that do not apply to the <type>json</type> type, nor to JSON in the abstract, corresponding to limits on what can be represented by the underlying data type. - Notably, <type>jsonb</> will reject numbers that are outside the - range of the <productname>PostgreSQL</productname> <type>numeric</> data - type, while <type>json</> will not. Such implementation-defined - restrictions are permitted by <acronym>RFC</> 7159. However, in + Notably, <type>jsonb</type> will reject numbers that are outside the + range of the <productname>PostgreSQL</productname> <type>numeric</type> data + type, while <type>json</type> will not. Such implementation-defined + restrictions are permitted by <acronym>RFC</acronym> 7159. However, in practice such problems are far more likely to occur in other - implementations, as it is common to represent JSON's <type>number</> + implementations, as it is common to represent JSON's <type>number</type> primitive type as IEEE 754 double precision floating point - (which <acronym>RFC</> 7159 explicitly anticipates and allows for). + (which <acronym>RFC</acronym> 7159 explicitly anticipates and allows for). When using JSON as an interchange format with such systems, the danger of losing numeric precision compared to data originally stored by <productname>PostgreSQL</productname> should be considered. @@ -134,23 +134,23 @@ </thead> <tbody> <row> - <entry><type>string</></entry> - <entry><type>text</></entry> - <entry><literal>\u0000</> is disallowed, as are non-ASCII Unicode + <entry><type>string</type></entry> + <entry><type>text</type></entry> + <entry><literal>\u0000</literal> is disallowed, as are non-ASCII Unicode escapes if database encoding is not UTF8</entry> </row> <row> - <entry><type>number</></entry> - <entry><type>numeric</></entry> + <entry><type>number</type></entry> + <entry><type>numeric</type></entry> <entry><literal>NaN</literal> and <literal>infinity</literal> values are disallowed</entry> </row> <row> - <entry><type>boolean</></entry> - <entry><type>boolean</></entry> + <entry><type>boolean</type></entry> + <entry><type>boolean</type></entry> <entry>Only lowercase <literal>true</literal> and <literal>false</literal> spellings are accepted</entry> </row> <row> - <entry><type>null</></entry> + <entry><type>null</type></entry> <entry>(none)</entry> <entry>SQL <literal>NULL</literal> is a different concept</entry> </row> @@ -162,10 +162,10 @@ <title>JSON Input and Output Syntax</title> <para> The input/output syntax for the JSON data types is as specified in - <acronym>RFC</> 7159. + <acronym>RFC</acronym> 7159. </para> <para> - The following are all valid <type>json</> (or <type>jsonb</>) expressions: + The following are all valid <type>json</type> (or <type>jsonb</type>) expressions: <programlisting> -- Simple scalar/primitive value -- Primitive values can be numbers, quoted strings, true, false, or null @@ -185,8 +185,8 @@ SELECT '{"foo": [true, "bar"], "tags": {"a": 1, "b": null}}'::json; <para> As previously stated, when a JSON value is input and then printed without - any additional processing, <type>json</> outputs the same text that was - input, while <type>jsonb</> does not preserve semantically-insignificant + any additional processing, <type>json</type> outputs the same text that was + input, while <type>jsonb</type> does not preserve semantically-insignificant details such as whitespace. For example, note the differences here: <programlisting> SELECT '{"bar": "baz", "balance": 7.77, "active":false}'::json; @@ -202,9 +202,9 @@ SELECT '{"bar": "baz", "balance": 7.77, "active":false}'::jsonb; (1 row) </programlisting> One semantically-insignificant detail worth noting is that - in <type>jsonb</>, numbers will be printed according to the behavior of the - underlying <type>numeric</> type. In practice this means that numbers - entered with <literal>E</> notation will be printed without it, for + in <type>jsonb</type>, numbers will be printed according to the behavior of the + underlying <type>numeric</type> type. In practice this means that numbers + entered with <literal>E</literal> notation will be printed without it, for example: <programlisting> SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb; @@ -213,7 +213,7 @@ SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb; {"reading": 1.230e-5} | {"reading": 0.00001230} (1 row) </programlisting> - However, <type>jsonb</> will preserve trailing fractional zeroes, as seen + However, <type>jsonb</type> will preserve trailing fractional zeroes, as seen in this example, even though those are semantically insignificant for purposes such as equality checks. </para> @@ -231,7 +231,7 @@ SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb; have a somewhat fixed structure. The structure is typically unenforced (though enforcing some business rules declaratively is possible), but having a predictable structure makes it easier to write - queries that usefully summarize a set of <quote>documents</> (datums) + queries that usefully summarize a set of <quote>documents</quote> (datums) in a table. </para> <para> @@ -249,7 +249,7 @@ SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb; </sect2> <sect2 id="json-containment"> - <title><type>jsonb</> Containment and Existence</title> + <title><type>jsonb</type> Containment and Existence</title> <indexterm> <primary>jsonb</primary> <secondary>containment</secondary> @@ -259,10 +259,10 @@ SELECT '{"reading": 1.230e-5}'::json, '{"reading": 1.230e-5}'::jsonb; <secondary>existence</secondary> </indexterm> <para> - Testing <firstterm>containment</> is an important capability of - <type>jsonb</>. There is no parallel set of facilities for the - <type>json</> type. Containment tests whether - one <type>jsonb</> document has contained within it another one. + Testing <firstterm>containment</firstterm> is an important capability of + <type>jsonb</type>. There is no parallel set of facilities for the + <type>json</type> type. Containment tests whether + one <type>jsonb</type> document has contained within it another one. These examples return true except as noted: </para> <programlisting> @@ -282,7 +282,7 @@ SELECT '[1, 2, 3]'::jsonb @> '[1, 2, 2]'::jsonb; -- within the object on the left side: SELECT '{"product": "PostgreSQL", "version": 9.4, "jsonb": true}'::jsonb @> '{"version": 9.4}'::jsonb; --- The array on the right side is <emphasis>not</> considered contained within the +-- The array on the right side is <emphasis>not</emphasis> considered contained within the -- array on the left, even though a similar array is nested within it: SELECT '[1, 2, [1, 3]]'::jsonb @> '[1, 3]'::jsonb; -- yields false @@ -319,10 +319,10 @@ SELECT '"bar"'::jsonb @> '["bar"]'::jsonb; -- yields false </programlisting> <para> - <type>jsonb</> also has an <firstterm>existence</> operator, which is + <type>jsonb</type> also has an <firstterm>existence</firstterm> operator, which is a variation on the theme of containment: it tests whether a string - (given as a <type>text</> value) appears as an object key or array - element at the top level of the <type>jsonb</> value. + (given as a <type>text</type> value) appears as an object key or array + element at the top level of the <type>jsonb</type> value. These examples return true except as noted: </para> <programlisting> @@ -353,11 +353,11 @@ SELECT '"foo"'::jsonb ? 'foo'; <para> Because JSON containment is nested, an appropriate query can skip explicit selection of sub-objects. As an example, suppose that we have - a <structfield>doc</> column containing objects at the top level, with - most objects containing <literal>tags</> fields that contain arrays of + a <structfield>doc</structfield> column containing objects at the top level, with + most objects containing <literal>tags</literal> fields that contain arrays of sub-objects. This query finds entries in which sub-objects containing - both <literal>"term":"paris"</> and <literal>"term":"food"</> appear, - while ignoring any such keys outside the <literal>tags</> array: + both <literal>"term":"paris"</literal> and <literal>"term":"food"</literal> appear, + while ignoring any such keys outside the <literal>tags</literal> array: <programlisting> SELECT doc->'site_name' FROM websites WHERE doc @> '{"tags":[{"term":"paris"}, {"term":"food"}]}'; @@ -385,7 +385,7 @@ SELECT doc->'site_name' FROM websites </sect2> <sect2 id="json-indexing"> - <title><type>jsonb</> Indexing</title> + <title><type>jsonb</type> Indexing</title> <indexterm> <primary>jsonb</primary> <secondary>indexes on</secondary> @@ -394,23 +394,23 @@ SELECT doc->'site_name' FROM websites <para> GIN indexes can be used to efficiently search for keys or key/value pairs occurring within a large number of - <type>jsonb</> documents (datums). - Two GIN <quote>operator classes</> are provided, offering different + <type>jsonb</type> documents (datums). + Two GIN <quote>operator classes</quote> are provided, offering different performance and flexibility trade-offs. </para> <para> - The default GIN operator class for <type>jsonb</> supports queries with - top-level key-exists operators <literal>?</>, <literal>?&</> - and <literal>?|</> operators and path/value-exists operator - <literal>@></>. + The default GIN operator class for <type>jsonb</type> supports queries with + top-level key-exists operators <literal>?</literal>, <literal>?&</literal> + and <literal>?|</literal> operators and path/value-exists operator + <literal>@></literal>. (For details of the semantics that these operators implement, see <xref linkend="functions-jsonb-op-table">.) An example of creating an index with this operator class is: <programlisting> CREATE INDEX idxgin ON api USING GIN (jdoc); </programlisting> - The non-default GIN operator class <literal>jsonb_path_ops</> - supports indexing the <literal>@></> operator only. + The non-default GIN operator class <literal>jsonb_path_ops</literal> + supports indexing the <literal>@></literal> operator only. An example of creating an index with this operator class is: <programlisting> CREATE INDEX idxginp ON api USING GIN (jdoc jsonb_path_ops); @@ -438,8 +438,8 @@ CREATE INDEX idxginp ON api USING GIN (jdoc jsonb_path_ops); ] } </programlisting> - We store these documents in a table named <structname>api</>, - in a <type>jsonb</> column named <structfield>jdoc</>. + We store these documents in a table named <structname>api</structname>, + in a <type>jsonb</type> column named <structfield>jdoc</structfield>. If a GIN index is created on this column, queries like the following can make use of the index: <programlisting> @@ -447,23 +447,23 @@ CREATE INDEX idxginp ON api USING GIN (jdoc jsonb_path_ops); SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"company": "Magnafone"}'; </programlisting> However, the index could not be used for queries like the - following, because though the operator <literal>?</> is indexable, - it is not applied directly to the indexed column <structfield>jdoc</>: + following, because though the operator <literal>?</literal> is indexable, + it is not applied directly to the indexed column <structfield>jdoc</structfield>: <programlisting> -- Find documents in which the key "tags" contains key or array element "qui" SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc -> 'tags' ? 'qui'; </programlisting> Still, with appropriate use of expression indexes, the above query can use an index. If querying for particular items within - the <literal>"tags"</> key is common, defining an index like this + the <literal>"tags"</literal> key is common, defining an index like this may be worthwhile: <programlisting> CREATE INDEX idxgintags ON api USING GIN ((jdoc -> 'tags')); </programlisting> - Now, the <literal>WHERE</> clause <literal>jdoc -> 'tags' ? 'qui'</> + Now, the <literal>WHERE</literal> clause <literal>jdoc -> 'tags' ? 'qui'</literal> will be recognized as an application of the indexable - operator <literal>?</> to the indexed - expression <literal>jdoc -> 'tags'</>. + operator <literal>?</literal> to the indexed + expression <literal>jdoc -> 'tags'</literal>. (More information on expression indexes can be found in <xref linkend="indexes-expressional">.) </para> @@ -473,11 +473,11 @@ CREATE INDEX idxgintags ON api USING GIN ((jdoc -> 'tags')); -- Find documents in which the key "tags" contains array element "qui" SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qui"]}'; </programlisting> - A simple GIN index on the <structfield>jdoc</> column can support this + A simple GIN index on the <structfield>jdoc</structfield> column can support this query. But note that such an index will store copies of every key and - value in the <structfield>jdoc</> column, whereas the expression index + value in the <structfield>jdoc</structfield> column, whereas the expression index of the previous example stores only data found under - the <literal>tags</> key. While the simple-index approach is far more + the <literal>tags</literal> key. While the simple-index approach is far more flexible (since it supports queries about any key), targeted expression indexes are likely to be smaller and faster to search than a simple index. @@ -485,7 +485,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu <para> Although the <literal>jsonb_path_ops</literal> operator class supports - only queries with the <literal>@></> operator, it has notable + only queries with the <literal>@></literal> operator, it has notable performance advantages over the default operator class <literal>jsonb_ops</literal>. A <literal>jsonb_path_ops</literal> index is usually much smaller than a <literal>jsonb_ops</literal> @@ -503,7 +503,7 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu data. <footnote> <para> - For this purpose, the term <quote>value</> includes array elements, + For this purpose, the term <quote>value</quote> includes array elements, though JSON terminology sometimes considers array elements distinct from values within objects. </para> @@ -511,13 +511,13 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu Basically, each <literal>jsonb_path_ops</literal> index item is a hash of the value and the key(s) leading to it; for example to index <literal>{"foo": {"bar": "baz"}}</literal>, a single index item would - be created incorporating all three of <literal>foo</>, <literal>bar</>, - and <literal>baz</> into the hash value. Thus a containment query + be created incorporating all three of <literal>foo</literal>, <literal>bar</literal>, + and <literal>baz</literal> into the hash value. Thus a containment query looking for this structure would result in an extremely specific index - search; but there is no way at all to find out whether <literal>foo</> + search; but there is no way at all to find out whether <literal>foo</literal> appears as a key. On the other hand, a <literal>jsonb_ops</literal> - index would create three index items representing <literal>foo</>, - <literal>bar</>, and <literal>baz</> separately; then to do the + index would create three index items representing <literal>foo</literal>, + <literal>bar</literal>, and <literal>baz</literal> separately; then to do the containment query, it would look for rows containing all three of these items. While GIN indexes can perform such an AND search fairly efficiently, it will still be less specific and slower than the @@ -531,15 +531,15 @@ SELECT jdoc->'guid', jdoc->'name' FROM api WHERE jdoc @> '{"tags": ["qu that it produces no index entries for JSON structures not containing any values, such as <literal>{"a": {}}</literal>. If a search for documents containing such a structure is requested, it will require a - full-index scan, which is quite slow. <literal>jsonb_path_ops</> is + full-index scan, which is quite slow. <literal>jsonb_path_ops</literal> is therefore ill-suited for applications that often perform such searches. </para> <para> - <type>jsonb</> also supports <literal>btree</> and <literal>hash</> + <type>jsonb</type> also supports <literal>btree</literal> and <literal>hash</literal> indexes. These are usually useful only if it's important to check equality of complete JSON documents. - The <literal>btree</> ordering for <type>jsonb</> datums is seldom + The <literal>btree</literal> ordering for <type>jsonb</type> datums is seldom of great interest, but for completeness it is: <synopsis> <replaceable>Object</replaceable> > <replaceable>Array</replaceable> > <replaceable>Boolean</replaceable> > <replaceable>Number</replaceable> > <replaceable>String</replaceable> > <replaceable>Null</replaceable> |