aboutsummaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/json.c
Commit message (Collapse)AuthorAge
* Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n).Andres Freund2017-08-20
| | | | | | | | | | | This is a mechanical change in preparation for a later commit that will change the layout of TupleDesc. Introducing a macro to abstract the details of where attributes are stored will allow us to change that in separate step and revise it in future. Author: Thomas Munro, editorialized by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
* Phase 3 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Re-run pgindent.Tom Lane2017-06-13
| | | | | | | | This is just to have a clean base state for testing of Piotr Stefaniak's latest version of FreeBSD indent. I fixed up a couple of places where pgindent would have changed format not-nicely. perltidy not included. Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com
* Assorted translatable string fixesAlvaro Herrera2017-06-04
| | | | | Mark our rusage reportage string translatable; remove quotes from type names; unify formatting of very similar messages.
* Fix typo in comment.Heikki Linnakangas2017-05-18
| | | | Daniel Gustafsson
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-17
| | | | perltidy run not included.
* Use wrappers of PG_DETOAST_DATUM_PACKED() more.Noah Misch2017-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.
* Correctly handle array pseudotypes in to_json and to_jsonbAndrew Dunstan2017-02-22
| | | | | | | Columns with array pseudotypes have not been identified as arrays, so they have been rendered as strings in the json and jsonb conversion routines. This change allows them to be rendered as json arrays, making it possible to deal correctly with the anyarray columns in pg_stats.
* Make messages mentioning type names more uniformAlvaro Herrera2017-01-18
| | | | | | | | | This avoids additional translatable strings for each distinct type, as well as making our quoting style around type names more consistent (namely, that we don't quote type names). This continues what started as f402b9950120. Discussion: https://postgr.es/m/20160401170642.GA57509@alvherre.pgsql
* Update copyright via script for 2017Bruce Momjian2017-01-03
|
* Fix typos.Robert Haas2016-03-15
| | | | Oskari Saarenmaa
* Fix IsValidJsonNumber() to notice trailing non-alphanumeric garbage.Tom Lane2016-02-03
| | | | | | | | | | | Commit e09996ff8dee3f70 was one brick shy of a load: it didn't insist that the detected JSON number be the whole of the supplied string. This allowed inputs such as "2016-01-01" to be misdetected as valid JSON numbers. Per bug #13906 from Dmitry Ryabov. In passing, be more wary of zero-length input (I'm not sure this can happen given current callers, but better safe than sorry), and do some minor cosmetic cleanup.
* Update copyright for 2016Bruce Momjian2016-01-02
| | | | Backpatch certain files through 9.1
* Remove unnecessary escaping in C character literalsPeter Eisentraut2015-12-22
| | | | '\"' is more commonly written simply as '"'.
* Fix incorrect translation of minus-infinity datetimes for json/jsonb.Tom Lane2015-10-20
| | | | | | | | | | | | | Commit bda76c1c8cfb1d11751ba6be88f0242850481733 caused both plus and minus infinity to be rendered as "infinity", which is not only wrong but inconsistent with the pre-9.4 behavior of to_json(). Fix that by duplicating the coding in date_out/timestamp_out/timestamptz_out more closely. Per bug #13687 from Stepan Perlov. Back-patch to 9.4, like the previous commit. In passing, also re-pgindent json.c, since it had gotten a bit messed up by recent patches (and I was already annoyed by indentation-related problems in back-patching this fix ...)
* Prevent stack overflow in json-related functions.Noah Misch2015-10-05
| | | | | | | | | | | | | | Sufficiently-deep recursion heretofore elicited a SIGSEGV. If an application constructs PostgreSQL json or jsonb values from arbitrary user input, application users could have exploited this to terminate all active database connections. That applies to 9.3, where the json parser adopted recursive descent, and later versions. Only row_to_json() and array_to_json() were at risk in 9.2, both in a non-security capacity. Back-patch to 9.2, where the json type was introduced. Oskari Saarenmaa, reviewed by Michael Paquier. Security: CVE-2015-5289
* Cache argument type information in json(b) aggregate functions.Andrew Dunstan2015-09-18
| | | | | | | | | | | | | | These functions have been looking up type info for every row they process. Instead of doing that we only look them up the first time through and stash the information in the aggregate state object. Affects json_agg, json_object_agg, jsonb_agg and jsonb_object_agg. There is plenty more work to do in making these more efficient, especially the jsonb functions, but this is a virtually cost free improvement that can be done right away. Backpatch to 9.5 where the jsonb variants were introduced.
* Support JSON negative array subscripts everywhereAndrew Dunstan2015-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, there was an inconsistency across json/jsonb operators that operate on datums containing JSON arrays -- only some operators supported negative array count-from-the-end subscripting. Specifically, only a new-to-9.5 jsonb deletion operator had support (the new "jsonb - integer" operator). This inconsistency seemed likely to be counter-intuitive to users. To fix, allow all places where the user can supply an integer subscript to accept a negative subscript value, including path-orientated operators and functions, as well as other extraction operators. This will need to be called out as an incompatibility in the 9.5 release notes, since it's possible that users are relying on certain established extraction operators changed here yielding NULL in the event of a negative subscript. For the json type, this requires adding a way of cheaply getting the total JSON array element count ahead of time when parsing arrays with a negative subscript involved, necessitating an ad-hoc lex and parse. This is followed by a "conversion" from a negative subscript to its equivalent positive-wise value using the count. From there on, it's as if a positive-wise value was originally provided. Note that there is still a minor inconsistency here across jsonb deletion operators. Unlike the aforementioned new "-" deletion operator that accepts an integer on its right hand side, the new "#-" path orientated deletion variant does not throw an error when it appears like an array subscript (input that could be recognized by as an integer literal) is being used on an object, which is wrong-headed. The reason for not being stricter is that it could be the case that an object pair happens to have a key value that looks like an integer; in general, these two possibilities are impossible to differentiate with rhs path text[] argument elements. However, we still don't allow the "#-" path-orientated deletion operator to perform array-style subscripting. Rather, we just return the original left operand value in the event of a negative subscript (which seems analogous to how the established "jsonb/json #> text[]" path-orientated operator may yield NULL in the event of an invalid subscript). In passing, make SetArrayPath() stricter about not accepting cases where there is trailing non-numeric garbage bytes rather than a clean NUL byte. This means, for example, that strings like "10e10" are now not accepted as an array subscript of 10 by some new-to-9.5 path-orientated jsonb operators (e.g. the new #- operator). Finally, remove dead code for jsonb subscript deletion; arguably, this should have been done in commit b81c7b409. Peter Geoghegan and Andrew Dunstan
* pgindent run for 9.5Bruce Momjian2015-05-23
|
* Remove spurious semicolons.Heikki Linnakangas2015-03-31
| | | | Petr Jelinek
* Render infinite date/timestamps as 'infinity' for json/jsonbAndrew Dunstan2015-02-26
| | | | | | | | | | | | | | Commit ab14a73a6c raised an error in these cases and later the behaviour was copied to jsonb. This is what the XML code, which we then adopted, does, as the XSD types don't accept infinite values. However, json dates and timestamps are just strings as far as json is concerned, so there is no reason not to render these values as 'infinity'. The json portion of this is backpatched to 9.4 where the behaviour was introduced. The jsonb portion only affects the development branch. Per gripe on pgsql-general.
* Fix jsonb Unicode escape processing, and in consequence disallow \u0000.Tom Lane2015-01-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've been trying to support \u0000 in JSON values since commit 78ed8e03c67d7333, and have introduced increasingly worse hacks to try to make it work, such as commit 0ad1a816320a2b53. However, it fundamentally can't work in the way envisioned, because the stored representation looks the same as for \\u0000 which is not the same thing at all. It's also entirely bogus to output \u0000 when de-escaped output is called for. The right way to do this would be to store an actual 0x00 byte, and then throw error only if asked to produce de-escaped textual output. However, getting to that point seems likely to take considerable work and may well never be practical in the 9.4.x series. To preserve our options for better behavior while getting rid of the nasty side-effects of 0ad1a816320a2b53, revert that commit in toto and instead throw error if \u0000 is used in a context where it needs to be de-escaped. (These are the same contexts where non-ASCII Unicode escapes throw error if the database encoding isn't UTF8, so this behavior is by no means without precedent.) In passing, make both the \u0000 case and the non-ASCII Unicode case report ERRCODE_UNTRANSLATABLE_CHARACTER / "unsupported Unicode escape sequence" rather than claiming there's something wrong with the input syntax. Back-patch to 9.4, where we have to do something because 0ad1a816320a2b53 broke things for many cases having nothing to do with \u0000. 9.3 also has bogus behavior, but only for that specific escape value, so given the lack of field complaints it seems better to leave 9.3 alone.
* Update copyright for 2015Bruce Momjian2015-01-06
| | | | Backpatch certain files through 9.0
* Add several generator functions for jsonb that exist for json.Andrew Dunstan2014-12-12
| | | | | | | | | | | | | | | | The functions are: to_jsonb() jsonb_object() jsonb_build_object() jsonb_build_array() jsonb_agg() jsonb_object_agg() Also along the way some better logic is implemented in json_categorize_type() to match that in the newly implemented jsonb_categorize_type(). Andrew Dunstan, reviewed by Pavel Stehule and Alvaro Herrera.
* Fix JSON aggregates to work properly when final function is re-executed.Tom Lane2014-12-02
| | | | | | | | | | | | Davide S. reported that json_agg() sometimes produced multiple trailing right brackets. This turns out to be because json_agg_finalfn() attaches the final right bracket, and was doing so by modifying the aggregate state in-place. That's verboten, though unfortunately it seems there's no way for nodeAgg.c to check for such mistakes. Fix that back to 9.3 where the broken code was introduced. In 9.4 and HEAD, likewise fix json_object_agg(), which had copied the erroneous logic. Make some cosmetic cleanups as well.
* Fix hstore_to_json_loose's detection of valid JSON number values.Andrew Dunstan2014-12-01
| | | | | | | | | | We expose a function IsValidJsonNumber that internally calls the lexer for json numbers. That allows us to use the same test everywhere, instead of inventing a broken test for hstore conversions. The new function is also used in datum_to_json, replacing the code that is now moved to the new function. Backpatch to 9.3 where hstore_to_json_loose was introduced.
* Revert 95d737ff to add 'ignore_nulls'Stephen Frost2014-09-29
| | | | | | | | | | Per discussion, revert the commit which added 'ignore_nulls' to row_to_json. This capability would be better added as an independent function rather than being bolted on to row_to_json. Additionally, the implementation didn't address complex JSON objects, and so was incomplete anyway. Pointed out by Tom and discussed with Andrew and Robert.
* Remove ill-conceived ban on zero length json object keys.Andrew Dunstan2014-09-25
| | | | | | | | | | We removed a similar ban on this in json_object recently, but the ban in datum_to_json was left, which generate4d sprutious errors in othee json generators, notable json_build_object. Along the way, add an assertion that datum_to_json is not passed a null key. All current callers comply with this rule, but the assertion will catch any possible future misbehaviour.
* Return NULL from json_object_agg if it gets no rows.Andrew Dunstan2014-09-25
| | | | | This makes it consistent with the docs and with all other builtin aggregates apart from count().
* Add 'ignore_nulls' option to row_to_jsonStephen Frost2014-09-11
| | | | | | | | | | | | | | | Provide an option to skip NULL values in a row when generating a JSON object from that row with row_to_json. This can reduce the size of the JSON object in cases where columns are NULL without really reducing the information in the JSON object. This also makes row_to_json into a single function with default values, rather than having multiple functions. In passing, change array_to_json to also be a single function with default values (we don't add an 'ignore_nulls' option yet- it's not clear that there is a sensible use-case there, and it hasn't been asked for in any case). Pavel Stehule
* Use ISO 8601 format for dates converted to JSON, too.Tom Lane2014-08-17
| | | | | | | | Commit f30015b6d794c15d52abbb3df3a65081fbefb1ed made this happen for timestamp and timestamptz, but it seems pretty inconsistent to not do it for simple dates as well. (In passing, I re-pgindent'd json.c.)
* Clean up handling of unknown-type inputs in json_build_object and friends.Tom Lane2014-08-09
| | | | | | | | There's actually no need for any special case for unknown-type literals, since we only need to push the value through its output function and unknownout() works fine. The code that was here was completely bizarre anyway, and would fail outright in cases that should work, not to mention suffering from some copy-and-paste bugs.
* Further cleanup of JSON-specific error messages.Tom Lane2014-08-09
| | | | | | | | | | | | | | | | | | Fix an obvious typo in json_build_object()'s complaint about invalid number of arguments, and make the errhint a bit more sensible too. Per discussion about how to word the improved hint, change the few places in the documentation that refer to JSON object field names as "names" to say "keys" instead, since that's what we've said in the vast majority of places in the docs. Arguably "name" is more correct, since that's the terminology used in RFC 7159; but we're stuck with "key" in view of the naming of json_object_keys() so let's at least be self-consistent. I adjusted a few code comments to match this as well, and failed to resist the temptation to clean up some odd whitespace choices in the same area, as well as a useless duplicate PG_ARGISNULL() check. There's still quite a bit of code that uses the phrase "field name" in non-user- visible ways, so I left those usages alone.
* Improve some JSON error messages.Robert Haas2014-08-05
| | | | | | | These messages are new in 9.4, which hasn't been released yet, so back-patch to REL9_4_STABLE. Daniele Varrazzo
* Allow empty string object keys in json_object().Andrew Dunstan2014-07-22
| | | | | This makes the behaviour consistent with the json parser, other json-generating functions, and the JSON standards.
* Add missing serial commasPeter Eisentraut2014-07-15
| | | | | Also update one place where the wal_level "logical" was not added to an error message.
* Consistently pass an "unsigned char" to ctype.h functions.Noah Misch2014-07-06
| | | | | | The isxdigit() calls relied on undefined behavior. The isascii() call was well-defined, but our prevailing style is to include the cast. Back-patch to 9.4, where the isxdigit() calls were introduced.
* Fix typosAlvaro Herrera2014-06-12
|
* Use EncodeDateTime instead of to_char to render JSON timestamps.Andrew Dunstan2014-06-03
| | | | | | | | | | Per gripe from Peter Eisentraut and Tom Lane. The output is slightly different, but still ISO 8601 compliant: to_char doesn't output the minutes when time zone offset is an integer number of hours, while EncodeDateTime outputs ":00". The code is slightly adapted from code in xml.c
* Do not escape a unicode sequence when escaping JSON text.Andrew Dunstan2014-06-03
| | | | | | | | | | | | | | Previously, any backslash in text being escaped for JSON was doubled so that the result was still valid JSON. However, this led to some perverse results in the case of Unicode sequences, These are now detected and the initial backslash is no longer escaped. All other backslashes are still escaped. No validity check is performed, all that is looked for is \uXXXX where X is a hexidecimal digit. This is a change from the 9.2 and 9.3 behaviour as noted in the Release notes. Per complaint from Teodor Sigaev.
* Output timestamps in ISO 8601 format when rendering JSON.Andrew Dunstan2014-06-03
| | | | | | | | | | | Many JSON processors require timestamp strings in ISO 8601 format in order to convert the strings. When converting a timestamp, with or without timezone, to a JSON datum we therefore now use such a format rather than the type's default text output, in functions such as to_json(). This is a change in behaviour from 9.2 and 9.3, as noted in the release notes.
* Get rid of bogus dependency on typcategory in to_json() and friends.Tom Lane2014-05-09
| | | | | | | | | | | | | | | | | | | These functions were relying on typcategory to identify arrays and composites, which is not reliable and not the normal way to do it. Using typcategory to identify boolean, numeric types, and json itself is also pretty questionable, though the code in those cases didn't seem to be at risk of anything worse than wrong output. Instead, use the standard lsyscache functions to identify arrays and composites, and rely on a direct check of the type OID for the other cases. In HEAD, also be sure to look through domains so that a domain is treated the same as its base type for conversions to JSON. However, this is a small behavioral change; given the lack of field complaints, we won't back-patch it. In passing, refactor so that there's only one copy of the code that decides which conversion strategy to apply, not multiple copies that could (and have) gotten out of sync.
* Teach add_json() that jsonb is of TYPCATEGORY_JSON.Tom Lane2014-05-09
| | | | | | | This code really needs to be refactored so that there aren't so many copies that can diverge. Not to mention that this whole approach is probably wrong. But for the moment I'll just stick my finger in the dike. Per report from Michael Paquier.
* Avoid some pnstrdup()s when constructing jsonbHeikki Linnakangas2014-05-09
| | | | | This speeds up text to jsonb parsing and hstore to jsonb conversions somewhat.
* pgindent run for 9.4Bruce Momjian2014-05-06
| | | | | This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
* Introduce jsonb, a structured format for storing json.Andrew Dunstan2014-03-23
| | | | | | | | | | | | | | | | | | | | | | The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored. The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which was intended to provide similar facilities to a nested hstore type, but which in the end proved to have some significant compatibility issues. Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan. Review: Andres Freund
* Fix typos in comments.Fujii Masao2014-03-17
| | | | Thom Brown
* New json functions.Andrew Dunstan2014-01-28
| | | | | | | | | | | | | json_build_array() and json_build_object allow for the construction of arbitrarily complex json trees. json_object() turns a one or two dimensional array, or two separate arrays, into a json_object of name/value pairs, similarly to the hstore() function. json_object_agg() aggregates its two arguments into a single json object as name value pairs. Catalog version bumped. Andrew Dunstan, reviewed by Marko Tiikkaja.
* Reindent json.c and jsonfuncs.c.Andrew Dunstan2014-01-22
| | | | | This will help in preparation of clean patches for upcoming json work.