postgresql - postgresql mirror

diff options

author	Tom Lane <tgl@sss.pgh.pa.us>	2017-11-18 12:42:52 -0500
committer	Tom Lane <tgl@sss.pgh.pa.us>	2017-11-18 12:42:52 -0500
commit	976a1a48fc35cde3c750982be64f872c4de4d343 (patch)
tree	de499afa5cf118d8e5615b23bd4bf22be286c8bf /contrib/ltree_plpython/ltree_plpython.c
parent	63ca86318dc3d6a768eed78efbc6ca014a0622a8 (diff)
download	postgresql-976a1a48fc35cde3c750982be64f872c4de4d343.tar.gz postgresql-976a1a48fc35cde3c750982be64f872c4de4d343.zip

Improve to_date/to_number/to_timestamp behavior with multibyte characters.

The documentation says that these functions skip one input character per literal (non-pattern) format character. Actually, though, they skipped one input *byte* per literal *byte*, which could be hugely confusing if either data or format contained multibyte characters. To fix, adjust the FormatNode representation and parse_format() so that multibyte format characters are stored as one FormatNode not several, and adjust the data-skipping bits to advance by pg_mblen() not necessarily one byte. There's no user-visible behavior change on the to_char() side, although the internal representation changes. Commit e87d4965b had already fixed most places where we skip characters on the basis of non-literal format patterns to advance by characters not bytes, but this gets one more place, the SKIP_THth macro. I think everything in formatting.c gets that right now. It'd be nice to have some regression test cases covering this behavior; but of course there's no way to do so in an encoding-agnostic way, and many of the interesting aspects would also require unportable locale selections. So I've not bothered here. Discussion: https://postgr.es/m/28186.1510957703@sss.pgh.pa.us

Diffstat (limited to 'contrib/ltree_plpython/ltree_plpython.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: