An UTF-8 octet sequence cannot start with a 11111xxx byte (above 0xf8),
see https://datatracker.ietf.org/doc/html/rfc3629#section-3. Previously,
such bytes were accepted by ngx_utf8_decode() and misinterpreted as 11110xxx
bytes (as in a 4-byte sequence). While unlikely, this can potentially cause
issues.
Fix is to explicitly reject such bytes in ngx_utf8_decode().
u = **p;
- if (u >= 0xf0) {
+ if (u >= 0xf8) {
+
+ (*p)++;
+ return 0xffffffff;
+
+ } else if (u >= 0xf0) {
u &= 0x07;
valid = 0xffff;