Ruslan Ermilov [Tue, 6 Nov 2018 13:29:49 +0000 (16:29 +0300)]
HTTP/2: limit the number of idle state switches.
An attack that continuously switches HTTP/2 connection between
idle and active states can result in excessive CPU usage.
This is because when a connection switches to the idle state,
all of its memory pool caches are freed.
This change limits the maximum allowed number of idle state
switches to 10 * http2_max_requests (i.e., 10000 by default).
This limits possible CPU usage in one connection, and also
imposes a limit on the maximum lifetime of a connection.
Initially reported by Gal Goldshtein from F5 Networks.
Ruslan Ermilov [Tue, 6 Nov 2018 13:29:35 +0000 (16:29 +0300)]
HTTP/2: flood detection.
Fixed uncontrolled memory growth in case peer is flooding us with
some frames (e.g., SETTINGS and PING) and doesn't read data. Fix
is to limit the number of allocated control frames.
Previously there was no validation for the size of a 64-bit atom
in an mp4 file. This could lead to a CPU hog when the size is 0,
or various other problems due to integer underflow when calculating
atom data size, including segmentation fault or worker process
memory disclosure.
Previously, ngx_http_upstream_process_header() might be called after
we've finished reading response headers and switched to a different read
event handler, leading to errors with gRPC proxying. Additionally,
the u->conf->read_timeout timer might be re-armed during reading response
headers (while this is expected to be a single timeout on reading
the whole response header).
Previously, ngx_http_upstream_test_next() used an outdated condition on
whether it will be possible to switch to a different server or not. It
did not take into account restrictions on non-idempotent requests, requests
with non-buffered request body, and the next upstream timeout.
For such requests, switching to the next upstream server was rejected
later in ngx_http_upstream_next(), resulting in nginx own error page
being returned instead of the original upstream response.
Ruslan Ermilov [Mon, 2 Apr 2018 15:40:04 +0000 (18:40 +0300)]
Core: revised the PROXY protocol v2 code.
- use normal prefixes for types and macros
- removed some macros and types
- revised debug messages
- removed useless check of ngx_sock_ntop() returning 0
- removed special processing of AF_UNSPEC
Maxim Dounin [Thu, 22 Mar 2018 12:56:07 +0000 (15:56 +0300)]
Configure: restored "no-threads" in OpenSSL builds.
This was previously used, but was incorrectly removed in 83d54192e97b
while removing old threads remnants. Instead of using it conditionally
when threads are not used, we now set in unconditionally, as even with
thread pools enabled we never call OpenSSL functions in threads.
This fixes resulting binary when using --with-openssl with OpenSSL 1.1.0+
and without -lpthread linked (notably on FreeBSD without PCRE).
Maxim Dounin [Thu, 22 Mar 2018 12:55:57 +0000 (15:55 +0300)]
Configure: fixed static compilation with OpenSSL 1.1.1.
OpenSSL now uses pthread_atfork(), and this requires -lpthread on Linux
to compile. Introduced NGX_LIBPTHREAD to add it as appropriate, similar
to existing NGX_LIBDL.
The fields "uri", "location", and "url" from ngx_http_upstream_conf_t
moved to ngx_http_proxy_loc_conf_t and ngx_http_proxy_vars_t, reflect
this change in create_loc_conf comments.
Maxim Dounin [Sat, 17 Mar 2018 20:04:26 +0000 (23:04 +0300)]
gRPC: special handling of "trailer only" responses.
The gRPC protocol makes a distinction between HEADERS frame with
the END_STREAM flag set, and a HEADERS frame followed by an empty
DATA frame with the END_STREAM flag. The latter is not permitted,
and results in errors not being propagated through nginx. Instead,
gRPC clients complain that "server closed the stream without sending
trailers" (seen in grpc-go) or "13: Received RST_STREAM with error
code 2" (seen in grpc-c).
To fix this, nginx now returns HEADERS with the END_STREAM flag if
the response length is known to be 0, and we are not expecting
any trailer headers to be added. And the response length is
explicitly set to 0 in the gRPC proxy if we see initial HEADERS frame
with the END_STREAM flag set.
Maxim Dounin [Sat, 17 Mar 2018 20:04:25 +0000 (23:04 +0300)]
gRPC: special handling of the TE request header.
According to the gRPC protocol specification, the "TE" header is used
to detect incompatible proxies, and at least grpc-c server rejects
requests without "TE: trailers".
To preserve the logic, we have to pass "TE: trailers" to the backend if
and only if the original request contains "trailers" in the "TE" header.
Note that no other TE values are allowed in HTTP/2, so we have to remove
anything else.
Maxim Dounin [Sat, 17 Mar 2018 20:04:24 +0000 (23:04 +0300)]
The gRPC proxy module.
The module allows passing requests to upstream gRPC servers.
The module is built by default as long as HTTP/2 support is compiled in.
Example configuration:
grpc_pass 127.0.0.1:9000;
Alternatively, the "grpc://" scheme can be used:
grpc_pass grpc://127.0.0.1:9000;
Keepalive support is available via the upstream keepalive module. Note
that keepalive connections won't currently work with grpc-go as it fails
to handle SETTINGS_HEADER_TABLE_SIZE.
To use with SSL:
grpc_pass grpcs://127.0.0.1:9000;
SSL connections use ALPN "h2" when available. At least grpc-go works fine
without ALPN, so if ALPN is not available we just establish a connection
without it.
Maxim Dounin [Sat, 17 Mar 2018 20:04:23 +0000 (23:04 +0300)]
Upstream: u->conf->preserve_output flag.
The flag can be used to continue sending request body even after we've
got a response from the backend. In particular, this is needed for gRPC
proxying of bidirectional streaming RPCs, and also to send control frames
in other forms of RPCs.
Maxim Dounin [Sat, 17 Mar 2018 20:04:22 +0000 (23:04 +0300)]
Upstream: u->request_body_blocked flag.
The flag indicates whether last ngx_output_chain() returned NGX_AGAIN
or not. If the flag is set, we arm the u->conf->send_timeout timer.
The flag complements c->write->ready test, and allows to stop sending
the request body in an output filter due to protocol-specific flow
control.
Basic trailer headers support allows one to access response trailers
via the $upstream_trailer_* variables.
Additionally, the u->conf->pass_trailers flag was introduced. When the
flag is set, trailer headers from the upstream response are passed to
the client. Like normal headers, trailer headers will be hidden
if present in u->conf->hide_headers_hash.
Maxim Dounin [Thu, 1 Mar 2018 17:25:50 +0000 (20:25 +0300)]
Core: ngx_current_msec now uses monotonic time if available.
When clock_gettime(CLOCK_MONOTONIC) (or faster variants, _FAST on FreeBSD,
and _COARSE on Linux) is available, we now use it for ngx_current_msec.
This should improve handling of timers if system time changes (ticket #189).
The r->out chain link could be left uninitialized in case of error.
A segfault could happen if the subrequest handler accessed it.
The issue was introduced in commit 20f139e9ffa8.
Roman Arutyunyan [Wed, 28 Feb 2018 13:56:58 +0000 (16:56 +0300)]
Generic subrequests in memory.
Previously, only the upstream response body could be accessed with the
NGX_HTTP_SUBREQUEST_IN_MEMORY feature. Now any response body from a subrequest
can be saved in a memory buffer. It is available as a single buffer in r->out
and the buffer size is configured by the subrequest_output_buffer_size
directive.
Upstream, proxy and fastcgi code used to handle the old-style feature is
removed.
Roman Arutyunyan [Thu, 22 Feb 2018 10:16:21 +0000 (13:16 +0300)]
Generate error for unsupported IPv6 transparent proxy.
On some platforms (for example, Linux with glibc 2.12-2.25) IPv4 transparent
proxying is available, but IPv6 transparent proxying is not. The entire feature
is enabled in this case and NGX_HAVE_TRANSPARENT_PROXY macro is set to 1.
Previously, an attempt to enable transparency for an IPv6 socket was silently
ignored in this case and was usually followed by a bind(2) EADDRNOTAVAIL error
(ticket #1487). Now the error is generated for unavailable IPv6 transparent
proxy.
If during configuration parsing of the geo directive the memory
allocation has failed, pool used to parse configuration inside
the block, and sometimes the temporary pool were not destroyed.
Maxim Dounin [Thu, 15 Feb 2018 16:06:22 +0000 (19:06 +0300)]
HTTP/2: precalculate hash for "Cookie".
There is no need to calculate hashes of static strings at runtime. The
ngx_hash() macro can be used to do it during compilation instead, similarly
to how it is done in ngx_http_proxy_module.c for "Server" and "Date" headers.
Ruslan Ermilov [Thu, 15 Feb 2018 14:51:37 +0000 (17:51 +0300)]
HTTP/2: fixed ngx_http_v2_push_stream() allocation error handling.
In particular, if a stream object allocation failed, and a client sent
the PRIORITY frame for this stream, ngx_http_v2_set_dependency() could
dereference a null pointer while trying to re-parent a dependency node.
Ruslan Ermilov [Fri, 9 Feb 2018 20:20:08 +0000 (23:20 +0300)]
HTTP/2: fixed null pointer dereference with server push.
r->headers_in.host can be NULL in ngx_http_v2_push_resource().
This happens when a request is terminated with 400 before the :authority
or Host header is parsed, and either pushing is enabled on the server{}
level or error_page 400 redirects to a location with pushes configured.
Ruslan Ermilov [Thu, 8 Feb 2018 06:55:03 +0000 (09:55 +0300)]
HTTP/2: server push.
Resources to be pushed are configured with the "http2_push" directive.
Also, preload links from the Link response headers, as described in
https://www.w3.org/TR/preload/#server-push-http-2, can be pushed, if
enabled with the "http2_push_preload" directive.
Only relative URIs with absolute paths can be pushed.
The number of concurrent pushes is normally limited by a client, but
cannot exceed a hard limit set by the "http2_max_concurrent_pushes"
directive.
Previously, when request body was not available or was previously read in
memory rather than a file, client received HTTP 500 error, but no explanation
was logged in error log. This could happen, for example, if request body was
read or discarded prior to error_page redirect, or if mirroring was enabled
along with dav.
Sergey Kandaurov [Tue, 30 Jan 2018 14:46:31 +0000 (17:46 +0300)]
SSL: using default server context in session remove (closes #1464).
This fixes segfault in configurations with multiple virtual servers sharing
the same port, where a non-default virtual server block misses certificate.
Maxim Dounin [Thu, 11 Jan 2018 18:43:49 +0000 (21:43 +0300)]
Upstream: fixed "header already sent" alerts on backend errors.
Following ad3f342f14ba046c (1.9.13), it is possible that a request where
header was already sent will be finalized with NGX_HTTP_BAD_GATEWAY,
triggering an attempt to return additional error response and the
"header already sent" alert as a result.
In particular, it is trivial to reproduce the problem with a HEAD request
and caching enabled. With caching enabled nginx will change HEAD to GET
and will set u->pipe->downstream_error to suppress sending the response
body to the client. When a backend-related error occurs (for example,
proxy_read_timeout expires), ngx_http_finalize_upstream_request() will
be called with NGX_HTTP_BAD_GATEWAY. After ad3f342f14ba046c this will
result in ngx_http_finalize_request(NGX_HTTP_BAD_GATEWAY).
Fix is to move u->pipe->downstream_error handling to a later point,
where all special response codes are changed to NGX_ERROR.
Reported by Jan Prachar,
http://mailman.nginx.org/pipermail/nginx-devel/2018-January/010737.html.
Roman Arutyunyan [Thu, 21 Dec 2017 10:29:40 +0000 (13:29 +0300)]
Allowed configuration token to start with a variable.
Specifically, it is now allowed to start with a variable expression with braces:
${name}. The opening curly bracket in such a token was previously considered
the start of a new block. Variables located anywhere else in a token worked
fine: foo${name}.
Roman Arutyunyan [Tue, 19 Dec 2017 16:00:27 +0000 (19:00 +0300)]
Fixed capabilities version.
Previously, capset(2) was called with the 64-bit capabilities version
_LINUX_CAPABILITY_VERSION_3. With this version Linux kernel expected two
copies of struct __user_cap_data_struct, while only one was submitted. As a
result, random stack memory was accessed and random capabilities were requested
by the worker. This sometimes caused capset() errors. Now the 32-bit version
_LINUX_CAPABILITY_VERSION_1 is used instead. This is OK since CAP_NET_RAW is
a 32-bit capability (CAP_NET_RAW = 13).
Roman Arutyunyan [Mon, 18 Dec 2017 18:09:39 +0000 (21:09 +0300)]
Improved the capabilities feature detection.
Previously included file sys/capability.h mentioned in capset(2) man page,
belongs to the libcap-dev package, which may not be installed on some Linux
systems when compiling nginx. This prevented the capabilities feature from
being detected and compiled on that systems.
Now linux/capability.h system header is included instead. Since capset()
declaration is located in sys/capability.h, now capset() syscall is defined
explicitly in code using the SYS_capset constant, similarly to other
Linux-specific features in nginx.
Roman Arutyunyan [Wed, 13 Dec 2017 17:40:53 +0000 (20:40 +0300)]
Retain CAP_NET_RAW capability for transparent proxying.
The capability is retained automatically in unprivileged worker processes after
changing UID if transparent proxying is enabled at least once in nginx
configuration.