Cleaned up r->headers_out.headers allocation error handling.
If initialization of a header failed for some reason after ngx_list_push(),
leaving the header as is can result in uninitialized memory access by
the header filter or the log module. The fix is to clear partially
initialized headers in case of errors.
For the Cache-Control header, the fix is to postpone pushing
r->headers_out.cache_control until its value is completed.
Sub filter: restored ngx_http_set_ctx() at the proper place.
Previously, ngx_http_sub_header_filter() could fail with a partially
initialized context, later accessed in ngx_http_sub_body_filter()
if called from the perl content handler.
The SSL_CTRL_SET_CURVES_LIST macro is removed in the OpenSSL master branch.
SSL_CTX_set1_curves_list is preserved as compatibility with previous versions.
SSL: disabled renegotiation detection in client mode.
CVE-2009-3555 is no longer relevant and mitigated by the renegotiation
info extension (secure renegotiation). On the other hand, unexpected
renegotiation still introduces potential security risks, and hence we do
not allow renegotiation on the server side, as we never request renegotiation.
On the client side the situation is different though. There are backends
which explicitly request renegotiation, and disabled renegotiation
introduces interoperability problems. This change allows renegotiation
on the client side, and fixes interoperability problems as observed with
such backends (ticket #872).
Additionally, with TLSv1.3 the SSL_CB_HANDSHAKE_START flag is currently set
by OpenSSL when receiving a NewSessionTicket message, and was detected by
nginx as a renegotiation attempt. This looks like a bug in OpenSSL, though
this change also allows better interoperability till the problem is fixed.
Roman Arutyunyan [Tue, 11 Apr 2017 13:41:53 +0000 (16:41 +0300)]
Set UDP datagram source address (ticket #1239).
Previously, the source IP address of a response UDP datagram could differ from
the original datagram destination address. This could happen if the server UDP
socket is bound to a wildcard address and the network interface chosen to output
the response packet has a different default address than the destination address
of the original packet. For example, if two addresses from the same network are
configured on an interface.
Now source address is set explicitly if a response is sent for a server UDP
socket bound to a wildcard address.
Piotr Sikora [Fri, 24 Mar 2017 09:48:03 +0000 (02:48 -0700)]
Upstream: allow recovery from "429 Too Many Requests" response.
This change adds "http_429" parameter to "proxy_next_upstream" for
retrying rate-limited requests, and to "proxy_cache_use_stale" for
serving stale cached responses after being rate-limited.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Piotr Sikora [Fri, 24 Mar 2017 09:48:03 +0000 (02:48 -0700)]
Added support for "429 Too Many Requests" response (RFC6585).
This change adds reason phrase in status line and pretty response body
when "429" status code is used in "return", "limit_conn_status" and/or
"limit_req_status" directives.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
When a slice subrequest was redirected to a new location, its context was lost.
After its completion, a new slice subrequest for the same slice was created.
This could lead to infinite loop. Now the slice module makes sure each slice
subrequest starts output with the slice context available.
Moved handling of wev->delayed to the connection event handler.
With post_action or subrequests, it is possible that the timer set for
wev->delayed will expire while the active subrequest write event handler
is not ready to handle this. This results in request hangs as observed
with limit_rate / sendfile_max_chunk and post_action (ticket #776) or
subrequests (ticket #1228).
Moving the handling to the connection event handler fixes the hangs observed,
and also slightly simplifies the code.
Since limit_req uses connection's write event to delay request processing,
it can conflict with timers in other subrequests. In particular, even
if applied to an active subrequest, it can break things if wev->delayed
is already set (due to limit_rate or sendfile_max_chunk), since after
limit_req finishes the wev->delayed flag will be set and no timer will be
active.
Fix is to use the wev->delayed flag in limit_req as well. This ensures that
wev->delayed won't be set after limit_req finishes, and also ensures that
limit_req's timers will be properly handled by other subrequests if the one
delayed by limit_req is not active.
All streams in connection must be finalized before the connection
itself can be finalized and all related memory is freed. That's
not always possible on the current event loop iteration.
Thus when the last stream is finalized, it sets the special read
event handler ngx_http_v2_handle_connection_handler() and posts
the event.
Previously, this handler didn't check the connection state and
could call the regular event handler on a connection that was
already in finalization stage. In the worst case that could
lead to a segmentation fault, since some data structures aren't
supposed to be used during connection finalization. Particularly,
the waiting queue can contain already freed streams, so the
WINDOW_UPDATE frame received by that moment could trigger
accessing to these freed streams.
Now, the connection error flag is explicitly checked in
ngx_http_v2_handle_connection_handler().
In order to finalize stream the error flag is set on fake connection and
either "write" or "read" event handler is called. The read events of fake
connections are always ready, but it's not the case with the write events.
When the ready flag isn't set, the error flag can be not checked in some
cases and as a result stream isn't finalized. Now the ready flag is
explicilty set on write events for proper finalization in all cases.
Piotr Sikora [Sun, 26 Mar 2017 08:25:04 +0000 (01:25 -0700)]
HTTP/2: fix flow control with padded DATA frames.
Previously, flow control didn't account for padding in DATA frames,
which meant that its view of the world could drift from peer's view
by up to 256 bytes per received padded DATA frame, which could lead
to a deadlock.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Piotr Sikora [Sun, 26 Mar 2017 08:25:02 +0000 (01:25 -0700)]
HTTP/2: fix $bytes_sent variable.
Previously, its value accounted for payloads of HEADERS, CONTINUATION
and DATA frames, as well as frame headers of HEADERS and DATA frames,
but it didn't account for frame headers of CONTINUATION frames.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Maxim Dounin [Tue, 28 Mar 2017 15:15:42 +0000 (18:15 +0300)]
Copy filter: wake up subrequests after aio operations.
Previously, connection write handler was called, resulting in wake up
of the active subrequest. This change makes it possible to read data
in non-active subrequests as well. For example, this allows SSI to
process instructions in non-active subrequests earlier and start
additional subrequests if needed, reducing overall response time.
Maxim Dounin [Tue, 28 Mar 2017 15:15:41 +0000 (18:15 +0300)]
Threads: fixed request hang with aio_write and subrequests.
If the subrequest is already finalized, the handler set with aio_write
may still be used by sendfile in threads when using range requests
(see also e4c1f5b32868, and the original note in 9fd738b85fad). Calling
already finalized subrequest's r->write_event_handler in practice
results in request hang in some cases.
Fix is to trigger connection event handler if the subrequest was already
finalized.
Maxim Dounin [Tue, 28 Mar 2017 15:15:39 +0000 (18:15 +0300)]
Simplified and improved sendfile() code on Linux.
The ngx_linux_sendfile() function is now used for both normal sendfile()
and sendfile in threads. The ngx_linux_sendfile_thread() function was
modified to use the same interface as ngx_linux_sendfile(), and is simply
called from ngx_linux_sendfile() when threads are enabled.
Special return code NGX_DONE is used to indicate that a thread task was
posted and no further actions are needed.
If number of bytes sent is less that what we were sending, we now always
retry sending. This is needed for sendfile() in threads as the number
of bytes we are sending might have been changed since the thread task
was posted. And this is also needed for Linux 4.3+, as sendfile() might
be interrupted at any time and provides no indication if it was interrupted
or not (ticket #1174).
Sergey Kandaurov [Tue, 28 Mar 2017 11:21:38 +0000 (14:21 +0300)]
Fixed ngx_open_cached_file() error handling.
If of.err is 0, it means that there was a memory allocation error
and no further logging and/or processing is needed. The of.failed
string can be only accessed if of.err is not 0.
Explicit initialization of ngx_module_libs and ngx_module_link
matches what we already do when processing mail modules, and
is also required after the next change.
Maxim Dounin [Tue, 7 Mar 2017 15:51:16 +0000 (18:51 +0300)]
Introduced worker_shutdown_timeout.
The directive configures a timeout to be used when gracefully shutting down
worker processes. When the timer expires, nginx will try to close all
the connections currently open to facilitate shutdown.
Maxim Dounin [Tue, 7 Mar 2017 15:51:15 +0000 (18:51 +0300)]
Cancelable timers are now preserved if there are other timers.
There is no need to cancel timers early if there are other timers blocking
shutdown anyway. Preserving such timers allows nginx to continue some
periodic work till the shutdown is actually possible.
With the new approach, timers with ev->cancelable are simply ignored when
checking if there are any timers left during shutdown.
Maxim Dounin [Tue, 7 Mar 2017 15:51:12 +0000 (18:51 +0300)]
Access log: removed dead ev->timedout check in flush timer handler.
The ev->timedout flag is set on first timer expiration, and never reset
after it. Due to this the code to stop the timer when the timer was
canceled never worked (except in a very specific time frame immediately
after start), and the timer was always armed again. This essentially
resulted in a buffer flush at the end of an event loop iteration.
This behaviour actually seems to be better than just stopping the flush
timer for the whole shutdown, so it is preserved as is instead of fixing
the code to actually remove the timer. It will be further improved by
upcoming changes to preserve cancelable timers if there are other timers
blocking shutdown.
Maxim Dounin [Tue, 7 Mar 2017 15:49:31 +0000 (18:49 +0300)]
Converted hc->busy/hc->free to use chain links.
Most notably, this fixes possible buffer overflows if number of large
client header buffers in a virtual server is different from the one in
the default server.
Maxim Dounin [Mon, 27 Feb 2017 19:36:15 +0000 (22:36 +0300)]
Fixed background update with "if".
Cloned subrequests should inherit r->content_handler. This way they will
be able to use the same location configuration as the original request
if there are "if" directives in the configuration.
Without r->content_handler inherited, the following configuration tries
to access a static file in the update request:
location / {
set $true 1;
if ($true) {
# nothing
}
Maxim Dounin [Fri, 10 Feb 2017 17:24:26 +0000 (20:24 +0300)]
Upstream: read handler cleared on upstream finalization.
With "proxy_ignore_client_abort off" (the default), upstream module changes
r->read_event_handler to ngx_http_upstream_rd_check_broken_connection().
If the handler is not cleared during upstream finalization, it can be
triggered later, causing unexpected effects, if, for example, a request
was redirected to a different location using error_page or X-Accel-Redirect.
In particular, it makes "proxy_ignore_client_abort on" non-working after
a redirection in a configuration like this:
It is also known to cause segmentation faults with aio used, see
http://mailman.nginx.org/pipermail/nginx-ru/2015-August/056570.html.
Fix is to explicitly set r->read_event_handler to ngx_http_block_reading()
during upstream finalization, similar to how it is done in the request body
reading code and in the limit_req module.
Maxim Dounin [Fri, 10 Feb 2017 14:49:19 +0000 (17:49 +0300)]
Cache: increased cache header Vary and ETag lengths to 128.
This allows to store larger ETag values for proxy_cache_revalidate,
including ones generated as SHA256, and cache responses with longer
Vary (ticket #826).
In particular, this fixes caching of Amazon S3 responses with CORS
enabled, which now use "Vary: Origin, Access-Control-Request-Headers,
Access-Control-Request-Method".
Roman Arutyunyan [Fri, 10 Feb 2017 13:33:12 +0000 (16:33 +0300)]
Slice filter: fetch slices in cloned subrequests.
Previously, slice subrequest location was selected based on request URI.
If request is then redirected to a new location, its context array is cleared,
making the slice module loose current slice range information. This lead to
broken output. Now subrequests with the NGX_HTTP_SUBREQUEST_CLONE flag are
created for slices. Such subrequests stay in the same location as the parent
request and keep the right slice context.
Roman Arutyunyan [Thu, 22 Dec 2016 11:25:34 +0000 (14:25 +0300)]
Cache: support for stale-while-revalidate and stale-if-error.
Previously, there was no way to enable the proxy_cache_use_stale behavior by
reading the backend response. Now, stale-while-revalidate and stale-if-error
Cache-Control extensions (RFC 5861) are supported. They specify, how long a
stale response can be used when a cache entry is being updated, or in case of
an error.
The function may leave error in the error queue while returning success,
e.g., when taking a DSO reference to itself as of OpenSSL 1.1.0d:
https://git.openssl.org/?p=openssl.git;a=commit;h=4af9f7f
Notably, this fixes alert seen with statically linked OpenSSL on some platforms.
While here, check OPENSSL_init_ssl() return value.
Maxim Dounin [Thu, 2 Feb 2017 17:29:16 +0000 (20:29 +0300)]
SSL: fixed ssl_buffer_size on SNI virtual hosts (ticket #1192).
Previously, buffer size was not changed from the one saved during
initial ngx_ssl_create_connection(), even if the buffer itself was not
yet created. Fix is to change c->ssl->buffer_size in the SNI callback.
Note that it should be also possible to update buffer size even in non-SNI
virtual hosts as long as the buffer is not yet allocated. This looks
like an overcomplication though.
Maxim Dounin [Fri, 20 Jan 2017 18:14:19 +0000 (21:14 +0300)]
Upstream: fixed cache corruption and socket leaks with aio_write.
The ngx_event_pipe() function wasn't called on write events with
wev->delayed set. As a result, threaded writing results weren't
properly collected in ngx_event_pipe_write_to_downstream() when a
write event was triggered for a completed write.
Further, this wasn't detected, as p->aio was reset by a thread completion
handler, and results were later collected in ngx_event_pipe_read_upstream()
instead of scheduling a new write of additional data. If this happened
on the last reading from an upstream, last part of the response was never
written to the cache file.
Similar problems might also happen in case of timeouts when writing to
client, as this also results in ngx_event_pipe() not being called on write
events. In this scenario socket leaks were observed.
Fix is to check if p->writing is set in ngx_event_pipe_read_upstream(), and
therefore collect results of previous write operations in case of read events
as well, similar to how we do so in ngx_event_pipe_write_downstream().
This is enough to fix the wev->delayed case. Additionally, we now call
ngx_event_pipe() from ngx_http_upstream_process_request() if there are
uncollected write operations (p->writing and !p->aio). This also fixes
the wev->timedout case.