Maxim Dounin [Fri, 20 Jan 2017 18:12:48 +0000 (21:12 +0300)]
Fixed trailer construction with limit on FreeBSD and macOS.
The ngx_chain_coalesce_file() function may produce more bytes to send then
requested in the limit passed, as it aligns the last file position
to send to memory page boundary. As a result, (limit - send) may become
negative. This resulted in big positive number when converted to size_t
while calling ngx_output_chain_to_iovec().
Another part of the problem is in ngx_chain_coalesce_file(): it changes cl
to the next chain link even if the current buffer is only partially sent
due to limit.
Therefore, if a file buffer was not expected to be fully sent due to limit,
and was followed by a memory buffer, nginx called sendfile() with a part
of the file buffer, and the memory buffer in trailer. If there were enough
room in the socket buffer, this resulted in a part of the file buffer being
skipped, and corresponding part of the memory buffer sent instead.
The bug was introduced in 8e903522c17a (1.7.8). Configurations affected
are ones using limits, that is, limit_rate and/or sendfile_max_chunk, and
memory buffers after file ones (may happen when using subrequests or
with proxying with disk buffering).
Fix is to explicitly check if (send < limit) before constructing trailer
with ngx_output_chain_to_iovec(). Additionally, ngx_chain_coalesce_file()
was modified to preserve unfinished file buffers in cl.
A bug was introduced by 82efcedb310b that could lead to timing out of
responses or segmentation fault, when accept_mutex was enabled.
The output queue in HTTP/2 can contain frames from different streams.
When the queue is sent, all related write handlers need to be called.
In order to do so, the streams were added to the h2c->posted queue
after handling sent frames. Then this queue was processed in
ngx_http_v2_write_handler().
If accept_mutex is enabled, the event's "ready" flag is set but its
handler is not called immediately. Instead, the event is added to
the ngx_posted_events queue. At the same time in this queue can be
events from upstream connections. Such events can result in sending
output queue before ngx_http_v2_write_handler() is triggered. And
at the time ngx_http_v2_write_handler() is called, the output queue
can be already empty with some streams added to h2c->posted.
But after 82efcedb310b, these streams weren't processed if all frames
have already been sent and the output queue was empty. This might lead
to a situation when a number of streams were get stuck in h2c->posted
queue for a long time. Eventually these streams might get closed by
the send timeout.
In the worst case this might also lead to a segmentation fault, if
already freed stream was left in the h2c->posted queue. This could
happen if one of the streams was terminated but wasn't closed, due to
the HEADERS frame or a partially sent DATA frame left in the output
queue. If this happened the ngx_http_v2_filter_cleanup() handler
removed the stream from the h2c->waiting or h2c->posted queue on
termination stage, before the frame has been sent, and the stream
was again added to the h2c->posted queue after the frame was sent.
In order to fix all these problems and simplify the code, write
events of fake stream connections are now added to ngx_posted_events
instead of using a custom h2c->posted queue.
HTTP/2: fixed saving preread buffer to temp file (ticket #1143).
Previously, a request body bigger than "client_body_buffer_size" wasn't written
into a temporary file if it has been pre-read entirely. The preread buffer
is freed after processing, thus subsequent use of it might result in sending
corrupted body or cause a segfault.
HTTP/2: graceful shutdown of active connections (closes #1106).
Previously, while shutting down gracefully, the HTTP/2 connections were
closed in transition to idle state after all active streams have been
processed. That might never happen if the client continued opening new
streams.
Now, nginx sends GOAWAY to all HTTP/2 connections and ignores further
attempts to open new streams. A worker process will quit as soon as
processing of already opened streams is finished.
Maxim Dounin [Mon, 10 Oct 2016 13:15:41 +0000 (16:15 +0300)]
Core: sockaddr lengths now respected by ngx_cmp_sockaddr().
Linux can return AF_UNIX sockaddrs with partially filled sun_path,
resulting in spurious comparison failures and failed binary upgrades.
Added proper checking of the lengths provided.
Reported by Jan Seda,
http://mailman.nginx.org/pipermail/nginx-devel/2016-September/008832.html.
Addition filter: set last_in_chain flag when clearing last_buf.
When the last_buf flag is cleared for add_after_body to append more data from a
subrequest, other filters may still have buffered data, which should be flushed
at this point. For example, the sub_filter may have a partial match buffered,
which will only be flushed after the subrequest is done, ending up with
interleaved data in output.
Setting last_in_chain instead of last_buf flushes the data and fixes the order
of output buffers.
Event pipe: do not set file's thread_handler if not needed.
This fixes a problem with aio threads and sendfile with aio_write switched
off, as observed with range requests after fc72784b1f52 (1.9.13). Potential
problems with sendfile in threads were previously described in 9fd738b85fad,
and this seems to be one of them.
The problem occurred as file's thread_handler was set to NULL by event pipe
code after a sendfile thread task was scheduled. As a result, no sendfile
completion code was executed, and the same buffer was additionally sent
using non-threaded sendfile. Fix is to avoid modifying file's thread_handler
if aio_write is switched off.
Note that with "aio_write on" it is still possible that sendfile will use
thread_handler as set by event pipe. This is believed to be safe though,
as handlers used are compatible.
Sergey Kandaurov [Mon, 22 Aug 2016 15:53:21 +0000 (18:53 +0300)]
SSL: adopted session ticket handling for OpenSSL 1.1.0.
Return 1 in the SSL_CTX_set_tlsext_ticket_key_cb() callback function
to indicate that a new session ticket is created, as per documentation.
Until 1.1.0, OpenSSL didn't make a distinction between non-negative
return values.
See https://git.openssl.org/?p=openssl.git;a=commitdiff;h=5c753de for details.
HTTP/2: flushing of the SSL buffer in transition to the idle state.
It fixes potential connection leak if some unsent data was left in the SSL
buffer. Particularly, that could happen when a client canceled the stream
after the HEADERS frame has already been created. In this case no other
frames might be produced and the HEADERS frame alone didn't flush the buffer.
Checking for return value of c->send_chain() isn't sufficient since there
are data can be left in the SSL buffer. Now the wew->ready flag is used
instead.
In particular, this fixed a connection leak in cases when all streams were
closed, but there's still some data to be sent in the SSL buffer and the
client forgot about the connection.
HTTP/2: avoid sending output queue if there's nothing to send.
Particularly this fixes alerts on OS X and NetBSD systems when HTTP/2 is
configured over plain TCP sockets.
On these systems calling writev() with no data leads to EINVAL errors
being logged as "writev() failed (22: Invalid argument) while processing
HTTP/2 connection".
HTTP/2: always send GOAWAY while worker is shutting down.
Previously, if the worker process exited, GOAWAY was sent to connections in
idle state, but connections with active streams were closed without GOAWAY.
Previously, when a buffer was processed by the sub filter, its final bytes
could be buffered by the filter even if they don't match any pattern.
This happened because the Boyer-Moore algorithm, employed by the sub filter
since b9447fc457b4 (1.9.4), matches the last characters of patterns prior to
checking other characters. If the last character is out of scope, initial
bytes of a potential match are buffered until the last character is available.
Now, after receiving a flush or recycled buffer, the filter performs
additional checks to reduce the number of buffered bytes. The potential match
is checked against the initial parts of all patterns. Non-matching bytes are
not buffered. This improves processing of a chunked response from upstream
by sending the entire chunks without buffering unless a partial match is found
at the end of a chunk.
HTTP/2: fixed the "http request count is zero" alert.
When the stream is terminated the HEADERS frame can still wait in the output
queue. This frame can't be removed and must be sent to the client anyway,
since HTTP/2 uses stateful compression for headers. So in order to postpone
closing and freeing memory of such stream the special close stream handler
is set to the write event. After the HEADERS frame is sent the write event
is called and the stream will be finally closed.
Some events like receiving a RST_STREAM can trigger the read handler of such
stream in closing state and cause unexpected processing that can result in
another attempt to finalize the request. To prevent it the read handler is
now set to ngx_http_empty_handler.
HTTP/2: fixed a segfault while processing unbuffered upload.
The ngx_http_v2_finalize_connection() closes current stream, but that is an
invalid operation while processing unbuffered upload. This results in access
to already freed memory, since the upstream module sets a cleanup handler that
also finalizes the request.
HTTP/2: implemented preread buffer for request body (closes #959).
Previously, the stream's window was kept zero in order to prevent a client
from sending the request body before it was requested (see 887cca40ba6a for
details). Until such initial window was acknowledged all requests with
data were rejected (see 0aa07850922f for details).
That approach revealed a number of problems:
1. Some clients (notably MS IE/Edge, Safari, iOS applications) show an error
or even crash if a stream is rejected;
2. This requires at least one RTT for every request with body before the
client receives window update and able to send data.
To overcome these problems the new directive "http2_body_preread_size" is
introduced. It sets the initial window and configures a special per stream
preread buffer that is used to save all incoming data before the body is
requested and processed.
If the directive's value is lower than the default initial window (65535),
as previously, all streams with data will be rejected until the new window
is acknowledged. Otherwise, no special processing is used and all requests
with data are welcome right from the connection start.
The default value is chosen to be 64k, which is bigger than the default
initial window. Setting it to zero is fully complaint to the previous
behavior.
HTTP/2: the "421 Misdirected Request" response (closes #848).
Since 4fbef397c753 nginx rejects with the 400 error any attempts of
requesting different host over the same connection, if the relevant
virtual server requires verification of a client certificate.
While requesting hosts other than negotiated isn't something legal
in HTTP/1.x, the HTTP/2 specification explicitly permits such requests
for connection reuse and has introduced a special response code 421.
According to RFC 7540 Section 9.1.2 this code can be sent by a server
that is not configured to produce responses for the combination of
scheme and authority that are included in the request URI. And the
client may retry the request over a different connection.
Now this code is used for requests that aren't authorized in current
connection. After receiving the 421 response a client will be able
to open a new connection, provide the required certificate and retry
the request.
Unfortunately, not all clients currently are able to handle it well.
Notably Chrome just shows an error, while at least the latest version
of Firefox retries the request over a new connection.
Maxim Dounin [Tue, 31 May 2016 02:13:30 +0000 (05:13 +0300)]
Core: skip special buffers on writing (ticket #981).
A special last buffer with cl->buf->pos set to NULL can be present in
a chain when writing request body if chunked encoding was used. This
resulted in a NULL pointer dereference if it happened to be the only
buffer left after a do...while loop iteration in ngx_write_chain_to_file().
The problem originally appeared in nginx 1.3.9 with chunked encoding
support. Additionally, rev. 3832b608dc8d (nginx 1.9.13) changed the
minimum number of buffers to trigger this from IOV_MAX (typically 1024)
to NGX_IOVS_PREALLOCATE (typically 64).
Fix is to skip such buffers in ngx_chain_to_iovec(), much like it is
done in other places.
Thread pools: memory barriers in task completion notifications.
The ngx_thread_pool_done object isn't volatile, and at least some
compilers assume that it is permitted to reorder modifications of
volatile and non-volatile objects. Added appropriate ngx_memory_barrier()
calls to make sure all modifications will happen before the lock is released.
Reported by Mindaugas Rasiukevicius,
http://mailman.nginx.org/pipermail/nginx-devel/2016-April/008160.html.
HTTP/2: write logs when refusing streams with data.
Refusing streams is known to be incorrectly handled at least by IE, Edge
and Safari. Make sure to provide appropriate logging to simplify fixing
this in the affected browsers.
HTTP/2: send WINDOW_UPDATE instead of RST_STREAM with NO_ERROR.
After the 92464ebace8e change, it has been discovered that not all
clients follow the RFC and handle RST_STREAM with NO_ERROR properly.
Notably, Chrome currently interprets it as INTERNAL_ERROR and discards
the response.
As a workaround, instead of RST_STREAM the maximum stream window update
will be sent, which will let client to send up to 2 GB of a request body
data before getting stuck on flow control. All the received data will
be silently discarded.
See for details:
http://mailman.nginx.org/pipermail/nginx-devel/2016-April/008143.html
https://bugs.chromium.org/p/chromium/issues/detail?id=603182
HTTP/2: refuse streams with data until SETTINGS is acknowledged.
A client is allowed to send requests before receiving and acknowledging
the SETTINGS frame. Such a client having a wrong idea about the stream's
could send the request body that nginx isn't ready to process.
The previous behavior was to send RST_STREAM with FLOW_CONTROL_ERROR in
such case, but it didn't allow retrying requests that have been rejected.
FastCGI: skip special bufs in buffered request body chain.
This prevents forming empty records out of such buffers. Particularly it fixes
double end-of-stream records with chunked transfer encoding, or when HTTP/2 is
used and the END_STREAM flag has been sent without data. In both cases there
is an empty buffer at the end of the request body chain with the "last_buf"
flag set.
The canonical libfcgi, as well as php implementation, tolerates such records,
while the HHVM parser is more strict and drops the connection (ticket #950).
Fixed a regression introduced in rev. 434548349838 that prevented
auto/types/sizeof and auto/types/typedef properly reporting autotest
source code to autoconf.err in case of test failure.
2. Receiving of request body is started only after
the ngx_http_read_client_request_body() call.
The last one fixes the problem when the client_max_body_size value might not be
respected from the right location if the location was changed either during the
process of receiving body or after the whole body had been received.
HTTP/2: sending RST_STREAM with NO_ERROR to discard request body.
RFC 7540 states that "A server can send a complete response prior to the client
sending an entire request if the response does not depend on any portion of the
request that has not been sent and received. When this is true, a server MAY
request that the client abort transmission of a request without error by sending
a RST_STREAM with an error code of NO_ERROR after sending a complete response
(i.e., a frame with the END_STREAM flag)."
This should prevent a client from blocking on the stream window, since it isn't
maintained for closed streams. Currently, quite big initial stream windows are
used, so such blocking is very unlikly, but that will be changed in the further
patches.
Maxim Dounin [Thu, 31 Mar 2016 20:38:33 +0000 (23:38 +0300)]
SSL: initialization changes for OpenSSL 1.1.0.
OPENSSL_config() deprecated in OpenSSL 1.1.0. Additionally,
SSL_library_init(), SSL_load_error_strings() and OpenSSL_add_all_algorithms()
are no longer available if OPENSSL_API_COMPAT is set to 0x10100000L.
The OPENSSL_init_ssl() function is now used instead with appropriate
arguments to trigger the same behaviour. The configure test changed to
use SSL_CTX_set_options().
Deinitialization now happens automatically in OPENSSL_cleanup() called
via atexit(3), so we no longer call EVP_cleanup() and ENGINE_cleanup()
directly.
Maxim Dounin [Thu, 31 Mar 2016 20:38:29 +0000 (23:38 +0300)]
SSL: reasonable version for LibreSSL.
LibreSSL defines OPENSSL_VERSION_NUMBER to 0x20000000L, but uses an old
API derived from OpenSSL at the time LibreSSL forked. As a result, every
version check we use to test for new API elements in newer OpenSSL versions
requires an explicit check for LibreSSL.
To reduce clutter, redefine OPENSSL_VERSION_NUMBER to 0x1000107fL if
LibreSSL is used. The same is done by FreeBSD port of LibreSSL.
Maxim Dounin [Tue, 29 Mar 2016 06:52:15 +0000 (09:52 +0300)]
Win32: replaced NGX_EXDEV with more appropriate error code.
Correct error code for NGX_EXDEV on Windows is ERROR_NOT_SAME_DEVICE,
"The system cannot move the file to a different disk drive".
Previously used ERROR_WRONG_DISK is about wrong diskette in the drive and
is not appropriate.
There is no real difference though, as MoveFile() is able to copy files
between disk drives, and will fail with ERROR_ACCESS_DENIED when asked
to copy directories. The ERROR_NOT_SAME_DEVICE error is only used
by MoveFileEx() when called without the MOVEFILE_COPY_ALLOWED flag.
On Windows there are two possible error codes which correspond to
the EEXIST error code: ERROR_FILE_EXISTS used by CreateFile(CREATE_NEW),
and ERROR_ALREADY_EXISTS used by CreateDirectory().
MoveFile() seems to use both: ERROR_ALREADY_EXISTS when moving within
one filesystem, and ERROR_FILE_EXISTS when copying a file to a different
drive.
Maxim Dounin [Mon, 28 Mar 2016 16:50:19 +0000 (19:50 +0300)]
Upstream: proxy_next_upstream non_idempotent.
By default, requests with non-idempotent methods (POST, LOCK, PATCH)
are no longer retried in case of errors if a request was already sent
to a backend. Previous behaviour can be restored by using
"proxy_next_upstream ... non_idempotent".
Ruslan Ermilov [Mon, 28 Mar 2016 16:29:18 +0000 (19:29 +0300)]
Fixed --test-build-*.
Fixes various aspects of --test-build-devpoll, --test-build-eventport, and
--test-build-epoll.
In particular, if --test-build-devpoll was used on Linux, then "devpoll"
event method would be preferred over "epoll". Also, wrong definitions of
event macros were chosen.