HTTP/2: fixed the "http request count is zero" alert.
When the stream is terminated the HEADERS frame can still wait in the output
queue. This frame can't be removed and must be sent to the client anyway,
since HTTP/2 uses stateful compression for headers. So in order to postpone
closing and freeing memory of such stream the special close stream handler
is set to the write event. After the HEADERS frame is sent the write event
is called and the stream will be finally closed.
Some events like receiving a RST_STREAM can trigger the read handler of such
stream in closing state and cause unexpected processing that can result in
another attempt to finalize the request. To prevent it the read handler is
now set to ngx_http_empty_handler.
HTTP/2: fixed a segfault while processing unbuffered upload.
The ngx_http_v2_finalize_connection() closes current stream, but that is an
invalid operation while processing unbuffered upload. This results in access
to already freed memory, since the upstream module sets a cleanup handler that
also finalizes the request.
HTTP/2: implemented preread buffer for request body (closes #959).
Previously, the stream's window was kept zero in order to prevent a client
from sending the request body before it was requested (see 887cca40ba6a for
details). Until such initial window was acknowledged all requests with
data were rejected (see 0aa07850922f for details).
That approach revealed a number of problems:
1. Some clients (notably MS IE/Edge, Safari, iOS applications) show an error
or even crash if a stream is rejected;
2. This requires at least one RTT for every request with body before the
client receives window update and able to send data.
To overcome these problems the new directive "http2_body_preread_size" is
introduced. It sets the initial window and configures a special per stream
preread buffer that is used to save all incoming data before the body is
requested and processed.
If the directive's value is lower than the default initial window (65535),
as previously, all streams with data will be rejected until the new window
is acknowledged. Otherwise, no special processing is used and all requests
with data are welcome right from the connection start.
The default value is chosen to be 64k, which is bigger than the default
initial window. Setting it to zero is fully complaint to the previous
behavior.
HTTP/2: the "421 Misdirected Request" response (closes #848).
Since 4fbef397c753 nginx rejects with the 400 error any attempts of
requesting different host over the same connection, if the relevant
virtual server requires verification of a client certificate.
While requesting hosts other than negotiated isn't something legal
in HTTP/1.x, the HTTP/2 specification explicitly permits such requests
for connection reuse and has introduced a special response code 421.
According to RFC 7540 Section 9.1.2 this code can be sent by a server
that is not configured to produce responses for the combination of
scheme and authority that are included in the request URI. And the
client may retry the request over a different connection.
Now this code is used for requests that aren't authorized in current
connection. After receiving the 421 response a client will be able
to open a new connection, provide the required certificate and retry
the request.
Unfortunately, not all clients currently are able to handle it well.
Notably Chrome just shows an error, while at least the latest version
of Firefox retries the request over a new connection.
Maxim Dounin [Tue, 31 May 2016 02:13:30 +0000 (05:13 +0300)]
Core: skip special buffers on writing (ticket #981).
A special last buffer with cl->buf->pos set to NULL can be present in
a chain when writing request body if chunked encoding was used. This
resulted in a NULL pointer dereference if it happened to be the only
buffer left after a do...while loop iteration in ngx_write_chain_to_file().
The problem originally appeared in nginx 1.3.9 with chunked encoding
support. Additionally, rev. 3832b608dc8d (nginx 1.9.13) changed the
minimum number of buffers to trigger this from IOV_MAX (typically 1024)
to NGX_IOVS_PREALLOCATE (typically 64).
Fix is to skip such buffers in ngx_chain_to_iovec(), much like it is
done in other places.
Thread pools: memory barriers in task completion notifications.
The ngx_thread_pool_done object isn't volatile, and at least some
compilers assume that it is permitted to reorder modifications of
volatile and non-volatile objects. Added appropriate ngx_memory_barrier()
calls to make sure all modifications will happen before the lock is released.
Reported by Mindaugas Rasiukevicius,
http://mailman.nginx.org/pipermail/nginx-devel/2016-April/008160.html.
HTTP/2: write logs when refusing streams with data.
Refusing streams is known to be incorrectly handled at least by IE, Edge
and Safari. Make sure to provide appropriate logging to simplify fixing
this in the affected browsers.
HTTP/2: send WINDOW_UPDATE instead of RST_STREAM with NO_ERROR.
After the 92464ebace8e change, it has been discovered that not all
clients follow the RFC and handle RST_STREAM with NO_ERROR properly.
Notably, Chrome currently interprets it as INTERNAL_ERROR and discards
the response.
As a workaround, instead of RST_STREAM the maximum stream window update
will be sent, which will let client to send up to 2 GB of a request body
data before getting stuck on flow control. All the received data will
be silently discarded.
See for details:
http://mailman.nginx.org/pipermail/nginx-devel/2016-April/008143.html
https://bugs.chromium.org/p/chromium/issues/detail?id=603182
HTTP/2: refuse streams with data until SETTINGS is acknowledged.
A client is allowed to send requests before receiving and acknowledging
the SETTINGS frame. Such a client having a wrong idea about the stream's
could send the request body that nginx isn't ready to process.
The previous behavior was to send RST_STREAM with FLOW_CONTROL_ERROR in
such case, but it didn't allow retrying requests that have been rejected.
FastCGI: skip special bufs in buffered request body chain.
This prevents forming empty records out of such buffers. Particularly it fixes
double end-of-stream records with chunked transfer encoding, or when HTTP/2 is
used and the END_STREAM flag has been sent without data. In both cases there
is an empty buffer at the end of the request body chain with the "last_buf"
flag set.
The canonical libfcgi, as well as php implementation, tolerates such records,
while the HHVM parser is more strict and drops the connection (ticket #950).
Fixed a regression introduced in rev. 434548349838 that prevented
auto/types/sizeof and auto/types/typedef properly reporting autotest
source code to autoconf.err in case of test failure.
2. Receiving of request body is started only after
the ngx_http_read_client_request_body() call.
The last one fixes the problem when the client_max_body_size value might not be
respected from the right location if the location was changed either during the
process of receiving body or after the whole body had been received.
HTTP/2: sending RST_STREAM with NO_ERROR to discard request body.
RFC 7540 states that "A server can send a complete response prior to the client
sending an entire request if the response does not depend on any portion of the
request that has not been sent and received. When this is true, a server MAY
request that the client abort transmission of a request without error by sending
a RST_STREAM with an error code of NO_ERROR after sending a complete response
(i.e., a frame with the END_STREAM flag)."
This should prevent a client from blocking on the stream window, since it isn't
maintained for closed streams. Currently, quite big initial stream windows are
used, so such blocking is very unlikly, but that will be changed in the further
patches.
Maxim Dounin [Thu, 31 Mar 2016 20:38:33 +0000 (23:38 +0300)]
SSL: initialization changes for OpenSSL 1.1.0.
OPENSSL_config() deprecated in OpenSSL 1.1.0. Additionally,
SSL_library_init(), SSL_load_error_strings() and OpenSSL_add_all_algorithms()
are no longer available if OPENSSL_API_COMPAT is set to 0x10100000L.
The OPENSSL_init_ssl() function is now used instead with appropriate
arguments to trigger the same behaviour. The configure test changed to
use SSL_CTX_set_options().
Deinitialization now happens automatically in OPENSSL_cleanup() called
via atexit(3), so we no longer call EVP_cleanup() and ENGINE_cleanup()
directly.
Maxim Dounin [Thu, 31 Mar 2016 20:38:29 +0000 (23:38 +0300)]
SSL: reasonable version for LibreSSL.
LibreSSL defines OPENSSL_VERSION_NUMBER to 0x20000000L, but uses an old
API derived from OpenSSL at the time LibreSSL forked. As a result, every
version check we use to test for new API elements in newer OpenSSL versions
requires an explicit check for LibreSSL.
To reduce clutter, redefine OPENSSL_VERSION_NUMBER to 0x1000107fL if
LibreSSL is used. The same is done by FreeBSD port of LibreSSL.
Maxim Dounin [Tue, 29 Mar 2016 06:52:15 +0000 (09:52 +0300)]
Win32: replaced NGX_EXDEV with more appropriate error code.
Correct error code for NGX_EXDEV on Windows is ERROR_NOT_SAME_DEVICE,
"The system cannot move the file to a different disk drive".
Previously used ERROR_WRONG_DISK is about wrong diskette in the drive and
is not appropriate.
There is no real difference though, as MoveFile() is able to copy files
between disk drives, and will fail with ERROR_ACCESS_DENIED when asked
to copy directories. The ERROR_NOT_SAME_DEVICE error is only used
by MoveFileEx() when called without the MOVEFILE_COPY_ALLOWED flag.
On Windows there are two possible error codes which correspond to
the EEXIST error code: ERROR_FILE_EXISTS used by CreateFile(CREATE_NEW),
and ERROR_ALREADY_EXISTS used by CreateDirectory().
MoveFile() seems to use both: ERROR_ALREADY_EXISTS when moving within
one filesystem, and ERROR_FILE_EXISTS when copying a file to a different
drive.
Maxim Dounin [Mon, 28 Mar 2016 16:50:19 +0000 (19:50 +0300)]
Upstream: proxy_next_upstream non_idempotent.
By default, requests with non-idempotent methods (POST, LOCK, PATCH)
are no longer retried in case of errors if a request was already sent
to a backend. Previous behaviour can be restored by using
"proxy_next_upstream ... non_idempotent".
Ruslan Ermilov [Mon, 28 Mar 2016 16:29:18 +0000 (19:29 +0300)]
Fixed --test-build-*.
Fixes various aspects of --test-build-devpoll, --test-build-eventport, and
--test-build-epoll.
In particular, if --test-build-devpoll was used on Linux, then "devpoll"
event method would be preferred over "epoll". Also, wrong definitions of
event macros were chosen.
Piotr Sikora [Sat, 27 Feb 2016 01:30:27 +0000 (17:30 -0800)]
Core: allow strings without null-termination in ngx_parse_url().
This fixes buffer over-read while using variables in the "proxy_pass",
"fastcgi_pass", "scgi_pass", and "uwsgi_pass" directives, where result
of string evaluation isn't null-terminated.
Found with MemorySanitizer.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Roman Arutyunyan [Fri, 25 Mar 2016 11:10:38 +0000 (14:10 +0300)]
Fixed socket inheritance on reload and binary upgrade.
On nginx reload or binary upgrade, an attempt is made to inherit listen sockets
from the previous configuration. Previously, no check for socket type was made
and the inherited socket could have the wrong type. On binary upgrade, socket
type was not detected at all. Wrong socket type could lead to errors on that
socket due to different logic and unsupported syscalls. For example, a UDP
socket, inherited as TCP, lead to the following error after arrival of a
datagram: "accept() failed (102: Operation not supported on socket)".
It allows to turn off accumulation of small pool allocations into a big
preallocated chunk of memory. This is useful for debugging memory access
with sanitizer, since such accumulation can cover buffer overruns from
being detected.
Core: use ngx_palloc_small() to allocate ngx_pool_large_t.
This structure cannot be allocated as a large block anyway, otherwise that will
result in infinite recursion, since each large allocation requires to allocate
another ngx_pool_large_t.
The room for the structure is guaranteed by the NGX_MIN_POOL_SIZE constant.
Dmitry Volyntsev [Fri, 18 Mar 2016 12:08:21 +0000 (15:08 +0300)]
Cache: added watermark to reduce IO load when keys_zone is full.
When a keys_zone is full then each next request to the cache is
penalized. That is, the cache has to evict older files to get a
slot from the keys_zone synchronously. The patch introduces new
behavior in this scenario. Manager will try to maintain available
free slots in the keys_zone by cleaning old files in the background.
Maxim Dounin [Fri, 18 Mar 2016 03:44:49 +0000 (06:44 +0300)]
Threads: writing via threads pools in event pipe.
The "aio_write" directive is introduced, which enables use of aio
for writing. Currently it is meaningful only with "aio threads".
Note that aio operations can be done by both event pipe and output
chain, so proper mapping between r->aio and p->aio is provided when
calling ngx_event_pipe() and in output filter.
Maxim Dounin [Fri, 18 Mar 2016 03:44:03 +0000 (06:44 +0300)]
Threads: offloading of temp files writing to thread pools.
The ngx_thread_write_chain_to_file() function introduced, which
uses ngx_file_t thread_handler, thread_ctx and thread_task fields.
The task context structure (ngx_thread_file_ctx_t) is the same for
both reading and writing, and can be safely shared as long as
operations are serialized.
The task->handler field is now always set (and not only when task is
allocated), as the same task can be used with different handlers.
The thread_write flag is introduced in the ngx_temp_file_t structure
to explicitly enable use of ngx_thread_write_chain_to_file() in
ngx_write_chain_to_temp_file() when supported by caller.
Maxim Dounin [Fri, 18 Mar 2016 03:43:52 +0000 (06:43 +0300)]
Threads: task pointer stored in ngx_file_t.
This simplifies the interface of the ngx_thread_read() function.
Additionally, most of the thread operations now explicitly set
file->thread_task, file->thread_handler and file->thread_ctx,
to facilitate use of thread operations in other places.
(Potential problems remain with sendfile in threads though - it uses
file->thread_handler as set in ngx_output_chain(), and it should not
be overwritten to an incompatible one.)
Maxim Dounin [Fri, 18 Mar 2016 02:04:45 +0000 (05:04 +0300)]
Fixed timeouts with threaded sendfile() and subrequests.
If a write event happens after sendfile() but before we've got the
sendfile results in the main thread, this write event will be ignored.
And if no more events will happen, the connection will hang.
Removing the events works in the simple cases, but not always, as
in some cases events are added back by an unrelated code. E.g.,
the upstream module adds write event in the ngx_http_upstream_init()
to track client aborts.
Fix is to use wev->complete instead. It is now set to 0 before
a sendfile() task is posted, and it is set to 1 once a write event
happens. If on completion of the sendfile() task wev->complete is 1,
we know that an event happened while we were executing sendfile(), and
the socket is still ready for writing even if sendfile() did not sent
all the data or returned EAGAIN.