Ruslan Ermilov [Fri, 3 Nov 2017 19:22:23 +0000 (22:22 +0300)]
Upstream: pre-resolve servers on reload.
After configuration is reloaded, it may take some time for the
re-resolvable upstream servers to resolve and become available
as peers. During this time, client requests might get dropped.
Such servers are now pre-resolved using the "cache" of already
resolved peers from the old shared memory zone.
Ruslan Ermilov [Sat, 15 Feb 2014 11:12:34 +0000 (15:12 +0400)]
Upstream: re-resolvable servers.
Specifying the upstream server by a hostname together with the
"resolve" parameter will make the hostname to be periodically
resolved, and upstream servers added/removed as necessary.
This requires a "resolver" at the "http" configuration block.
The "resolver_timeout" parameter also affects when the failed
DNS requests will be attempted again. Responses with NXDOMAIN
will be attempted again in 10 seconds.
Upstream has a configuration generation number that is incremented each
time servers are added/removed to the primary/backup list. This number
is remembered by the peer.init method, and if peer.get detects a change
in configuration, it returns NGX_BUSY.
Each server has a reference counter. It is incremented by peer.get and
decremented by peer.free. When a server is removed, it is removed from
the list of servers and is marked as "zombie". The memory allocated by
a zombie peer is freed only when its reference count becomes zero.
Co-authored-by: Roman Arutyunyan <arut@nginx.com> Co-authored-by: Sergey Kandaurov <pluknet@nginx.com> Co-authored-by: Vladimir Homutov <vl@nginx.com>
TLSv1 and TLSv1.1 are formally deprecated and forbidden to negotiate due
to insufficient security reasons outlined in RFC 8996.
TLSv1 and TLSv1.1 are disabled in BoringSSL e95b0cad9 and LibreSSL 3.8.1
in the way they cannot be enabled in nginx configuration. In OpenSSL 3.0,
they are only permitted at security level 0 (disabled by default).
The support is dropped in Chrome 84, Firefox 78, and deprecated in Safari.
This change disables TLSv1 and TLSv1.1 by default for OpenSSL 1.0.1 and
newer, where TLSv1.2 support is available. For older library versions,
which do not have alternatives, these protocol versions remain enabled.
Since a2a513b93cae, stream frames no longer need to be retransmitted after it
was deleted. The frames which were retransmitted before, could be stream data
frames sent prior to a RESET_STREAM. Such retransmissions are explicitly
prohibited by RFC 9000, Section 19.4.
The certificate cache provides each chain as a unique stack of reference
counted elements. This shallow copy is required because OpenSSL stacks
aren't reference counted.
Added ngx_openssl_cache_module, which indexes a type-aware object cache.
It maps an id to a unique instance, and provides references to it, which
are dropped when the cycle's pool is destroyed.
Instead of cross-linking the objects using exdata, pointers to configured
certificates are now stored in ngx_ssl_t, and OCSP staples are now accessed
with rbtree in it. This allows sharing these objects between SSL contexts.
SSL: optional ssl_client_certificate for ssl_verify_client.
Starting from TLSv1.1 (as seen since draft-ietf-tls-rfc2246-bis-00),
the "certificate_authorities" field grammar of the CertificateRequest
message was redone to allow no distinguished names. In TLSv1.3, with
the restructured CertificateRequest message, this can be similarly
done by optionally including the "certificate_authorities" extension.
This allows to avoid sending DNs at all.
In practice, aside from published TLS specifications, all supported
SSL/TLS libraries allow to request client certificates with an empty
DN list for any protocol version. For instance, when operating in
TLSv1, this results in sending the "certificate_authorities" list as
a zero-length vector, which corresponds to the TLSv1.1 specification.
Such behaviour goes back to SSLeay.
The change relaxes the requirement to specify at least one trusted CA
certificate in the ssl_client_certificate directive, which resulted in
sending DNs of these certificates (closes #142). Instead, all trusted
CA certificates can be specified now using the ssl_trusted_certificate
directive if needed. A notable difference that certificates specified
in ssl_trusted_certificate are always loaded remains (see 3648ba7db).
Roman Arutyunyan [Mon, 12 Aug 2024 14:20:45 +0000 (18:20 +0400)]
Mp4: rejecting unordered chunks in stsc atom.
Unordered chunks could result in trak->end_chunk smaller than trak->start_chunk
in ngx_http_mp4_crop_stsc_data(). Later in ngx_http_mp4_update_stco_atom()
this caused buffer overread while trying to calculate trak->end_offset.
Roman Arutyunyan [Mon, 12 Aug 2024 14:20:43 +0000 (18:20 +0400)]
Mp4: fixed buffer underread while updating stsz atom.
While cropping an stsc atom in ngx_http_mp4_crop_stsc_data(), a 32-bit integer
overflow could happen, which could result in incorrect seeking and a very large
value stored in "samples". This resulted in a large invalid value of
trak->end_chunk_samples. This value is further used to calculate the value of
trak->end_chunk_samples_size in ngx_http_mp4_update_stsz_atom(). While doing
this, a large invalid value of trak->end_chunk_samples could result in reading
memory before stsz atom start. This could potentially result in a segfault.
Kasei Wang [Thu, 18 Jul 2024 13:43:25 +0000 (17:43 +0400)]
HTTP/2: close connections initialized during graceful shutdown.
In some rare cases, graceful shutdown may happen while initializing an HTTP/2
connection. Previously, such a connection ignored the shutdown and remained
active. Now it is gracefully closed prior to processing any streams to
eliminate the shutdown delay.
Roman Arutyunyan [Thu, 27 Jun 2024 13:29:56 +0000 (17:29 +0400)]
Stream: allow servers with no handler.
Previously handlers were mandatory. However they are not always needed.
For example, a server configured with ssl_reject_handshake does not need a
handler. Such servers required a fake handler to pass the check. Now handler
absence check is moved to runtime. If handler is missing, the connection is
closed with 500 code.
Previously the last chain field of ngx_quic_buffer_t could still reference freed
chains and buffers after calling ngx_quic_free_buffer(). While normally an
ngx_quic_buffer_t object should not be used after freeing, resetting last_chain
field would prevent a potential use-after-free.
Roman Arutyunyan [Tue, 28 May 2024 13:19:08 +0000 (17:19 +0400)]
QUIC: ignore CRYPTO frames after handshake completion.
Sending handshake-level CRYPTO frames after the client's Finished message could
lead to memory disclosure and a potential segfault, if those frames are sent in
one packet with the Finished frame.
Roman Arutyunyan [Tue, 28 May 2024 13:18:50 +0000 (17:18 +0400)]
HTTP/3: fixed dynamic table overflow.
While inserting a new entry into the dynamic table, first the entry is added,
and then older entries are evicted until table size is within capacity. After
the first step, the number of entries may temporarily exceed the maximum
calculated from capacity by one entry, which previously caused table overflow.
The easiest way to trigger the issue is to keep adding entries with empty names
and values until first eviction.
Roman Arutyunyan [Tue, 28 May 2024 13:18:28 +0000 (17:18 +0400)]
HTTP/3: decoder stream pre-creation.
Previously a decoder stream was created on demand for sending Section
Acknowledgement, Stream Cancellation and Insert Count Increment. If conditions
for sending any of these instructions never happen, a decoder stream is not
created at all. These conditions include client not using the dynamic table and
no streams abandoned by server (RFC 9204, Section 2.2.2.2). However RFC 9204,
Section 4.2 defines only one condition for not creating a decoder stream:
An endpoint MAY avoid creating a decoder stream if its decoder sets
the maximum capacity of the dynamic table to zero.
The change enables pre-creation of the decoder stream at HTTP/3 session
initialization if maximum dynamic table capacity is not zero. Note that this
value is currently hardcoded to 4096 bytes and is not configurable, so the
stream is now always created.
Also, the change fixes a potential stack overflow when creating a decoder
stream in ngx_http_v3_send_cancel_stream() while draining a request stream by
ngx_drain_connections(). Creating a decoder stream involves calling
ngx_get_connection(), which calls ngx_drain_connections(), which will drain the
same request stream again. If client's MAX_STREAMS for uni stream is high
enough, these recursive calls will continue until we run out of stack.
Otherwise, decoder stream creation will fail at some point and the request
stream connection will be drained. This may result in use-after-free, since
this connection could still be referenced up the stack.
Edgar Bonet [Thu, 16 May 2024 09:15:10 +0000 (11:15 +0200)]
Configure: fixed building libatomic test.
Using "long *" instead of "AO_t *" leads either to -Wincompatible-pointer-types
or -Wpointer-sign warnings, depending on whether long and size_t are compatible
types (e.g., ILP32 versus LP64 data models). Notably, -Wpointer-sign warnings
are enabled by default in Clang only, and -Wincompatible-pointer-types is an
error starting from GCC 14.
Signed-off-by: Edgar Bonet <bonet@grenoble.cnrs.fr>
Passing from udp was not possible for the most part due to preread buffer
restriction. Passing to udp could occasionally work, but the connection would
still be bound to the original listen rbtree, which prevented it from being
deleted on connection closure.
SSL: fixed possible configuration overwrite loading "engine:" keys.
When loading certificate keys via ENGINE_load_private_key() in runtime,
it was possible to overwrite configuration on ENGINE_by_id() failure.
OpenSSL documention doesn't describe errors in details, the only reason
I found in the comment to example is when the engine is not available.
HTTP/3: fixed handling of malformed request body length.
Previously, a request body larger than declared in Content-Length resulted in
a 413 status code, because Content-Length was mistakenly used as the maximum
allowed request body, similar to client_max_body_size. Following the HTTP/3
specification, such requests are now rejected with the 400 error as malformed.
QUIC: fixed close timer processing with early data.
The ngx_quic_run() function uses qc->close timer to limit the handshake
duration. Normally it is removed by ngx_quic_do_init_streams() which is
called once when we are done with initial SSL processing.
The problem happens when the client sends early data and streams are
initialized in the ngx_quic_run() -> ngx_quic_handle_datagram() call.
The order of set/remove timer calls is now reversed; the close timer is
set up and the timer fires when assigned, starting the unexpected connection
close process.
The fix is to skip setting the timer if streams were initialized during
handling of the initial datagram. The idle timer for quic is set anyway,
and stream-related timeouts are managed by application layer.
Sergey Kandaurov [Mon, 18 Mar 2024 13:14:30 +0000 (17:14 +0400)]
Fixed undefined behaviour with IPv4-mapped IPv6 addresses.
Previously, it could result when left-shifting signed integer due to implicit
integer promotion, such that the most significant bit appeared on the sign bit.
In practice, though, this results in the same left value as with an explicit
cast, at least on known compilers, such as GCC and Clang. The reason is that
in_addr_t, which is equivalent to uint32_t and same as "unsigned int" in ILP32
and LP64 data type models, has the same type width as the intermediate after
integer promotion, so there's no side effects such as sign-extension. This
explains why adding an explicit cast does not change object files in practice.
Piotr Sikora [Thu, 14 Mar 2024 14:37:20 +0000 (18:37 +0400)]
Geo: fixed uninitialized memory access.
While copying ngx_http_variable_value_t structures to geo binary base
in ngx_http_geo_copy_values(), and similarly in the stream module,
uninitialized parts of these structures are copied as well. These
include the "escape" field and possible holes. Calculating crc32 of
this data triggers uninitialized memory access.
In preparation for adding more parameters to the listen directive,
and to be in sync with the corresponding structure in the http module.
No functional changes.
Sergey Kandaurov [Fri, 22 Mar 2024 10:18:51 +0000 (14:18 +0400)]
Stream: using ngx_stream_ssl_srv_conf_t *sscf naming convention.
Originally, the stream module was developed based on the mail module,
following the existing style. Then it was diverged to closely follow
the http module development. This change updates style to use sscf
naming convention troughout the stream module, which matches the http
module code style. No functional changes.
Roman Arutyunyan [Wed, 21 Feb 2024 13:36:02 +0000 (17:36 +0400)]
Stream: ngx_stream_pass_module.
The module allows to pass connections from Stream to other modules such as HTTP
or Mail, as well as back to Stream. Previously, this was only possible with
proxying. Connections with preread buffer read out from socket cannot be
passed.
The module allows selective SSL termination based on SNI.
Roman Arutyunyan [Thu, 14 Dec 2023 17:58:39 +0000 (21:58 +0400)]
Stream: virtual servers.
Server name is taken either from ngx_stream_ssl_module or
ngx_stream_ssl_preread_module.
The change adds "default_server" parameter to the "listen" directive,
as well as the following directives: "server_names_hash_max_size",
"server_names_hash_bucket_size", "server_name" and "ssl_reject_handshake".
Roman Arutyunyan [Wed, 13 Dec 2023 14:04:55 +0000 (18:04 +0400)]
Stream: socket peek in preread phase.
Previously, preread buffer was always read out from socket, which made it
impossible to terminate SSL on the connection without introducing additional
SSL BIOs. The following patches will rely on this.
Now, when possible, recv(MSG_PEEK) is used instead, which keeps data in socket.
It's called if SSL is not already terminated and if an egde-triggered event
method is used. For epoll, EPOLLRDHUP support is also required.
Roman Arutyunyan [Wed, 14 Feb 2024 11:55:37 +0000 (15:55 +0400)]
QUIC: fixed stream cleanup (ticket #2586).
Stream connection cleanup handler ngx_quic_stream_cleanup_handler() calls
ngx_quic_shutdown_stream() after which it resets the pointer from quic stream
to the connection (sc->connection = NULL). Previously if this call failed,
sc->connection retained the old value, while the connection was freed by the
application code. This resulted later in a second attempt to close the freed
connection, which lead to allocator double free error.
The fix is to reset the sc->connection pointer in case of error.
Sergey Kandaurov [Wed, 14 Feb 2024 11:55:34 +0000 (15:55 +0400)]
QUIC: trial packet decryption in response to invalid key update.
Inspired by RFC 9001, Section 6.3, trial packet decryption with the current
keys is now used to avoid a timing side-channel signal. Further, this fixes
segfault while accessing missing next keys (ticket #2585).
Roman Arutyunyan [Wed, 14 Feb 2024 12:56:28 +0000 (16:56 +0400)]
QUIC: fixed unsent MTU probe acknowledgement.
Previously if an MTU probe send failed early in ngx_quic_frame_sendto()
due to allocation error or congestion control, the application level packet
number was not increased, but was still saved as MTU probe packet number.
Later when a packet with this number was acknowledged, the unsent MTU probe
was acknowledged as well. This could result in discovering a bigger MTU than
supported by the path, which could lead to EMSGSIZE (Message too long) errors
while sending further packets.
The problem existed since PMTUD was introduced in 58afcd72446f (1.25.2).
Back then only the unlikely memory allocation error could trigger it. However
in efcdaa66df2e congestion control was added to ngx_quic_frame_sendto() which
can now trigger the issue with a higher probability.
Maxim Dounin [Tue, 30 Jan 2024 00:20:10 +0000 (03:20 +0300)]
Upstream: fixed usage of closed sockets with filter finalization.
When filter finalization is triggered when working with an upstream server,
and error_page redirects request processing to some simple handler,
ngx_http_request_finalize() triggers request termination when the response
is sent. In particular, via the upstream cleanup handler, nginx will close
the upstream connection and the corresponding socket.
Still, this can happen to be with ngx_event_pipe() on stack. While
the code will set p->downstream_error due to NGX_ERROR returned from the
output filter chain by filter finalization, otherwise the error will be
ignored till control returns to ngx_http_upstream_process_request().
And event pipe might try reading from the (already closed) socket, resulting
in "readv() failed (9: Bad file descriptor) while reading upstream" errors
(or even segfaults with SSL).
Such errors were seen with the following configuration:
Fix is to clear p->upstream in ngx_http_upstream_finalize_request(),
and ensure that p->upstream is checked in ngx_event_pipe_read_upstream()
and when handling events at ngx_event_pipe() exit.
Maxim Dounin [Tue, 30 Jan 2024 00:20:05 +0000 (03:20 +0300)]
Fixed request termination with AIO and subrequests (ticket #2555).
When a request was terminated due to an error via ngx_http_terminate_request()
while an AIO operation was running in a subrequest, various issues were
observed. This happened because ngx_http_request_finalizer() was only set
in the subrequest where ngx_http_terminate_request() was called, but not
in the subrequest where the AIO operation was running. After completion
of the AIO operation normal processing of the subrequest was resumed, leading
to issues.
In particular, in case of the upstream module, termination of the request
called upstream cleanup, which closed the upstream connection. Attempts to
further work with the upstream connection after AIO operation completion
resulted in segfaults in ngx_ssl_recv(), "readv() failed (9: Bad file
descriptor) while reading upstream" errors, or socket leaks.
In ticket #2555, issues were observed with the following configuration
with cache background update (with thread writing instrumented to
introduce a delay, when a client closes the connection during an update):
Similarly, the same issue can be seen with SSI, and can be caused by
errors in subrequests, such as in the following configuration
(where "/proxy" uses AIO, and "/sleep" returns 444 after some delay,
causing request termination):
Similarly, issues can be observed with just static files. However,
with static files potential impact is limited due to timeout safeguards
in ngx_http_writer(), and the fact that c->error is set during request
termination.
In a simple configuration with an AIO operation in the active subrequest,
such as in the following configuration, the connection is closed right
after completion of the AIO operation anyway, since ngx_http_writer()
tries to write to the connection and fails due to c->error set:
Fix is to introduce r->main->terminated flag, which is to be checked
by AIO event handlers when the r->main->blocked counter is decremented.
When the flag is set, handlers are expected to wake up the connection
instead of the subrequest (which might be already cleaned up).
Additionally, now ngx_http_request_finalizer() is always set in the
active subrequest, so waking up the connection properly finalizes the
request even if termination happened in a non-active subrequest.
Maxim Dounin [Mon, 29 Jan 2024 07:31:37 +0000 (10:31 +0300)]
AIO operations now add timers (ticket #2162).
Each AIO (thread IO) operation being run is now accompanied with 1-minute
timer. This timer prevents unexpected shutdown of the worker process while
an AIO operation is running, and logs an alert if the operation is running
for too long.
This fixes "open socket left" alerts during worker processes shutdown
due to pending AIO (or thread IO) operations while corresponding requests
have no timers. In particular, such errors were observed while reading
cache headers (ticket #2162), and with worker_shutdown_timeout.
Maxim Dounin [Mon, 29 Jan 2024 07:29:39 +0000 (10:29 +0300)]
Silenced complaints about socket leaks on forced termination.
When graceful shutdown was requested, and then nginx was forced to
do fast shutdown, it used to (incorrectly) complain about open sockets
left in connections which weren't yet closed when fast shutdown
was requested.
Fix is to avoid complaining about open sockets when fast shutdown was
requested after graceful one. Abnormal termination, if requested with
the WINCH signal, can still happen though.