aboutsummaryrefslogtreecommitdiff
path: root/src/os/unix
Commit message (Collapse)AuthorAge
* Core: fix build without libcrypt.Piotr Sikora2025-02-18
| | | | | | libcrypt is no longer part of glibc, so it might not be available. Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
* On DragonFly BSD 5.8+, TCP_KEEPIDLE and TCP_KEEPINTVL are in secs.Andy Pan2024-11-19
|
* Detect cache line size at runtime on macOS.Piotr Sikora2024-02-26
| | | | | | | Notably, Apple Silicon CPUs have 128 byte cache line size, which is twice the default configured for generic aarch64. Signed-off-by: Piotr Sikora <piotr@aviatrix.com>
* AIO operations now add timers (ticket #2162).Maxim Dounin2024-01-29
| | | | | | | | | | | | Each AIO (thread IO) operation being run is now accompanied with 1-minute timer. This timer prevents unexpected shutdown of the worker process while an AIO operation is running, and logs an alert if the operation is running for too long. This fixes "open socket left" alerts during worker processes shutdown due to pending AIO (or thread IO) operations while corresponding requests have no timers. In particular, such errors were observed while reading cache headers (ticket #2162), and with worker_shutdown_timeout.
* Silenced complaints about socket leaks on forced termination.Maxim Dounin2024-01-29
| | | | | | | | | | | When graceful shutdown was requested, and then nginx was forced to do fast shutdown, it used to (incorrectly) complain about open sockets left in connections which weren't yet closed when fast shutdown was requested. Fix is to avoid complaining about open sockets when fast shutdown was requested after graceful one. Abnormal termination, if requested with the WINCH signal, can still happen though.
* QUIC: path MTU discovery.Roman Arutyunyan2023-08-14
| | | | | MTU selection starts by doubling the initial MTU until the first failure. Then binary search is used to find the path MTU.
* Merged with the default branch.Sergey Kandaurov2023-01-02
|\
| * Style.BullerDu2022-12-16
| |
* | Merged with the default branch.Sergey Kandaurov2022-12-15
|\|
| * Fixed segfault when switching off master process during upgrade.Maxim Dounin2022-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Binary upgrades are not supported without master process, but it is, however, possible, that nginx running with master process is asked to upgrade binary, and the configuration file as available on disk at this time includes "master_process off;". If this happens, listening sockets inherited from the previous binary will have ls[i].previous set. But the old cycle on initial process startup, including startup after binary upgrade, is destroyed by ngx_init_cycle() once configuration parsing is complete. As a result, an attempt to dereference ls[i].previous in ngx_event_process_init() accesses already freed memory. Fix is to avoid looking into ls[i].previous if the old cycle is already freed. With this change it is also no longer needed to clear ls[i].previous in worker processes, so the relevant code was removed.
| * Process events posted by ngx_close_idle_connections() immediately.Roman Arutyunyan2022-11-18
| | | | | | | | | | | | Previously, if an event was posted by a read event handler, called by ngx_close_idle_connections(), that event was not processed until the next event loop iteration, which could happen after a timeout.
* | Merged with the default branch.Sergey Kandaurov2022-07-26
|\|
| * Events: fixed EPOLLRDHUP with FIONREAD (ticket #2367).Maxim Dounin2022-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading exactly rev->available bytes, rev->available might become 0 after FIONREAD usage introduction in efd71d49bde0. On the next call of ngx_readv_chain() on systems with EPOLLRDHUP this resulted in return without any actions, that is, with rev->ready set, and this in turn resulted in no timers set in event pipe, leading to socket leaks. Fix is to reset rev->ready in ngx_readv_chain() when returning due to rev->available being 0 with EPOLLRDHUP, much like it is already done in ngx_unix_recv(). This ensures that if rev->available will become 0, on systems with EPOLLRDHUP support appropriate EPOLLRDHUP-specific handling will happen on the next ngx_readv_chain() call. While here, also synced ngx_readv_chain() to match ngx_unix_recv() and reset rev->ready when returning due to rev->available being 0 with kqueue. This is mostly cosmetic change, as rev->ready is anyway reset when rev->available is set to 0.
* | Merged with the default branch.Sergey Kandaurov2022-06-22
|\|
| * Fixed runtime handling of systems without EPOLLRDHUP support.Marcus Ball2022-05-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 7583:efd71d49bde0 (nginx 1.17.5) along with introduction of the ioctl(FIONREAD) support proper handling of systems without EPOLLRDHUP support in the kernel (but with EPOLLRDHUP in headers) was broken. Before the change, rev->available was never set to 0 unless ngx_use_epoll_rdhup was also set (that is, runtime test for EPOLLRDHUP introduced in 6536:f7849bfb6d21 succeeded). After the change, rev->available might reach 0 on systems without runtime EPOLLRDHUP support, stopping further reading in ngx_readv_chain() and ngx_unix_recv(). And, if EOF happened to be already reported along with the last event, it is not reported again by epoll_wait(), leading to connection hangs and timeouts on such systems. This affects Linux kernels before 2.6.17 if nginx was compiled with newer headers, and, more importantly, emulation layers, such as DigitalOcean's App Platform's / gVisor's epoll emulation layer. Fix is to explicitly check ngx_use_epoll_rdhup before the corresponding rev->pending_eof tests in ngx_readv_chain() and ngx_unix_recv().
| * Core: added autotest for UDP segmentation offloading.Vladimir Homutov2022-01-26
| |
| * Core: added function for local source address cmsg.Vladimir Homutov2022-01-25
| |
| * Core: made the ngx_sendmsg() function non-static.Vladimir Homutov2022-01-25
| | | | | | | | | | The NGX_HAVE_ADDRINFO_CMSG macro is defined when at least one of methods to deal with corresponding control message is available.
* | QUIC: fixed macro style.Vladimir Homutov2022-01-25
| |
* | Merged with the default branch.Sergey Kandaurov2021-12-29
|\|
| * Support for sendfile(SF_NOCACHE).Maxim Dounin2021-12-27
| | | | | | | | | | | | The SF_NOCACHE flag, introduced in FreeBSD 11 along with the new non-blocking sendfile() implementation by glebius@, makes it possible to use sendfile() along with the "directio" directive.
| * Simplified sendfile(SF_NODISKIO) usage.Maxim Dounin2021-12-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting with FreeBSD 11, there is no need to use AIO operations to preload data into cache for sendfile(SF_NODISKIO) to work. Instead, sendfile() handles non-blocking loading data from disk by itself. It still can, however, return EBUSY if a page is already being loaded (for example, by a different process). If this happens, we now post an event for the next event loop iteration, so sendfile() is retried "after a short period", as manpage recommends. The limit of the number of EBUSY tolerated without any progress is preserved, but now it does not result in an alert, since on an idle system event loop iteration might be very short and EBUSY can happen many times in a row. Instead, SF_NODISKIO is simply disabled for one call once the limit is reached. With this change, sendfile(SF_NODISKIO) is now used automatically as long as sendfile() is enabled, and no longer requires "aio on;".
* | Merged with the default branch.Ruslan Ermilov2021-12-24
|\|
| * HTTP/2: fixed "task already active" with sendfile in threads.Maxim Dounin2021-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With sendfile in threads, "task already active" alerts might appear in logs if a write event happens on the main HTTP/2 connection, triggering a sendfile in threads while another thread operation is already running. Observed with "aio threads; aio_write on; sendfile on;" and with thread event handlers modified to post a write event to the main HTTP/2 connection (though can happen without any modifications). Similarly, sendfile() with AIO preloading on FreeBSD can trigger duplicate aio operation, resulting in "second aio post" alerts. This is, however, harder to reproduce, especially on modern FreeBSD systems, since sendfile() usually does not return EBUSY. Fix is to avoid starting a sendfile operation if other thread operation is active by checking r->aio in the thread handler (and, similarly, in aio preload handler). The added check also makes duplicate calls protection redundant, so it is removed.
* | QUIC: write and full stream shutdown support.Roman Arutyunyan2021-12-13
| | | | | | | | | | Full stream shutdown is now called from stream cleanup handler instead of explicitly sending frames.
* | Merged with the default branch.Sergey Kandaurov2021-11-03
|\|
| * Fixed sendfile() limit handling on Linux.Maxim Dounin2021-10-29
| | | | | | | | | | | | | | | | | | | | | | On Linux starting with 2.6.16, sendfile() silently limits all operations to MAX_RW_COUNT, defined as (INT_MAX & PAGE_MASK). This incorrectly triggered the interrupt check, and resulted in 0-sized writev() on the next loop iteration. Fix is to make sure the limit is always checked, so we will return from the loop if the limit is already reached even if number of bytes sent is not exactly equal to the number of bytes we've tried to send.
* | Merged with the default branch.Sergey Kandaurov2021-09-01
|\|
| * Give GCC atomics precedence over deprecated Darwin atomic(3).Sergey Kandaurov2021-08-30
| | | | | | | | This allows to build nginx on macOS with -Wdeprecated-declarations.
* | Core: fixed errno clobbering in ngx_sendmsg().Vladimir Homutov2021-07-20
| | | | | | | | This was broken by 2dfd313f22f2.
* | Merged with the default branch.Sergey Kandaurov2021-07-15
|\|
| * Use only preallocated memory in ngx_readv_chain() (ticket #1408).Ruslan Ermilov2021-07-05
| | | | | | | | | | | | | | | | | | | | In d1bde5c3c5d2, the number of preallocated iovec's for ngx_readv_chain() was increased. Still, in some setups, the function might allocate memory for iovec's from a connection pool, which is only freed when closing the connection. The ngx_readv_chain() function was modified to use only preallocated memory, similarly to the ngx_writev_chain() change in 8e903522c17a.
* | Core: added separate function for local source address cmsg.Vladimir Homutov2021-07-15
| |
* | QUIC: added support for segmentation offloading.Vladimir Homutov2021-07-15
| | | | | | | | | | | | | | | | | | To improve output performance, UDP segmentation offloading is used if available. If there is a significant amount of data in an output queue and path is verified, QUIC packets are not sent one-by-one, but instead are collected in a buffer, which is then passed to kernel in a single sendmsg call, using UDP GSO. Such method greatly decreases number of system calls and thus system load.
* | Core: made the ngx_sendmsg() function non-static.Vladimir Homutov2021-07-15
|/ | | | | | | | Additionally, the ngx_init_srcaddr_cmsg() function is introduced which initializes control message with connection local address. The NGX_HAVE_ADDRINFO_CMSG macro is defined when at least one of methods to deal with corresponding control message is available.
* Restored zeroing of ngx_channel_t in ngx_pass_open_channel().Ruslan Ermilov2021-04-22
| | | | | | | | | Due to structure's alignment, some uninitialized memory contents may have been passed between processes. Zeroing was removed in 0215ec9aaa8a. Reported by Johnny Wang.
* Removed "ch" argument from ngx_pass_open_channel().Ruslan Ermilov2021-03-11
|
* Introduced strerrordesc_np() support.Maxim Dounin2021-03-01
| | | | | | The strerrordesc_np() function, introduced in glibc 2.32, provides an async-signal-safe way to obtain error messages. This makes it possible to avoid copying error messages.
* Improved maximum errno detection.Maxim Dounin2021-03-01
| | | | | | | | | | | | | | | | Previously, systems without sys_nerr (or _sys_nerr) were handled with an assumption that errors start at 0 and continuous. This is, however, not something POSIX requires, and not true on some platforms. Notably, on Linux, where sys_nerr is no longer available for newly linked binaries starting with glibc 2.32, there are gaps in error list, which used to stop us from properly detecting maximum errno. Further, on GNU/Hurd errors start at 0x40000001. With this change, maximum errno detection is moved to the runtime code, now able to ignore gaps, and also detects the first error if needed. This fixes observed "Unknown error" messages as seen on Linux with glibc 2.32 and on GNU/Hurd.
* Cache: introduced min_free cache clearing.Maxim Dounin2020-06-22
| | | | | | | | | | | Clearing cache based on free space left on a file system is expected to allow better disk utilization in some cases, notably when disk space might be also used for something other than nginx cache (including nginx own temporary files) and while loading cache (when cache size might be inaccurate for a while, effectively disabling max_size cache clearing). Based on a patch by Adam Bambuch.
* Too large st_blocks values are now ignored (ticket #157).Maxim Dounin2020-06-22
| | | | | | | | | | | | | | With XFS, using "allocsize=64m" mount option results in large preallocation being reported in the st_blocks as returned by fstat() till the file is closed. This in turn results in incorrect cache size calculations and wrong clearing based on max_size. To avoid too aggressive cache clearing on such volumes, st_blocks values which result in sizes larger than st_size and eight blocks (an arbitrary limit) are no longer trusted, and we use st_size instead. The ngx_de_fs_size() counterpart is intentionally not modified, as it is used on closed files and hence not affected by this problem.
* Large block sizes on Linux are now ignored (ticket #1168).Maxim Dounin2020-06-22
| | | | | | | | | | | | | | | | | | NFS on Linux is known to report wsize as a block size (in both f_bsize and f_frsize, both in statfs() and statvfs()). On the other hand, typical file system block sizes on Linux (ext2/ext3/ext4, XFS) are limited to pagesize. (With FAT, block sizes can be at least up to 512k in extreme cases, but this doesn't really matter, see below.) To avoid too aggressive cache clearing on NFS volumes on Linux, block sizes larger than pagesize are now ignored. Note that it is safe to ignore large block sizes. Since 3899:e7cd13b7f759 (1.0.1) cache size is calculated based on fstat() st_blocks, and rounding to file system block size is preserved mostly for Windows. Note well that on other OSes valid block sizes seen are at least up to 65536. In particular, UFS on FreeBSD is known to work well with block and fragment sizes set to 65536.
* Stream: fixed processing of zero length UDP packets (ticket #1982).Vladimir Homutov2020-06-08
|
* Fixed SIGQUIT not removing listening UNIX sockets (closes #753).Ruslan Ermilov2020-06-01
| | | | | | Listening UNIX sockets were not removed on graceful shutdown, preventing the next runs. The fix is to replace the custom socket closing code in ngx_master_process_cycle() by the ngx_close_listening_sockets() call.
* Events: available bytes calculation via ioctl(FIONREAD).Maxim Dounin2019-10-17
| | | | | | | | | | | | | | | | | | | | | This makes it possible to avoid looping for a long time while working with a fast enough peer when data are added to the socket buffer faster than we are able to read and process them (ticket #1431). This is basically what we already do on FreeBSD with kqueue, where information about the number of bytes in the socket buffer is returned by the kevent() call. With other event methods rev->available is now set to -1 when the socket is ready for reading. Later in ngx_recv() and ngx_recv_chain(), if full buffer is received, real number of bytes in the socket buffer is retrieved using ioctl(FIONREAD). Reading more than this number of bytes ensures that even with edge-triggered event methods the event will be triggered again, so it is safe to stop processing of the socket and switch to other connections. Using ioctl(FIONREAD) only after reading a full buffer is an optimization. With this approach we only call ioctl(FIONREAD) when there are at least two recv()/readv() calls.
* Fixed portability issues with union sigval.Sergey Kandaurov2019-01-28
| | | | | | | | | | | | | | | | | | AIO support in nginx was originally developed against FreeBSD versions 4-6, where the sival_ptr field was named as sigval_ptr (seemingly by mistake[1]), which made nginx use the only name available then. The standard-complaint name was restored in 2005 (first appeared in FreeBSD 7.0, 2008), retaining compatibility with previous versions[2][3]. In DragonFly, similar changes were committed in 2009[4], with backward compatibility recently removed[5]. The change switches to the standard name, retaining compatibility with old FreeBSD versions. [1] https://svnweb.freebsd.org/changeset/base/48621 [2] https://svnweb.freebsd.org/changeset/base/152029 [3] https://svnweb.freebsd.org/changeset/base/174003 [4] https://gitweb.dragonflybsd.org/dragonfly.git/commit/3693401 [5] https://gitweb.dragonflybsd.org/dragonfly.git/commit/7875042
* Win32: removed NGX_DIR_MASK concept.Maxim Dounin2018-12-24
| | | | | | | | | | | | Previous interface of ngx_open_dir() assumed that passed directory name has a room for NGX_DIR_MASK at the end (NGX_DIR_MASK_LEN bytes). While all direct users of ngx_dir_open() followed this interface, this also implied similar requirements for indirect uses - in particular, via ngx_walk_tree(). Currently none of ngx_walk_tree() uses provides appropriate space, and fixing this does not look like a right way to go. Instead, ngx_dir_open() interface was changed to not require any additional space and use appropriate allocations instead.
* Fixed NGX_TID_T_FMT format specification for uint64_t.Maxim Dounin2018-07-22
| | | | | | Previously, "%uA" was used, which corresponds to ngx_atomic_uint_t. Size of ngx_atomic_uint_t can be easily different from uint64_t, leading to undefined results.
* Removed glibc crypt_r() bug workaround (ticket #1469).Maxim Dounin2018-05-23
| | | | | | | | | The bug in question was fixed in glibc 2.3.2 and is no longer expected to manifest itself on real servers. On the other hand, the workaround causes compilation problems on various systems. Previously, we've already fixed the code to compile with musl libc (fd6fd02f6a4d), and now it is broken on Fedora 28 where glibc's crypt library was replaced by libxcrypt. So the workaround was removed.
* Fixed checking ngx_tcp_push() and ngx_tcp_nopush() return values.Ruslan Ermilov2018-03-19
| | | | No functional changes.