Skip to content

Commit 6f66c56

Browse files
authored
Merge branch 'main' into branch-unique-reference-tracking
2 parents 7df2176 + 5f57f69 commit 6f66c56

34 files changed

+604
-83
lines changed

Android/testbed/app/build.gradle.kts

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,12 @@ android {
9292
}
9393
throw GradleException("Failed to find API level in $androidEnvFile")
9494
}
95-
targetSdk = 35
95+
96+
// This controls the API level of the maxVersion managed emulator, which is used
97+
// by CI and cibuildwheel. 34 takes up too much disk space (#142289), 35 has
98+
// issues connecting to the internet (#142387), and 36 and later are not
99+
// available as aosp_atd images yet.
100+
targetSdk = 33
96101

97102
versionCode = 1
98103
versionName = "1.0"

Doc/c-api/memory.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -677,7 +677,11 @@ The pymalloc allocator
677677
Python has a *pymalloc* allocator optimized for small objects (smaller or equal
678678
to 512 bytes) with a short lifetime. It uses memory mappings called "arenas"
679679
with a fixed size of either 256 KiB on 32-bit platforms or 1 MiB on 64-bit
680-
platforms. It falls back to :c:func:`PyMem_RawMalloc` and
680+
platforms. When Python is configured with :option:`--with-pymalloc-hugepages`,
681+
the arena size on 64-bit platforms is increased to 2 MiB to match the huge page
682+
size, and arena allocation will attempt to use huge pages (``MAP_HUGETLB`` on
683+
Linux, ``MEM_LARGE_PAGES`` on Windows) with automatic fallback to regular pages.
684+
It falls back to :c:func:`PyMem_RawMalloc` and
681685
:c:func:`PyMem_RawRealloc` for allocations larger than 512 bytes.
682686
683687
*pymalloc* is the :ref:`default allocator <default-memory-allocators>` of the

Doc/library/subprocess.rst

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -803,14 +803,29 @@ Instances of the :class:`Popen` class have the following methods:
803803

804804
.. note::
805805

806-
When the ``timeout`` parameter is not ``None``, then (on POSIX) the
807-
function is implemented using a busy loop (non-blocking call and short
808-
sleeps). Use the :mod:`asyncio` module for an asynchronous wait: see
806+
When ``timeout`` is not ``None`` and the platform supports it, an
807+
efficient event-driven mechanism is used to wait for process termination:
808+
809+
- Linux >= 5.3 uses :func:`os.pidfd_open` + :func:`select.poll`
810+
- macOS and other BSD variants use :func:`select.kqueue` +
811+
``KQ_FILTER_PROC`` + ``KQ_NOTE_EXIT``
812+
- Windows uses ``WaitForSingleObject``
813+
814+
If none of these mechanisms are available, the function falls back to a
815+
busy loop (non-blocking call and short sleeps).
816+
817+
.. note::
818+
819+
Use the :mod:`asyncio` module for an asynchronous wait: see
809820
:class:`asyncio.create_subprocess_exec`.
810821

811822
.. versionchanged:: 3.3
812823
*timeout* was added.
813824

825+
.. versionchanged:: 3.15
826+
if *timeout* is not ``None``, use efficient event-driven implementation
827+
on Linux >= 5.3 and macOS / BSD.
828+
814829
.. method:: Popen.communicate(input=None, timeout=None)
815830

816831
Interact with process: Send data to stdin. Read data from stdout and stderr,

Doc/using/configure.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -783,6 +783,21 @@ also be used to improve performance.
783783

784784
See also :envvar:`PYTHONMALLOC` environment variable.
785785

786+
.. option:: --with-pymalloc-hugepages
787+
788+
Enable huge page support for :ref:`pymalloc <pymalloc>` arenas (disabled by
789+
default). When enabled, the arena size on 64-bit platforms is increased to
790+
2 MiB and arena allocation uses ``MAP_HUGETLB`` (Linux) or
791+
``MEM_LARGE_PAGES`` (Windows) with automatic fallback to regular pages.
792+
793+
The configure script checks that the platform supports ``MAP_HUGETLB``
794+
and emits a warning if it is not available.
795+
796+
On Windows, use the ``--pymalloc-hugepages`` flag with ``build.bat`` or
797+
set the ``UsePymallocHugepages`` MSBuild property.
798+
799+
.. versionadded:: 3.15
800+
786801
.. option:: --without-doc-strings
787802

788803
Disable static documentation strings to reduce the memory footprint (enabled

Doc/whatsnew/3.15.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -743,6 +743,20 @@ ssl
743743

744744
(Contributed by Ron Frederick in :gh:`138252`.)
745745

746+
subprocess
747+
----------
748+
749+
* :meth:`subprocess.Popen.wait`: when ``timeout`` is not ``None`` and the
750+
platform supports it, an efficient event-driven mechanism is used to wait for
751+
process termination:
752+
753+
- Linux >= 5.3 uses :func:`os.pidfd_open` + :func:`select.poll`.
754+
- macOS and other BSD variants use :func:`select.kqueue` + ``KQ_FILTER_PROC`` + ``KQ_NOTE_EXIT``.
755+
- Windows keeps using ``WaitForSingleObject`` (unchanged).
756+
757+
If none of these mechanisms are available, the function falls back to the
758+
traditional busy loop (non-blocking call and short sleeps).
759+
(Contributed by Giampaolo Rodola in :gh:`83069`).
746760

747761
symtable
748762
--------
@@ -1463,6 +1477,12 @@ Build changes
14631477
modules that are missing or packaged separately.
14641478
(Contributed by Stan Ulbrych and Petr Viktorin in :gh:`139707`.)
14651479

1480+
* The new configure option :option:`--with-pymalloc-hugepages` enables huge
1481+
page support for :ref:`pymalloc <pymalloc>` arenas. When enabled, arena size
1482+
increases to 2 MiB and allocation uses ``MAP_HUGETLB`` (Linux) or
1483+
``MEM_LARGE_PAGES`` (Windows) with automatic fallback to regular pages.
1484+
On Windows, use ``build.bat --pymalloc-hugepages``.
1485+
14661486
* Annotating anonymous mmap usage is now supported if Linux kernel supports
14671487
:manpage:`PR_SET_VMA_ANON_NAME <PR_SET_VMA(2const)>` (Linux 5.17 or newer).
14681488
Annotations are visible in ``/proc/<pid>/maps`` if the kernel supports the feature

Include/internal/pycore_obmalloc.h

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,11 @@ typedef unsigned int pymem_uint; /* assuming >= 16 bits */
208208
* mappings to reduce heap fragmentation.
209209
*/
210210
#ifdef USE_LARGE_ARENAS
211-
#define ARENA_BITS 20 /* 1 MiB */
211+
# ifdef PYMALLOC_USE_HUGEPAGES
212+
# define ARENA_BITS 21 /* 2 MiB */
213+
# else
214+
# define ARENA_BITS 20 /* 1 MiB */
215+
# endif
212216
#else
213217
#define ARENA_BITS 18 /* 256 KiB */
214218
#endif
@@ -469,7 +473,7 @@ nfp free pools in usable_arenas.
469473
*/
470474

471475
/* How many arena_objects do we initially allocate?
472-
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the
476+
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE before growing the
473477
* `arenas` vector.
474478
*/
475479
#define INITIAL_ARENA_OBJECTS 16
@@ -512,14 +516,26 @@ struct _obmalloc_mgmt {
512516
513517
memory address bit allocation for keys
514518
515-
64-bit pointers, IGNORE_BITS=0 and 2^20 arena size:
519+
ARENA_BITS is configurable: 20 (1 MiB) by default on 64-bit, or
520+
21 (2 MiB) when PYMALLOC_USE_HUGEPAGES is enabled. All bit widths
521+
below are derived from ARENA_BITS automatically.
522+
523+
64-bit pointers, IGNORE_BITS=0 and 2^20 arena size (default):
516524
15 -> MAP_TOP_BITS
517525
15 -> MAP_MID_BITS
518526
14 -> MAP_BOT_BITS
519527
20 -> ideal aligned arena
520528
----
521529
64
522530
531+
64-bit pointers, IGNORE_BITS=0 and 2^21 arena size (hugepages):
532+
15 -> MAP_TOP_BITS
533+
15 -> MAP_MID_BITS
534+
13 -> MAP_BOT_BITS
535+
21 -> ideal aligned arena
536+
----
537+
64
538+
523539
64-bit pointers, IGNORE_BITS=16, and 2^20 arena size:
524540
16 -> IGNORE_BITS
525541
10 -> MAP_TOP_BITS

Lib/profiling/sampling/sample.py

Lines changed: 17 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ def _pause_threads(unwinder, blocking):
4242
LiveStatsCollector = None
4343

4444
_FREE_THREADED_BUILD = sysconfig.get_config_var("Py_GIL_DISABLED") is not None
45+
4546
# Minimum number of samples required before showing the TUI
4647
# If fewer samples are collected, we skip the TUI and just print a message
4748
MIN_SAMPLES_FOR_TUI = 200
@@ -64,19 +65,23 @@ def __init__(self, pid, sample_interval_usec, all_threads, *, mode=PROFILING_MOD
6465
self.realtime_stats = False
6566

6667
def _new_unwinder(self, native, gc, opcodes, skip_non_matching_threads):
67-
if _FREE_THREADED_BUILD:
68-
unwinder = _remote_debugging.RemoteUnwinder(
69-
self.pid, all_threads=self.all_threads, mode=self.mode, native=native, gc=gc,
70-
opcodes=opcodes, skip_non_matching_threads=skip_non_matching_threads,
71-
cache_frames=True, stats=self.collect_stats
72-
)
68+
kwargs = {}
69+
if _FREE_THREADED_BUILD or self.all_threads:
70+
kwargs['all_threads'] = self.all_threads
7371
else:
74-
unwinder = _remote_debugging.RemoteUnwinder(
75-
self.pid, only_active_thread=bool(self.all_threads), mode=self.mode, native=native, gc=gc,
76-
opcodes=opcodes, skip_non_matching_threads=skip_non_matching_threads,
77-
cache_frames=True, stats=self.collect_stats
78-
)
79-
return unwinder
72+
kwargs['only_active_thread'] = bool(self.all_threads)
73+
74+
return _remote_debugging.RemoteUnwinder(
75+
self.pid,
76+
mode=self.mode,
77+
native=native,
78+
gc=gc,
79+
opcodes=opcodes,
80+
skip_non_matching_threads=skip_non_matching_threads,
81+
cache_frames=True,
82+
stats=self.collect_stats,
83+
**kwargs
84+
)
8085

8186
def sample(self, collector, duration_sec=None, *, async_aware=False):
8287
sample_interval_sec = self.sample_interval_usec / 1_000_000

Lib/subprocess.py

Lines changed: 143 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -748,6 +748,60 @@ def _use_posix_spawn():
748748
return False
749749

750750

751+
def _can_use_pidfd_open():
752+
# Availability: Linux >= 5.3
753+
if not hasattr(os, "pidfd_open"):
754+
return False
755+
try:
756+
pidfd = os.pidfd_open(os.getpid(), 0)
757+
except OSError as err:
758+
if err.errno in {errno.EMFILE, errno.ENFILE}:
759+
# transitory 'too many open files'
760+
return True
761+
# likely blocked by security policy like SECCOMP (EPERM,
762+
# EACCES, ENOSYS)
763+
return False
764+
else:
765+
os.close(pidfd)
766+
return True
767+
768+
769+
def _can_use_kqueue():
770+
# Availability: macOS, BSD
771+
names = (
772+
"kqueue",
773+
"KQ_EV_ADD",
774+
"KQ_EV_ONESHOT",
775+
"KQ_FILTER_PROC",
776+
"KQ_NOTE_EXIT",
777+
)
778+
if not all(hasattr(select, x) for x in names):
779+
return False
780+
kq = None
781+
try:
782+
kq = select.kqueue()
783+
kev = select.kevent(
784+
os.getpid(),
785+
filter=select.KQ_FILTER_PROC,
786+
flags=select.KQ_EV_ADD | select.KQ_EV_ONESHOT,
787+
fflags=select.KQ_NOTE_EXIT,
788+
)
789+
kq.control([kev], 1, 0)
790+
return True
791+
except OSError as err:
792+
if err.errno in {errno.EMFILE, errno.ENFILE}:
793+
# transitory 'too many open files'
794+
return True
795+
return False
796+
finally:
797+
if kq is not None:
798+
kq.close()
799+
800+
801+
_CAN_USE_PIDFD_OPEN = not _mswindows and _can_use_pidfd_open()
802+
_CAN_USE_KQUEUE = not _mswindows and _can_use_kqueue()
803+
804+
751805
# These are primarily fail-safe knobs for negatives. A True value does not
752806
# guarantee the given libc/syscall API will be used.
753807
_USE_POSIX_SPAWN = _use_posix_spawn()
@@ -2046,14 +2100,100 @@ def _try_wait(self, wait_flags):
20462100
sts = 0
20472101
return (pid, sts)
20482102

2103+
def _wait_pidfd(self, timeout):
2104+
"""Wait for PID to terminate using pidfd_open() + poll().
2105+
Linux >= 5.3 only.
2106+
"""
2107+
if not _CAN_USE_PIDFD_OPEN:
2108+
return False
2109+
try:
2110+
pidfd = os.pidfd_open(self.pid, 0)
2111+
except OSError:
2112+
# May be:
2113+
# - ESRCH: no such process
2114+
# - EMFILE, ENFILE: too many open files (usually 1024)
2115+
# - ENODEV: anonymous inode filesystem not supported
2116+
# - EPERM, EACCES, ENOSYS: undocumented; may happen if
2117+
# blocked by security policy like SECCOMP
2118+
return False
2119+
2120+
try:
2121+
poller = select.poll()
2122+
poller.register(pidfd, select.POLLIN)
2123+
events = poller.poll(timeout * 1000)
2124+
if not events:
2125+
raise TimeoutExpired(self.args, timeout)
2126+
return True
2127+
finally:
2128+
os.close(pidfd)
2129+
2130+
def _wait_kqueue(self, timeout):
2131+
"""Wait for PID to terminate using kqueue(). macOS and BSD only."""
2132+
if not _CAN_USE_KQUEUE:
2133+
return False
2134+
try:
2135+
kq = select.kqueue()
2136+
except OSError:
2137+
# likely EMFILE / ENFILE (too many open files)
2138+
return False
2139+
2140+
try:
2141+
kev = select.kevent(
2142+
self.pid,
2143+
filter=select.KQ_FILTER_PROC,
2144+
flags=select.KQ_EV_ADD | select.KQ_EV_ONESHOT,
2145+
fflags=select.KQ_NOTE_EXIT,
2146+
)
2147+
try:
2148+
events = kq.control([kev], 1, timeout) # wait
2149+
except OSError:
2150+
return False
2151+
else:
2152+
if not events:
2153+
raise TimeoutExpired(self.args, timeout)
2154+
return True
2155+
finally:
2156+
kq.close()
20492157

20502158
def _wait(self, timeout):
2051-
"""Internal implementation of wait() on POSIX."""
2159+
"""Internal implementation of wait() on POSIX.
2160+
2161+
Uses efficient pidfd_open() + poll() on Linux or kqueue()
2162+
on macOS/BSD when available. Falls back to polling
2163+
waitpid(WNOHANG) otherwise.
2164+
"""
20522165
if self.returncode is not None:
20532166
return self.returncode
20542167

20552168
if timeout is not None:
2056-
endtime = _time() + timeout
2169+
if timeout < 0:
2170+
raise TimeoutExpired(self.args, timeout)
2171+
started = _time()
2172+
endtime = started + timeout
2173+
2174+
# Try efficient wait first.
2175+
if self._wait_pidfd(timeout) or self._wait_kqueue(timeout):
2176+
# Process is gone. At this point os.waitpid(pid, 0)
2177+
# will return immediately, but in very rare races
2178+
# the PID may have been reused.
2179+
# os.waitpid(pid, WNOHANG) ensures we attempt a
2180+
# non-blocking reap without blocking indefinitely.
2181+
with self._waitpid_lock:
2182+
if self.returncode is not None:
2183+
return self.returncode # Another thread waited.
2184+
(pid, sts) = self._try_wait(os.WNOHANG)
2185+
assert pid == self.pid or pid == 0
2186+
if pid == self.pid:
2187+
self._handle_exitstatus(sts)
2188+
return self.returncode
2189+
# os.waitpid(pid, WNOHANG) returned 0 instead
2190+
# of our PID, meaning PID has not yet exited,
2191+
# even though poll() / kqueue() said so. Very
2192+
# rare and mostly theoretical. Fallback to busy
2193+
# polling.
2194+
elapsed = _time() - started
2195+
endtime -= elapsed
2196+
20572197
# Enter a busy loop if we have a timeout. This busy loop was
20582198
# cribbed from Lib/threading.py in Thread.wait() at r71065.
20592199
delay = 0.0005 # 500 us -> initial delay of 1 ms
@@ -2085,6 +2225,7 @@ def _wait(self, timeout):
20852225
# http://bugs.python.org/issue14396.
20862226
if pid == self.pid:
20872227
self._handle_exitstatus(sts)
2228+
20882229
return self.returncode
20892230

20902231

Lib/test/test_binascii.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,17 @@ def assertNonBase64Data(data, expected, ignorechars):
202202
assertNonBase64Data(b'a\nb==', b'i', ignorechars=bytearray(b'\n'))
203203
assertNonBase64Data(b'a\nb==', b'i', ignorechars=memoryview(b'\n'))
204204

205+
# Same cell in the cache: '\r' >> 3 == '\n' >> 3.
206+
data = self.type2test(b'\r\n')
207+
with self.assertRaises(binascii.Error):
208+
binascii.a2b_base64(data, ignorechars=b'\r')
209+
self.assertEqual(binascii.a2b_base64(data, ignorechars=b'\r\n'), b'')
210+
# Same bit mask in the cache: '*' & 31 == '\n' & 31.
211+
data = self.type2test(b'*\n')
212+
with self.assertRaises(binascii.Error):
213+
binascii.a2b_base64(data, ignorechars=b'*')
214+
self.assertEqual(binascii.a2b_base64(data, ignorechars=b'*\n'), b'')
215+
205216
data = self.type2test(b'a\nb==')
206217
with self.assertRaises(TypeError):
207218
binascii.a2b_base64(data, ignorechars='')

0 commit comments

Comments
 (0)