bpo-37986: Improve perfomance of PyLong_FromDouble() #15611

sir-sigurd · 2019-08-30T09:00:08Z

https://bugs.python.org/issue37986

Objects/longobject.c

gpshead · 2019-09-11T15:16:05Z

Objects/longobject.c

@@ -434,8 +446,7 @@ PyLong_FromDouble(double dval)
        dval = -dval;
    }
    frac = frexp(dval, &expo); /* dval = frac*2**expo; 0.0 <= frac < 1.0 */
-    if (expo <= 0)


This is already on the slow path, it seems safest to keep this check in place even though it should've been handled by the above int range checks. smart compilers would see that (no idea how many are smart enough to unroll frexp and understand).

I do not think it makes sense to keep this code.

Either seems fine to me. Personally, I'd probably keep the check out of defensiveness (someone could, for whatever reason, move the fast path out at some point in the future; it's nice if the slow path remains valid in that case), but I'm happy for this to be merged as is. Do we at least have unit tests that cover this case?

smart compilers would see that (no idea how many are smart enough to unroll frexp and understand).

At least gcc is not smart enough.

a compromise is to turn it into assert(expr <= 0); as protection against future code changes breaking our assumption. Our buildbots run --with-pydebug builds where assertions are enabled.

gpshead · 2019-09-11T15:47:08Z

Objects/longobject.c

+     * and someone using a non-default option on Sun also bumped into
+     * that).
+     */
+    const double int_max = (unsigned long)LONG_MAX + 1;


int_max is an imprecise value on platforms where sizeof(long) >= sizeof(double). Most 64-bit systems have long's larger than a double's 53-bit mantissa (and likely all platforms when considering long long per the above comment).

Will it be truncated in the right direction (towards zero) to avoid this triggering on values with undefined conversion behavior?

the previous code used LONG_MIN < v and v < LONG_MAX directly rather than LONG_MAX + 1 stored into a double. (I believe the C promotion will promoted those values to a double before comparison as all floating point types have a higher rank than integer types)

The original comment explains why you should use < LONG_MAX. I would keep the original comment and the code, and just move it into PyLong_FromDouble().

I think I had to add comment about this: I assumed that LONG_MAX == 2 ** (CHAR_BIT * sizeof(long) - 1) - 1 and LONG_MIN == -2 ** (CHAR_BIT * sizeof(long) - 1), i.e. (unsigned long)LONG_MAX + 1 is a power of two and can be exactly represented by double (assuming that FLT_RADIX == 2). Does that make sense?

(Originally I wrote it like this: const double int_max = pow(2, CHAR_BIT * sizeof(long) - 1), see #15611 (comment))

Here I'm trying to demonstrate correctness of this approach:

In [66]: SIZEOF_LONG = 8; CHAR_BITS = 8 In [67]: LONG_MAX = (1 << (SIZEOF_LONG * CHAR_BITS - 1)) - 1; LONG_MIN = -LONG_MAX - 1 In [68]: int_max = float(LONG_MAX + 1) In [69]: int_max == LONG_MAX + 1 Out[69]: True In [70]: def cast_to_long(dval): ...: assert isinstance(dval, float) ...: wholepart = math.trunc(dval) ...: if LONG_MIN <= wholepart <= LONG_MAX: ...: return wholepart ...: raise RuntimeError('undefined behavior') In [71]: def long_from_double(dval): ...: assert isinstance(dval, float) ...: if -int_max <= dval < int_max: ...: return cast_to_long(dval) ...: raise ValueError('float is out of range, use frexp()') In [72]: long_from_double(int_max) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-72-280887471997> in <module>() ----> 1 long_from_double(int_max) <ipython-input-71-ccaef6014bf1> in long_from_double(dval) 3 if -int_max <= dval < int_max: 4 return cast_to_long(dval) ----> 5 raise ValueError('float is out of range, use frexp()') ValueError: float is out of range, use frexp() In [73]: int_max.hex() Out[73]: '0x1.0000000000000p+63' In [74]: long_from_double(float.fromhex('0x1.fffffffffffffp+62')) Out[74]: 9223372036854774784 In [75]: long_from_double(float.fromhex('-0x1.fffffffffffffp+62')) Out[75]: -9223372036854774784 In [76]: long_from_double(-int_max) Out[76]: -9223372036854775808 In [77]: long_from_double(float.fromhex('-0x1.0000000000001p+63')) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-77-de5e9e1eba23> in <module>() ----> 1 long_from_double(float.fromhex('-0x1.0000000000001p+63')) <ipython-input-71-ccaef6014bf1> in long_from_double(dval) 3 if -int_max <= dval < int_max: 4 return cast_to_long(dval) ----> 5 raise ValueError('float is out of range, use frexp()') ValueError: float is out of range, use frexp()

I think this is fine, under reasonable assumptions on the platform. LONG_MAX + 1 must be a power of 2 (follows from C99 §6.2.6.2p2), and while it's theoretically possible that double will be unable to represent LONG_MAX + 1 exactly, that seems highly unlikely in practice. So the conversion to double must be exact (C99 §6.3.1.4p2).

It's not safe based purely on the C standard to assume that LONG_MIN = -LONG_MAX - 1: the integer representation could be ones' complement or sign-magnitude, in which case LONG_MIN = -LONG_MAX. But that assumption is safe in practice for any platform that Python's likely to meet, and we make the assumption of two's complement for signed integers elsewhere in the codebase. If we're worried enough about this, we could change the -int_max <= dval comparison to -int_max < dval. On balance, I'd suggest making that change (partly just for the aesthetics of the symmetry).

Believe it or not, it's also not safe based purely on the C standard to assume that (unsigned long)LONG_MAX + 1 is representable as an unsigned long: C99 §6.2.5p9 only guarantees that nonnegative long values are representable as unsigned long But the chance of that not being true in practice is negligible (at least until someone tries to port CPython to the DS9000). And the failure mode is benign: we'd just end up never taking the fast path.

Re-reading all this, I had one more worry (which is why I dismissed my own review): what happens if the exact value of dval lies strictly between LONG_MAX and LONG_MAX + 1? In that case we could end up converting a double that, strictly speaking, is outside the range of long. But it turns out that we're safe, because C99 is quite explicit here: §6.3.1.4p1 says (emphasis mine):

If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

So any double value that's strictly smaller than LONG_MAX + 1 should be fine.

it's also not safe based purely on the C standard to assume that (unsigned long)LONG_MAX + 1 is representable as an unsigned long

Then I think we could use ((double)(LONG_MAX / 2 + 1)) * 2, but does it worth it?

It's not safe based purely on the C standard to assume that LONG_MIN = -LONG_MAX - 1: the integer representation could be ones' complement or sign-magnitude, in which case LONG_MIN = -LONG_MAX. But that assumption is safe in practice for any platform that Python's likely to meet, and we make the assumption of two's complement for signed integers elsewhere in the codebase.

Shouldn't we formally state that we support only two's complement representation?
BTW it was proposed to abandon other representations and it looks like committee is agree with that.

Then I think we could use ((double)(LONG_MAX / 2 + 1)) * 2, but does it worth it?

Definitely not worth it! The C standard permits LONG_MAX == ULONG_MAX, but I'd be astonished if you ever find a real implementation (now or in the future) that has this property.

Shouldn't we formally state that we support only two's complement representation?

Yes, we should, though I'm not sure where would be the best place. But I think it's a non-issue in practice.

serhiy-storchaka · 2019-09-16T16:34:14Z

I left this on to @mdickinson.

serhiy-storchaka · 2019-10-21T06:46:57Z

@mdickinson, could you please take a look?

ghost · 2019-10-21T08:24:40Z

C type long on 64-bit MSVC is 4-byte integer. [1]
Please consider using Py_ssize_t.

[1] https://stackoverflow.com/questions/384502

mdickinson · 2019-10-21T10:35:29Z

@mdickinson, could you please take a look?

Will do, but not before this evening (UTC+01:00)

Objects/longobject.c

mdickinson

Changes LGTM; one minor suggested change, but it's not one I feel that strongly about.

mdickinson · 2019-10-26T11:16:53Z

C type long on 64-bit MSVC is 4-byte integer. [...] Please consider using Py_ssize_t.

I doubt that this is worth it: the real value of the fast path is for small values, especially values that will fit into a single PyLong digit. I'd expect that (a) converting values in the range [2**31, 2**63) is much rarer than converting values smaller than 2**31, and (b) the speedup wouldn't be all that significant.

Sorry; I didn't look hard enough. I think there's still potentially an issue here. Will comment further.

sir-sigurd · 2019-10-26T12:48:51Z

the speedup wouldn't be all that significant.

From the issue on bpo:

+---------------------+---------------------+------------------------------+
| math.ceil(2.**30)   | 64.2 ns             | 43.9 ns: 1.46x faster (-32%) |
+---------------------+---------------------+------------------------------+
| math.ceil(2.**60)   | 66.3 ns             | 42.3 ns: 1.57x faster (-36%) |
+---------------------+---------------------+------------------------------+

mdickinson · 2019-10-26T12:57:18Z

From the issue on bpo:

Hmm, that's not nothing. :-) Are those timings with 30-bit digits or 15-bit digits? (Last time I checked, even 64-bit Windows was using 15-bit digits, but that was some time ago.)

I'd still maintain that converting values in that range would be infrequent enough that the effect on real-world code would be minimal; so I'm still -1 on switching long to Py_ssize_t here.

mdickinson · 2019-10-26T13:11:11Z

Last time I checked, even 64-bit Windows was using 15-bit digits, but that was some time ago.

Self-correction: it's using 30-bit digits. The choice is based on SIZEOF_VOID_P.

mdickinson · 2019-10-26T13:15:42Z

@sir-sigurd Please could you add a news entry?

ghost · 2019-10-27T11:37:08Z

I'd still maintain that converting values in that range would be infrequent enough that the effect on real-world code would be minimal; so I'm still -1 on switching long to Py_ssize_t here.

Yes, the value of that range is relatively uncommon.
But this performance improvement on Windows is free, just use another C type.

sir-sigurd · 2020-05-10T06:59:05Z

Anything else should be done to get this merged?

mdickinson · 2020-05-10T08:39:41Z

Anything else should be done to get this merged?

I'll re-review today.

serhiy-storchaka · 2020-05-10T08:48:26Z

Misc/NEWS.d/next/Core and Builtins/2019-11-20-09-50-58.bpo-37986.o0lmA7.rst

@@ -0,0 +1,4 @@
+Improve performance of :c:func:`PyLong_FromDouble` for values that fit into
+:c:type:`long`. Now :meth:`float.__trunc__` is faster up to 10%,
+:func:`math.floor()` and :func:`math.ceil()` are faster up to 30% when used


Are these numbers still correct? There are other changes related to trunc/floor/ceil in 3.9, they can reduce or increase the relative effect of this optimization.

I think we could just drop the second sentence and keep the first here.

@gvanrossum

* Update docs. * bpo-40513: Per-interpreter signals pending (GH-19924) Move signals_pending from _PyRuntime.ceval to PyInterpreterState.ceval. * bpo-40513: Per-interpreter gil_drop_request (GH-19927) Move gil_drop_request member from _PyRuntimeState.ceval to PyInterpreterState.ceval. * bpo-40514: Add --with-experimental-isolated-subinterpreters (GH-19926) Add --with-experimental-isolated-subinterpreters build option to configure: better isolate subinterpreters, experimental build mode. When used, force the usage of the libc malloc() memory allocator, since pymalloc relies on the unique global interpreter lock (GIL). * bpo-32117: Updated Simpsons names in docs (GH-19737) `sally` is not a Simpsons character Automerge-Triggered-By: @gvanrossum * bpo-40513: Per-interpreter recursion_limit (GH-19929) Move recursion_limit member from _PyRuntimeState.ceval to PyInterpreterState.ceval. * Py_SetRecursionLimit() now only sets _Py_CheckRecursionLimit of ceval.c if the current Python thread is part of the main interpreter. * Inline _Py_MakeEndRecCheck() into _Py_LeaveRecursiveCall(). * Convert _Py_RecursionLimitLowerWaterMark() macro into a static inline function. * bpo-29587: _PyErr_ChainExceptions() checks exception (GH-19902) _PyErr_ChainExceptions() now ensures that the first parameter is an exception type, as done by _PyErr_SetObject(). * The following function now check PyExceptionInstance_Check() in an assertion using a new _PyBaseExceptionObject_cast() helper function: * PyException_GetTraceback(), PyException_SetTraceback() * PyException_GetCause(), PyException_SetCause() * PyException_GetContext(), PyException_SetContext() * PyExceptionClass_Name() now checks PyExceptionClass_Check() with an assertion. * Remove XXX comment and add gi_exc_state variable to _gen_throw(). * Remove comment from test_generators * bpo-40520: Remove redundant comment in pydebug.h (GH-19931) Automerge-Triggered-By: @corona10 * Revert "bpo-40513: Per-interpreter signals pending (GH-19924)" (GH-19932) This reverts commit 4e01946. * bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933) When Python is built in the experimental isolated subinterpreters mode, disable Unicode singletons and Unicode interned strings since they are shared by all interpreters. Temporary workaround until these caches are made per-interpreter. * bpo-40458: Increase reserved stack space to prevent overflow crash on Windows (GH-19845) * bpo-40521: Disable free lists in subinterpreters (GH-19937) When Python is built with experimental isolated interpreters, disable tuple, dict and free free lists. Temporary workaround until these caches are made per-interpreter. Add frame_alloc() and frame_get_builtins() subfunctions to simplify _PyFrame_New_NoTrack(). * bpo-40522: _PyThreadState_Swap() sets autoTSSkey (GH-19939) In the experimental isolated subinterpreters build mode, _PyThreadState_GET() gets the autoTSSkey variable and _PyThreadState_Swap() sets the autoTSSkey variable. * Add _PyThreadState_GetTSS() * _PyRuntimeState_GetThreadState() and _PyThreadState_GET() return _PyThreadState_GetTSS() * PyEval_SaveThread() sets the autoTSSkey variable to current Python thread state rather than NULL. * eval_frame_handle_pending() doesn't check that _PyThreadState_Swap() result is NULL. * _PyThreadState_Swap() gets the current Python thread state with _PyThreadState_GetTSS() rather than _PyRuntimeGILState_GetThreadState(). * PyGILState_Ensure() no longer checks _PyEval_ThreadsInitialized() since it cannot access the current interpreter. * bpo-40513: new_interpreter() init GIL earlier (GH-19942) Fix also code to handle init_interp_main() failure. * bpo-40513: Per-interpreter GIL (GH-19943) In the experimental isolated subinterpreters build mode, the GIL is now per-interpreter. Move gil from _PyRuntimeState.ceval to PyInterpreterState.ceval. new_interpreter() always get the config from the main interpreter. * bpo-40513: _xxsubinterpreters.run_string() releases the GIL (GH-19944) In the experimental isolated subinterpreters build mode, _xxsubinterpreters.run_string() now releases the GIL. * bpo-40355: Improve error messages in ast.literal_eval with malformed Dict nodes (GH-19868) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> * bpo-40504: Allow weakrefs to lru_cache objects (GH-19938) * bpo-40523: Add pass-throughs for hash() and reversed() to weakref.proxy objects (GH-19946) * bpo-40480 "fnmatch" exponential execution time (GH-19908) bpo-40480: create different regexps in the presence of multiple `*` patterns to prevent fnmatch() from taking exponential time. * bpo-40517: Implement syntax highlighting support for ASDL (#19928) * Revert "bpo-40517: Implement syntax highlighting support for ASDL (#19928)" (#19950) This reverts commit d60040b. * bpo-40527: Fix command line argument parsing (GH-19955) * bpo-40528: Improve and clear several aspects of the ASDL definition code for the AST (GH-19952) * bpo-40521: Disable method cache in subinterpreters (GH-19960) When Python is built with experimental isolated interpreters, disable the type method cache. Temporary workaround until the cache is made per-interpreter. * bpo-40533: Disable GC in subinterpreters (GH-19961) When Python is built with experimental isolated interpreters, a garbage collection now does nothing in an isolated interpreter. Temporary workaround until subinterpreters stop sharing Python objects. * bpo-40521: Disable list free list in subinterpreters (GH-19959) When Python is built with experimental isolated interpreters, disable the list free list. Temporary workaround until this cache is made per-interpreter. * bpo-40334: Add type to the assignment rule in the grammar file (GH-19963) * Fix typo in sqlite3 documentation (GH-19965) *first* is repeated twice. * bpo-40334: Allow trailing comma in parenthesised context managers (GH-19964) * bpo-40334: Generate comments in the parser code to improve debugging (GH-19966) * bpo-40397: Refactor typing._GenericAlias (GH-19719) Make the design more object-oriented. Split _GenericAlias on two almost independent classes: for special generic aliases like List and for parametrized generic aliases like List[int]. Add specialized subclasses for Callable, Callable[...], Tuple and Union[...]. * bpo-1635741: Port errno module to multiphase initialization (GH-19923) * bpo-40334: Fix error location upon parsing an invalid string literal (GH-19962) When parsing a string with an invalid escape, the old parser used to point to the beginning of the invalid string. This commit changes the new parser to match that behaviour, since it's currently pointing to the end of the string (or to be more precise, to the beginning of the next token). * bpo-40334: Error message for invalid default args in function call (GH-19973) When parsing something like `f(g()=2)`, where the name of a default arg is not a NAME, but an arbitrary expression, a specialised error message is emitted. * bpo-38787: C API for module state access from extension methods (PEP 573) (GH-19936) Module C state is now accessible from C-defined heap type methods (PEP 573). Patch by Marcel Plch and Petr Viktorin. Co-authored-by: Marcel Plch <mplch@redhat.com> Co-authored-by: Victor Stinner <vstinner@python.org> * bpo-40545: Export _PyErr_GetTopmostException() function (GH-19978) Declare _PyErr_GetTopmostException() with PyAPI_FUNC() to properly export the function in the C API. The function remains private ("_Py") prefix. Co-Authored-By: Julien Danjou <julien@danjou.info> * bpo-32604: [_xxsubinterpreters] Propagate exceptions. (GH-19768) (Note: PEP 554 is not accepted and the implementation in the code base is a private one for use in the test suite.) If code running in a subinterpreter raises an uncaught exception then the "run" call in the calling interpreter fails. A RunFailedError is raised there that summarizes the original exception as a string. The actual exception type, __cause__, __context__, state, etc. are all discarded. This turned out to be functionally insufficient in practice. There is a more helpful solution (and PEP 554 has been updated appropriately). This change adds the exception propagation behavior described in PEP 554 to the _xxsubinterpreters module. With this change a copy of the original exception is set to __cause__ on the RunFailedError. For now we are using "pickle", which preserves the exception's state. We also preserve the original __cause__, __context__, and __traceback__ (since "pickle" does not preserve those). https://bugs.python.org/issue32604 * bpo-38787: Update structures.rst docs (PEP 573) (GH-19980) * bpo-40548: Always run GitHub action, even on doc PRs (GH-19981) Always run GitHub action jobs, even on documentation-only pull requests. So it will be possible to make a GitHub action job, like the Windows (64-bit) job, mandatory. * bpo-40517: Implement syntax highlighting support for ASDL (GH-19967) * bpo-40555: Check for p->error_indicator in loop rules after the main loop is done (GH-19986) * bpo-40273: Reversible mappingproxy (FH-19513) * bpo-40559: Add Py_DECREF to _asynciomodule.c:task_step_impl() (GH-19990) This fixes a possible memory leak in the C implementation of asyncio.Task. * Make the first dataclass example more useful (GH-19994) * bpo-40541: Add optional *counts* parameter to random.sample() (GH-19970) * bpo-40502: Initialize n->n_col_offset (GH-19988) * initialize n->n_col_offset * 📜🤖 Added by blurb_it. * Move initialization Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> * bpo-39791: Add files() to importlib.resources (GH-19722) * bpo-39791: Update importlib.resources to support files() API (importlib_resources 1.5). * 📜🤖 Added by blurb_it. * Add some documentation about the new objects added. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> * bpo-40566: Apply PEP 573 to abc module (GH-20005) * bpo-40570: Improve compatibility of uname_result with late-bound .platform (#20015) * bpo-40570: Improve compatibility of uname_result with late-bound .platform. * Add test capturing ability to cast uname to a tuple. * bpo-40334: Avoid collisions between parser variables and grammar variables (GH-19987) This is for the C generator: - Disallow rule and variable names starting with `_` - Rename most local variable names generated by the parser to start with `_` Exceptions: - Renaming `p` to `_p` will be a separate PR - There are still some names that might clash, e.g. - anything starting with `Py` - C reserved words (`if` etc.) - Macros like `EXTRA` and `CHECK` * Add link to Enum class (GH-19884) * bpo-40397: Remove __args__ and __parameters__ from _SpecialGenericAlias (GH-19984) * bpo-40549: Convert posixmodule.c to multiphase init (GH-19982) Convert posixmodule.c ("posix" or "nt" module) to the multiphase initialization (PEP 489). * Create the module using PyModuleDef_Init(). * Create ScandirIteratorType and DirEntryType with the new PyType_FromModuleAndSpec() (PEP 573) * Get the module state from ScandirIteratorType and DirEntryType with the new PyType_GetModule() (PEP 573) * Pass module to functions which access the module state. * convert_sched_param() gets a new module parameter. It is now called directly since Argument Clinic doesn't support passing the module to an argument converter callback. * Remove _posixstate_global macro. * bpo-37986: Improve perfomance of PyLong_FromDouble() (GH-15611) * bpo-37986: Improve perfomance of PyLong_FromDouble() * Use strict bound check for safety and symmetry * Remove possibly outdated performance claims Co-authored-by: Mark Dickinson <dickinsm@gmail.com> * bpo-40397: Fix subscription of nested generic alias without parameters. (GH-20021) * bpo-40257: Tweak docstrings for special generic aliases. (GH-20022) * Add the terminating period. * Omit module name for builtin types. * Improve code clarity for the set lookup logic (GH-20028) * bpo-40585: Normalize errors messages in codeop when comparing them (GH-20030) With the new parser, the error message contains always the trailing newlines, causing the comparison of the repr of the error messages in codeop to fail. This commit makes the new parser mirror the old parser's behaviour regarding trailing newlines. * bpo-40575: Avoid unnecessary overhead in _PyDict_GetItemIdWithError() (GH-20018) Avoid unnecessary overhead in _PyDict_GetItemIdWithError() by calling _PyDict_GetItem_KnownHash() instead of the more generic PyDict_GetItemWithError(), since we already know the hash of interned strings. * bpo-36346: array: Don't use deprecated APIs (GH-19653) * Py_UNICODE -> wchar_t * Py_UNICODE -> unicode in Argument Clinic * PyUnicode_AsUnicode -> PyUnicode_AsWideCharString * Don't use "u#" format. Co-authored-by: Victor Stinner <vstinner@python.org> * bpo-40561: Add docstrings for webbrowser open functions (GH-19999) Co-authored-by: Brad Solomon <brsolomon@deloitte.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> * bpo-40584: Update PyType_FromModuleAndSpec() to process tp_vectorcall_offset (GH-20026) * bpo-40334: produce specialized errors for invalid del targets (GH-19911) * bpo-39465: Don't access directly _Py_Identifier members (GH-20043) * Replace id->object with _PyUnicode_FromId(&id) * Use _Py_static_string_init(str) macro to initialize statically name_op in typeobject.c. * bpo-40571: Make lru_cache(maxsize=None) more discoverable (GH-20019) * bpo-40602: Rename hashtable.h to pycore_hashtable.h (GH-20044) * Move Modules/hashtable.h to Include/internal/pycore_hashtable.h * Move Modules/hashtable.c to Python/hashtable.c * Python is now linked to hashtable.c. _tracemalloc is no longer linked to hashtable.c. Previously, marshal.c got hashtable.c via _tracemalloc.c which is built as a builtin module. * bpo-40602: _Py_hashtable_new() uses PyMem_Malloc() (GH-20046) _Py_hashtable_new() now uses PyMem_Malloc/PyMem_Free allocator by default, rather than PyMem_RawMalloc/PyMem_RawFree. PyMem_Malloc is faster than PyMem_RawMalloc for memory blocks smaller than or equal to 512 bytes. * bpo-40480: restore ability to join fnmatch.translate() results (GH-20049) In translate(), generate unique group names across calls. The restores the undocumented ability to get a valid regexp by joining multiple translate() results via `|`. * bpo-39481: remove generic classes from ipaddress/mmap (GH-20045) These were added by mistake (see https://bugs.python.org/issue39481#msg366288). * bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033) * bpo-40602: Optimize _Py_hashtable for pointer keys (GH-20051) Optimize _Py_hashtable_get() and _Py_hashtable_get_entry() for pointer keys: * key_size == sizeof(void*) * hash_func == _Py_hashtable_hash_ptr * compare_func == _Py_hashtable_compare_direct Changes: * Add get_func and get_entry_func members to _Py_hashtable_t * Convert _Py_hashtable_get() and _Py_hashtable_get_entry() functions to static nline functions. * Add specialized get and get entry for pointer keys. * bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing non-BMP characters on Windows. (GH-20053) * bpo-38787: Add PyCFunction_CheckExact() macro for exact type checks (GH-20024) … now that we allow subtypes of PyCFunction. Also add PyCMethod_CheckExact() and PyCMethod_Check() for checks against the PyCMethod subtype. * bpo-40602: Add _Py_HashPointerRaw() function (GH-20056) Add a new _Py_HashPointerRaw() function which avoids replacing -1 with -2 to micro-optimize hash table using pointer keys: using _Py_hashtable_hash_ptr() hash function. * bpo-40501: Replace ctypes code in uuid with native module (GH-19948) * Fix Wikipedia link (GH-20031) * bpo-40609: Rewrite how _tracemalloc handles domains (GH-20059) Rewrite how the _tracemalloc module stores traces of other domains. Rather than storing the domain inside the key, it now uses a new hash table with the domain as the key, and the data is a per-domain traces hash table. * Add tracemalloc_domain hash table. * Remove _Py_tracemalloc_config.use_domain. * Remove pointer_t and related functions. * bpo-40609: Remove _Py_hashtable_t.key_size (GH-20060) Rewrite _Py_hashtable_t type to always store the key as a "const void *" pointer. Add an explicit "key" member to _Py_hashtable_entry_t. Remove _Py_hashtable_t.key_size member. hash and compare functions drop their hash table parameter, and their 'key' parameter type becomes "const void *". * bpo-40609: Add destroy functions to _Py_hashtable (GH-20062) Add key_destroy_func and value_destroy_func parameters to _Py_hashtable_new_full(). marshal.c and _tracemalloc.c use these destroy functions. * bpo-40609: _tracemalloc allocates traces (GH-20064) Rewrite _tracemalloc to store "trace_t*" rather than directly "trace_t" in traces hash tables. Traces are now allocated on the heap memory, outside the hash table. Add tracemalloc_copy_traces() and tracemalloc_copy_domains() helper functions. Remove _Py_hashtable_copy() function since there is no API to copy a key or a value. Remove also _Py_hashtable_delete() function which was commented. * bpo-40609: _Py_hashtable_t values become void* (GH-20065) _Py_hashtable_t values become regular "void *" pointers. * Add _Py_hashtable_entry_t.data member * Remove _Py_hashtable_t.data_size member * Remove _Py_hashtable_t.get_func member. It is no longer needed to specialize _Py_hashtable_get() for a specific value size, since all entries now have the same size (void*). * Remove the following macros: * _Py_HASHTABLE_GET() * _Py_HASHTABLE_SET() * _Py_HASHTABLE_SET_NODATA() * _Py_HASHTABLE_POP() * Rename _Py_hashtable_pop() to _Py_hashtable_steal() * _Py_hashtable_foreach() callback now gets key and value rather than entry. * Remove _Py_hashtable_value_destroy_func type. value_destroy_func callback now only has a single parameter: data (void*). * bpo-40602: Optimize _Py_hashtable_get_ptr() (GH-20066) _Py_hashtable_get_entry_ptr() avoids comparing the entry hash: compare directly keys. Move _Py_hashtable_get_entry_ptr() just after _Py_hashtable_get_entry_generic(). * bpo-40331: Increase test coverage for the statistics module (GH-19608) * bpo-40613: Remove compiler warning from _xxsubinterpretersmodule (GH-20069) * bpo-34790: add version of removal of explicit passing of coros to `asyncio.wait`'s documentation (#20008) * bpo-40334: Always show the caret on SyntaxErrors (GH-20050) This commit fixes SyntaxError locations when the caret is not displayed, by doing the following: - `col_number` always gets set to the location of the offending node/expr. When no caret is to be displayed, this gets achieved by setting the object holding the error line to None. - Introduce a new function `_PyPegen_raise_error_known_location`, which can be called, when an arbitrary `lineno`/`col_offset` needs to be passed. This function then gets used in the grammar (through some new macros and inline functions) so that SyntaxError locations of the new parser match that of the old. * bpo-38787: Fix Argument Clinic defining_class_converter (GH-20074) Don't hardcode defining_class parameter name to "cls": * Define CConverter.set_template_dict(): do nothing by default * CLanguage.render_function() now calls set_template_dict() on all converters. * issue-25872: Fix KeyError using linecache from multiple threads (GH-18007) The crash that this fixes occurs when using traceback and other modules from multiple threads; del cache[filename] can raise a KeyError. * bpo-39465: Remove _PyUnicode_ClearStaticStrings() from C API (GH-20078) Remove the _PyUnicode_ClearStaticStrings() function from the C API. Make the function fully private (declare it with "static"). * bpo-29587: Make gen.throw() chain exceptions with yield from (GH-19858) The previous commits on bpo-29587 got exception chaining working with gen.throw() in the `yield` case. This patch also gets the `yield from` case working. As a consequence, implicit exception chaining now also works in the asyncio scenario of awaiting on a task when an exception is already active. Tests are included for both the asyncio case and the pure generator-only case. * bpo-40521: Add PyInterpreterState.unicode (GH-20081) Move PyInterpreterState.fs_codec into a new PyInterpreterState.unicode structure. Give a name to the fs_codec structure and use this structure in unicodeobject.c. * bpo-40597: email: Use CTE if lines are longer than max_line_length consistently (gh-20038) raw_data_manager (default for EmailPolicy, EmailMessage) does correct wrapping of 'text' parts as long as the message contains characters outside of 7bit US-ASCII set: base64 or qp Content-Transfer-Encoding is applied if the lines would be too long without it. It did not, however, do this for ascii-only text, which could result in lines that were longer than policy.max_line_length or even the rfc 998 maximum. This changeset fixes the heuristic so that if lines are longer than policy.max_line_length, it will always apply a content-transfer-encoding so that the lines are wrapped correctly. * bpo-40275: Import locale module lazily in gettext (GH-19905) * bpo-40495: compileall option to hardlink duplicate pyc files (GH-19901) compileall is now able to use hardlinks to prevent duplicates in a case when .pyc files for different optimization levels have the same content. Co-authored-by: Miro Hrončok <miro@hroncok.cz> Co-authored-by: Victor Stinner <vstinner@python.org> * bpo-40549: posixmodule.c uses defining_class (GH-20075) Pass PEP 573 defining_class to os.DirEntry methods. The module state is now retrieve from defining_class rather than Py_TYPE(self), to support subclasses (even if DirEntry doesn't support subclasses yet). * Pass the module rather than defining_class to DirEntry_fetch_stat(). * Only get the module state once in _posix_clear(), _posix_traverse() and _posixmodule_exec(). * Revert "bpo-32604: [_xxsubinterpreters] Propagate exceptions. (GH-19768)" (GH-20089) * Revert "bpo-40613: Remove compiler warning from _xxsubinterpretersmodule (GH-20069)" This reverts commit fa0a66e. * Revert "bpo-32604: [_xxsubinterpreters] Propagate exceptions. (GH-19768)" This reverts commit a1d9e0a. * bpo-40602: Write unit tests for _Py_hashtable_t (GH-20091) Cleanup also hashtable.c. Rename _Py_hashtable_t members: * Rename entries to nentries * Rename num_buckets to nbuckets * bpo-40619: Correctly handle error lines in programs without file mode (GH-20090) * bpo-40618: Disallow invalid targets in augassign and except clauses (GH-20083) This commit fixes the new parser to disallow invalid targets in the following scenarios: - Augmented assignments must only accept a single target (Name, Attribute or Subscript), but no tuples or lists. - `except` clauses should only accept a single `Name` as a target. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> * bpo-40602: _Py_hashtable_set() reports rehash failure (GH-20077) If _Py_hashtable_set() fails to grow the hash table (rehash), it now fails rather than ignoring the error. * bpo-40548: GitHub Action workflow: skip jobs on doc only PRs (GH-19983) Signed-off-by: Filipe Laíns <lains@archlinux.org> * bpo-40460: Fix typo in idlelib/zzdummy.py (GH-20093) Replace ztest with ztext. * bpo-40462: Fix typo in test_json (GH-20094) * bpo-38872: Document exec symbol for codeop.compile_command (GH-20047) * Document exec symbol for codeop.compile_command * Remove extra statements Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com> * bpo-40334: Correctly identify invalid target in assignment errors (GH-20076) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> * bpo-40548: github actions: pass the changes check on no source changes (GH-20097) Signed-off-by: Filipe Laíns <lains@archlinux.org> * Update code comment re: location of struct _is. (GH-20067) * bpo-40612: Fix SyntaxError edge cases in traceback formatting (GH-20072) This fixes both the traceback.py module and the C code for formatting syntax errors (in Python/pythonrun.c). They now both consistently do the following: - Suppress caret if it points left of text - Allow caret pointing just past end of line - If caret points past end of line, clip to *just* past end of line The syntax error formatting code in traceback.py was mostly rewritten; small, subtle changes were applied to the C code in pythonrun.c. There's still a difference when the text contains embedded newlines. Neither handles these very well, and I don't think the case occurs in practice. Automerge-Triggered-By: @gvanrossum * Fix typo in code comment in main_loop label. (GH-20068) * Trivial typo fix in _tkinter.c (GH-19622) Change spelling of a #define in _tkinter.c from HAVE_LIBTOMMAMTH to HAVE_LIBTOMMATH, since this is used to keep track of tclTomMath.h, not tclTomMamth.h. No other file seems to refer to this variable. * bpo-40055: test_distutils leaves warnings filters unchanged (GH-20095) distutils.tests now saves/restores warnings filters to leave them unchanged. Importing tests imports docutils which imports pkg_resources which adds a warnings filter. * bpo-40479: Fix hashlib issue with OpenSSL 3.0.0 (GH-20107) OpenSSL 3.0.0-alpha2 was released today. The FIPS_mode() function has been deprecated and removed. It no longer makes sense with the new provider and context system in OpenSSL 3.0.0. EVP_default_properties_is_fips_enabled() is good enough for our needs in unit tests. It's an internal API, too. Signed-off-by: Christian Heimes <christian@python.org> * bpo-40479: Test with latest OpenSSL versions (GH-20108) * 1.0.2u (EOL) * 1.1.0l (EOL) * 1.1.1g * 3.0.0-alpha2 (disabled for now) Build the FIPS provider and create a FIPS configuration file for OpenSSL 3.0.0. Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: @tiran * Update NEWS. Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Javier Buzzi <buzzi.javier@gmail.com> Co-authored-by: Hai Shi <shihai1992@gmail.com> Co-authored-by: Steve Dower <steve.dower@python.org> Co-authored-by: Curtis Bucher <cpbucher5@gmail.com> Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com> Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com> Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com> Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Naglis <naglis@users.noreply.github.com> Co-authored-by: Dong-hee Na <donghee.na92@gmail.com> Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: Marcel Plch <mplch@redhat.com> Co-authored-by: Julien Danjou <julien@danjou.info> Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com> Co-authored-by: Zackery Spytz <zspytz@gmail.com> Co-authored-by: Chris Jerdonek <chris.jerdonek@gmail.com> Co-authored-by: Ned Batchelder <ned@nedbatchelder.com> Co-authored-by: Joannah Nanjekye <33177550+nanjekyejoannah@users.noreply.github.com> Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Jason R. Coombs <jaraco@jaraco.com> Co-authored-by: Andre Delfino <adelfino@gmail.com> Co-authored-by: Sergey Fedoseev <fedoseev.sergey@gmail.com> Co-authored-by: Mark Dickinson <dickinsm@gmail.com> Co-authored-by: scoder <stefan_ml@behnel.de> Co-authored-by: Inada Naoki <songofacandy@gmail.com> Co-authored-by: Brad Solomon <brad.solomon.1124@gmail.com> Co-authored-by: Brad Solomon <brsolomon@deloitte.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Shantanu <hauntsaninja@users.noreply.github.com> Co-authored-by: Allen Guo <guoguo12@gmail.com> Co-authored-by: Tzanetos Balitsaris <tbalitsaris@gmail.com> Co-authored-by: jack1142 <6032823+jack1142@users.noreply.github.com> Co-authored-by: Michael Graczyk <mgraczyk@users.noreply.github.com> Co-authored-by: Arkadiusz Hiler <arek.l1@gmail.com> Co-authored-by: Lumír 'Frenzy' Balhar <lbalhar@redhat.com> Co-authored-by: Miro Hrončok <miro@hroncok.cz> Co-authored-by: Filipe Laíns <filipe.lains@gmail.com> Co-authored-by: Filipe Laíns <lains@archlinux.org> Co-authored-by: Guido van Rossum <guido@python.org> Co-authored-by: Andrew York <andrew.g.york+github@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>

@gvanrossum

* Update docs. * bpo-40513: Per-interpreter signals pending (GH-19924) Move signals_pending from _PyRuntime.ceval to PyInterpreterState.ceval. * bpo-40513: Per-interpreter gil_drop_request (GH-19927) Move gil_drop_request member from _PyRuntimeState.ceval to PyInterpreterState.ceval. * bpo-40514: Add --with-experimental-isolated-subinterpreters (GH-19926) Add --with-experimental-isolated-subinterpreters build option to configure: better isolate subinterpreters, experimental build mode. When used, force the usage of the libc malloc() memory allocator, since pymalloc relies on the unique global interpreter lock (GIL). * bpo-32117: Updated Simpsons names in docs (GH-19737) `sally` is not a Simpsons character Automerge-Triggered-By: @gvanrossum * bpo-40513: Per-interpreter recursion_limit (GH-19929) Move recursion_limit member from _PyRuntimeState.ceval to PyInterpreterState.ceval. * Py_SetRecursionLimit() now only sets _Py_CheckRecursionLimit of ceval.c if the current Python thread is part of the main interpreter. * Inline _Py_MakeEndRecCheck() into _Py_LeaveRecursiveCall(). * Convert _Py_RecursionLimitLowerWaterMark() macro into a static inline function. * bpo-29587: _PyErr_ChainExceptions() checks exception (GH-19902) _PyErr_ChainExceptions() now ensures that the first parameter is an exception type, as done by _PyErr_SetObject(). * The following function now check PyExceptionInstance_Check() in an assertion using a new _PyBaseExceptionObject_cast() helper function: * PyException_GetTraceback(), PyException_SetTraceback() * PyException_GetCause(), PyException_SetCause() * PyException_GetContext(), PyException_SetContext() * PyExceptionClass_Name() now checks PyExceptionClass_Check() with an assertion. * Remove XXX comment and add gi_exc_state variable to _gen_throw(). * Remove comment from test_generators * bpo-40520: Remove redundant comment in pydebug.h (GH-19931) Automerge-Triggered-By: @corona10 * Revert "bpo-40513: Per-interpreter signals pending (GH-19924)" (GH-19932) This reverts commit 4e01946. * bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933) When Python is built in the experimental isolated subinterpreters mode, disable Unicode singletons and Unicode interned strings since they are shared by all interpreters. Temporary workaround until these caches are made per-interpreter. * bpo-40458: Increase reserved stack space to prevent overflow crash on Windows (GH-19845) * bpo-40521: Disable free lists in subinterpreters (GH-19937) When Python is built with experimental isolated interpreters, disable tuple, dict and free free lists. Temporary workaround until these caches are made per-interpreter. Add frame_alloc() and frame_get_builtins() subfunctions to simplify _PyFrame_New_NoTrack(). * bpo-40522: _PyThreadState_Swap() sets autoTSSkey (GH-19939) In the experimental isolated subinterpreters build mode, _PyThreadState_GET() gets the autoTSSkey variable and _PyThreadState_Swap() sets the autoTSSkey variable. * Add _PyThreadState_GetTSS() * _PyRuntimeState_GetThreadState() and _PyThreadState_GET() return _PyThreadState_GetTSS() * PyEval_SaveThread() sets the autoTSSkey variable to current Python thread state rather than NULL. * eval_frame_handle_pending() doesn't check that _PyThreadState_Swap() result is NULL. * _PyThreadState_Swap() gets the current Python thread state with _PyThreadState_GetTSS() rather than _PyRuntimeGILState_GetThreadState(). * PyGILState_Ensure() no longer checks _PyEval_ThreadsInitialized() since it cannot access the current interpreter. * bpo-40513: new_interpreter() init GIL earlier (GH-19942) Fix also code to handle init_interp_main() failure. * bpo-40513: Per-interpreter GIL (GH-19943) In the experimental isolated subinterpreters build mode, the GIL is now per-interpreter. Move gil from _PyRuntimeState.ceval to PyInterpreterState.ceval. new_interpreter() always get the config from the main interpreter. * bpo-40513: _xxsubinterpreters.run_string() releases the GIL (GH-19944) In the experimental isolated subinterpreters build mode, _xxsubinterpreters.run_string() now releases the GIL. * bpo-40355: Improve error messages in ast.literal_eval with malformed Dict nodes (GH-19868) Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> * bpo-40504: Allow weakrefs to lru_cache objects (GH-19938) * bpo-40523: Add pass-throughs for hash() and reversed() to weakref.proxy objects (GH-19946) * bpo-40480 "fnmatch" exponential execution time (GH-19908) bpo-40480: create different regexps in the presence of multiple `*` patterns to prevent fnmatch() from taking exponential time. * bpo-40517: Implement syntax highlighting support for ASDL (#19928) * Revert "bpo-40517: Implement syntax highlighting support for ASDL (#19928)" (#19950) This reverts commit d60040b. * bpo-40527: Fix command line argument parsing (GH-19955) * bpo-40528: Improve and clear several aspects of the ASDL definition code for the AST (GH-19952) * bpo-40521: Disable method cache in subinterpreters (GH-19960) When Python is built with experimental isolated interpreters, disable the type method cache. Temporary workaround until the cache is made per-interpreter. * bpo-40533: Disable GC in subinterpreters (GH-19961) When Python is built with experimental isolated interpreters, a garbage collection now does nothing in an isolated interpreter. Temporary workaround until subinterpreters stop sharing Python objects. * bpo-40521: Disable list free list in subinterpreters (GH-19959) When Python is built with experimental isolated interpreters, disable the list free list. Temporary workaround until this cache is made per-interpreter. * bpo-40334: Add type to the assignment rule in the grammar file (GH-19963) * Fix typo in sqlite3 documentation (GH-19965) *first* is repeated twice. * bpo-40334: Allow trailing comma in parenthesised context managers (GH-19964) * bpo-40334: Generate comments in the parser code to improve debugging (GH-19966) * bpo-40397: Refactor typing._GenericAlias (GH-19719) Make the design more object-oriented. Split _GenericAlias on two almost independent classes: for special generic aliases like List and for parametrized generic aliases like List[int]. Add specialized subclasses for Callable, Callable[...], Tuple and Union[...]. * bpo-1635741: Port errno module to multiphase initialization (GH-19923) * bpo-40334: Fix error location upon parsing an invalid string literal (GH-19962) When parsing a string with an invalid escape, the old parser used to point to the beginning of the invalid string. This commit changes the new parser to match that behaviour, since it's currently pointing to the end of the string (or to be more precise, to the beginning of the next token). * bpo-40334: Error message for invalid default args in function call (GH-19973) When parsing something like `f(g()=2)`, where the name of a default arg is not a NAME, but an arbitrary expression, a specialised error message is emitted. * bpo-38787: C API for module state access from extension methods (PEP 573) (GH-19936) Module C state is now accessible from C-defined heap type methods (PEP 573). Patch by Marcel Plch and Petr Viktorin. Co-authored-by: Marcel Plch <mplch@redhat.com> Co-authored-by: Victor Stinner <vstinner@python.org> * bpo-40545: Export _PyErr_GetTopmostException() function (GH-19978) Declare _PyErr_GetTopmostException() with PyAPI_FUNC() to properly export the function in the C API. The function remains private ("_Py") prefix. Co-Authored-By: Julien Danjou <julien@danjou.info> * bpo-32604: [_xxsubinterpreters] Propagate exceptions. (GH-19768) (Note: PEP 554 is not accepted and the implementation in the code base is a private one for use in the test suite.) If code running in a subinterpreter raises an uncaught exception then the "run" call in the calling interpreter fails. A RunFailedError is raised there that summarizes the original exception as a string. The actual exception type, __cause__, __context__, state, etc. are all discarded. This turned out to be functionally insufficient in practice. There is a more helpful solution (and PEP 554 has been updated appropriately). This change adds the exception propagation behavior described in PEP 554 to the _xxsubinterpreters module. With this change a copy of the original exception is set to __cause__ on the RunFailedError. For now we are using "pickle", which preserves the exception's state. We also preserve the original __cause__, __context__, and __traceback__ (since "pickle" does not preserve those). https://bugs.python.org/issue32604 * bpo-38787: Update structures.rst docs (PEP 573) (GH-19980) * bpo-40548: Always run GitHub action, even on doc PRs (GH-19981) Always run GitHub action jobs, even on documentation-only pull requests. So it will be possible to make a GitHub action job, like the Windows (64-bit) job, mandatory. * bpo-40517: Implement syntax highlighting support for ASDL (GH-19967) * bpo-40555: Check for p->error_indicator in loop rules after the main loop is done (GH-19986) * bpo-40273: Reversible mappingproxy (FH-19513) * bpo-40559: Add Py_DECREF to _asynciomodule.c:task_step_impl() (GH-19990) This fixes a possible memory leak in the C implementation of asyncio.Task. * Make the first dataclass example more useful (GH-19994) * bpo-40541: Add optional *counts* parameter to random.sample() (GH-19970) * bpo-40502: Initialize n->n_col_offset (GH-19988) * initialize n->n_col_offset * 📜🤖 Added by blurb_it. * Move initialization Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> * bpo-39791: Add files() to importlib.resources (GH-19722) * bpo-39791: Update importlib.resources to support files() API (importlib_resources 1.5). * 📜🤖 Added by blurb_it. * Add some documentation about the new objects added. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> * bpo-40566: Apply PEP 573 to abc module (GH-20005) * bpo-40570: Improve compatibility of uname_result with late-bound .platform (#20015) * bpo-40570: Improve compatibility of uname_result with late-bound .platform. * Add test capturing ability to cast uname to a tuple. * bpo-40334: Avoid collisions between parser variables and grammar variables (GH-19987) This is for the C generator: - Disallow rule and variable names starting with `_` - Rename most local variable names generated by the parser to start with `_` Exceptions: - Renaming `p` to `_p` will be a separate PR - There are still some names that might clash, e.g. - anything starting with `Py` - C reserved words (`if` etc.) - Macros like `EXTRA` and `CHECK` * Add link to Enum class (GH-19884) * bpo-40397: Remove __args__ and __parameters__ from _SpecialGenericAlias (GH-19984) * bpo-40549: Convert posixmodule.c to multiphase init (GH-19982) Convert posixmodule.c ("posix" or "nt" module) to the multiphase initialization (PEP 489). * Create the module using PyModuleDef_Init(). * Create ScandirIteratorType and DirEntryType with the new PyType_FromModuleAndSpec() (PEP 573) * Get the module state from ScandirIteratorType and DirEntryType with the new PyType_GetModule() (PEP 573) * Pass module to functions which access the module state. * convert_sched_param() gets a new module parameter. It is now called directly since Argument Clinic doesn't support passing the module to an argument converter callback. * Remove _posixstate_global macro. * bpo-37986: Improve perfomance of PyLong_FromDouble() (GH-15611) * bpo-37986: Improve perfomance of PyLong_FromDouble() * Use strict bound check for safety and symmetry * Remove possibly outdated performance claims Co-authored-by: Mark Dickinson <dickinsm@gmail.com> * bpo-40397: Fix subscription of nested generic alias without parameters. (GH-20021) * bpo-40257: Tweak docstrings for special generic aliases. (GH-20022) * Add the terminating period. * Omit module name for builtin types. * Improve code clarity for the set lookup logic (GH-20028) * bpo-40585: Normalize errors messages in codeop when comparing them (GH-20030) With the new parser, the error message contains always the trailing newlines, causing the comparison of the repr of the error messages in codeop to fail. This commit makes the new parser mirror the old parser's behaviour regarding trailing newlines. * bpo-40575: Avoid unnecessary overhead in _PyDict_GetItemIdWithError() (GH-20018) Avoid unnecessary overhead in _PyDict_GetItemIdWithError() by calling _PyDict_GetItem_KnownHash() instead of the more generic PyDict_GetItemWithError(), since we already know the hash of interned strings. * bpo-36346: array: Don't use deprecated APIs (GH-19653) * Py_UNICODE -> wchar_t * Py_UNICODE -> unicode in Argument Clinic * PyUnicode_AsUnicode -> PyUnicode_AsWideCharString * Don't use "u#" format. Co-authored-by: Victor Stinner <vstinner@python.org> * bpo-40561: Add docstrings for webbrowser open functions (GH-19999) Co-authored-by: Brad Solomon <brsolomon@deloitte.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> * bpo-40584: Update PyType_FromModuleAndSpec() to process tp_vectorcall_offset (GH-20026) * bpo-40334: produce specialized errors for invalid del targets (GH-19911) * bpo-39465: Don't access directly _Py_Identifier members (GH-20043) * Replace id->object with _PyUnicode_FromId(&id) * Use _Py_static_string_init(str) macro to initialize statically name_op in typeobject.c. * bpo-40571: Make lru_cache(maxsize=None) more discoverable (GH-20019) * bpo-40602: Rename hashtable.h to pycore_hashtable.h (GH-20044) * Move Modules/hashtable.h to Include/internal/pycore_hashtable.h * Move Modules/hashtable.c to Python/hashtable.c * Python is now linked to hashtable.c. _tracemalloc is no longer linked to hashtable.c. Previously, marshal.c got hashtable.c via _tracemalloc.c which is built as a builtin module. * bpo-40602: _Py_hashtable_new() uses PyMem_Malloc() (GH-20046) _Py_hashtable_new() now uses PyMem_Malloc/PyMem_Free allocator by default, rather than PyMem_RawMalloc/PyMem_RawFree. PyMem_Malloc is faster than PyMem_RawMalloc for memory blocks smaller than or equal to 512 bytes. * bpo-40480: restore ability to join fnmatch.translate() results (GH-20049) In translate(), generate unique group names across calls. The restores the undocumented ability to get a valid regexp by joining multiple translate() results via `|`. * bpo-39481: remove generic classes from ipaddress/mmap (GH-20045) These were added by mistake (see https://bugs.python.org/issue39481#msg366288). * bpo-40593: Improve syntax errors for invalid characters in source code. (GH-20033) * bpo-40602: Optimize _Py_hashtable for pointer keys (GH-20051) Optimize _Py_hashtable_get() and _Py_hashtable_get_entry() for pointer keys: * key_size == sizeof(void*) * hash_func == _Py_hashtable_hash_ptr * compare_func == _Py_hashtable_compare_direct Changes: * Add get_func and get_entry_func members to _Py_hashtable_t * Convert _Py_hashtable_get() and _Py_hashtable_get_entry() functions to static nline functions. * Add specialized get and get entry for pointer keys. * bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing non-BMP characters on Windows. (GH-20053) * bpo-38787: Add PyCFunction_CheckExact() macro for exact type checks (GH-20024) … now that we allow subtypes of PyCFunction. Also add PyCMethod_CheckExact() and PyCMethod_Check() for checks against the PyCMethod subtype. * bpo-40602: Add _Py_HashPointerRaw() function (GH-20056) Add a new _Py_HashPointerRaw() function which avoids replacing -1 with -2 to micro-optimize hash table using pointer keys: using _Py_hashtable_hash_ptr() hash function. * bpo-40501: Replace ctypes code in uuid with native module (GH-19948) * Fix Wikipedia link (GH-20031) * bpo-40609: Rewrite how _tracemalloc handles domains (GH-20059) Rewrite how the _tracemalloc module stores traces of other domains. Rather than storing the domain inside the key, it now uses a new hash table with the domain as the key, and the data is a per-domain traces hash table. * Add tracemalloc_domain hash table. * Remove _Py_tracemalloc_config.use_domain. * Remove pointer_t and related functions. * bpo-40609: Remove _Py_hashtable_t.key_size (GH-20060) Rewrite _Py_hashtable_t type to always store the key as a "const void *" pointer. Add an explicit "key" member to _Py_hashtable_entry_t. Remove _Py_hashtable_t.key_size member. hash and compare functions drop their hash table parameter, and their 'key' parameter type becomes "const void *". * bpo-40609: Add destroy functions to _Py_hashtable (GH-20062) Add key_destroy_func and value_destroy_func parameters to _Py_hashtable_new_full(). marshal.c and _tracemalloc.c use these destroy functions. * bpo-40609: _tracemalloc allocates traces (GH-20064) Rewrite _tracemalloc to store "trace_t*" rather than directly "trace_t" in traces hash tables. Traces are now allocated on the heap memory, outside the hash table. Add tracemalloc_copy_traces() and tracemalloc_copy_domains() helper functions. Remove _Py_hashtable_copy() function since there is no API to copy a key or a value. Remove also _Py_hashtable_delete() function which was commented. * bpo-40609: _Py_hashtable_t values become void* (GH-20065) _Py_hashtable_t values become regular "void *" pointers. * Add _Py_hashtable_entry_t.data member * Remove _Py_hashtable_t.data_size member * Remove _Py_hashtable_t.get_func member. It is no longer needed to specialize _Py_hashtable_get() for a specific value size, since all entries now have the same size (void*). * Remove the following macros: * _Py_HASHTABLE_GET() * _Py_HASHTABLE_SET() * _Py_HASHTABLE_SET_NODATA() * _Py_HASHTABLE_POP() * Rename _Py_hashtable_pop() to _Py_hashtable_steal() * _Py_hashtable_foreach() callback now gets key and value rather than entry. * Remove _Py_hashtable_value_destroy_func type. value_destroy_func callback now only has a single parameter: data (void*). * bpo-40602: Optimize _Py_hashtable_get_ptr() (GH-20066) _Py_hashtable_get_entry_ptr() avoids comparing the entry hash: compare directly keys. Move _Py_hashtable_get_entry_ptr() just after _Py_hashtable_get_entry_generic(). * bpo-40331: Increase test coverage for the statistics module (GH-19608) * bpo-40613: Remove compiler warning from _xxsubinterpretersmodule (GH-20069) * bpo-34790: add version of removal of explicit passing of coros to `asyncio.wait`'s documentation (#20008) * bpo-40334: Always show the caret on SyntaxErrors (GH-20050) This commit fixes SyntaxError locations when the caret is not displayed, by doing the following: - `col_number` always gets set to the location of the offending node/expr. When no caret is to be displayed, this gets achieved by setting the object holding the error line to None. - Introduce a new function `_PyPegen_raise_error_known_location`, which can be called, when an arbitrary `lineno`/`col_offset` needs to be passed. This function then gets used in the grammar (through some new macros and inline functions) so that SyntaxError locations of the new parser match that of the old. * bpo-38787: Fix Argument Clinic defining_class_converter (GH-20074) Don't hardcode defining_class parameter name to "cls": * Define CConverter.set_template_dict(): do nothing by default * CLanguage.render_function() now calls set_template_dict() on all converters. * issue-25872: Fix KeyError using linecache from multiple threads (GH-18007) The crash that this fixes occurs when using traceback and other modules from multiple threads; del cache[filename] can raise a KeyError. * bpo-39465: Remove _PyUnicode_ClearStaticStrings() from C API (GH-20078) Remove the _PyUnicode_ClearStaticStrings() function from the C API. Make the function fully private (declare it with "static"). * bpo-29587: Make gen.throw() chain exceptions with yield from (GH-19858) The previous commits on bpo-29587 got exception chaining working with gen.throw() in the `yield` case. This patch also gets the `yield from` case working. As a consequence, implicit exception chaining now also works in the asyncio scenario of awaiting on a task when an exception is already active. Tests are included for both the asyncio case and the pure generator-only case. * bpo-40521: Add PyInterpreterState.unicode (GH-20081) Move PyInterpreterState.fs_codec into a new PyInterpreterState.unicode structure. Give a name to the fs_codec structure and use this structure in unicodeobject.c. * bpo-40597: email: Use CTE if lines are longer than max_line_length consistently (gh-20038) raw_data_manager (default for EmailPolicy, EmailMessage) does correct wrapping of 'text' parts as long as the message contains characters outside of 7bit US-ASCII set: base64 or qp Content-Transfer-Encoding is applied if the lines would be too long without it. It did not, however, do this for ascii-only text, which could result in lines that were longer than policy.max_line_length or even the rfc 998 maximum. This changeset fixes the heuristic so that if lines are longer than policy.max_line_length, it will always apply a content-transfer-encoding so that the lines are wrapped correctly. * bpo-40275: Import locale module lazily in gettext (GH-19905) * bpo-40495: compileall option to hardlink duplicate pyc files (GH-19901) compileall is now able to use hardlinks to prevent duplicates in a case when .pyc files for different optimization levels have the same content. Co-authored-by: Miro Hrončok <miro@hroncok.cz> Co-authored-by: Victor Stinner <vstinner@python.org> * bpo-40549: posixmodule.c uses defining_class (GH-20075) Pass PEP 573 defining_class to os.DirEntry methods. The module state is now retrieve from defining_class rather than Py_TYPE(self), to support subclasses (even if DirEntry doesn't support subclasses yet). * Pass the module rather than defining_class to DirEntry_fetch_stat(). * Only get the module state once in _posix_clear(), _posix_traverse() and _posixmodule_exec(). * Revert "bpo-32604: [_xxsubinterpreters] Propagate exceptions. (GH-19768)" (GH-20089) * Revert "bpo-40613: Remove compiler warning from _xxsubinterpretersmodule (GH-20069)" This reverts commit fa0a66e. * Revert "bpo-32604: [_xxsubinterpreters] Propagate exceptions. (GH-19768)" This reverts commit a1d9e0a. * bpo-40602: Write unit tests for _Py_hashtable_t (GH-20091) Cleanup also hashtable.c. Rename _Py_hashtable_t members: * Rename entries to nentries * Rename num_buckets to nbuckets * bpo-40619: Correctly handle error lines in programs without file mode (GH-20090) * bpo-40618: Disallow invalid targets in augassign and except clauses (GH-20083) This commit fixes the new parser to disallow invalid targets in the following scenarios: - Augmented assignments must only accept a single target (Name, Attribute or Subscript), but no tuples or lists. - `except` clauses should only accept a single `Name` as a target. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> * bpo-40602: _Py_hashtable_set() reports rehash failure (GH-20077) If _Py_hashtable_set() fails to grow the hash table (rehash), it now fails rather than ignoring the error. * bpo-40548: GitHub Action workflow: skip jobs on doc only PRs (GH-19983) Signed-off-by: Filipe Laíns <lains@archlinux.org> * bpo-40460: Fix typo in idlelib/zzdummy.py (GH-20093) Replace ztest with ztext. * bpo-40462: Fix typo in test_json (GH-20094) * bpo-38872: Document exec symbol for codeop.compile_command (GH-20047) * Document exec symbol for codeop.compile_command * Remove extra statements Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com> * bpo-40334: Correctly identify invalid target in assignment errors (GH-20076) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> * bpo-40548: github actions: pass the changes check on no source changes (GH-20097) Signed-off-by: Filipe Laíns <lains@archlinux.org> * Update code comment re: location of struct _is. (GH-20067) * bpo-40612: Fix SyntaxError edge cases in traceback formatting (GH-20072) This fixes both the traceback.py module and the C code for formatting syntax errors (in Python/pythonrun.c). They now both consistently do the following: - Suppress caret if it points left of text - Allow caret pointing just past end of line - If caret points past end of line, clip to *just* past end of line The syntax error formatting code in traceback.py was mostly rewritten; small, subtle changes were applied to the C code in pythonrun.c. There's still a difference when the text contains embedded newlines. Neither handles these very well, and I don't think the case occurs in practice. Automerge-Triggered-By: @gvanrossum * Fix typo in code comment in main_loop label. (GH-20068) * Trivial typo fix in _tkinter.c (GH-19622) Change spelling of a #define in _tkinter.c from HAVE_LIBTOMMAMTH to HAVE_LIBTOMMATH, since this is used to keep track of tclTomMath.h, not tclTomMamth.h. No other file seems to refer to this variable. * bpo-40055: test_distutils leaves warnings filters unchanged (GH-20095) distutils.tests now saves/restores warnings filters to leave them unchanged. Importing tests imports docutils which imports pkg_resources which adds a warnings filter. * bpo-40479: Fix hashlib issue with OpenSSL 3.0.0 (GH-20107) OpenSSL 3.0.0-alpha2 was released today. The FIPS_mode() function has been deprecated and removed. It no longer makes sense with the new provider and context system in OpenSSL 3.0.0. EVP_default_properties_is_fips_enabled() is good enough for our needs in unit tests. It's an internal API, too. Signed-off-by: Christian Heimes <christian@python.org> * bpo-40479: Test with latest OpenSSL versions (GH-20108) * 1.0.2u (EOL) * 1.1.0l (EOL) * 1.1.1g * 3.0.0-alpha2 (disabled for now) Build the FIPS provider and create a FIPS configuration file for OpenSSL 3.0.0. Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: @tiran * Update NEWS. Co-authored-by: Victor Stinner <vstinner@python.org> Co-authored-by: Javier Buzzi <buzzi.javier@gmail.com> Co-authored-by: Hai Shi <shihai1992@gmail.com> Co-authored-by: Steve Dower <steve.dower@python.org> Co-authored-by: Curtis Bucher <cpbucher5@gmail.com> Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com> Co-authored-by: Tim Peters <tim.peters@gmail.com> Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com> Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com> Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Naglis <naglis@users.noreply.github.com> Co-authored-by: Dong-hee Na <donghee.na92@gmail.com> Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: Marcel Plch <mplch@redhat.com> Co-authored-by: Julien Danjou <julien@danjou.info> Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com> Co-authored-by: Zackery Spytz <zspytz@gmail.com> Co-authored-by: Chris Jerdonek <chris.jerdonek@gmail.com> Co-authored-by: Ned Batchelder <ned@nedbatchelder.com> Co-authored-by: Joannah Nanjekye <33177550+nanjekyejoannah@users.noreply.github.com> Co-authored-by: nanjekyejoannah <joannah.nanjekye@ibm.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Jason R. Coombs <jaraco@jaraco.com> Co-authored-by: Andre Delfino <adelfino@gmail.com> Co-authored-by: Sergey Fedoseev <fedoseev.sergey@gmail.com> Co-authored-by: Mark Dickinson <dickinsm@gmail.com> Co-authored-by: scoder <stefan_ml@behnel.de> Co-authored-by: Inada Naoki <songofacandy@gmail.com> Co-authored-by: Brad Solomon <brad.solomon.1124@gmail.com> Co-authored-by: Brad Solomon <brsolomon@deloitte.com> Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> Co-authored-by: Shantanu <hauntsaninja@users.noreply.github.com> Co-authored-by: Allen Guo <guoguo12@gmail.com> Co-authored-by: Tzanetos Balitsaris <tbalitsaris@gmail.com> Co-authored-by: jack1142 <6032823+jack1142@users.noreply.github.com> Co-authored-by: Michael Graczyk <mgraczyk@users.noreply.github.com> Co-authored-by: Arkadiusz Hiler <arek.l1@gmail.com> Co-authored-by: Lumír 'Frenzy' Balhar <lbalhar@redhat.com> Co-authored-by: Miro Hrončok <miro@hroncok.cz> Co-authored-by: Filipe Laíns <filipe.lains@gmail.com> Co-authored-by: Filipe Laíns <lains@archlinux.org> Co-authored-by: Guido van Rossum <guido@python.org> Co-authored-by: Andrew York <andrew.g.york+github@gmail.com> Co-authored-by: Christian Heimes <christian@python.org>

the-knights-who-say-ni added the CLA signed label Aug 30, 2019

bedevere-bot added the awaiting review label Aug 30, 2019

sir-sigurd force-pushed the float-as-double-macro branch from 7555a03 to 1df092e Compare August 30, 2019 09:01

sir-sigurd changed the title ~~Improve perfomance of PyLong_FromDouble()~~ bpo-37986: Improve perfomance of PyLong_FromDouble() Aug 30, 2019

sir-sigurd commented Aug 30, 2019

View reviewed changes

Objects/longobject.c Outdated Show resolved Hide resolved

serhiy-storchaka reviewed Aug 30, 2019

View reviewed changes

Objects/longobject.c Outdated Show resolved Hide resolved

sir-sigurd force-pushed the float-as-double-macro branch from 1df092e to 0572857 Compare August 30, 2019 09:27

gpshead self-assigned this Sep 10, 2019

gpshead reviewed Sep 11, 2019

View reviewed changes

serhiy-storchaka requested a review from mdickinson September 11, 2019 16:20

mdickinson reviewed Oct 26, 2019

View reviewed changes

Objects/longobject.c Outdated Show resolved Hide resolved

mdickinson previously approved these changes Oct 26, 2019

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting review labels Oct 26, 2019

mdickinson approved these changes Oct 26, 2019

View reviewed changes

mdickinson mentioned this pull request Oct 29, 2019

bpo-38629: implement __floor__ and __ceil__ for float #16985

Merged

bpo-37986: Improve perfomance of PyLong_FromDouble()

5dab8e2

sir-sigurd force-pushed the float-as-double-macro branch from 1ad7603 to 5dab8e2 Compare November 20, 2019 05:13

Use strict bound check for safety and symmetry

0b90ead

serhiy-storchaka reviewed May 10, 2020

View reviewed changes

Remove possibly outdated performance claims

8be513e

mdickinson merged commit 86a93fd into python:master May 10, 2020

bedevere-bot removed the awaiting merge label May 10, 2020

sir-sigurd deleted the float-as-double-macro branch May 10, 2020 09:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpo-37986: Improve perfomance of PyLong_FromDouble() #15611

bpo-37986: Improve perfomance of PyLong_FromDouble() #15611

sir-sigurd commented Aug 30, 2019 •

edited by bedevere-bot

Loading

gpshead Sep 11, 2019

serhiy-storchaka Sep 11, 2019

mdickinson Oct 26, 2019

sir-sigurd Oct 26, 2019

gpshead Oct 26, 2019 •

edited

Loading

serhiy-storchaka Oct 26, 2019

gpshead Sep 11, 2019

serhiy-storchaka Sep 11, 2019

sir-sigurd Sep 11, 2019 •

edited

Loading

sir-sigurd Sep 11, 2019

mdickinson Oct 26, 2019

mdickinson Oct 26, 2019

sir-sigurd Oct 26, 2019

mdickinson Oct 27, 2019

serhiy-storchaka commented Sep 16, 2019

serhiy-storchaka commented Oct 21, 2019

ghost commented Oct 21, 2019

mdickinson commented Oct 21, 2019

mdickinson left a comment

mdickinson commented Oct 26, 2019 •

edited

Loading

sir-sigurd commented Oct 26, 2019

mdickinson commented Oct 26, 2019

mdickinson commented Oct 26, 2019 •

edited

Loading

mdickinson commented Oct 26, 2019

ghost commented Oct 27, 2019

sir-sigurd commented May 10, 2020

mdickinson commented May 10, 2020

serhiy-storchaka May 10, 2020

mdickinson May 10, 2020

bpo-37986: Improve perfomance of PyLong_FromDouble() #15611

bpo-37986: Improve perfomance of PyLong_FromDouble() #15611

Conversation

sir-sigurd commented Aug 30, 2019 • edited by bedevere-bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gpshead Oct 26, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sir-sigurd Sep 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serhiy-storchaka commented Sep 16, 2019

serhiy-storchaka commented Oct 21, 2019

ghost commented Oct 21, 2019

mdickinson commented Oct 21, 2019

mdickinson left a comment

Choose a reason for hiding this comment

mdickinson commented Oct 26, 2019 • edited Loading

sir-sigurd commented Oct 26, 2019

mdickinson commented Oct 26, 2019

mdickinson commented Oct 26, 2019 • edited Loading

mdickinson commented Oct 26, 2019

ghost commented Oct 27, 2019

sir-sigurd commented May 10, 2020

mdickinson commented May 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sir-sigurd commented Aug 30, 2019 •

edited by bedevere-bot

Loading

gpshead Oct 26, 2019 •

edited

Loading

sir-sigurd Sep 11, 2019 •

edited

Loading

mdickinson commented Oct 26, 2019 •

edited

Loading

mdickinson commented Oct 26, 2019 •

edited

Loading