dtoa: thread safety in `--disable-gil` builds #111962

colesbury · 2023-11-10T15:50:45Z

Feature or enhancement

The Python/dtoa.c library is responsible for formatting floating point numbers (double) as strings and for parsing strings into numbers. The shared state is not thread-safe without the GIL:

Balloc and Bfree use a per-interpreter free-list to avoid some allocations of Bigint objects
pow5mult uses a per-interpreter append-only list of Bigint powers of 5

For (1), we can just skip using the freelists in --disable-gil builds. We already have a code path (Py_USING_MEMORY_DEBUGGER) that doesn't use freelists.

cpython/Python/dtoa.c

Line 312 in d61313b

#ifndef Py_USING_MEMORY_DEBUGGER

For (2), we can use atomic operations to append to the powers-of-5 linked list in a thread-safe manner. I don't think this needs to be guarded by a Py_NOGIL checks, since each power-of-5 is only ever created once.

For context, here is the modification to Python/dtoa.c in nogil-3.12. Note that it uses a PyMutex for (2), which I think we can avoid.

dragonbox, Ryū, Grisu, Schubfach

In the past 5 or so years, there have been a number of faster float-to-string algorithms with the desirable attributes (correctly rounded, no "garabage" digits, etc.). To my knowledge they are also all thread-safe. "dragonbox" looks the most promising, but porting it to C is a bigger undertaking than making the existing dtoa.c thread-safe.

Linked PRs

gh-111962: Make dtoa thread-safe in --disable-gil builds. #112049

The text was updated successfully, but these errors were encountered:

ericvsmith · 2023-11-11T00:19:27Z

@mdickinson for awareness.

mdickinson · 2023-11-12T10:00:32Z

Thanks, @ericvsmith.

@colesbury

In the past 5 or so years, there have been a number of faster float-to-string algorithms [...]

I'm not opposed to moving away from dtoa.c (quite the opposite), but it's important to recognise just how much dtoa.c is doing for us right now - not just float-to-string in all its wonderful variants (shortest string, correctly-rounded fixed precision including the case of negative precision, correctly-rounded fixed number of significant digits) but correctly-rounded string-to-float, too. I like the look of dragonbox, but it only covers one subcase (shortest string) of one direction (float-to-string). We'd need to find substitutes for all the various pieces of dtoa.c.

mdickinson · 2023-11-12T10:04:36Z

The suggested mitigations for (1) and (2) sound good to me, FWIW.

mdickinson · 2023-11-12T14:18:03Z

I guess we should probably open a separate issue if we want to explore the idea of moving away from dtoa.c entirely. Dragonbox looks good for shortest-string float-to-str. Daniel Lemire's fast_double_parser might do in the other direction for correctly-rounded str-to-float, but I haven't looked at it in detail and I don't know how battle-tested it is. The main speed priority would be for numeric strings with 17 or fewer significant decimal digits (since that's the max number of digits that the shortest-string algorithm will produce); it seems fine to me if a slower algorithm is used for larger numbers of significant digits. That leaves support for e-style formatting, f-style formatting, and two-argument round (including the case where the second argument to round is negative). Again it would seem fine to have slower fallback options for e and f-style formatting with stupid numbers of digits, since that's a niche use-case, but it would be a shame to lose Python's correct rounding here. E.g., format(0.1, '.55f') should continue to give '0.1000000000000000055511151231257827021181583404541015625' rather than producing something misleading like 0.1000000000000000055500000000000000000000000000000000000 (which is what some other implementations/languages do).

mdickinson · 2023-11-12T14:20:57Z

Daniel Lemire's fast_double_parser might do in the other direction

Scratch that; the README says "We encourage users to adopt fast_float library instead".

colesbury · 2023-11-13T16:17:29Z

@mdickinson, thanks for the context. I'll plan on only making minimal changes to dtoa to make it thread-safe.

This avoids using the Bigint free-list in `--disable-gil` builds, and pre-computes the needed powers of 5 during interpreter initialization. This makes two changes to dtoa

This avoids using the Bigint free-list in `--disable-gil` builds and pre-computes the needed powers of 5 during interpreter initialization.

This updates `dtoa.c` to avoid using the Bigint free-list in --disable-gil builds and to pre-computes the needed powers of 5 during interpreter initialization. * gh-111962: Make dtoa thread-safe in `--disable-gil` builds. This avoids using the Bigint free-list in `--disable-gil` builds and pre-computes the needed powers of 5 during interpreter initialization. * Fix size of cached powers of 5 array. We need the powers of 5 up to 5**512 because we only jump straight to underflow when the exponent is less than -512 (or larger than 308). * Rename Py_NOGIL to Py_GIL_DISABLED * Changes from review * Fix assertion placement

…thon#112049) This updates `dtoa.c` to avoid using the Bigint free-list in --disable-gil builds and to pre-computes the needed powers of 5 during interpreter initialization. * pythongh-111962: Make dtoa thread-safe in `--disable-gil` builds. This avoids using the Bigint free-list in `--disable-gil` builds and pre-computes the needed powers of 5 during interpreter initialization. * Fix size of cached powers of 5 array. We need the powers of 5 up to 5**512 because we only jump straight to underflow when the exponent is less than -512 (or larger than 308). * Rename Py_NOGIL to Py_GIL_DISABLED * Changes from review * Fix assertion placement

colesbury added type-feature A feature request or enhancement 3.13 bugs and security fixes topic-free-threading labels Nov 10, 2023

colesbury mentioned this issue Nov 10, 2023

PEP 703 -- Making the Global Interpreter Lock Optional in CPython #108219

Open

bedevere-app bot mentioned this issue Nov 13, 2023

gh-111962: Make dtoa thread-safe in --disable-gil builds. #112049

Merged

colesbury self-assigned this Nov 14, 2023

colesbury closed this as completed Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dtoa: thread safety in `--disable-gil` builds #111962

dtoa: thread safety in `--disable-gil` builds #111962

colesbury commented Nov 10, 2023 •

edited by bedevere-app bot

Loading

ericvsmith commented Nov 11, 2023

mdickinson commented Nov 12, 2023

mdickinson commented Nov 12, 2023

mdickinson commented Nov 12, 2023 •

edited

Loading

mdickinson commented Nov 12, 2023

colesbury commented Nov 13, 2023

dtoa: thread safety in --disable-gil builds #111962

dtoa: thread safety in --disable-gil builds #111962

Comments

colesbury commented Nov 10, 2023 • edited by bedevere-app bot Loading

Feature or enhancement

dragonbox, Ryū, Grisu, Schubfach

Linked PRs

ericvsmith commented Nov 11, 2023

mdickinson commented Nov 12, 2023

mdickinson commented Nov 12, 2023

mdickinson commented Nov 12, 2023 • edited Loading

mdickinson commented Nov 12, 2023

colesbury commented Nov 13, 2023

dtoa: thread safety in `--disable-gil` builds #111962

dtoa: thread safety in `--disable-gil` builds #111962

colesbury commented Nov 10, 2023 •

edited by bedevere-app bot

Loading

mdickinson commented Nov 12, 2023 •

edited

Loading