Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-101291: Low level opt-in API for pylong #101685

Merged
merged 3 commits into from
May 21, 2023

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Feb 8, 2023

This is a stop-gap solution, until we decide on future directions for the C-API.

Adds some inline functions to allow fast access to the value of PyLongObject objects, without explicitly probing the internals.

These new functions will be guarded by PY_UNSTABLE_PYLONG_ABI.

In the future we may:

  • Add the non-inline versions of the functions to the full API (removing the underscore)
  • Replace PY_UNSTABLE_PYLONG_ABI with a more general ABI selector flag.

@encukou Is this OK as a stopgap?
@scoder Is this usable by Cython?

@markshannon markshannon changed the title Low level opt-in API for pylong GH-101291: Low level opt-in API for pylong Feb 8, 2023
Copy link
Member

@arhadthedev arhadthedev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor typo.

/* Return 1 if the argument fits into a (N-1) bit integer,
* where N is the number of bits in a intptr_t, and the value
* can be extracted efficiently by _PyLong_GetSingleDigitValue
* NOTE: Some values that could into (N-1) bit integer will return 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* NOTE: Some values that could into (N-1) bit integer will return 0.
* NOTE: Some values that could fit into (N-1) bit integer will return 0.

@encukou
Copy link
Member

encukou commented Feb 8, 2023

The _Py prefix marks these as internal. But they're intended for third-party projects :/

What does you will need to recompile for *every* release mean? Every patch version? If so, I don't think packaging tooling can handle that, did you check if that would be useful for Cython/Numpy?

@mdboom
Copy link
Contributor

mdboom commented Feb 8, 2023

I worked on some Python/C calling benchmarks last year. We could test those against a Cython hacked to use this API.

@scoder
Copy link
Contributor

scoder commented Feb 14, 2023

So, what we currently do in Cython is that we generate a switch statement for Py_SIZE() (depending on the compile time size of a C long, or the specific C integer type that we are converting to) and then just build the long from ob_digit using unrolled loops.

https://github.com/cython/cython/blob/1dba3d3b44ce942dafe4c77dec4c64def22c57e1/Cython/Utility/TypeConversion.c#L939-L1085

The pylong_join() helper function that generates the digit merge is here:

https://github.com/cython/cython/blob/1dba3d3b44ce942dafe4c77dec4c64def22c57e1/Cython/Utility/__init__.py

Note that we generate independent unpacking functions for different C integer/float sizes, so we'd need an inline function for int, long, ssize_t and the respective unsigned types as well, in order to cut down the switch statement at compile time.

I certainly agree that the single digit case is an important special case. However, we currently do a lot more in the cases where the target C integer (or float) size is substantially larger than the PyLong digit size. Switching to the proposed API would get us half the way, and we'd end up with using the official public API, this new inofficial API, and the bare struct access, depending on the compile time environment/config and runtime value. The advantage then seems to be that users could still benefit from fast single-digit access if they chose to opt out of the direct struct access.

If CPython doesn't want to go all the way of providing different inline conversion functions for different C target type sizes (as Cython does), then separating out the single-digit case still sounds like a reasonable option to provide to users, I think.

@markshannon
Copy link
Member Author

The concept of a "digit" is something we want to get away from in the API.

What matters is whether the Python int small enough to fit into a machine int (intptr_t), and whether false negatives are OK.

Given the highly skewed distribution of ints, false negatives to should be OK for larger numbers.
Most ints represent a number and are small (less than a billion), or represent some sort of code, for crypto etc, and won't fit in a machine int.

@scoder
Copy link
Contributor

scoder commented Feb 15, 2023

The concept of a "digit" is something we want to get away from in the API.

In the API or in the implementation? I'm asking because the digit's were never really part of the API but an implementation detail that CPython and Cython were both making good use of. As long as they remain a part of the implementation, I don't see why Cython shouldn't make use of them.

@markshannon
Copy link
Member Author

In the API or in the implementation?

Both. But it needs to be in the API first, or we won't be able to do so in the implementation.

I don't see why Cython shouldn't make use of them.

Because there is no advantage in doing so. It doesn't improve performance, it just makes maintenance harder.

@markshannon
Copy link
Member Author

I'm now inclined to make this part of the full API. With some way to opt in to the faster version.

Something like this:

/* Probably need a better name for this macro :) */
#ifdef Py_GIVE_ME_THE_FAST_VERSION_I_DONT_MIND_RECOMPILING_EVERY_RELEASE 
static inline int
PyLong_IsSingleDigit(PyLongObject* op) {
    assert(PyLong_Check(op));
    Py_ssize_t signed_size = op->long_value.ob_size;
    return -1 <= signed_size && signed_size <= 1;
}
#else
PyAPI_FUNC(int) PyLong_IsSingleDigit(PyLongObject* op);
#endif

@encukou does this sound like a reasonable approach?

@encukou
Copy link
Member

encukou commented Feb 20, 2023

What do you mean by "every release"?

  • if it's recompiling for every minor release (3.x.0) and alphas/betas, that's currently needed for the default API/ABI. No need to guard by a macro. (It might not be the best default, but changing it would be a big PEP.)
  • if it's really every release, including patch releases (3.x.y), wheel tags don't support that, so packages on PyPI wouldn't be able to use it. The feature would only be useful in things like a monorepo which includes CPython itself, and at that point it could just be private API.

@scoder
Copy link
Contributor

scoder commented Feb 20, 2023

if it's recompiling for every minor release (3.x.0) and alphas/betas, that's currently needed for the default API/ABI

Exactly. And that's the level of cross-version compatibility that CPython currently provides to its users and what Cython currently provides to its users (with the default C configuration). In theory, you could deprecate any non-stable API function tomorrow and remove it in CPython 3.13 or 3.14. Depending on which function you chose, that could end up being a blow to the existing ecosystem, but it wouldn't really be different from removing any other publicly exposed C function that is used outside of CPython itself.

Personally, I don't see much of an advantage of an additional private-but-exposed API. If it's exposed, and fills a purpose (which it probably does, because it exists), why can't others use it? As long as CPython sticks to adding and removing functions only in minor (3.x.0) releases, that's just like any other kind of functionality that a specific CPython 3.x release series supports. And keeping the C-API stable for the lifetime of a minor release series seems to be widely agreed on.

For a broader compatibility range, the current CPython offer is the limited API and the stable ABI, but that's a different beast and we're not discussing that here.

@markshannon
Copy link
Member Author

What do you mean by "every release"?

Every minor release.

@markshannon
Copy link
Member Author

markshannon commented Feb 20, 2023

It does need to be guarded by a macro, though. Otherwise you can't use it in a stable ABI build.

@encukou
Copy link
Member

encukou commented Feb 20, 2023

You'll want something like:

#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x030C0000
PyAPI_FUNC(int) PyLong_IsSingleDigit(PyLongObject* op);
#endif

#ifndef Py_LIMITED_API
// (Instead of the #ifndef guard, you can put this in a Include/cpython/*.h file that's guarded as a whole)
static inline int
_PyLong_IsSingleDigit(PyLongObject* op) {
    assert(PyLong_Check(op));
    Py_ssize_t signed_size = op->long_value.ob_size;
    return -1 <= signed_size && signed_size <= 1;
}
#define PyLong_IsSingleDigit _PyLong_IsSingleDigit
#endif

and then in the .c,

#undef PyLong_IsSingleDigit

...
PyLong_IsSingleDigit(PyLongObject* op) {
   return _PyLong_IsSingleDigit(PyLongObject* op);
}

@encukou
Copy link
Member

encukou commented Feb 20, 2023

Note that libpython should provide the function, regardless of whether there's a faster C macro for it.

@markshannon
Copy link
Member Author

Sure.

However, if we provide the function regardless of whether or not Py_LIMITED_API is defined, then Py_LIMITED_API doesn't limit the API.
Rather, it seems that defining Py_LIMITED_API changes the ABI not the API. This seems confusing.

@encukou
Copy link
Member

encukou commented Feb 20, 2023 via email

@markshannon
Copy link
Member Author

But for users, one without the other doesn't really make much sense.

An unstable ABI with a stable API makes a lot of sense, to me at least.

The inline form of PyLong_IsSingleDigit depends on the layout of the int object, which will change in 3.12, so its ABI is unstable. The non-inline form is a published symbol and is stable.

The reason the API counts of stable, even though the ABI might not be, is that the unstable form can be chosen at build time without any change to the code, much as we do for Py_DEBUG.

@encukou
Copy link
Member

encukou commented Feb 20, 2023 via email

@markshannon markshannon requested a review from a team as a code owner February 21, 2023 11:28
@encukou
Copy link
Member

encukou commented Feb 21, 2023

Now, I don't think the functions with undefined behavior should be in the stable ABI. These also hardcode the fact that an efficient “digit” fits in intptr_t -- isn't that an implementation detail?
For most uses, PyLong_AsLong/PyLong_AsLongAndOverflow should be good enough, and it doesn't expose undefined behavior.

@zooba
Copy link
Member

zooba commented Feb 21, 2023

I don't see what this PR offers other than confusion for someone trying to read our API.

For starters, I don't think anyone who isn't used to CPython internals will assume the correct meaning for "digit" (and even most of us will need to double check whether it means a single element from our internal storage or... a digit). For small numbers, AsSsize_t ought to be perfectly fine.

If we want to skip the type check and just crash on a non-longobject, then we could add that API. But the size check isn't optional, so it may as well be encapsulated with extracting the value, as it currently is.

I don't see why it should get into the limited API. The point of that API set isn't to grow over time - it's to remain stable, so that extensions can build once and be compatible with many versions. Every time we add a function to it, we hurt that. In some cases, it's worth it, but I don't see why IsSingleDigit needs to be stable. Extensions that want compatibility will have to avoid it for ~5 years and use AsSsize_t anyway, and those that want speed should avoid the entire limited API.

I'm sympathetic to extension authors who want to extract values larger than ssize_t. Adding a (stable) API to copy the integer into a byte array provided by the caller would be fine, and ought to sufficiently hide our implementation details.

I have very little sympathy for extension authors who need to check whether the value fits into a ssize_t without actually getting the value as a ssize_t or an OverflowError (which ought to still be fairly cheap, right? As it never makes it back into Python code in this scenario.)

@markshannon
Copy link
Member Author

Now, I don't think the functions with undefined behavior should be in the stable ABI.

Almost all the API has undefined behavior if you don't fulfill the pre-conditions.
Requiring that you check PyLong_IsSingleDigit() before calling PyLong_GetSingleDigitValue() is no different to calling PyDict_Check() before calling PyDict_GetItem()

These also hardcode the fact that an efficient “digit” fits in intptr_t -- isn't that an implementation detail?

A large integer has to be made up of digits. So the existence of "digits" isn't really an implementation detail.
Their size and representation is. This API doesn't define what a digit is. All is says is that some set of values will return true from PyLong_IsSingleDigit() and that PyLong_GetSingleDigitValue() will give you its value.
There is the implicit promise that this is an efficient way to get that value.

Feel free free to suggest better names, but remember that "small" int is already in use for a very restricted set of values around 0.

@zooba

If we don't provide functions like these, then Cython, NumPy, MyPyC, etc will just use the C struct.
It will take them ages updating to 3.12, have latent bugs when they do, and generally make life worse for everyone.

If we want to skip the type check and just crash on a non-longobject, then we could add that API.

The functions take PyLongObject *. So, unlike much of our API the type is enforced by the C compiler.

@encukou
Copy link
Member

encukou commented Feb 21, 2023

Almost all the API has undefined behavior if you don't fulfill the pre-conditions.

Yes, but I believe that new additions to the limited API should be “nicer” than the status quo.
(If we disagree, I can formalize that and write up an design direction guidelines and run that by the SC, but that'd take time. I don't think this PR should be blocked on that.)

A large integer has to be made up of digits. So the existence of "digits" isn't really an implementation detail.
Their size and representation is. This API doesn't define what a digit is. All is says is that some set of values will return true from PyLong_IsSingleDigit() and that PyLong_GetSingleDigitValue() will give you its value.

How would you use this API?
Either you call PyLong_IsSingleDigit and then either PyLong_GetSingleDigitValue, or fall back to [PyLong_AsLongAndOverflow or an exception]. I don't see why this should be faster than PyLong_AsLongAndOverflow (or maybe a variant without the type check -- should we add that?) in the happy case.
Or you are confident that you don't need to call PyLong_AsLongAndOverflow -- but I don't see a way to get that confidence if you don't already know the value.

If we don't provide functions like these, then Cython, NumPy, MyPyC, etc will just use the C struct.

Those don't use the limited API. And it seems everyone who can afford to call a DLL function would be fine with PyLong_AsLong*.

@markshannon
Copy link
Member Author

Those don't use the limited API. And it seems everyone who can afford to call a DLL function would be fine with PyLong_AsLong*.

Which is probably true, and why I proposed Py_GIVE_ME_THE_FAST_VERSION_I_DONT_MIND_RECOMPILING_EVERY_RELEASE (or a more sensible name) first.

I thought you were saying that these functions should be part of the limited API. If we are to have CPython perform well and have Cython, etc be available during the alpha and beta phases, we need a low-level API.

@markshannon
Copy link
Member Author

How would you use this API?

https://github.com/python/cpython/pull/101685/files#diff-1a6e70e2beeecad88840c67284ac4d54a36998029244771fcc820e801390726aR621

Either you call PyLong_IsSingleDigit and then either PyLong_GetSingleDigitValue, or fall back to [PyLong_AsLongAndOverflow or an exception]. I don't see why this should be faster than PyLong_AsLongAndOverflow

It is considerably faster because PyLong_IsSingleDigit and PyLong_GetSingleDigitValue get inlined, and do very little work: a couple of memory reads, a multiply and comparison and no calls.

With the internals of longs hidden behind these functions, we can improve the layout and will only need a single memory read. faster-cpython/ideas#548

@encukou
Copy link
Member

encukou commented Feb 21, 2023

It is considerably faster because PyLong_IsSingleDigit and PyLong_GetSingleDigitValue get inlined, and do very little work: a couple of memory reads, a multiply and comparison and no calls.

At least the inlining should be possible for PyLong_AsLong as well :)

I thought you were saying that these functions should be part of the limited API. If we are to have CPython perform well and have Cython, etc be available during the alpha and beta phases, we need a low-level API.

Sorry, that was a misunderstanding :(
They need to be part of the limited API to be usable in a stable ABI build -- the stable ABI that should work with all future 3.x releases. The one that calls everything as functions. That's not what Cython or NumPy use. (Well, with Cython you can opt in to it, but it'll generate different code.)
For a low-level API, the regular C-API (no Py_LIMITED_API) is perfect. And it's easier to add things to it.

@markshannon
Copy link
Member Author

For a low-level API, the regular C-API (no Py_LIMITED_API) is perfect. And it's easier to add things to it.

"Perfect" is not the term I would use 🙂
The problem is that the "regular C-API" isn't really an API at all, but a grab bag of functions (which could form an OK-ish API) and structs that expose CPython internals, with no defined semantics.

What we need is an API that includes low-level API functions, but not structs and internal functions.
Ideally, we also want the ability to switch between portability and performance at compile time, without the proliferation of #ifdefs that plague both CPython and Cython.

@markshannon
Copy link
Member Author

That's comparing the inlined version to the non-inlined version. Not the same thing as comparing a single expensive call to two cheap ones.

Two successive functions calls don't need any more register spilling than a single call.

@zooba
Copy link
Member

zooba commented May 11, 2023

That's comparing the inlined version to the non-inlined version.

Because that's the change they want: for the fast path of PyLong_AsSsize_t to be inlined. Hence why I was proposing an inlineable wrapper that does the fast path if it can, but falls back to the function call if not (and marked "unstable" because now the caller has to recompile more often).

@wjakob
Copy link
Contributor

wjakob commented May 11, 2023

@zooba I would still prefer if those are two separate functions.

Suppose I am trying to fetch a 64 bit unsigned integer from Python on a 64 bit machine. Then the full-blown PyLong_AsSsize_t is actually the wrong function to call because it cannot return a result bigger than 2<<62.

Of course most values are likely to still be compact. So what I want to be able to do is to first try the fast path for compact values. If the number is compact and positive, the cast succeeds, and it fails if it is negative. In the rare case that the number is not compact, I can fall back to PyLong_AsLongLong (instead of PyLong_AsSsize_t).

If you bake together those functions in the CPython side, it removes that flexibility.

@zooba
Copy link
Member

zooba commented May 11, 2023

In the rare case that the number is not compact, I can fall back to PyLong_AsLongLong (instead of PyLong_AsSsize_t).

If you bake together those functions in the CPython side, it removes that flexibility.

It just means you're in the same place as you would be for numbers that are bigger than 2<<63, which is that you have to handle an overflow.

Besides, I gave up on my preferred API for this because nobody seemed to like it, and have simplified my discussion point down to an existing API that other people mentioned.

If I get over the argument-burnout to the point where I would actually code something up, you'd see. But everyone just wants to argue past everything I say, and I can't tell if it's me (in which case I'll bow out... again) or if it's something else (in which case I change my messaging... again... and things still don't go the way I'd like to see them go).

Best of luck!

@markshannon
Copy link
Member Author

Why is 2 << 63 important? Why not or 2 << 37 or 2 << 49?

ints are either numbers in the usual sense of being a count of something, and are clustered closely around zero, or
are some sort of mathematical device like a crypto key or bitset.
99.9...% (add some more nines) of ints are going to be less than 2 << N, it doesn't matter much whether N is 30, 40, 50 or 60.
Things like crypto keys or bitsets will be larger than 2 << N, but they will also be larger than 2 << 63, so won't fit into any C int anyway.

The point is that the ints in range 2 << N ... 2 << 63 are vanishingly rare, so we shouldn't care about them.

Given all that, we should choose the N in PyUnstable_Int_IsCompact() for speed. The actual value of N is unimportant to the user.

@wjakob
Copy link
Contributor

wjakob commented May 12, 2023

Why is 2 << 63 important? Why not or 2 << 37 or 2 << 49?

Put yourself into the shoes of the consumer of this API (binding project tools like pybind11, Cython, etc.). They must expose a C++ function taking a standard integer type (int, uint32_t, uint64_t, etc.) and perform a suitable conversion from Python. The specific API needed for this will depend on the C/C++ type being targeted. My example referencing the special case of 2<<63 was merely to explain why PyLong_AsSsize_t isn't good as a fallback for all case because it cannot convert the full range of 64 bit unsigned integers.

So what these tools will ideally do is to first try the fast path (if PyUnstable_Int_IsCompact is true), and then fallback to one of several other functions if that fails (PyLong_AsLong, PyLong_AsLongLong, PyLong_AsUnsignedLong, PyLong_AsUnsignedLongLong). Based on the function interface, the specific sequence is generated in source code form (Cython) or at compile time, using template metaprogramming (pybind11, nanobind).

My example was simply to point out why @zooba's suggestion of gluing PyUnstable_Int_IsCompact and PyUnstable_Int_IsCompact together into a one-size-fits-all API that internally also does a fallback via PyLong_AsSsize_t would not be sufficiently general.

All of this is completely unrelated to the question of small/compact integers, which are naturally much smaller than this threshold.

@zooba
Copy link
Member

zooba commented May 12, 2023

Would my proposal from #101685 (comment) be sufficiently general? (Slightly tweaked from the first time I posted it)

static inline int PyUnstable_Long_AsByteArray(PyLongObject* v,
    unsigned char* bytes, size_t n,
    int little_endian, int is_signed) {
    switch (n) {
        case 1:
        case 2:
        case 4:
        case 8:
            if (/* value won't fit in n bytes */)
                return -1;
            // read directly from v's internals
            return 0;
    }
    int r =_PyLong_AsByteArray(v, bytes, n, little_endian, is_signed);
    if (r < 0) PyErr_Clear(); // leave no errors set, or else we'd have to set them above too
    return r;
}

Bearing in mind that this is all in a header file and is always inlined, when n, little_endian and is_signed are literals, every optimising compiler in the world is going to only generate the code it needs. For n of 1, 2, 4 or 8, it will never generate a call to any actual function, and will always do it using internals directly.

Usage looks like:

int x;
if (PyUnstable_Long_AsByteArray(v, (unsigned char*)&x, sizeof(x), 1, 1) < 0) {
    // raise error
}

It only depends on unsigned char being 8 bits (and we're in a lot of trouble if that ever changes), and personally I'd even be okay with saying that the pointer needs to be aligned correctly for its size so that the fast path can just cast back to int* or whatever and write to it. Again, every optimising compiler will figure this out and generate decent code - you'd need volatile to get them to actually write through the pointer.

@scoder
Copy link
Contributor

scoder commented May 12, 2023 via email

@markshannon
Copy link
Member Author

@zooba
The devil is in the detail. The expression if (/* value won't fit in n bytes */) is expensive to compute if n is larger than the compact value, as it will be for the common case of n == sizeof(Py_ssize_t).

@encukou
Copy link
Member

encukou commented May 15, 2023

For unstable API, let's expose what CPython uses.
We can add API designed for users, but not to 3.12. As said in the OP, this is a stop-gap solution.

@zooba
Copy link
Member

zooba commented May 15, 2023

The expression if (/* value won't fit in n bytes */) is expensive to compute if n is larger than the compact value

It doesn't have to be a perfect calculation, though. We're only guaranteeing a fast path when it's fast, so if ssize_t is 64 bits but the fast path is only used for up to 60 bits, we're still meeting the promise. We just can't fail fast (unless it's 90 bits or whatever the next digit is), but the caller doesn't want us to fail at all, so they'll be fine if big-but-not-overflow numbers are slower than smaller numbers.

@encukou
Copy link
Member

encukou commented May 15, 2023

Ah, I missed that you used Int rather than Long for these.
If we're taking back the Int and Str namespaces, let's please first have a general discussion about that. (I don't think it's worth the inconsistency, FWIW.)

@encukou
Copy link
Member

encukou commented May 15, 2023

See faster-cpython#16 for what I have in mind.

@encukou
Copy link
Member

encukou commented May 18, 2023

I've added some tests to faster-cpython#16. Feel free to cherry-pick/adapt anything from there if it helps.
I'm off for the weekend, the beta 1 deadline is Monday.

@markshannon
Copy link
Member Author

Thanks @encukou for doing this properly

@markshannon
Copy link
Member Author

@zooba Any objections to me merging this?

@scoder
Copy link
Contributor

scoder commented May 21, 2023

Looks good to me, FWIW.

scoder added a commit to cython/cython that referenced this pull request May 21, 2023
@markshannon
Copy link
Member Author

OK, it looks like we have some sort of consensus that this is worth having.
Should @zooba come up with a broader solution that has as good performance, we can drop (or deprecate) this API.

@markshannon markshannon merged commit 9392379 into python:main May 21, 2023
@rwgk
Copy link

rwgk commented May 22, 2023

This PR breaks building pybind11. See below. Is this expected? What do you recommend we do?

https://github.com/pybind/pybind11/tree/d72ffb448c58b4ffb08b5ad629bc788646e2d59e

g++ (Debian 12.2.0-14) 12.2.0
g++ -o pybind11/tests/test_embed/catch.o -c -std=c++17 -O0 -g -Wall -Wextra -Wconversion -Wcast-qual -Wdeprecated -Wundef -Wnon-virtual-dtor -Wunused-result -Werror -DPYBIND11_STRICT_ASSERTS_CLASS_HOLDER_VS_TYPE_CASTER_MIX -DPYBIND11_ENABLE_TYPE_CASTER_ODR_GUARD_IF_AVAILABLE -Ipybind11/include -I/usr/local/google/home/rwgk/forked/pybind11/include -I/usr/local/google/home/rwgk/clone/pybind11/include -I/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12 -ICatch2/single_include/catch2 -I/usr/local/google/home/rwgk/forked/Catch2/single_include/catch2 -I/usr/local/google/home/rwgk/clone/Catch2/single_include/catch2 /usr/local/google/home/rwgk/forked/pybind11/tests/test_embed/catch.cpp
In file included from /usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/Python.h:35,
                 from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../detail/common.h:266,
                 from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/../attr.h:13,
                 from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/class.h:12,
                 from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:13,
                 from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/embed.h:12,
                 from /usr/local/google/home/rwgk/forked/pybind11/tests/test_embed/catch.cpp:4:
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/cpython/longintrepr.h: In function ‘int _PyLong_IsCompact(const PyLongObject*)’:
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/pyport.h:24:31: error: cast from type ‘const PyLongObject*’ {aka ‘const _longobject*’} to type ‘PyObject*’ {aka ‘_object*’} casts away qualifiers [-Werror=cast-qual]
   24 | #define _Py_CAST(type, expr) ((type)(expr))
      |                               ^~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/longobject.h:13:9: note: in expansion of macro ‘PyType_FastSubclass’
   13 |         PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
      |         ^~~~~~~~~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/object.h:178:28: note: in expansion of macro ‘_Py_CAST’
  178 | #define _PyObject_CAST(op) _Py_CAST(PyObject*, (op))
      |                            ^~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/object.h:207:31: note: in expansion of macro ‘_PyObject_CAST’
  207 | #  define Py_TYPE(ob) Py_TYPE(_PyObject_CAST(ob))
      |                               ^~~~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/longobject.h:13:29: note: in expansion of macro ‘Py_TYPE’
   13 |         PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
      |                             ^~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/cpython/longintrepr.h:109:12: note: in expansion of macro PyLong_Check’
  109 |     assert(PyLong_Check(op));
      |            ^~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/cpython/longintrepr.h: In function ‘Py_ssize_t _PyLong_CompactValue(const PyLongObject*)’:
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/pyport.h:24:31: error: cast from type ‘const PyLongObject*’ {aka ‘const _longobject*’} to type ‘PyObject*’ {aka ‘_object*’} casts away qualifiers [-Werror=cast-qual]
   24 | #define _Py_CAST(type, expr) ((type)(expr))
      |                               ^~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/longobject.h:13:9: note: in expansion of macro ‘PyType_FastSubclass’
   13 |         PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
      |         ^~~~~~~~~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/object.h:178:28: note: in expansion of macro ‘_Py_CAST’
  178 | #define _PyObject_CAST(op) _Py_CAST(PyObject*, (op))
      |                            ^~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/object.h:207:31: note: in expansion of macro ‘_PyObject_CAST’
  207 | #  define Py_TYPE(ob) Py_TYPE(_PyObject_CAST(ob))
      |                               ^~~~~~~~~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/longobject.h:13:29: note: in expansion of macro ‘Py_TYPE’
   13 |         PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
      |                             ^~~~~~~
/usr/local/google/home/rwgk/usr_local_like/cpython_git_bisect/include/python3.12/cpython/longintrepr.h:118:12: note: in expansion of macro PyLong_Check’
  118 |     assert(PyLong_Check(op));
      |            ^~~~~~~~~~~~
cc1plus: all warnings being treated as errors

@encukou
Copy link
Member

encukou commented May 22, 2023

Ugh. Looks like 3.12 adds a bunch of functions that try to be const-correct, but fail to compile if you tell the compiler to actually enforce const-correctness (-Wcast-qual).

@scoder
Copy link
Contributor

scoder commented May 22, 2023

Looks like 3.12 adds a bunch of functions that try to be const-correct, but fail to compile if you tell the compiler to actually enforce const-correctness (-Wcast-qual).

Since it's probably difficult to change the existing cast/check macros to respect const, I think the quick way forward is to remove the const qualifiers from the function arguments for now.

Also note that it's only asserts that fails here, so they could be removed. But it seems sensible to have them, especially since these are user facing functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants