Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyType_GetModuleByDef family for binary functions performance #117578

Closed
neonene opened this issue Apr 6, 2024 · 9 comments
Closed

PyType_GetModuleByDef family for binary functions performance #117578

neonene opened this issue Apr 6, 2024 · 9 comments
Labels
performance Performance or resource usage topic-C-API type-feature A feature request or enhancement

Comments

@neonene
Copy link
Contributor

neonene commented Apr 6, 2024

Feature or enhancement

Proposal:

When implementing an extension module with a module state enabled,
binary tp slot functions (e.g. nb_add) need two PyType_GetModuleByDef() to be used like the following to compare the given types with a heap-type object in the module state, which can be slow:

static PyObject *
foo_add(PyObject *left, PyObject *right)
{
    ...
    PyObject *module = PyType_GetModuleByDef(Py_TYPE(left), &module_def);
    if (module == NULL) {
        PyErr_Clear();
        module = PyType_GetModuleByDef(Py_TYPE(right), &module_def);
    }
    ...
}
  • 3.13.0 alpha5 _decimal (module state ver.)
from timeit import timeit
f = lambda s: timeit(s, 'import _decimal; d = _decimal.Decimal(1)')
a = f('d + 1')
b = f('1 + d')
print('d + 1:', a)
print('1 + d:', b,  b / a)
d + 1: 0.11202071857405826
1 + d: 0.49533294743326095 4.421797625818569

The difference mainly comes from a TypeError emission from PyType_GetModuleByDef(), so it would be nice if we could have a new function which receives two types and emits an error after checking them.

cc @encukou

Related issue:

Linked PRs

@neonene neonene added the type-feature A feature request or enhancement label Apr 6, 2024
@aisk aisk added the performance Performance or resource usage label Apr 6, 2024
@encukou
Copy link
Member

encukou commented Apr 8, 2024

Make it private at first, then perhaps expose it in another issue.

@gvanrossum
Copy link
Member

@neonene Are you mainly interested in fixing the specific case for _decimal? Other stdlib modules? 3rd party modules? Limited API? I agree with @encukou that if you want this mainly for _decimal, a private function would be best, and we should start there.

@neonene
Copy link
Contributor Author

neonene commented Apr 8, 2024

It seems that extension modules such as _decimal and _ctypes cannot use any private extern functions, so my proposal (a private function) might be related to built-in modules. I'm thinking I should take back if not so beneficial.

@gvanrossum
Copy link
Member

It seems that extension modules such as _decimal and _ctypes cannot use any private extern functions

Isn't this a recent side effect caused by a desire to use only the Limited API? @vstinner should know more. I think the slowdown you've observed is rather significant, so maybe we need to make an exception that restriction (if that's what's causing your observation).

In modules not restricted to the Limited API, private APIs can be used as long as they use the PyAPI_FUNC() macro.

@neonene
Copy link
Contributor Author

neonene commented Apr 9, 2024

Tested _decimal with #117661, where the private function was temporarily exposed:

d + 1: 0.1184351708673473
1 + d: 0.1197695422950866 1.0112666821685414

@vstinner
Copy link
Member

vstinner commented Apr 9, 2024

Make it private at first, then perhaps expose it in another issue.

Yep, I agree. Also, there is maybe no need to expose it in public right now :-)

It seems that extension modules such as _decimal and _ctypes cannot use any private extern functions

Python has multiple stdlib extensions built as shared extensions: you should export internal functions using PyAPI_FUNC() for them.

Example in Include/internal/pycore_object.h:

// Export for 'math' shared extension
PyAPI_FUNC(PyObject*) _PyObject_LookupSpecial(PyObject *, PyObject *);

Isn't this a recent side effect caused by a desire to use only the Limited API?

_decimal and _ctypes are not built with the limited C API. If tomorrow there is a great advantage to use the internal C API in a stdlib extension built with the limited C API, just remove the few lines around Py_LIMITED_API and that's it.

@neonene
Copy link
Contributor Author

neonene commented Apr 9, 2024

Thanks. I've been very confused around the PyAPI_FUNC() macro.

encukou added a commit that referenced this issue Apr 25, 2024
)


Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Petr Viktorin <encukou@gmail.com>
@encukou encukou closed this as completed Apr 25, 2024
@picnixz
Copy link
Contributor

picnixz commented Aug 17, 2024

Re-opening until #123100 is merged (just for tracking purposes).

@neonene
Copy link
Contributor Author

neonene commented Sep 26, 2024

Superseded by PyType_GetBaseByToken(): d7248cd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

7 participants