Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Module Def Slot for Supporting Multiple Interpreters #104108

Closed
ericsnowcurrently opened this issue May 2, 2023 · 12 comments
Closed

Add a Module Def Slot for Supporting Multiple Interpreters #104108

ericsnowcurrently opened this issue May 2, 2023 · 12 comments
Labels
3.12 bugs and security fixes docs Documentation in the Doc dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-feature A feature request or enhancement

Comments

@ericsnowcurrently
Copy link
Member

ericsnowcurrently commented May 2, 2023

PEP 489 is clear that extension modules implementing multi-phase init are expected to support use in multiple interpreters. However, there are two situations where that mandate isn't sufficient:

  1. PEP 684 introduces a per-interpreter GIL, where there is an additional thread-safety constraint beyond support for multiple interpreters
  2. the HPy project implies modules with multi-phase init but not necessarily supporting multiple interpreters

In both cases a new module def slot (the same one, in fact) is a correct solution.

For per-interpreter GIL, PEP 684 specifies that we must add a module def slot for opting in to supporting per-interpreter GIL.

For HPy, the situation was pointed out to me by @hodgestar during a conversation at PyCon.

CC @encukou


I propose the following solution:

  1. add a new Py_mod_multiple_interpreters module def slot
  2. interpret the value to indicate support for multiple interpreters and for per-interpreter GIL
  3. call _PyImport_CheckSubinterpIncompatibleExtensionAllowed() when appropriate in PyModule_FromDefAndSpec2()

The slot value may be one of the following:

  • 0 - does not support multiple interpreters (for HPy)
  • 1 - supports multiple interpreters (the default)
  • 2 - supports per-interpreter GIL

It would probably make sense to define a constant (macro) for each of those.

Linked PRs

@ericsnowcurrently ericsnowcurrently added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters 3.12 bugs and security fixes labels May 2, 2023
@ericsnowcurrently
Copy link
Member Author

@ericsnowcurrently
Copy link
Member Author

related: gh-98627

@hodgestar
Copy link

Thanks @ericsnowcurrently. This looks great.

To clarify for those who weren't present present in our discussion, all HPy modules use multi-phase init so without this there is no way for a module in HPy to specify whether it supports multiple interpreters or not.

@encukou
Copy link
Member

encukou commented May 3, 2023

Sounds reasonable, but let's get the exact meaning of 0 down.

When a module supports multiple interpreters, it usually also supports clean tear-down, repeated initialization, and multiple module instances loaded in a single interpreter. (The module isotlation howto has more Background info.) For Py_mod_multiple_interpreters=0, we'll need to explicitly say what we expect/guarantee.

Will it really be “can only be loaded in one interpreter” (to check it, CPython will need something like a process-wide mapping of PyModuleDef to the interpreter in which the module can be loaded), or rather “can only be loaded once per process” (CPython would need something like a process-wide set of PyModuleDef)?
Or should the slot have more values to choose from? HPy might need something else than Cython or than a least-effort port of NumPy.

@encukou
Copy link
Member

encukou commented May 3, 2023

(If it turns out that each project needs slightly different semantics, they could also do the opt-out themselves -- though maybe CPython could provide better ways to do different checks.)

@ericsnowcurrently
Copy link
Member Author

When a module supports multiple interpreters, it usually also supports clean tear-down, repeated initialization, and multiple module instances loaded in a single interpreter. (The module isotlation howto has more Background info.) For Py_mod_multiple_interpreters=0, we'll need to explicitly say what we expect/guarantee.

Good point. I was certainly thinking of PEP 630 when talking about "supports multiple interpreters". The 0 value would match what the HPy folks are interested in.

I'll make sure the docs are clear about the meaning.

Will it really be “can only be loaded in one interpreter” (to check it, CPython will need something like a process-wide mapping of PyModuleDef to the interpreter in which the module can be loaded), or rather “can only be loaded once per process” (CPython would need something like a process-wide set of PyModuleDef)?

I was planning to start with "can only be loaded in the main interpreter" since it's simpler. However, it should probably be "can only be loaded once in the process", since we don't know how well the module may support being loaded more than once. I suppose that "once" could technically be in any interpreter, but I don't see a big advantage to supporting that immediately.

As to knowing if it was imported already, we can make use of _PyRuntime.imports.extensions.dict. We could probably also take advantage of the fact that we never dlclose() the extension module's file.

Or should the slot have more values to choose from? HPy might need something else than Cython or than a least-effort port of NumPy.

I'd like to keep things focused for now. We can circle back to a more detailed slot if the need arises. I have some ideas on that.

@encukou
Copy link
Member

encukou commented May 4, 2023

OK, “once in the process, and only in the main interpreter” makes sense. If it's easy to implement just “once per process” I'd go with that, so we don't have another place where the concept of “main interpreter” leaks to user code.

“Once per process” is safe/restrictive enough for modules that use C statics, and useful for many use cases -- especially the “MVP” (just starting out, will read up on details later) case. So adding it to C API makes sense.
It's probably not exactly what HPy needs: HPyGlobal is per-interpreter, so HPY probably really wants “once per interpreter”. But that's very specific to HPyGlobal.

I'd like to keep things focused for now. We can circle back to a more detailed slot if the need arises. I have some ideas on that.

I do too: I'd like to make it easy for people to do their own checking when they outgrow Py_mod_multiple_interpreters=0 but don't reach full isolation. IMO, HPy should do that too, and we shouldn't say Py_mod_multiple_interpreters=0 is “what HPy needs”.

@steve-s
Copy link
Contributor

steve-s commented May 4, 2023

Chiming in to clarify some of the HPy specifics:

tl;dr: I believe that HPy has a reasonable workaround, so the issue is not as pressing as we originally thought, but I personally think that explicit API with good docs is better than implicit assumptions inferred from (non) presence of multi-phase init.

It's probably not exactly what HPy needs: HPyGlobal is per-interpreter, so HPy probably really wants “once per interpreter”. But that's very specific to HPyGlobal

This is indeed very HPy specific. The main motivation for requesting the flag was not HPyGlobal, but as @hodgestar wrote the fact that HPy only supports multi-phase init. For the interested: the interplay of HPyGlobal and loading of one module twice in one interpreter is discussed in hpyproject/hpy#431.

The main motivation for HPy was that we'd like to have a way to say: although this extension uses multiphase init, please treat it like single phase init. Main use case for that that I see is migration of legacy extensions: the user may not be ready or willing yet to asses all the implications of what it means to support multiple interpreters, but they still want to migrate to HPy for other reasons. So for that they want to stick to the original semantics (whatever it was). Maybe one can think of similar situation on CPython? Can multi-phase init be useful on its own without subinterpreters support?

There are few possible workarounds for HPy:

  1. cumbersome IMO: HPy would emulate its multiphase init on top of single phase init. Some features of multiphase init will not be available in that case.
  2. as @encukou pointed out we can implement a runtime check that would fail if such module is loaded for second time. I did not think about this before and it sounds like a reasonable workaround to me.

In HPy we still plan to have something like the flags you introduce here. In general, I think that some explicit flags that tell the interpreter what expectations the extension has is a good thing from usability and documentation point of view. Such flags may not only be related to subinterpreters, but in the future to nogil, JIT compilation, ... Maybe they should be a bitset to avoid potential explosion and allow fine grained configuration, or just multiple "flag slots" (adding few things to module spec should not really make significant memory footprint increase).

@ericsnowcurrently
Copy link
Member Author

Such flags may not only be related to subinterpreters, but in the future to nogil, JIT compilation, ... Maybe they should be a bitset to avoid potential explosion and allow fine grained configuration, or just multiple "flag slots" (adding few things to module spec should not really make significant memory footprint increase).

+1

FWIW, I took a stab at something like that yesterday for this issue, but backed off when it started to get more complex than I can justify at the moment. 🙂

@encukou
Copy link
Member

encukou commented May 4, 2023

Maybe they should be a bitset to avoid potential explosion and allow fine grained configuration

IMO, a bitset would itself be an explosion. I wouldn't want to encourage people to check for the different aspects of isolation individually. And I wouldn't want to write tests to check CPython does the right checks.

I think the 3 levels, plus the ability to fail at runtime in more nuanced cases, are good.

ericsnowcurrently added a commit that referenced this issue May 5, 2023
…04148)

I'll be adding a value to indicate support for per-interpreter GIL in gh-99114.
@ericsnowcurrently
Copy link
Member Author

ericsnowcurrently commented May 5, 2023

FYI, I've merged my PR but I'm open to further tweaks.

I'll also get back to adding docs in the next few weeks.

carljm added a commit to carljm/cpython that referenced this issue May 5, 2023
* main:
  pythongh-99113: Add PyInterpreterConfig.own_gil (pythongh-104204)
  pythongh-104146: Remove unused var 'parser_body_declarations' from clinic.py (python#104214)
  pythongh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (pythongh-104205)
  pythongh-104108: Add the Py_mod_multiple_interpreters Module Def Slot (pythongh-104148)
  pythongh-99113: Share the GIL via PyInterpreterState.ceval.gil (pythongh-104203)
  pythonGH-100479: Add `pathlib.PurePath.with_segments()` (pythonGH-103975)
  pythongh-69152: Add _proxy_response_headers attribute to HTTPConnection (python#26152)
  pythongh-103533: Use PEP 669 APIs for cprofile (pythonGH-103534)
  pythonGH-96803: Add three C-API functions to make _PyInterpreterFrame less opaque for users of PEP 523. (pythonGH-96849)
jbower-fb pushed a commit to jbower-fb/cpython-jbowerfb that referenced this issue May 8, 2023
…pythongh-104148)

I'll be adding a value to indicate support for per-interpreter GIL in pythongh-99114.
@erlend-aasland erlend-aasland added the docs Documentation in the Doc dir label Jan 5, 2024
@Eclips4
Copy link
Member

Eclips4 commented Jan 5, 2024

Docs was added in #107403, so It's done :)

@Eclips4 Eclips4 closed this as completed Jan 5, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in Subinterpreters Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes docs Documentation in the Doc dir interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-subinterpreters type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

6 participants