Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port syslog module to use module state #99127

Closed
corona10 opened this issue Nov 5, 2022 · 10 comments
Closed

Port syslog module to use module state #99127

corona10 opened this issue Nov 5, 2022 · 10 comments
Assignees
Labels
topic-subinterpreters type-feature A feature request or enhancement

Comments

@corona10
Copy link
Member

corona10 commented Nov 5, 2022

While I check the @ericsnowcurrently 's checklist: https://github.com/ericsnowcurrently/multi-core-python/wiki/0-The-Plan
I found that the syslog module still uses the global variable for the following variables.

  • S_ident_o
  • S_log_open

Both variables are declared as global variables since the original author assumes that in only one instance, only one syslog will exist.(one process one interpreter)

So I think that it's okay to move them into the module state if we consider the per-interpreter model.

@corona10 corona10 added the type-feature A feature request or enhancement label Nov 5, 2022
@corona10 corona10 self-assigned this Nov 5, 2022
corona10 added a commit to corona10/cpython that referenced this issue Nov 5, 2022
corona10 added a commit to corona10/cpython that referenced this issue Nov 5, 2022
corona10 added a commit to corona10/cpython that referenced this issue Nov 5, 2022
@corona10
Copy link
Member Author

corona10 commented Nov 5, 2022

Hmm, I think that I miss understood about syslog, I close ths issue as invalid.

@corona10 corona10 closed this as completed Nov 5, 2022
@ronaldoussoren
Copy link
Contributor

I've added some review comments to the PR, it seems to be on right path with some minor issues because the syslog extension changes process global state in the syslog library. Those issues IMHO can be easily fixed.

@corona10 corona10 reopened this Nov 5, 2022
@corona10
Copy link
Member Author

corona10 commented Nov 5, 2022

I've added some review comments to the PR, it seems to be on right path with some minor issues because the syslog extension changes process global state in the syslog library. Those issues IMHO can be easily fixed.

Thanks I will take a look.

@ericsnowcurrently
Copy link
Member

Note that currently the syslog module already has the following characteristics:

  • an app that embeds CPython may call openlog() independently of any use of the syslog module, and the two may interfere with each other
  • two interpreters using the syslog module may interfere with each other in the same way

We can't do much about the embedding case because the posix syslog API doesn't provide a way to get the current configuration (if any) such that you could restore it.

For multiple interpreters, either we leave the current behavior or we try to make each interpreter independent. To do that, we have two options:

  • swap in each interpreter's configuration around each syslog.syslog() call
    • call openlog() before and closelog() after
    • call setlogmask() before and after
    • syslog.openlog() and syslog.setlogmask() would only store the values on the module state
  • do similar, but be smarter about the single-interpreter case and cases where swapping is not necessary (i.e. same syslog config)

The "smarter" approach would require that the syslog module keep process-global state, along with a lock to protect that state. That would require something similar to what is discussed in https://discuss.python.org/t/20668 (and https://discuss.python.org/t/20663).


For reference:

@ericsnowcurrently
Copy link
Member

FWIW, for an implementation of mutli-phase init for a module, I would expect it to either deal with the sort of weird behavior syslog would have, or disallow use in multiple interpreters (at least in problematic cases).

@encukou
Copy link
Member

encukou commented Nov 8, 2022

We can't do much about the embedding case because the posix syslog API doesn't provide a way to get the current configuration (if any) such that you could restore it.

This is a typical example of a library with process-wide state. It's meant to be used by “the” application, rather than “a” library. I don't think Python's stdlib should go out of this way to adapt the API to multiple libraries that aren't aware of each other. (On the other hand, Python's C API provide the tools to allow third-parties to do that.)

So I'd do straightforward locking that's useful for applications, but if misused (used from a library) you get errors rather than unintended behavior. Specifically:

Calls to openlog, closelog and setlogmask can only be done in one module. After one of them is called, all will fail when called from a different module object. The restriction is reset when the module is garbage-collected.
(Typically, you only get different module objects in different interpreters – unless you do del sys.modules['syslog'], which is documented to have weird effects. So, user-facing docs can explain things in terms of interpreters rather than module objects.)

That can always be relaxed if we find “obviously good” semantics later. With more complicated solutions like swapping the state around, we could paint ourselves in a corner. Unlike a third-party library, we can't easily drop compatibility and allow people to pin <2.0.

(Anyway, even for this “simple” locking, to support multiple GILs we'd need a process-global lock.)

@ericsnowcurrently
Copy link
Member

This is a typical example of a library with process-wide state. It's meant to be used by “the” application, rather than “a” library.

My position is that process-wide state belongs to the main interpreter. That's where the app lives.

I don't think Python's stdlib should go out of this way to adapt the API to multiple libraries that aren't aware of each other.

+1

With more complicated solutions like swapping the state around, we could paint ourselves in a corner.

Agreed.

Furthermore, now that I think about it, the simplest solution is to disallow syslog.openlog() and syslog.closelog() (and syslog.setlogmask()) in subinterpreters (ergo, only allowed in the main interpreter). Then there's no need for process-wide state or a global lock. As you said, "that can always be relaxed if we find “obviously good” semantics later."

@corona10
Copy link
Member Author

@ericsnowcurrently
Do we have a public API to check whether the current interpreter is main interpreter?
Currently, we can check the status if the module is a core module but syslog module is not core module, so such APIs will be needed for 3rd party authors.

@ericsnowcurrently
Copy link
Member

Do we have a public API to check whether the current interpreter is main interpreter?

We have PyInterpreterState_Main(), so you can do current_interp == PyInterpreterState_Main(). We also have an internal _Py_IsMainInterpreter(interp), in case that would be worth making public.

@erlend-aasland
Copy link
Contributor

We might need to do a similar trick for the readline module.

carljm added a commit to carljm/cpython that referenced this issue Dec 1, 2022
* main: (112 commits)
  pythongh-99894: Ensure the local names don't collide with the test file in traceback suggestion error checking (python#99895)
  pythongh-99612: Fix PyUnicode_DecodeUTF8Stateful() for ASCII-only data (pythonGH-99613)
  Doc: Add summary line to isolation_level & autocommit sqlite3.connect params (python#99917)
  pythonGH-98906 ```re``` module: ```search() vs. match()``` section should mention ```fullmatch()``` (pythonGH-98916)
  pythongh-89189: More compact range iterator (pythonGH-27986)
  bpo-47220: Document the optional callback parameter of weakref.WeakMethod (pythonGH-25491)
  pythonGH-99905: Fix output of misses in summarize_stats.py execution counts (pythonGH-99906)
  pythongh-99845: PEP 670: Convert PyObject macros to functions (python#99850)
  pythongh-99845: Use size_t type in __sizeof__() methods (python#99846)
  pythonGH-99877)
  Fix typo in exception message in `multiprocessing.pool` (python#99900)
  pythongh-87092: move all localsplus preparation into separate function called from assembler stage (pythonGH-99869)
  pythongh-99891: Fix infinite recursion in the tokenizer when showing warnings (pythonGH-99893)
  pythongh-99824: Document that sqlite3.connect implicitly open a transaction if autocommit=False (python#99825)
  pythonGH-81057: remove static state from suggestions.c (python#99411)
  Improve zip64 limit error message (python#95892)
  pythongh-98253: Break potential reference cycles in external code worsened by typing.py lru_cache (python#98591)
  pythongh-99127: Allow some features of syslog to the main interpreter only (pythongh-99128)
  pythongh-82836: fix private network check (python#97733)
  Docs: improve accuracy of socketserver reference (python#24767)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-subinterpreters type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

5 participants