-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-111965: Using critical sections to make io.StringIO
thread safe.
#112116
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aisk.
I think we want critical sections for stringio_closed
and stringio_closed
for the accesses to closed
because the object might be concurrent closed. You'll need to be careful because the CHECK_
macros may return
and it's not safe to return from the middle of a critical section. It may be easiest to split those functions in two.
Similarly for stringio_newlines
we want the critical section to cover at least the CHECK_INITIALIZED
and CHECK_CLOSED
.
Some basic performance measurements would not be a bad idea, but don't go overboard. They'll be more important after other changes to I/O objects.
We don't need critical sections in |
Updated, and made a small performance test on my local machine, and not seen any noticeable performance difference before and after the commits. I just realized that in the default build (no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aisk. What's the performance impact on --disable-gil
builds using your benchmark?
I suggested a different structure for the three getter functions that I think will make them more consistent with the rest of the file (in particular, not having to inline CHECK_INITIALIZED
/CHECK_CLOSED
).
I found that it's really common for the getters and setters to add the critical section guard, and makes the work repeatability and error-prone. Can we add some code generation process, like the argument clinic stuff, or just introduce a C macro to reduce the repeatability, like:
|
@aisk, Yeah I think adding support for getters/setters to Argument Clinic is the way to do it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great!
@corona10, would you please review this as well?
Modules/_io/stringio.c
Outdated
@@ -1,6 +1,7 @@ | |||
#include "Python.h" | |||
#include <stddef.h> // offsetof() | |||
#include "pycore_object.h" | |||
#include "pycore_critical_section.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: #112193 (comment)
Modules/_io/stringio.c
Outdated
|
||
state: object | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state: object | |
state: object | |
/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@colesbury
Out of curiosity:
Should user explicitly care about thread-safe when implementing dunder methods of object for each case?
Or Will it be handled from interpreter side?
Modules/_io/clinic/stringio.c.h
Outdated
"\n"); | ||
|
||
#define _IO_STRINGIO___SETSTATE___METHODDEF \ | ||
{"__setstate__", _PyCFunction_CAST(_io_StringIO___setstate__), METH_FASTCALL|METH_KEYWORDS, _io_StringIO___setstate____doc__}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__setstate__
should be METH_O
not METH_FASTCALL|METH_KEYWORDS
See: https://github.com/python/cpython/pull/112116/files#r1398121900
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized the difference, thanks a lot for point out
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Modules/_io/stringio.c
Outdated
#include "pycore_object.h" | ||
#include "pycore_critical_section.h" // Py_BEGIN_CRITICAL_SECTION() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we update AC tool, we don't have to add a header manually from now on.
#112251
Please rebase the PR and run make clinic
one more time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Co-authored-by: Donghee Na <donghee.na@python.org>
da13ca2
to
0fbf608
Compare
Added the "critical sections" tags for most methods of the
StringIO
class, except the__new__
and__init__
. The reference implementation from colesbury/nogil-3.12@6323ca60f9 dosen't do it also. I think it's because with the__new__
and__init__
process, other threads will not touch the newly created instance, so it does not need lock.And for the getter closed and linebuffering, there are only a few read-only field access for conditioning (and will raise exception if failed), so it's not need to protect.
I don't know if I'm right for these two questions. And do we need some performance benchmark for the change? If so, I can provide some help later.
--disable-gil
builds) #111965