-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__sizeof__
incorrectly ignores the size 1 array in PyVarObjects (bool, int, tuple) when ob_size is 0
#100637
Comments
@corona10 The issue focuses on
|
Line 184 in 98308db
This code will explain why the intention is the same as int. |
@corona10 Yes, Also, |
Yeah, since the semantics of |
We could fix this for What we can't do is fix it by messing with the value of the The other thing we can't do is fix this for general There are other aspects that
Can you elaborate on what you're doing here? This may not be a task that |
See also #47940 for previous discussion. |
Given that people have been discovering and re-filing issues about this for well over a decade, maybe it's worth a remark in the source? Say, in |
This was essentially what I was going for. I agree changes on ob_size would be impossible.
This is true, but on the other hand, is there a reason 0, False, and empty tuples are even allocated and use a 1 length array? The code paths I've seen never touch the item array when ob_size is 0. So it seems it doesn't need to exist at all? C99 supports declaring the respective structs with digit ob_digit[]; |
+1 on this as well, and also regarding the Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results |
For |
Regarding tuples: it doesn't seem worth worrying too much about the empty tuple, since it's a singleton, so we're only taking about an extra 8 bytes per interpreter instance. For tuple subclasses, I'm not seeing the behaviour you describe. On main, after instrumenting calls to |
This leads me to think that the ob_digit[1] of [0] is not considered just an "extra allocation" but actually used by the int / bool structs. Such it seems that this should definitely be reported in Despite int(0) being cached, user subclasses are still affected by the class UserInt(int):
...
zero = 0
uzero = UserInt()
print(zero.__sizeof__())
>> 24
print(uzero.__sizeof__())
>> 24 In addition, since 24 falls within a 8 byte allocation boundary, users are mislead to think the allocated size is 24 when it is actually 32 due to the requested being 28 |
Would you be interested in working on a PR for this? |
Just curious: Is that new or common? I thought it's 16 (at least usually), so 24 or 28 wouldn't make a difference, both actually costing 32. |
Sorry, yes; I think you're right: cpython/Include/internal/pycore_obmalloc.h Line 132 in e83f88a
|
Following up on this, @ionite34 It looks to me as though tuples should be out of scope for this issue. If there's behaviour that you think should be changed for tuple or tuple subclasses, I'd suggest opening a new issue for that, so that it can be dealt with separately. |
Yes, sure. Though it'd be my first contribution, not too sure about the process but if I'm reading the guide correctly I can just make the changes on a branch and do a PR? And so the current scope of a fix would be changing Implementation-wise I think it would be simplest just to take |
Yep. If you ping me when it's done, I'll review.
Yes, preferably along with a unit test or two. (I haven't checked the current tests, but I'd hope that making the suggested change would trigger a test failure, so that it's obvious which part of the tests need to be updated.)
SGTM (with |
Last word on tuples: it turns out that there is an under-reporting issue for |
…lement ob_digit array for 0 and False (#100663) Fixes behaviour where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for the zero int. Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
…he 1 element ob_digit array for 0 and False (pythonGH-100663) Fixes behaviour where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for the zero int. (cherry picked from commit d7e7f79) Co-authored-by: Ionite <dev@ionite.io> Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
…the 1 element ob_digit array for 0 and False (GH-100663) (#100717) gh-100637: Fix int and bool __sizeof__ calculation to include the 1 element ob_digit array for 0 and False (GH-100663) Fixes behaviour where int (and subtypes like bool) __sizeof__ under-reports true size as it did not take into account the size 1 `ob_digit` array for the zero int. (cherry picked from commit d7e7f79) Co-authored-by: Ionite <dev@ionite.io> Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Bug report
tuple.__sizeof__
,int.__sizeof__
, andbool.__sizeof__
incorrectly returns a size using an algorithm that assumes theob_item
/ob_digit
array is size 0 whenob_size
is 0, when this is not true.This result of
__sizeof__
suggests that int(0) has anob_digit
array size of 0, and thus 4 less bytes (considering array dtype of c_uint32). However, this is not correct.https://github.com/python/cpython/blob/3.11/Include/cpython/longintrepr.h#L79-L82
Code paths for the creation of
int(0)
and other PyVarObject types all initialize with an array of size 1. Such the struct ofint(0)
holds a c_uint32 array of size 1, and element [0] of 0.The result
24
of(0).__sizeof__()
suggests that this array does not exist for sizing purposes. This seems to be misleading for performance calculations on memory size, and creates unsafe scenarios when moving memory.Implementations of
__sizeof__
(int) (bool also inherits this)
https://github.com/python/cpython/blob/main/Objects/longobject.c#L5876-L5884
(tuple)
https://github.com/python/cpython/blob/main/Objects/typeobject.c#L5929-L5942
Your environment
Linked PRs
The text was updated successfully, but these errors were encountered: