Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys.getsizeof wrong for Py3k bool objects #47940

Closed
schuppenies mannequin opened this issue Aug 26, 2008 · 9 comments
Closed

sys.getsizeof wrong for Py3k bool objects #47940

schuppenies mannequin opened this issue Aug 26, 2008 · 9 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@schuppenies
Copy link
Mannequin

schuppenies mannequin commented Aug 26, 2008

BPO 3690
Nosy @loewis, @mdickinson
Files
  • bool_sizeof.patch: Patch against py3k branch, revision 66040
  • smallints_sizeof.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-03-19.09:20:41.529>
    created_at = <Date 2008-08-26.20:52:02.419>
    labels = ['interpreter-core', 'type-bug']
    title = 'sys.getsizeof wrong for Py3k bool objects'
    updated_at = <Date 2010-03-19.09:20:41.528>
    user = 'https://bugs.python.org/schuppenies'

    bugs.python.org fields:

    activity = <Date 2010-03-19.09:20:41.528>
    actor = 'loewis'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-03-19.09:20:41.529>
    closer = 'loewis'
    components = ['Interpreter Core']
    creation = <Date 2008-08-26.20:52:02.419>
    creator = 'schuppenies'
    dependencies = []
    files = ['11264', '11568']
    hgrepos = []
    issue_num = 3690
    keywords = ['patch']
    message_count = 9.0
    messages = ['71996', '72742', '73208', '73228', '73250', '73252', '73634', '101310', '101313']
    nosy_count = 4.0
    nosy_names = ['loewis', 'mark.dickinson', 'schuppenies', 'Alexander.Belopolsky']
    pr_nums = []
    priority = 'normal'
    resolution = 'wont fix'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3690'
    versions = ['Python 3.0', 'Python 3.1']

    @schuppenies
    Copy link
    Mannequin Author

    schuppenies mannequin commented Aug 26, 2008

    sys.getsizeof returns wrong results for bool objects in Python 3000.
    Although bool objects use the same datatype as long objects, they are
    allocated differently. Thus, the inherited long_sizeof implementation is
    incorrect. The applied patch addresses this issue.

    @schuppenies schuppenies mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Aug 26, 2008
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 7, 2008

    I'm not sure this is a bug. sys.getsizeof doesn't take padding in the
    malloc implementation into account, either, so a long object that
    accounts to 22 bytes (such as the number 1) uses at least 24 bytes,
    also. In any case, I also think this doesn't matter much either way.

    @schuppenies
    Copy link
    Mannequin Author

    schuppenies mannequin commented Sep 14, 2008

    As I understood the long object allocation it is implemented as
    "PyObject_MALLOC(sizeof(PyVarObject) + size*sizeof(digit))" to avoid
    this allocation of extra 2 bytes. So from my understanding, the number 0
    allocates memory for the reference count, type, and ob_size, whereas any
    other number allocates this plus additional memory required by the
    number of digits.

    Looking at bool objects in Py3k, arn't they fixed-sized memory-wise,
    always allocating the the padded size of _longobject?

    In any case, I also think this doesn't matter much either way.
    Why do you think so?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 14, 2008

    > In any case, I also think this doesn't matter much either way.
    Why do you think so?

    What's the actual difference that this change makes? At most 8
    bytes per object, right? And for two objects in total. So if somebody
    would compute memory consumption, they might be off by not more
    than 14 bytes, in total. Compared to all the other errors that memory
    computation makes (e.g. malloc headers, rounding-up to multiples of
    8 in obmalloc) which aren't accounted-for in sys.getsizeof, this
    difference is negligible.

    What's more, the small_ints aren't dynamically allocated, either,
    but instead, each small_int takes a complete PyLongObject. If
    that was also considered in long_sizeof, the computation would happen
    to be completely correct for bool also.

    @schuppenies
    Copy link
    Mannequin Author

    schuppenies mannequin commented Sep 15, 2008

    What's the actual difference that this change makes?

    It would provide more accurate results, even in the light of being not
    perfect.

    [..] each small_int takes a complete PyLongObject. If that was also
    considered in long_sizeof, the computation would happen to be
    completely correct for bool also.

    So how should this bug report be handled? Provide a patch to handle
    getsizeof correctly for small_ints? 'wont fix' because there are issues
    anyway? I would prefer the former and try to come up with a patch if you
    think it is worthwhile.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 15, 2008

    So how should this bug report be handled? Provide a patch to handle
    getsizeof correctly for small_ints? 'wont fix' because there are issues
    anyway? I would prefer the former and try to come up with a patch if you
    think it is worthwhile.

    Fixing it for small_ints would be fine with me - there is specialized
    code for long sizeof already. It's the explosion of boolobject that I
    dislike.

    @schuppenies
    Copy link
    Mannequin Author

    schuppenies mannequin commented Sep 23, 2008

    Attached is a patch which takes the preallocation of small_ints into
    account. What do you think?

    @mdickinson
    Copy link
    Member

    I don't think there's anything worth fixing here. It's true that getsizeof is sometimes going to return results that are too small, because there are a good few places in the longobject internals where it's not predictable in advance exactly how much space is needed, so memory is overallocated.

    The case of the small int 0 is one example of this, but it's far from the only one. For example, if you multiply a 2-limb long by another 2-limb long the code will always allocate 4 limbs for the result, even though it'll often turn out that the result fits in 3 limbs. Should sys.getsizeof return base_size + 4 * sizeof_limb in that case, instead of base_size + 3 * sizeof_limb? That would be difficult to achieve, since long objects don't currently know how much space was actually allocated to hold them.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 19, 2010

    Closing this as "won't fix", then.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant