Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traces with size > 2048MB overflow #2

Open
MacroPower opened this issue May 28, 2021 · 1 comment
Open

Traces with size > 2048MB overflow #2

MacroPower opened this issue May 28, 2021 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@MacroPower
Copy link

I was testing some things with a ~3GB list and noticed that my profile was showing negative memory allocated. I was also sometimes getting overflowError at:

scale = 1.0 / (1.0 - math.exp(-avg_size / self.sample_rate))

I took a look at tcmalloc and it explains the problem and just clamps:

// Very large values of interval overflow ssize_t. If we happen to
// hit such improbable condition, we simply cheat and clamp interval
// to largest supported value.
return static_cast<ssize_t>(std::min<double>(interval, MAX_SSIZE));

However this seems to be double, where 2048MB makes me think it's int overflowing, not double. Regardless, I can do this at __init__.py:479 & __init__.py:501:

size, trace_traceback = trace
if size < 0:
    size = 2147483647

And that "fixes" the problem. And I think you could even do size = 4294967294 + size to increase the max size before you end up wrapping around again. But in the very least I'm thinking size should really be a double, because that's much more reasonable to clamp in terms of how often a size would actually be that large.

I can take a shot doing that but have never touched C before so any help/guidance would be appreciated.

@timpalpant timpalpant added the bug Something isn't working label Jul 4, 2021
@timpalpant timpalpant self-assigned this Jul 4, 2021
@timpalpant
Copy link
Owner

Hi @MacroPower do you happen to have a minimal example that can reproduce this issue? Also, what OS/arch are you running on (e.g. x86/arm/32-bit/64-bit?). It does look like there is an undesirable overflow here. Here are a few places where I could conceivably see it happening:

  1. When intercepting a Python Calloc call, we multiply size_t(nelem) * size_t(elsize), which could overflow. This looks to be present in Python's built-in tracemalloc as well, so we could try to repro with it to help narrow down the issue.
  2. When returning results to Python, we may need to use a different format here.
  3. When aggregating result statistics, we could be overflowing, although I think that Python usually extends integer sizes automatically.

timpalpant added a commit that referenced this issue Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants