Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arithmetic Overflow on large HTTP response #13987

Closed
amini-allight opened this issue Nov 15, 2023 · 6 comments · Fixed by #13989
Closed

Arithmetic Overflow on large HTTP response #13987

amini-allight opened this issue Nov 15, 2023 · 6 comments · Fixed by #13989

Comments

@amini-allight
Copy link

Apologies if this has already been reported before, I wasn't able to find it.

I was using the following code to download a large file (1.1 GB).

require "http/client"

response = HTTP::Client.get "https://some.url.that.returns.a.large.file/"

Unfortunately I can't include the real URL because it's not publicly accessible but it's fairly easy to replicate. I got an Arithmetic Overflow exception with the following backtrace:

Unhandled exception: Arithmetic overflow (OverflowError)
  from /usr/lib/crystal/int.cr:464:31 in 'next_power_of_two'
  from /usr/lib/crystal/math/math.cr:742:5 in 'pw2ceil'
  from /usr/lib/crystal/string/builder.cr:46:26 in 'write'
  from /usr/lib/crystal/io.cr:568:11 in 'gets_to_end'
  from /usr/lib/crystal/http/client/response.cr:81:17 in 'consume_body_io'
  from /usr/lib/crystal/http/client/response.cr:105:11 in 'from_io?'
  from /usr/lib/crystal/http/client.cr:610:5 in 'exec_internal_single'
  from /usr/lib/crystal/http/client.cr:592:18 in 'exec_internal'
  from /usr/lib/crystal/http/client.cr:585:7 in 'exec'
  from /usr/lib/crystal/http/client.cr:721:5 in 'exec'
  from /usr/lib/crystal/http/client.cr:753:7 in 'exec'
  from /usr/lib/crystal/http/client.cr:410:3 in 'get'

Looking through the referenced code it appears the issue is HTTP::Client tries to resize the internal buffer it's using to store the response to the next largest power of 2. 1.1 GB is slightly more than 2³⁰ bytes so the next one up is 2³¹ or 2,147,483,648. That's 1 beyond the max of the signed 32 bit integer being used to store the size, resulting in an overflow and crash.

Is this intended behavior? Is there any way to switch Crystal to use 64 bit integers for memory/container sizes or any plans to change the default? 2³⁰ bytes isn't that high of a limit for some applications.

This occurred on Crystal 1.10.1.

@Blacksmoke16
Copy link
Member

FWIW in this case you'd prob be better off streaming the file instead of loading it all into memory. Something like:

File.open "large.file", "w" do |file|
  HTTP::Client.get "https://some.url.that.returns.a.large.file/" do |resp|
    IO.copy resp.body_io, file
  end
end

@amini-allight
Copy link
Author

That's a good idea, I was trying to avoid that for speed reasons but I might have to just do it and pair it with a ramfs to compensate

@straight-shoota
Copy link
Member

As a workaround without storing to disk, you could also copy the file into a byte buffer as long as you ensure it's big enough.

io = IO::Memory.new(Int32::MAX) # or chose smaller, fitting size if you know it in advance
HTTP::Client.get "https://some.url.that.returns.a.large.file/" do |resp|
  IO.copy resp.body_io, io
end

Maybe you don't need all data in memory though and you can just start consuming it on the go?

@straight-shoota
Copy link
Member

This is a very interesting problem with different layers.

I think the generic, dynamically growing byte buffer should clamp to fit the maximum size (i.e. Int32::MAX) and only fail if that is definitely not enough space. The current strategy to grow capacity to the next power of two wastes quite a lot of the theoretically avilable capacity. 1 << 31 is out of range, but (1 << 31) - 1 would fit. Yet the maximum allocatable size is actually 1 << 30.

Another issue is specific to the HTTP client: Not sure if it's the case with this particular HTTP resource, but it would be quite likely for the HTTP response to contain a Content-Length header. The client could use that information to allocate an aptly sized byte buffer from the start, without the need for dynamic growth.

@amini-allight
Copy link
Author

As a workaround without storing to disk, you could also copy the file into a byte buffer as long as you ensure it's big enough.

io = IO::Memory.new(Int32::MAX) # or chose smaller, fitting size if you know it in advance
HTTP::Client.get "https://some.url.that.returns.a.large.file/" do |resp|
  IO.copy resp.body_io, io
end

Maybe you don't need all data in memory though and you can just start consuming it on the go?

This seems to be working for me. Although when using the content length (which I do know in advance) to initialize the IO::Memory object I need to add one additional byte or the same resize crash still occurs. I'm not sure why this is. I assume the buffer must be resizing on fill rather than resizing on the first attempt to write past the end but the math around /usr/lib/crystal/io/memory.cr:96 doesn't seem to support this idea.

@straight-shoota
Copy link
Member

Not sure what exactly is happing there, but if a String::Builder is involved, I have a pretty good idea. It appends an byte for the final null character that every Crystal string has for compatibility with C strings. But it doesn't properly take this additional size requirement into account when allocating.
So basically if you allocate for the body size and copy all the bytes into the IO, it's full (as you would expect). Adding a final null character is another write_byte(0) call which then needs to allocate more memory and boom.
This should be fixed by #13989. Maybe you could try that patch with your sized workaround to see if it's still an issue?

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants