Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix String::Buffer and IO::Memory capacity to grow beyond 1GB #13989

Merged

Conversation

straight-shoota
Copy link
Member

@straight-shoota straight-shoota commented Nov 16, 2023

The byte buffer for IO::Memory and String::Buffer grows exponentially by the next power of two bigger then the currently requested size. So for example, if space is requested for 78 bytes, it allocates for a capacity of 128 bytes (1 << 7).

The maximum size of a buffer is Int32::MAX, which is just one below a power of tw ((1 << 31) - 1). So if space is requested for more than 1 << 30 bytes, the next power of two would be 1 << 31 which is out of range and raises ArithmeticError.
This means, the sizes between (1 << 30) + 1 and (2 << 31) - 1 would all be valid but cannot be reached by the dynamically growing buffer. If the actually required bytesize does not exeed (2 << 31) - 1, it should be possible to allocate that much.

This patch clamps the maximum cacpacity to Int32::MAX. That enables sizes in that range and there will only be an error if that is definitely not enough space.

Resolves #13987

@straight-shoota
Copy link
Member Author

straight-shoota commented Nov 18, 2023

I have no idea what's going on in wasm32 CI (https://github.com/crystal-lang/crystal/actions/runs/6895836942/job/18764260201?pr=13989). The error seems definitely triggered by the changes in this PR (more precisely d227e8b). But I don't see any direct connection to the failing code, which is in some time function in the spec framework (which runs after the example is completed).
No other platform shows any memory issues and I've double checked the code, so I'm quite certain the changes are legit.
Maybe it triggers a bug in wasm?

Error: failed to run main module `wasm32_std_spec.wasm`

Caused by:
    0: failed to invoke command default
    1: In func wasi_snapshot_preview1::clock_time_get at convert Clockid: Int conversion error: TryFromIntError(())
       wasm backtrace:
           0: 0xbe35d8 - <unknown>!__wasi_clock_time_get
           1: 0xbe3231 - <unknown>!__clock_gettime
           2: 0x633ce5 - <unknown>!*Crystal::System::Time::monotonic:Tuple(Int64, Int32)
           3: 0x5808aa - <unknown>!*Time::monotonic:Time::Span
           4: 0x6d1ebf - <unknown>!*Spec::Example#internal_run<Time::Span, Proc(Nil)>:(Array(Spec::Result) | Nil)
           5: 0x6d1c6c - <unknown>!*Spec::Example#run:Nil
           6: 0x6d065c - <unknown>!*Spec::ExampleGroup@Spec::Context#internal_run:Nil
           7: 0x6d0302 - <unknown>!*Spec::ExampleGroup#run:Nil
           8: 0x63edf4 - <unknown>!*Spec::RootContext@Spec::Context#internal_run:Nil
           9: 0x63eb08 - <unknown>!*Spec::RootContext#run:Nil
          10: 0xccbfb - <unknown>!~procProc(Int32, (Exception | Nil), Nil)@src/spec/dsl.cr:208
          11: 0x628fa0 - <unknown>!*Crystal::AtExitHandlers::run<Int32, (Exception+ | Nil)>:Int32
          12: 0xbe0092 - <unknown>!*Crystal::exit<Int32, (Exception+ | Nil)>:Int32
          13: 0xbe0033 - <unknown>!*Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32
          14: 0xcbb78 - <unknown>!main
          15: 0xcbbfa - <unknown>!main
          16: 0xbe3522 - <unknown>!__main_void
          17: 0xcbba9 - <unknown>!_start

@straight-shoota straight-shoota force-pushed the fix/grow-byte-buffer-max branch 3 times, most recently from 1c97581 to abc9a02 Compare November 24, 2023 10:26
@straight-shoota
Copy link
Member Author

This needs further investigation into the wasm runtime.
I'm skipping the >1 GB spec for String::Builder on wasm32 for now.

@straight-shoota straight-shoota marked this pull request as ready for review November 24, 2023 10:46
@ysbaddaden
Copy link
Contributor

ysbaddaden commented Dec 15, 2023

About the WASM32 issue: Strings are always terminated with a null byte (\0), so shouldn't the maximum size be Int32::MAX - 1? In fact the String class is special and it actually allocates bytesize + HEADER_SIZE + 1 (for the null byte), so maybe the maximum capacity in String::Builder should be Int32::MAX - String::HEADER_SIZE - 1?

@straight-shoota
Copy link
Member Author

Yeah, that's correct. This extra capacity for the final null byte should be taken care of in all capactity calculations, though. String::Builder#initialize(capacity) already adds String::HEADER_SIZE + 1 to the capacity parameter, so maximum capacity should correctly be Int32::MAX because all extra requirements are already counted in.
And the spec works correctly in compiled mode building a string with a length of Int32::MAX - String::HEADER_SIZE - 1 bytes.

@ysbaddaden
Copy link
Contributor

The initial buffer allocation accounts for HEADER_SIZE + 1 but the reallocations don't, though you likely account for it in the bytesize when you check for IO::EOFError.

BTW: a String bytesize can be exactly Int32::MAX + 1 for the null byte:

slice = Slice(UInt8).new(Int32::MAX)
slice[-1] = 97

str = String.new(slice.to_unsafe, Int32::MAX)
p str[-1] # => 'a' (97)

I tried to make String::Builder always reallocate to bytesize + header size + 1, but I get crashes.

@ysbaddaden
Copy link
Contributor

I reproduce the wasm32 error on my local with your branch. I wrote a patch that always allocates bytesize + header size + 1 and the error is gone. That really sounds like a buffer overflow.

I'm trying to apply it on your branch. So far I'm having spec errors because I can allocate up to Int32::MAX bytesize (yes, it's accepted in String.new for example) when your branch only allowed up to Int32::MAX - 1 (to account for the null byte in the bytesize).

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Dec 17, 2023

The spec passes until we try to build the actual string object, which doesn't allocate anything, but merely writes a few things on the buffer (the trailing null byte and the string header).

The spec still succeeds, but calling Time.monotonic in src/spec/example.cr:51 will always crash wasmtime. I tried to skip the eventual writes to the buffer, or doing anything, but it still crashes on the same call to Time.mononotic 😭

EDIT: I tried wasmtime v15.0.1, just in case, but I get the same result.

@straight-shoota
Copy link
Member Author

straight-shoota commented Jan 2, 2024

@ysbaddaden Do you think this we can merge this?

@ysbaddaden
Copy link
Contributor

Yes, the problem isn't your code change, but allocating that much memory in wasm seems to trigger an issue with the wasm libs.

The only detail related to the patch is that we can allocate one more byte (see above comment) but what's 1 byte compared to 1GB 😂

# range (1 << 30) < new_bytesize < Int32::MAX
return Int32::MAX if new_bytesize > 1 << 30

Math.pw2ceil(new_bytesize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but I wonder if we shouldn't change the formula like we did for array

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the converse is also true; that we should cap the max of array too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe time for a Growable module...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=> #14173

@straight-shoota straight-shoota added this to the 1.11.0 milestone Jan 4, 2024
@straight-shoota straight-shoota merged commit b8a24e7 into crystal-lang:master Jan 5, 2024
54 of 55 checks passed
@straight-shoota straight-shoota deleted the fix/grow-byte-buffer-max branch January 5, 2024 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Arithmetic Overflow on large HTTP response
3 participants