Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 32-bit Architectures / Replace usize with u64 #5351

Closed
Tracked by #6818
CarlKCarlK opened this issue Jan 30, 2024 · 7 comments · Fixed by #6961
Closed
Tracked by #6818

Support 32-bit Architectures / Replace usize with u64 #5351

CarlKCarlK opened this issue Jan 30, 2024 · 7 comments · Fixed by #6961
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@CarlKCarlK
Copy link

[First, sorry for the flurry of issues and thank you for your responsiveness. Second, I apologize that this issue will be vague and without a repro case.]

Rust's file seek for local files uses u64, not usize. This allows even 32-bit OS to access regions of files beyond 4GB.

object_store's get_range and many related methods use usize. This works fine on a 64-bit OS, but on a 32-bit OS (including WASM32) using HTTP I think limits one to the first 4GB of any file.

Possible fixes:

  • Limit object_store to 64-bit systems
  • Change the range related methods to u64.

Thanks,
Carl

@CarlKCarlK CarlKCarlK added the bug label Jan 30, 2024
@tustvold
Copy link
Contributor

tustvold commented Jan 31, 2024

I think changing to use u64 would make sense as part of a broader initiative to support wasm32. However, given the crate currently doesn't support anything other than in memory for wasm32, I think it would be a pretty tough sell given the downstream impact of such a change. Not to mention quite hard to test.

@tustvold tustvold added enhancement Any new improvement worthy of a entry in the changelog and removed bug labels Jan 31, 2024
@alamb alamb changed the title Can object_store on 32-bit systems read ranges in 4GB+ files? Can object_store on 32-bit systems read ranges in 4GB+ files? (Should we use u64 vs usize) Jul 26, 2024
@flokli
Copy link

flokli commented Aug 22, 2024

I was running into this today as well. The axum-range crate returns a Range that's using u64 as ranges, and due to get_ranges in object_store only accepting usize, the conversion is a bit more ugly than necessary.

@tustvold tustvold changed the title Can object_store on 32-bit systems read ranges in 4GB+ files? (Should we use u64 vs usize) Support WASM32 Nov 29, 2024
@tustvold tustvold changed the title Support WASM32 Support 32-bit Architectures / Replace usize with u64 Nov 29, 2024
@XiangpengHao
Copy link
Contributor

doesn't support anything other than in memory for wasm32

Now with OpenDAL's awesome object_store_opendal, it can support many other backends, including http, on wasm32. (cc @Xuanwo who might be interested)

I think it's worthwhile to improve wasm32 support at this point.

@tustvold
Copy link
Contributor

tustvold commented Jan 9, 2025

The next release is a major one #6903, so now is the time if anyone wants to pick this up

@alamb
Copy link
Contributor

alamb commented Jan 10, 2025

I understand the problem theoretically, but can someone tell me what doesn't work practically?

I ask as I was trying to verify @XiangpengHao 's nice PR to make this change:

But I found I could run the tests / build for object_store on wasm32 just fine 🤔

@XiangpengHao
Copy link
Contributor

XiangpengHao commented Jan 10, 2025

what doesn't work practically?

My use case is that when trying to load hits.parquet (15GB) into browser-based parquet-viewer, it will fail to read the correct file range, because the file size is larger than what usize on wasm32 (essentially u32) can represent. To allow larger than 4GB ranges, we need to use u64 instead of usize to represent ranges.

For example: range 0xffff_ffff_ffff_0000 - 0xffff_ffff_ffff_ffff will essentially become 0xffff_0000 - 0xffff_ffff on wasm32

tests / build for object_store on wasm32 just fine

I think this is because we don't have test cases for large ranges.

@alamb
Copy link
Contributor

alamb commented Jan 11, 2025

For example: range 0xffff_ffff_ffff_0000 - 0xffff_ffff_ffff_ffff will essentially become 0xffff_0000 - 0xffff_ffff on wasm32

Got it -- makes sense. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants