Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high peak memory consumption of WaveletMatrix::from_ints #44

Closed
KonradHoeffner opened this issue Nov 21, 2022 · 7 comments · Fixed by #45
Closed

high peak memory consumption of WaveletMatrix::from_ints #44

KonradHoeffner opened this issue Nov 21, 2022 · 7 comments · Fixed by #45
Assignees

Comments

@KonradHoeffner
Copy link

Building a 205 MB wavelet matrix using WaveletMatrix::from_ints temporarily consumes 3.9 GB memory, which may be too much for small VMs.
Is there a way to reduce memory consumption while constructing a wavelet matrix?

Screenshot from 2022-11-21 10-02-05

@kampersanda
Copy link
Owner

kampersanda commented Nov 22, 2022

@KonradHoeffner

In v0.5.0, there is no approach to reduce the memory consumption. The large memory is due to that WaveletMatrixBuilder always maintains input values in Vec<usize>. Using sucds::CompactVector instead of Vec<usize> can achieve memory reduction when the input alphabet size is small.

The large memory consumption is inconvenient, so we'll improve the implementation in v0.5.1. Thank you for the report!

@kampersanda
Copy link
Owner

@hirosassa I'd like to do this. No problem? Or, have you already started coding?

@hirosassa
Copy link
Collaborator

@kampersanda Not yet. Please go ahead!

@kampersanda kampersanda linked a pull request Nov 23, 2022 that will close this issue
@kampersanda kampersanda self-assigned this Nov 23, 2022
@kampersanda
Copy link
Owner

kampersanda commented Nov 23, 2022

@KonradHoeffner #45 added an option to specify the minimum number of bits to store an input integer in wavelet matrix construction. From v0.6.0, you will be able to reduce memory consumption through WaveletMatrixBuilder::with_width() (see the example usage). For example, this experiment shows memory reduction to 25%.

@KonradHoeffner
Copy link
Author

Wow, thanks for the quick implementation!
I am trying it out right now, but I noticed that WaveletMatrixBuilder::push panics if the value doesn't fit instead of producing a Result, is that intended for purposes of faster execution?

@KonradHoeffner
Copy link
Author

Tested it, works perfectly, thanks!

@kampersanda
Copy link
Owner

@KonradHoeffner

I am trying it out right now, but I noticed that WaveletMatrixBuilder::push panics if the value doesn't fit instead of producing a Result, is that intended for purposes of faster execution?

Oops! As one of the primitive operations such as vec[i], I naturally defined it as Panicable; however it should return Result. I'll modified the API. Thank you for the meaningful question!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants