Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make reading fasta less memory hungry #1458

Closed
antonkulaga opened this issue Mar 25, 2017 · 6 comments
Closed

make reading fasta less memory hungry #1458

antonkulaga opened this issue Mar 25, 2017 · 6 comments
Milestone

Comments

@antonkulaga
Copy link
Contributor

When I download latest human genome release and convert fasta to adam it eats >22 Gb memory (for 3.2 GB) file. It would be nice to optimize it somehow in terms of memory consumption.

@heuermh
Copy link
Member

heuermh commented Mar 31, 2017

Is this on a single node? What are you using for -fragment_length (the default is 10kb)?

@antonkulaga
Copy link
Contributor Author

Yes, it is signle node with default fragment_length

@antonkulaga
Copy link
Contributor Author

Any progress on this?

@fnothaft
Copy link
Member

Sorry, no major news here. We've been backlogged with the 0.23.0 release and Python/R API additions. How are you using the output of the conversion? If you are using it for a broadcast object via the ReferenceFile API, you may want to consider the TwoBitFile implementation, which is much more efficient from a memory capacity perspective.

@fnothaft
Copy link
Member

I'll take a look at this soon though; we added sequence dictionary loading code that may eliminate the need for some of the code and caching that is in the FASTA load path.

@heuermh
Copy link
Member

heuermh commented Jun 29, 2019

Fixed by #2175

@heuermh heuermh closed this as completed Jun 29, 2019
@heuermh heuermh added this to the 0.28.0 milestone Jun 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants