make reading fasta less memory hungry #1458

antonkulaga · 2017-03-25T10:59:03Z

When I download latest human genome release and convert fasta to adam it eats >22 Gb memory (for 3.2 GB) file. It would be nice to optimize it somehow in terms of memory consumption.

heuermh · 2017-03-31T17:25:52Z

Is this on a single node? What are you using for -fragment_length (the default is 10kb)?

antonkulaga · 2017-04-01T14:53:05Z

Yes, it is signle node with default fragment_length

antonkulaga · 2017-07-23T16:08:16Z

Any progress on this?

fnothaft · 2017-07-23T17:57:01Z

Sorry, no major news here. We've been backlogged with the 0.23.0 release and Python/R API additions. How are you using the output of the conversion? If you are using it for a broadcast object via the ReferenceFile API, you may want to consider the TwoBitFile implementation, which is much more efficient from a memory capacity perspective.

fnothaft · 2017-07-23T18:02:49Z

I'll take a look at this soon though; we added sequence dictionary loading code that may eliminate the need for some of the code and caching that is in the FASTA load path.

heuermh · 2019-06-29T23:48:05Z

Fixed by #2175

heuermh mentioned this issue Aug 7, 2017

kryo buffer overflow when converting fastas from CLI to adam #1660

Closed

heuermh closed this as completed Jun 29, 2019

heuermh added this to the 0.28.0 milestone Jun 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make reading fasta less memory hungry #1458

make reading fasta less memory hungry #1458

antonkulaga commented Mar 25, 2017

heuermh commented Mar 31, 2017

antonkulaga commented Apr 1, 2017

antonkulaga commented Jul 23, 2017

fnothaft commented Jul 23, 2017

fnothaft commented Jul 23, 2017

heuermh commented Jun 29, 2019

make reading fasta less memory hungry #1458

make reading fasta less memory hungry #1458

Comments

antonkulaga commented Mar 25, 2017

heuermh commented Mar 31, 2017

antonkulaga commented Apr 1, 2017

antonkulaga commented Jul 23, 2017

fnothaft commented Jul 23, 2017

fnothaft commented Jul 23, 2017

heuermh commented Jun 29, 2019