Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

never ending run #27

Open
romaingroux opened this issue Nov 8, 2022 · 2 comments
Open

never ending run #27

romaingroux opened this issue Nov 8, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@romaingroux
Copy link

Hi @PengNi,

I recently tried to run ccsmeth and hit an unexpected issue. The run never ends. I run ccsmeth from a docker image on an entire human genome (the T2T v1.0 release). I have a 42X SequelII coverage dataset as input.

Here is the command:

ccsmeth call_mods --tseed 20221101 --input all.ccs.chm13v1_0.bam --ref chm13.draft_v1.0.fasta --model_file /ccsmeth/models/model_ccsmeth_5mCpG_call_mods_attbigru2s_b21.v1.ckpt --output all.ccs.chm13v1_0.CpG-5mC.call_mods --threads 30 --threads_call 3 --model_type attbigru2s --rm_per_readsite --mode align

The genome file is of regular size, 2.9GB. The bam files is 353GB big, it corresponds to a coverage of ~40X. It's sorted, indexed, normally nothing wrong on this end.

What happens:

The program starts. I see 30 processes spawning as expected. I also see the creation of a all.ccs.chm13v1_0.CpG-5mC.call_mods.per_readsite.tsv file. There is active writing on this file, it's getting bigger and bigger. The memory consumption steadily grows until reaching huge amounts of memory. Then there is a reallocation process going on. It results in a sudden decrease of the memory used by the process and the docker containers enters a "brain death" state. There is hardly anything happening in there and it just will never resume nor finish (I waited until one week).

What I expect:

To get an error message or to complete the run instead of this "death" state.

I have tried to sub-sample heavily from the BAM file to obtain a final 1X coverage in order to run a test. In this case, the run finishes and I get a modbam file. Is this simply a matter of input size? Should we run ccsmeth on separated chromosomes or avoid big datasets?

Thank you in advance

@PengNi
Copy link
Owner

PengNi commented Nov 8, 2022

Hi @romaingroux , thank you very much for using ccsmeth and for your suggestions on fixing this issue. I didn't get this issue before. I will try to replicate this issue and try to fix it. Also, may you show me the log of this run so I can check where the run got stuck?

In the meantime, I guess it may be related to the RAM size. What is the RAM size and how many CPU cores of your machine - maybe less threads like --threads 20 will work? Also, I'd suggest to run ccsmeth with GPU if GPU is available. Using only CPU may be 100 times slower (Maybe there will be a faster lightweight model of ccsmeth in the future though). So if GPU is not available, primrose of PacBio seems a better option, cause it is super fast.

Best,
Peng

@romaingroux
Copy link
Author

Thank you for this very rapid answer.

Regarding the machine specs, I actually run jobs in pods on a kubernetes cluster. The pod was allocated 30 CPUs and 260GB of RAM. Max RAM allocation I can do is 386GB. So, as you proposed, I guess I'll try to increase the RAM allocation and decrease the number of "threads". I guess the parallelization is based on python processes. Am I right?

For the log, I don't have any. I'll try to produce one.

Finally, it's a bit unfortunate but I don't have access to a GPU machine. But thanks for pointing this out anyway.

@PengNi PengNi added the bug Something isn't working label Nov 15, 2022
@PengNi PengNi added bug Something isn't working and removed bug Something isn't working labels Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants