Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast5s failing - no mods called #21

Closed
acread opened this issue May 11, 2022 · 2 comments
Closed

Fast5s failing - no mods called #21

acread opened this issue May 11, 2022 · 2 comments

Comments

@acread
Copy link

acread commented May 11, 2022

I'm very excited to analyze some ONT data with DeepSignal-Plant!

I'm having an issue that looks similar to the one described in issue #8 -- I'm basecalling with Guppy, converting to single Fast5s, using Tombo to annotate the Fast5s with the fastq data, resquiggling, and then running deepsignal-plant...

I am able to process the example Arabidopsis data and generate the final outputs, but things seem to get stuck with my data. I'm not sure if it's a problem in Tombo or deepsignal-plant.
Is there a way to check my Tombo output to confirm that it is compatible with deepsignal? I'm not getting any errors with Tombo but I also don't use this package much.

Log output:
`# ===============================================

parameters:

input_path:
/scratch.global/read0094/Nanopore/ME034vRun/SingleFast5s/0/
model_path:
model.dp2.CNN.arabnrice2-1_120m_R9.4plus_tem.bn13_sn16.both_bilstm.epoch6.ckpt
model_type:
both_bilstm
seq_len:
13
signal_len:
16
layernum1:
3
layernum2:
1
class_num:
2
dropout_rate:
0
n_vocab:
16
n_embed:
4
is_base:
yes
is_signallen:
yes
batch_size:
512
hid_rnn:
256
result_file:
fast5s.testC.call_mods.tsv
recursively:
yes
corrected_group:
RawGenomeCorrected_000
basecall_subgroup:
BaseCalled_template
reference_path:
/home/springer/read0094/SetariaStuff/ME034V/ME034_v0.4.Chr1_9.fa
is_dna:
yes
normalize_method:
mad
motifs:
C
mod_loc:
0
f5_batch_size:
10
region:
None
positions:
None
nproc:
30
nproc_gpu:
6

===============================================

[main] call_mods starts..
cuda availability: True
4000 fast5 files in total..
parse the motifs string..
read genome reference file..
read_fast5 process-1089541 starts
read_fast5 process-1089541 ending, proceed 180 fast5s
read_fast5 process-1089546 starts
read_fast5 process-1089546 ending, proceed 190 fast5s
read_fast5 process-1089533 starts
read_fast5 process-1089533 ending, proceed 90 fast5s
read_fast5 process-1089550 starts
read_fast5 process-1089550 ending, proceed 180 fast5s
read_fast5 process-1089549 starts
read_fast5 process-1089549 ending, proceed 200 fast5s
read_fast5 process-1089540 starts
read_fast5 process-1089540 ending, proceed 180 fast5s
read_fast5 process-1089543 starts
read_fast5 process-1089543 ending, proceed 180 fast5s
read_fast5 process-1089536 starts
read_fast5 process-1089536 ending, proceed 150 fast5s
read_fast5 process-1089542 starts
read_fast5 process-1089542 ending, proceed 190 fast5s
read_fast5 process-1089551 starts
read_fast5 process-1089551 ending, proceed 190 fast5s
read_fast5 process-1089548 starts
read_fast5 process-1089548 ending, proceed 190 fast5s
read_fast5 process-1089552 starts
read_fast5 process-1089552 ending, proceed 190 fast5s
read_fast5 process-1089530 starts
read_fast5 process-1089530 ending, proceed 200 fast5s
read_fast5 process-1089539 starts
read_fast5 process-1089539 ending, proceed 190 fast5s
read_fast5 process-1089534 starts
read_fast5 process-1089534 ending, proceed 130 fast5s
read_fast5 process-1089537 starts
read_fast5 process-1089537 ending, proceed 200 fast5s
read_fast5 process-1089531 starts
read_fast5 process-1089531 ending, proceed 150 fast5s
read_fast5 process-1089535 starts
read_fast5 process-1089535 ending, proceed 160 fast5s
read_fast5 process-1089545 starts
read_fast5 process-1089545 ending, proceed 190 fast5s
read_fast5 process-1089544 starts
read_fast5 process-1089544 ending, proceed 180 fast5s
read_fast5 process-1089538 starts
read_fast5 process-1089538 ending, proceed 200 fast5s
read_fast5 process-1089547 starts
read_fast5 process-1089547 ending, proceed 190 fast5s
read_fast5 process-1089532 starts
read_fast5 process-1089532 ending, proceed 100 fast5s
call_mods process-1089555 starts
call_mods process-1089555 ending, proceed 0 batches
call_mods process-1089557 starts
call_mods process-1089557 ending, proceed 0 batches
call_mods process-1089556 starts
call_mods process-1089556 ending, proceed 0 batches
call_mods process-1089553 starts
call_mods process-1089553 ending, proceed 0 batches
call_mods process-1089558 starts
call_mods process-1089558 ending, proceed 0 batches
call_mods process-1089554 starts
call_mods process-1089554 ending, proceed 0 batches
write_process-1089559 starts
write_process-1089559 finished
4000 of 4000 fast5 files failed..
[main] call_mods costs 96.07 seconds..`

@PengNi
Copy link
Owner

PengNi commented May 12, 2022

Hi @acread , thanks for using deepsignal-plant. There may be two reasons for the failing of deepsignal-plant:

(1) Please check the log of tombo resquiggle, see if most of the reads are successfully resquiggled. In the end of the log tombo will tell how many reads failed.

(2) It may related with the VBZ compression issue. In this case, deepsignal-plant will not read data from FAST5s without a HDF5 plugin. Add the plugin will solve the issue:

# download ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz (or newer version) and set HDF5_PLUGIN_PATH
# https://github.com/nanoporetech/vbz_compression/releases
wget https://github.com/nanoporetech/vbz_compression/releases/download/v1.0.1/ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz
tar zxvf ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz
export HDF5_PLUGIN_PATH=/abslolute/path/to/ont-vbz-hdf-plugin-1.0.1-Linux/usr/local/hdf5/lib/plugin

Best,
Peng

@acread
Copy link
Author

acread commented May 12, 2022

Thank you for the quick reply!

In my case it looks like Tombo was running well (most reads were resquiggled based on the log) - however the hdf5 plugin was not being found by deepsignal-plant. I used the export command above to direct it to the correct location and it looks like it is working.

@PengNi PengNi closed this as completed May 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants