The output csv file is empty #12

weir12 · 2020-06-03T02:04:28Z

Hi:
@adnaniazi
Thank you for your contribution to this project.I trying to apply tailfindr to my RNA data.
However,I got an abnormal result.Each column is empty in output csv file except file_path

read_id,tail_start,tail_end,samples_per_nt,tail_length,file_path
,,,,,/home/weir/tair_rawdata/fixed_rawdata/basecalled/col0/run1/workspace/0/GXB01159_20180404_FAH71487_GA30000_mux_scan_20180404_RDS03_YF3_57044_read_10_ch_171_strand.fast5,,,,,/home/weir/tair_rawdata/fixed_rawdata/basecalled/col0/run1/workspace/0/GXB01159_20180404_FAH71487_GA30000_mux_scan_20180404_RDS03_YF3_57044_read_10_ch_168_strand.fast5

one of Input fast5 files maybe help you found the reason for problem.
GXB01159_20180404_FAH71487_GA30000_mux_scan_20180404_RDS03_YF3_57044_read_10_ch_171_strand.zip
here is my find_tails with parameters

df <- find_tails(fast5_dir = paste(basedir,sample,paste('run',batch,sep=''),'workspace',sep='/'),
                 save_dir = save_dir,
                 csv_filename = 'rna_tails.csv',
                 num_cores = 16,
                 save_plots = TRUE,
                 basecall_group ='Basecall_1D_001',
                 plot_debug_traces = TRUE,
                 plotting_library = 'rbokeh')

And here is specified parameter of guppy during basecalling.

Guppy Basecalling Software, (C) Oxford Nanopore Technologies, Limited. Version 3.5.2
+5b7a51b, client-server API version 1.1.0

--flowcell FLO-MIN106 --kit SQK-RNA001 --recursive \
--num_callers 8 --cpu_threads_per_caller 2 --records_per_fastq 0 --compress_fastq --fast5_out --qscore_filtering --min_qscore 7

Next is the log file of tailfindr

── Started tailfindr (version 0.1.0) ───────────────────────────────────────────
────────────────────────────────────────────────────────────────────────────────
☰ You have configured tailfindr as following:
❯ fast5_dir:         /home/weir/tair_rawdata/fixed_rawdata/basecalled/col0/run1/worksp
ace
❯ save_dir:          /home/weir/output/polya/tailfinder_res/col0/1
❯ csv_filename:      rna_tails.csv
❯ num_cores:         16
❯ basecall_group:    Basecall_1D_001
❯ save_plots:        TRUE
❯ plot_debug_traces: TRUE
❯ plotting_library:  rbokeh
── Processing started at 2020-06-03 00:51:09 ───────────────────────────────────
● Creating a sub-directory to save the plots in.
  Done! All plots will be saved in the following direcotry:
  /home/weir/output/polya/tailfinder_res/col0/1/plots
● Searching for all Fast5 files...
  Done! Found 988103 Fast5 files.
● Analyzing a single Fast5 file to assess if your data 
  is in an acceptable format...
  ✔ The data has been basecalled using Guppy.
  ✔ Flipflop model was used during basecalling.
  ✔ Every read is in a single fast5 file of its own.
  ✔ The experiment type is RNA, so we will search
    for poly(A) tails.
  ✔ The reads are 1D reads.
● Starting a parallel compute cluster...
  Done!
● Searching for Poly(A) tails...
(omitted processing)
● Formatting the tail data...
  Done!
● Saving the data in the CSV file...
  Done! Below is the path of the CSV file:
  /home/weir/output/polya/tailfinder_res/col0/1/rna_tails.csv
● A logfile containing all this information has been saved in this path: 
  /home/weir/output/polya/tailfinder_res/col0/1/2020-06-03_00-50-35_tailfinder.log
── Processing ended at 2020-06-03 06:09:24 ─────────────────────────────────────
✔ tailfindr finished successfully!

Uh...It doesn't seem to found a problem in log file.
I noticed that turrn-off of enabling_trimming is required in DNA samples.
Perhaps RNA samples have similar situation.
I would really appreciate it if you could help me.
Thank you

The text was updated successfully, but these errors were encountered:

adnaniazi · 2020-06-03T07:51:48Z

Hi,

Thank you for reporting the issue in detail. Can you please provide me 5 of your fast5 files. I tried to debug it using the one fast5 file that you provided, but it is not enough for me to debug the issue.

Thank you.

Adnan

weir12 · 2020-06-03T09:56:55Z

THANK YOU!!!
fast5_files.zip
If you need any other data.Do not hesitate to contact me :)

adnaniazi · 2020-06-03T10:22:55Z

Great. Thank you!

I have now fixed the issue. Please uninstall tailfindr, and then install it again from the GitHub repo. Hopefully, it would work this time. If there is any problem again, please feel free to report it.

One more thing: Only generate the plots for a subset of your reads, and not the whole dataset -- unless you absolutely need to. This is because generating the plots takes a lot of time. But it is your choice.

Wish you all the best!

Adnan

weir12 · 2020-06-04T08:16:40Z

Sorry for the late reply,Because I've been waiting for the process to finish and to evaluate result.
This seems like an CPU intensive task which need More threads and time.
But there's no problem.I can afford to be patient.

Thank you for your wise advice & your timely assistance.

jon-xu · 2022-03-31T00:16:46Z

Hi Adnaniazi,

I got the same issue as weir12 with the newest version.
Could you please give some advise?

Please download the fast5 file example I used as input:
https://cloudstor.aarnet.edu.au/plus/s/WlxHDt1lVmsRO64

Thanks,
Jon

adnaniazi · 2022-03-31T06:39:24Z

Hi Jon,

Your FAST5 seems to be okay. Can you please try running tailfindr on the data that comes with tailfindr to see if it works. Use this path for fast5_dir in your tailfindr command:
fast5_dir = system.file('extdata', 'rna', package = 'tailfindr')
See if the CSV file is empty for the internal dataset as well.

Adnan

jon-xu · 2022-03-31T08:45:04Z

Hi Adnan,

Just tried and with the built-in data it did output some results (attached).

Thanks,
Jon
example.csv

adnaniazi · 2022-03-31T09:02:37Z

This means that you have not installed the VBZ plugin or have not configured it properly.

Please download and extract the VBZ plugin for your OS from this link:
https://github.com/nanoporetech/vbz_compression/releases

Then extract it somewhere and make it discoverable by HDF5 libarary by exporting the path like this (edit the path according to your extracted vbz folder):
export HDF5_PLUGIN_PATH=/bla/bla/bla_path/vbz/ont-vbz-hdf-plugin-1.0.1-Linux/usr/local/hdf5/lib/plugin

That's it. Tailfindr should now work on your data.

jon-xu · 2022-03-31T09:16:52Z

I see, many thanks Adnan!

adnaniazi closed this as completed Jun 15, 2020

xcarmengrandix mentioned this issue Jul 26, 2022

windows installation nanoporetech/vbz_compression#20

Open

nemitheasura mentioned this issue Aug 25, 2022

Using plugin from within R/RStudio on Windows nanoporetech/vbz_compression#21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The output csv file is empty #12

The output csv file is empty #12

weir12 commented Jun 3, 2020 •

edited

Loading

adnaniazi commented Jun 3, 2020

weir12 commented Jun 3, 2020

adnaniazi commented Jun 3, 2020

weir12 commented Jun 4, 2020

jon-xu commented Mar 31, 2022

adnaniazi commented Mar 31, 2022

jon-xu commented Mar 31, 2022

adnaniazi commented Mar 31, 2022

jon-xu commented Mar 31, 2022

The output csv file is empty #12

The output csv file is empty #12

Comments

weir12 commented Jun 3, 2020 • edited Loading

adnaniazi commented Jun 3, 2020

weir12 commented Jun 3, 2020

adnaniazi commented Jun 3, 2020

weir12 commented Jun 4, 2020

jon-xu commented Mar 31, 2022

adnaniazi commented Mar 31, 2022

jon-xu commented Mar 31, 2022

adnaniazi commented Mar 31, 2022

jon-xu commented Mar 31, 2022

weir12 commented Jun 3, 2020 •

edited

Loading