'dict_ref_structure' is not defined #63

ScatF · 2019-07-11T14:57:55Z

Hi,

I'm using human transcriptome ONT data and want to simulate reads. I first had the error message 'DivisionByZero' so I'm actually running with the 2.3-beta version of NanoSim. I reached the simulation stage and I ran into this error :

  File "/NanoSim/src/simulator.py", line 1198, in <module>
    main()
  File "/NanoSim/src/simulator.py", line 1192, in main
    simulation(args.mode, out, dna_type, perfect, kmer_bias, max_len, min_len, None, None, model_ir)
  File "/NanoSim/src/simulator.py", line 722, in simulation
    new_read, new_read_name = extract_read(dna_type, middle_ref)
  File "/NanoSim/src/simulator.py", line 772, in extract_read
    key = random.choice(seq_len.keys())
  File "/miniconda3/lib/python3.7/random.py", line 262, in choice
    return seq[i]
TypeError: 'dict_keys' object is not subscriptable

I identify the problem : with Python 3.7, seq_len.keys() is an object and not a list. We can easily fix the problem by changing
key = random.choice(seq_len.keys())
into
key = random.choice(list(seq_len.keys()))

And now I get this error :

  File "./simulator.py", line 1198, in <module>
    main()
  File "./simulator.py", line 1192, in main
    simulation(args.mode, out, dna_type, perfect, kmer_bias, max_len, min_len, None, None, model_ir)
  File "./simulator.py", line 754, in simulation
    simulation_aligned_transcriptome(model_ir, out_reads, out_error, kmer_bias, per)
  File "./simulator.py", line 499, in simulation_aligned_transcriptome
    if ref_trx in dict_ref_structure:
NameError: name 'dict_ref_structure' is not defined

And I don't understand why. 'dict_ref_structure' is well defined before, in global and line 321.
Can you help me ?

The text was updated successfully, but these errors were encountered:

SaberHQ · 2019-07-11T20:23:22Z

Hi @StanislasF

Thanks for bringing this up. I will update the script for seq_len.keys() thing as you mentioned so that it is compatible in Python 3.7.

As for your second question, as you said, that dictionary is very well defined before. Would you mind to write down the exact code you are running with all input variables you use? Thanks.

ScatF · 2019-07-12T07:45:20Z

Yes, sure. I'm running with python 3.7.3, HTSeq 0.11.2, nump 1.16.4, pybedtools 0.8.0, pysam 0.15.2, scipy 1.3.0, scikit-learn 0.21.2, genometools 1.2.1 (you did'nt mentionned genometools in your requirement file but it was needed for me). All my reference come from Ensembl database.

I ran step by step

cd /MYPATH/localLib/NanoSim-2.3-beta/src/

./read_analysis.py transcriptome -i /MYPATH/Data/ONTseq/fasta/raw/T7_3moins_10102018.clean.fa -rt /MYPATH/data/Homo_sapiens.GRCh38.cdna.all.fa  -annot /MYPATH/data/Homo_sapiens.GRCh38.97.chr.gtf -o /MYPATH/workspace/NanoSim/NanoSim_$temps/training -t 4 --no_intron_retention

./read_analysis.py quantify -i /MYPATH/Data/ONTseq/fasta/raw/T7_3moins_10102018.clean.fa -rt /MYPATH/data/Homo_sapiens.GRCh38.cdna.all.fa  -o /MYPATH/workspace/NanoSim/NanoSim_$temps/expression -t 4  

./simulator.py transcriptome -rt /MYPATH/data/Homo_sapiens.GRCh38.cdna.all.fa -e /MYPATH/workspace/NanoSim/NanoSim_$temps/expression_abundance.tsv -c /MYPATH/workspace/NanoSim/NanoSim_$temps/training -o /MYPATH/workspace/NanoSim/NanoSim_$temps/simulated -max 10000 --no_model_ir -rg /MYPATH/data/Homo_sapiens.GRCh38.dna.primary_assembly.fa

All the output are okay except for the simulation, it give me only unaligned reads. It break after 'start simulation of random reads' and give me the 'dict_keys' error

SaberHQ · 2019-07-12T23:17:32Z

Dear @StanislasF

Thanks for providing more info. Actually, it seems like an input requirement bug and I fixed it now. The reference genome and the annotation file are not necessary unless you are willing to model Intron retention events as well. There was a bug in which that dictionary you just mentioned did not create when using --no_model_ir option.

It is now fixed. I also improved the speed a lot and added a new option as well. So please check the new pre-release here: https://github.com/bcgsc/NanoSim/releases/tag/v2.4-beta

Let me know if I can provide more help. Please feel free to contact me if you have any questions.

ScatF · 2019-07-15T14:48:11Z

Dear @SaberHQ

You did a great job with your new pre-release. I tried with and without the --no_model_ir option and find only one little issue : for your --uracile option in simulator.py you use maketrans function. With Python 3, this function is no more in string package but is a static method of builtin str. So you cannot import it but you can use it directly with str.maketrans

Thank you for your help !

ScatF closed this as completed Jul 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'dict_ref_structure' is not defined #63

'dict_ref_structure' is not defined #63

ScatF commented Jul 11, 2019

SaberHQ commented Jul 11, 2019

ScatF commented Jul 12, 2019 •

edited

Loading

SaberHQ commented Jul 12, 2019

ScatF commented Jul 15, 2019

'dict_ref_structure' is not defined #63

'dict_ref_structure' is not defined #63

Comments

ScatF commented Jul 11, 2019

SaberHQ commented Jul 11, 2019

ScatF commented Jul 12, 2019 • edited Loading

SaberHQ commented Jul 12, 2019

ScatF commented Jul 15, 2019

ScatF commented Jul 12, 2019 •

edited

Loading