Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seq_filter_by_id StopIteration bug #39

Closed
neoformit opened this issue Nov 29, 2023 · 5 comments
Closed

seq_filter_by_id StopIteration bug #39

neoformit opened this issue Nov 29, 2023 · 5 comments

Comments

@neoformit
Copy link
Contributor

We are are seeing this issue on Galaxy Australia:

seq_filter_by_id.py", line 340, in fastq_filter
 for title, seq, qual in FastqGeneralIterator(handle):
RuntimeError: generator raised StopIteration

Relates to an old issue resulting from the deprecation of StopIteration use.

This has been fixed in BioPython (>=1.71 judging by the merge date on the fix) but the Galaxy tool seq_filter_by_id.xml (and probably other tools) still sits on BioPython 1.67. I guess this should be bumped to >=1.71 (available from conda-forge)?

@neoformit
Copy link
Contributor Author

@mthang

@peterjc
Copy link
Owner

peterjc commented Nov 30, 2023

Yes, this is a Python 2 to 3 change and we need to bump the Biopython dependency. By my reading of the dates we need at least Biopython 1.73 though.

@peterjc
Copy link
Owner

peterjc commented Nov 30, 2023

Touch wood the update now on the Tool Shed resolves this (and for four other tools using FastqGeneralIterator).

However, these have not been updated in a while and were using the legacy tool_dependencies.xml system, so given CI is currently broken on this repository I'm not 100% confident.

Please report back...

@neoformit
Copy link
Contributor Author

neoformit commented Nov 30, 2023

Yep this passes planemo test locally now, but only after I re-arrange a bit to make the test-data accessible. The dir structure pushed to the toolshed causes tests to fail:

├── README.rst
├── test-data
│   ├── empty_file.dat
│   ├── k12_hypothetical_alt.tabular
│   ├── k12_hypothetical.fasta
│   ├── k12_hypothetical.tabular
│   ├── k12_ten_proteins.fasta
│   ├── sanger-pairs-mixed.fastq
│   ├── sanger-pairs-names.tabular
│   └── sanger-sample.fastq
└── tools
    └── seq_filter_by_id
        ├── seq_filter_by_id.py
        └── seq_filter_by_id.xml

If the tool files are in the root dir then planemo can make sense of it.
I guess this could be done by modifying .shed.yml like so, but I can't verify this because testtoolshed is broken:

include:
  - strip_components: 2
    source:
      - ../../test-data/empty_file.dat
      - ../../test-data/k12_hypothetical.fasta
      - ../../test-data/k12_hypothetical.tabular
      - ../../test-data/k12_hypothetical_alt.tabular
      - ../../test-data/k12_ten_proteins.fasta
      - ../../test-data/sanger-pairs-mixed.fastq
      - ../../test-data/sanger-pairs-names.tabular
      - ../../test-data/sanger-sample.fastq
  - strip_components: 4
    source:
      - ../../tools/seq_filter_by_id/README.rst
      - ../../tools/seq_filter_by_id/seq_filter_by_id.py
      - ../../tools/seq_filter_by_id/seq_filter_by_id.xml

But in the time being, we can force install this. Thanks for the fix.

@peterjc
Copy link
Owner

peterjc commented Nov 30, 2023

Thanks for confirming, I'm going to close this issue now.

I wonder if that is a planemo regression, this used to be fine - although I recognize the IUC settled on a pattern of a test-data folder for each tool (unlike what I was using which was a common test-data folder used by multiple tools). Nothing obvious in the https://github.com/galaxyproject/planemo/issues open issues.

I guess when I get round to #40 this will have to be reviewed...

@peterjc peterjc closed this as completed Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants