Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revisit InvalidDNA error? #3104

Closed
ctb opened this issue Apr 1, 2024 · 2 comments
Closed

revisit InvalidDNA error? #3104

ctb opened this issue Apr 1, 2024 · 2 comments

Comments

@ctb
Copy link
Contributor

ctb commented Apr 1, 2024

see apetkau/genomics-data-index#39 (linked to #137) which reports:

sourmash.exceptions.Panic: sourmash panicked: \
thread 'unnamed' panicked with 'called `Result::unwrap()` on an `Err` value: \
InvalidDNA { message: "ATTGCCGAAGTTGATGGTAACGATCCGCTCN" }' at src/core/src/signature.rs:481

does this still happen?

@ctb ctb changed the title revisit force_sequence error? revisit InvalidDNA error? Apr 1, 2024
@ctb
Copy link
Contributor Author

ctb commented Apr 1, 2024

Not at the command line:

% sourmash sketch dna tests/test-data/shewanella.faa --check-sequence -f

== This is sourmash version 4.8.7. ==
== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==

computing signatures for files: tests/test-data/shewanella.faa
Computing a total of 1 signature(s) for each input.
... reading sequences from tests/test-data/shewanella.faa
ERROR when reading from 'tests/test-data/shewanella.faa' - 
invalid DNA character in input k-mer: MCGIVGAVAQRDVAEILVEGLRRLEYRGYDS

@ctb
Copy link
Contributor Author

ctb commented Apr 1, 2024

and not with the Python API. The following code:

import sourmash, screed
mh = sourmash.MinHash(n=0, ksize=21, scaled=1000)

record = next(iter(screed.open('tests/test-data/shewanella.faa')))
mh.add_sequence(record.sequence)

yields:

Traceback (most recent call last):
  File "/Users/t/dev/sourmash/bad.py", line 5, in <module>
    mh.add_sequence(record.sequence)
  File "/Users/t/dev/sourmash/src/sourmash/minhash.py", line 350, in add_sequence
    self._methodcall(lib.kmerminhash_add_sequence, to_bytes(sequence), force)
  File "/Users/t/dev/sourmash/src/sourmash/utils.py", line 25, in _methodcall
    return rustcall(func, self._get_objptr(), *args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/t/dev/sourmash/src/sourmash/utils.py", line 78, in rustcall
    raise exc
ValueError: invalid DNA character in input k-mer: MCGIVGAVAQRDVAEILVEGL

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant