Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new de tag with samtools #309

Closed
nikosdarzentas opened this issue Jan 11, 2019 · 6 comments
Closed

new de tag with samtools #309

nikosdarzentas opened this issue Jan 11, 2019 · 6 comments
Labels

Comments

@nikosdarzentas
Copy link

Hello,
and it's an understatement to say thank you for your work...

While testing the latest minimap2 2.15-r905 with fgbio, fgbio's SAM parsing complained that it was expecting a float from this new tag you added, with the offending example being:
de:f:-inf
Sorry in advance, I could only recover some of the log output from my terminal, hope it still helps:

Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM fi[...]
        at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:468)
        at htsjdk.samtools.SAMLineParser.parseTag(SAMLineParser.java:423)
        at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:346)
@lh3
Copy link
Owner

lh3 commented Jan 11, 2019

Could you show me the offending line? Thanks.

@lh3 lh3 added the bug label Jan 11, 2019
@lh3
Copy link
Owner

lh3 commented Jan 11, 2019

PS: I haven't found a de:f:-inf in my local SAM/PAF files, so this is probably a rare bug. In this case, could you send me the query sequence? If you can't share the sequence, please send me via email. Thank you!

@nikosdarzentas
Copy link
Author

Apologies, I had moved on by then and had to go back and recreate the analysis.
So, this is alignment with minimap2 -xsr of the sequence that is shown below (noisy, I know) to Ensembl's hg38 primary assembly genome, fed to fgbio, which produces the error - I hope that's adequate, but please get back to me if you need more.

[...]
[2019/01/11 21:41:35 | FgBioMain | Info] Executing SortBam from fgbio version 0.7.0 as nikos@core on JRE 1.8.0_121-b13 with snappy
[2019/01/11 21:41:36 | FgBioMain | Info] SortBam failed. Elapsed time: 0.14 minutes.
Exception in thread "main" htsjdk.samtools.SAMFormatException: Error parsing text SAM file. Tag of type f should have single-precision floating point value; File /dev/stdin; Line 87038
Line: NB501229:189:HCHKVAFXY:2:11105:26796:6432	163	18	68537384	60	150M	=	68537661	427	GTTTTTATACTGTATCATTTAGGGAATAATNATGNNNNNAAGNNNATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNANNNNNNANNNNAGAACCCACAGGTAAAAATGACCAACTGTATTTGC	AAAAAEEEEEEEEEEEEEEEEEEEEEEEEE#EEE#####EEE###EE########################################################E######/####EAAAE6<AEAAAAEEEA<AA/<<AAE<AEEA<EE/	NM:i:75	ms:i:75	AS:i:225	nn:i:75	tp:A:P	cm:i:6	s1:i:178	s2:i:0	de:f:-inf	RX:Z:AATGGCAGTA
	at htsjdk.samtools.SAMLineParser.reportErrorParsingLine(SAMLineParser.java:468)
	at htsjdk.samtools.SAMLineParser.parseTag(SAMLineParser.java:423)
	at htsjdk.samtools.SAMLineParser.parseLine(SAMLineParser.java:346)
	at htsjdk.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:268)
	at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:255)
	at htsjdk.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:228)
	at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:576)
	at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:548)
[...]

@lh3 lh3 closed this as completed in 48e230f Jan 12, 2019
@lh3
Copy link
Owner

lh3 commented Jan 12, 2019

Thanks for the example. I have fixed this bug. Nonetheless, fgbio is not doing the right thing, either. "-inf" is a valid floating point number.

@nikosdarzentas
Copy link
Author

Great, many thanks! And should I tell them? It'll hurt their feelings... :-)

@jmarshall
Copy link
Contributor

-inf is not a valid floating point number in SAM: someone named lh3 wrote in the specification that type f tag values must match the regular expression [-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?.

There is something to be said for relaxing this to allow for infinities and NaNs. If you would like to see this happen, please raise it as an issue in hts-specs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants