I lied to PATRIC & it liked it #1962

aswarren · 2018-04-19T19:45:13Z

Minhash supports fastq files. I told PATRIC that my fastq file was a contigs file so I could submit it to similar genome finder. It returned the correct result.

We should add type "reads" to allowable inputs in the similar genome finder.

mshukla1 · 2018-04-25T20:54:22Z

...or just call you a lier next time! ;)

olsonanl · 2018-05-16T21:15:51Z

This is a problem across the board. We have had a number of problems where uploaded data does not match the declared type.

I have working fastq and fasta validators that we can bolt into place. One issue, however, is that "reads" does not mean fastq; it can mean bam as well:

SPAdes takes as input paired-end reads, mate-pairs and single (unpaired) reads in FASTA and FASTQ. For IonTorrent data SPAdes also supports unpaired reads in unmapped BAM format (like the one produced by Torrent Server). However, in order to run read error correction, reads should be in FASTQ or BAM format. Sanger, Oxford Nanopore and PacBio CLR reads can be provided in both formats since SPAdes does not run error correction for these types of data.

There is a distinction between the file format and type of file as we are using it.

mshukla1 · 2018-07-12T16:45:53Z

Not sure if there is any short term action item here or we bite the bullet fix the file type / file format checking and make it consistently available across all services.

olsonanl · 2018-07-12T16:54:30Z

There is not a good hack fix to this.

aswarren · 2018-07-12T18:53:45Z

Regardless of what we do to clean up the filetype space, we will still have filetypes; apps will still be authorized to work with certain filetypes; this one should be authorized to work with fastq since it supports it. If you want to clean up filetypes, that is a separate issue/ticket.

aswarren · 2018-07-12T19:09:03Z

Hmm I think I need to walk that back. I forgot BAM, that is my bad. I will try to think about this.

aswarren · 2018-07-12T20:40:19Z

After looking at this further it looks like BAM is its own type right now.
https://github.com/PATRIC3/Workspace/blob/master/typeslist.txt

So that isn't exactly an issue for this service.
More broadly, for my money, exploding types like "reads" and "contigs" into their constituent formats and having those as our input types makes sense to me.
I think if we did that then we could support the old "types" (reads, contigs) as we transition to no longer creating them.

I will try to create a proposed solution for this in a couple of slides so we can talk about it next week.

aswarren · 2018-07-12T21:51:05Z

Since BAM is not currently a "reads" type.
This is enabled for Similar genome finder here.
PATRIC3/p3_web#773

aswarren · 2018-07-26T17:50:09Z

It looks like fastq only works when it is small and compressed fastq files don't seem to work even though MASH supports it. The invocation of MASH may need to be reworked for fastq:
marbl/Mash#32

olsonanl · 2018-07-26T18:05:46Z

How does it fail?

aswarren · 2018-07-26T18:40:23Z

@olsonanl Scratch that. It succeeded. We just were not setting the MASH distance threshold to be permissive enough. This seems to be common enough (along with the need to search all public genomes) that I'm wondering if we should be hiding the "Advanced" parameters here by default. They don't seem very "Advanced" and most people would want to know that they are searching only "Reference & Representative" without having to dig.

mshukla1 assigned hyoo and olsonanl Jul 12, 2018

aswarren closed this as completed Jul 12, 2018

aswarren reopened this Jul 26, 2018

aswarren closed this as completed Jul 26, 2018

aswarren mentioned this issue Aug 16, 2018

Similar Genome Finder Advanced Parameters #2087

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I lied to PATRIC & it liked it #1962

I lied to PATRIC & it liked it #1962

aswarren commented Apr 19, 2018

mshukla1 commented Apr 25, 2018

olsonanl commented May 16, 2018

mshukla1 commented Jul 12, 2018

olsonanl commented Jul 12, 2018

aswarren commented Jul 12, 2018

aswarren commented Jul 12, 2018 •

edited

Loading

aswarren commented Jul 12, 2018

aswarren commented Jul 12, 2018

aswarren commented Jul 26, 2018

olsonanl commented Jul 26, 2018

aswarren commented Jul 26, 2018

I lied to PATRIC & it liked it #1962

I lied to PATRIC & it liked it #1962

Comments

aswarren commented Apr 19, 2018

mshukla1 commented Apr 25, 2018

olsonanl commented May 16, 2018

mshukla1 commented Jul 12, 2018

olsonanl commented Jul 12, 2018

aswarren commented Jul 12, 2018

aswarren commented Jul 12, 2018 • edited Loading

aswarren commented Jul 12, 2018

aswarren commented Jul 12, 2018

aswarren commented Jul 26, 2018

olsonanl commented Jul 26, 2018

aswarren commented Jul 26, 2018

aswarren commented Jul 12, 2018 •

edited

Loading