-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I lied to PATRIC & it liked it #1962
Comments
...or just call you a lier next time! ;) |
This is a problem across the board. We have had a number of problems where uploaded data does not match the declared type. I have working fastq and fasta validators that we can bolt into place. One issue, however, is that "reads" does not mean fastq; it can mean bam as well:
There is a distinction between the file format and type of file as we are using it. |
Not sure if there is any short term action item here or we bite the bullet fix the file type / file format checking and make it consistently available across all services. |
There is not a good hack fix to this. |
Regardless of what we do to clean up the filetype space, we will still have filetypes; apps will still be authorized to work with certain filetypes; this one should be authorized to work with fastq since it supports it. If you want to clean up filetypes, that is a separate issue/ticket. |
Hmm I think I need to walk that back. I forgot BAM, that is my bad. I will try to think about this. |
After looking at this further it looks like BAM is its own type right now. So that isn't exactly an issue for this service. I will try to create a proposed solution for this in a couple of slides so we can talk about it next week. |
Since BAM is not currently a "reads" type. |
It looks like fastq only works when it is small and compressed fastq files don't seem to work even though MASH supports it. The invocation of MASH may need to be reworked for fastq: |
How does it fail? |
@olsonanl Scratch that. It succeeded. We just were not setting the MASH distance threshold to be permissive enough. This seems to be common enough (along with the need to search all public genomes) that I'm wondering if we should be hiding the "Advanced" parameters here by default. They don't seem very "Advanced" and most people would want to know that they are searching only "Reference & Representative" without having to dig. |
Minhash supports fastq files. I told PATRIC that my fastq file was a contigs file so I could submit it to similar genome finder. It returned the correct result.
We should add type "reads" to allowable inputs in the similar genome finder.
The text was updated successfully, but these errors were encountered: