-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release new GTDB databases for R07-RS207 #1941
Comments
the full genomic ones are available on farm at along with updated genbank databases. @luizirber, it is legit to make the |
It is the preferable default (instead of https://ipfs.io) per ipfs/ipfs-companion#939 But we can also point in docs to check https://ipfs.github.io/public-gateway-checker/ if |
note, also need to build/provide the taxonomy spreadsheets for both genbank and GTDB. |
new taxonomy spread sheets built for GTDB! Paths on farm:
|
I love me some picklists!
running now, results will be in |
and I will confirm inclusion with:
|
databases built!
etc. |
☝️ https://greyhound.sourmash.bio/ is now running with |
Oh, and IPFS hashes: |
Here's how I'm contemplating building new database releases - https://github.com/sourmash-bio/database-releases/ idea is we have a very small repo that contains the just the Snakefile and config stuff for each release version, and then every time we do a release of databases we cut a new release here => zenodo DOI, etc. I'll flesh that out more clearly but would love any hot takes you might have :) |
no past decision goes unpunished. the SIGH. |
Full databases (.zip, .sbt.zip, .lca.json.gz) now available for all GTDB:
and for just the genomic representatives:
Genbank .zip databases from end of March 2022 are here:
I'll work on collating the tax spreadsheets etc and putting them in a single canonical place on farm. (Still need to build tax spreadsheets for genbank.) |
I made a release on database-releases here, https://github.com/sourmash-bio/database-examples/releases/tag/v0.1 |
all (?) GTDB databases linked under still have to update, copy, and/or link in taxonomy DBs, among other things... |
random question: should I use the code in https://github.com/dib-lab/2018-ncbi-lineages to build new Genbank lineages, or is there a better procedure? No problem updating code etc etc if needed, was just wondering if somewhere in our collection of issues/PRs there is a new, improved genbank lineage construction script. |
I don't know of any new and improved methods |
🎶 hey, ho, away we go 🎶 (envision picture of dwarf heading off to code mines with a pickaxe) |
actually I like how I set myself up for success in that github repo with a Snakefile and everything. yay past me! |
...easy conversion over to assembly_summary files as inputs: https://github.com/ctb/2022-assembly-summary-to-lineages |
New genbank lineages file on farm:
|
and now in our google drive folder, https://drive.google.com/drive/folders/1Jk5z4fQtsyqyJWCcNmtn4WyE2jZsejrZ. I think it's time to update the docs, yah? |
I think rs207 needs to be added to the osf first? also genbank was plugged into here, https://osf.io/wxf9z/, which is labelled as sourmash GTDB databases |
they're all available via other links at this point - google drive and/or IPFS. |
per https://twitter.com/ace_gtdb/status/1512789050452692996
The text was updated successfully, but these errors were encountered: