Skip to content

Upload of metagenome and metatranscriptome assemblies to ENA

License

Notifications You must be signed in to change notification settings

EBI-Metagenomics/assembly_uploader

Repository files navigation

Public ENA Assembly uploader

Upload of metagenome and metatranscriptome assemblies to the European Nucleotide Archive (ENA)

Pre-requisites:

  • CSV metadata file. One per study. See test/fixtures/test_metadata for an example
  • Compressed assembly fasta files in the locations defined in the metadata file

Set the following environmental variables with your webin details:

ENA_WEBIN

export ENA_WEBIN=Webin-0000

ENA_WEBIN_PASSWORD

export ENA_WEBIN_PASSWORD=password

Installation

Install the package:

pip install assembly_uploader

Register study and generate pre-upload files

If you already have a registered study accession for your assembly files skip to step 3.

Step 1

This step will generate a folder STUDY_upload and a project XML and submission XML within it:

study_xmls
  --study STUDY         raw reads study ID
  --library LIBRARY     metagenome or metatranscriptome
  --center CENTER       center for upload e.g. EMG
  --hold HOLD           hold date (private) if it should be different from the provided study in format dd-mm-yyyy. Will inherit the release date of the raw read study if not
                        provided.
  --tpa                 is the study a third party assembly. Default True
  --publication PUBLICATION
                        pubmed ID for connected publication if available

Step 2

This step submit the XML to ENA and generate a new assembly study accession. Keep note of the newly generated study accession:

submit_study
  --study STUDY         raw reads study ID
  --test                run test submission only

Step 3

This step will generate manifest files in the folder STUDY_UPLOAD for runs specified in the metadata file:

assembly_manifest
  --study STUDY         raw reads study ID
  --data DATA           metadata CSV - run_id, coverage, assembler, version, filepath
  --assembly_study ASSEMBLY_STUDY
                        pre-existing study ID to submit to if available. Must exist in the webin account
  --force               overwrite all existing manifests

Upload assemblies

Once manifest files are generated, it is necessary to use ENA's webin-cli resource to upload genomes.

To test your submission add the -test argument.

A live execution example within this repo is the following:

ena-webin-cli \
  -context=genome \
  -manifest=SRR12240187.manifest \
  -userName=$ENA_WEBIN \
  -password=$ENA_WEBIN_PASSWORD \
  -submit

More information on ENA's webin-cli can be found here.

About

Upload of metagenome and metatranscriptome assemblies to ENA

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages