Skip to content

Commit

Permalink
update descriptions and channels
Browse files Browse the repository at this point in the history
  • Loading branch information
ktmeaton committed May 6, 2020
1 parent d713d24 commit d1567af
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 9 deletions.
23 changes: 16 additions & 7 deletions docs/process/process_data_download.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,37 +4,46 @@ Data Download
SQLite Import
------------------

Import assembly FTP url from database, also retrieve file names for web get.
Import assembly FTP url from database, retrieve file names for web get, prepare TSV input of SRA metadata for EAGER pipeline.

========================================= =========================== ===========================
Input Type Description
========================================= =========================== ===========================
ch_sqlite sqlite NCBImeta SQLite database from process ncbimeta_db_update or params.sqlite
ch_sqlite sqlite NCBImeta SQLite database from params.sqlite or process :ref:`ncbimeta_db_update<NCBImeta DB Update>`
========================================= =========================== ===========================

========================================= =========================== ===========================
Output Type Description
========================================= =========================== ===========================
ch_assembly_for_download_ftp url FTP url for process assembly_download.
ch_assembly_for_download_ftp text FTP url for process assembly_download :ref:`assembly_download<Assembly Download>`.
ch_sra_tsv_for_eager tsv TSV metadata input for process eager.
========================================= =========================== ===========================

========================================= =========================== ===========================
Publish Type Description
========================================= =========================== ===========================
file_assembly_for_download_ftp text List of FTP urls for genomic assembly download.
${params.eager_tsv} tsv TSV metadata input for EAGER pipeline.
========================================= =========================== ===========================

**Shell script**::

sqlite3 ${sqlite} ${params.sqlite_select_command} | grep . | head -n ${params.max_datasets} | sed 's/ /\\n/g' | while read line;
# Select the Genbank Assemblies
sqlite3 ${sqlite} ${params.sqlite_select_command_asm} | grep . | head -n ${params.max_datasets} | sed -E -e 's/ |;/\\n/g' | while read line;
do
if [[ ! -z \$line ]]; then
asm_url=\$line;
asm_fasta=`echo \$line | cut -d "/" -f 10 | awk -v suffix=${params.genbank_asm_gz_suffix} '{print \$0 suffix}'`;
asm_ftp=\${asm_url}/\${asm_fasta};
asm_ftp=`echo \$line | \
awk -F "/" -v suffix=${params.genbank_assembly_gz_suffix} '{print \$0 FS \$NF suffix}'`;
echo \$asm_ftp >> ${params.file_assembly_for_download_ftp}
fi;
done;
# Extract SRA Metadata for EAGER tsv
${params.scriptdir}/sqlite_EAGER_tsv.py \
--database ${sqlite} \
--query ${params.sqlite_select_command_sra} \
--organism ${params.eager_organism} \
--max-datasets ${params.max_datasets} \
--output ${params.eager_tsv}

------------

Expand Down
6 changes: 4 additions & 2 deletions pipeline.nf
Original file line number Diff line number Diff line change
Expand Up @@ -228,16 +228,18 @@ if( (params.sqlite || ( params.ncbimeta_update && params.ncbimeta_annot) ) && !p

process sqlite_import{
/*
Import assembly FTP url from database, also retrieve file names for web get.
Import assembly FTP url from database, retrieve file names for web get, prepare TSV input of SRA metadata for EAGER pipeline.
Input:
ch_sqlite (sqlite): NCBImeta SQLite database from process ncbimeta_db_update or params.sqlite
Output:
ch_assembly_for_download_ftp (url): FTP url for process assembly_download.
ch_assembly_for_download_ftp (text): FTP url for process assembly_download.
ch_sra_tsv_for_eager (tsv): TSV metadata input for process eager.
Publish:
file_assembly_for_download_ftp (text): List of FTP urls for genomic assembly download.
${params.eager_tsv} (tsv): TSV metadata input for EAGER pipeline.
*/
// Other variables and config
tag "$sqlite"
Expand Down

0 comments on commit d1567af

Please sign in to comment.