sugarcaneMapperIDs

The goal of this app is the following: given a DNA or protein sequence from sugarcane map it to a desired database (from sequences in Conekt Grasses) and return the corresponding identifier used to compute expression profiles or co-expression networks.

Required software

mkdir sugarcaneMapperIDs
cd sugarcaneMapperIDs
python3 -m venv .venv
. .venv/bin/activate
pip install flask
pip install peewee
pip install biopython

Installing DB schema

The schema is available as SQL, and can be used to initialized the database in the following way:

cat db/schema.sql | sqlite3 sugarcaneSequences.db

The model.py also knows how to implement the schema, and the preferred way to initialize the database is the following:

python3 db/model.py

Populating DB

python3 utils/populate.py --cds FILES_PANTRANSCRIPTOME/all_CDS_idsok.fasta --proteins FILES_PANTRANSCRIPTOME/PanTranscriptome_2023.proteins --transcripts FILES_PANTRANSCRIPTOME/all_transcripts_checked_with_cds.fasta --orthogroup_members FILES_PANTRANSCRIPTOME/Orthogroups_panTABLE.tsv --orthogroup_representative FILES_PANTRANSCRIPTOME/transcripts_of_longest_cds_per_OG.ids

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
db		db
idmapper		idmapper
utils		utils
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sugarcaneMapperIDs

Required software

Installing DB schema

Populating DB

About

Releases

Packages

Languages

labbces/sugarcaneMapperIDs

Folders and files

Latest commit

History

Repository files navigation

sugarcaneMapperIDs

Required software

Installing DB schema

Populating DB

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages