Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update scripts to accomodate re-organized pokay #142

Open
1 of 3 tasks
anwarMZ opened this issue Oct 6, 2023 · 1 comment
Open
1 of 3 tasks

Update scripts to accomodate re-organized pokay #142

anwarMZ opened this issue Oct 6, 2023 · 1 comment
Assignees
Labels
enhancement New feature or request High-priotity

Comments

@anwarMZ
Copy link
Member

anwarMZ commented Oct 6, 2023

Is your feature request related to a problem? Please describe.
Pokay has been updated to host other priority pathogens in addition to SARS-CoV-2 data. We need to update the script: functional_annotation.py that digests pokay literature and fill in the functional annotation template.

Currently, the script takes two arguments

        description='This script produces a TSV file from TXT files '
                    'in POKAY  '
                    'https://github.com/nodrogluap/pokay/tree/master'
                    '/data for annotating SARS-COV-2 mutations')
    parser.add_argument('--inputdir', type=str, default=None,
                        help='directory path for input files')
    parser.add_argument('--outputfile', type=str, default=None,
                        help='output file (.TSV) format')
 

Describe the solution you'd like
The following adjustments can help:

  • Add an argument to specify the accession id e.g., --accession NC_045512 for sars-cov-2 and --accession NC_063383` for mpox. If --accession is None (default). This would require another CSV file with custom functions. When accession is provided, the workflow will use the current versioned functional annotation.
  • Need a dictionary to connect protein products with genes and prefixes in the filenames in literature
  • Add a column of protein_product in the functional annotation file to have direct link for app.
@anwarMZ anwarMZ added enhancement New feature or request High-priotity labels Oct 6, 2023
@miseminger
Copy link
Collaborator

Only read CDS_mature_peptides from the JSON for the protein_product

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request High-priotity
Projects
None yet
Development

No branches or pull requests

2 participants