Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export PC alignments #494

Closed
rbeinart opened this issue Apr 10, 2017 · 2 comments
Closed

Export PC alignments #494

rbeinart opened this issue Apr 10, 2017 · 2 comments

Comments

@rbeinart
Copy link

Hello,
As suggested on the Google Group, I am posting here to request that a function be added to export PC alignments from the pangenomics analysis pipeline.

Thanks!
-Roxanne-

meren added a commit that referenced this issue Apr 10, 2017
`get_AA_sequences_for_PCs` will return sequences aligned or not
aligned. depending on whether the user have aligned their sequences, or
they wish unaligned sequences anyway.
@meren
Copy link
Member

meren commented Apr 10, 2017

Hi Roxanne,

Thanks for this. Now anvi-summarize for pan genomes does report aligned sequences in PCs (unless the flag --skip-alignments with anvi-pan-genome, in which case it will simply report unaligned sequences). This will be in the next stable release after some more testing.

Now we have the functionality in the super class I will also implement an anvi-export-pc-alignments program so aligned sequences for individual PCs can be acquired rapidly without summary.

Best,

@meren meren closed this as completed in b0b55f7 Apr 11, 2017
@meren
Copy link
Member

meren commented Apr 11, 2017

This is done. We now have a new program to do it outside of summaries:

$ anvi-export-pc-alignments -h
usage: anvi-export-pc-alignments [-h] [-p PAN_DB] [-g GENOMES_STORAGE]
                                 [-o FILE_PATH] [--pc-id PROTEIN_CLUSTER_ID]
                                 [--pc-ids-file FILE_PATH]
                                 [-C COLLECTION_NAME] [-b BIN_NAME]
                                 [--list-collections] [--list-bins]

Export aligned sequences from anvi'o pan genomes

optional arguments:
  -h, --help            show this help message and exit

INPUT FILES:
  Input files from the pangenome analysis.

  -p PAN_DB, --pan-db PAN_DB
                        Anvi'o pan database
  -g GENOMES_STORAGE, --genomes-storage GENOMES_STORAGE
                        Anvi'o genomes storage file

OUTPUT FILE:
  You get to chose an output file name to report things. The default will be
  an ugly name. So, be explicit.

  -o FILE_PATH, --output-file FILE_PATH
                        File path to store results.

SELECTION:
  Which protein clusters should be exported. You can ask for a single PC, or
  multiple ones listed in a file, or you can use a collection and bin name
  to list PCs of interest. If you give nothing, this program will export
  alignments for every single PC found in the profile database (and this is
  called 'customer service').

  --pc-id PROTEIN_CLUSTER_ID
                        Protein cluster ID you are interested in.
  --pc-ids-file FILE_PATH
                        Text file for protein clusters (each line should
                        contain be a unique protein cluster id).
  -C COLLECTION_NAME, --collection-name COLLECTION_NAME
                        Collection name.
  -b BIN_NAME, --bin-id BIN_NAME
                        Bin name you are interested in.

OTHER STUFF:
  Yes. Stuff that are not like the ones above.

  --list-collections    Show available collections and exit.
  --list-bins           List available bins in a collection and exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants