Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VarSome API Python client rewrite #7

Merged
merged 34 commits into from
Jan 15, 2018
Merged

VarSome API Python client rewrite #7

merged 34 commits into from
Jan 15, 2018

Conversation

ckopanos
Copy link
Member

Includes new vcf annotator object
Several code fixes
Unit tests
updated documentation

README.md Outdated

Without an API key you will not be able to perform batch requests as well.
The script should complete without errors and display aprox 6,700 lines of data from dann, dbnsfp, ensemble_transcripts, gerp, gnomad_exomes, gnomad_exomes_coverage, icgc_somatic, ncbi_clinvar2, pub_med_articles, refseq_transcripts, sanger_cosmic_public, uniprot_variants, wustl_civic etc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ensembl_transcripts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done



def annotate_variant(argv):
parser = argparse.ArgumentParser(description='Sample Variant API calls')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description='Sample Varsome API calls'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

rsid = fields.ListField(items_types=(int,), help_text="RS ID")


class ClinVarDisease(models.Base):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClinVar and ClinVarDisease are not needed anymore, we only have them for old annotations for the portal, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes correct.

sanger_cosmic = fields.ListField(required=False, items_types=(Cosmic,), help_text="Sanger Cosmic")
sanger_cosmic_public = fields.ListField(required=False, items_types=(CosmicPublic,), help_text="Cosmic")
sanger_cosmic_licensed = fields.ListField(required=False, items_types=(CosmicLicensed,), help_text="Cosmic")
ncbi_clinvar = fields.ListField(required=False, items_types=(ClinVar,), help_text="ClinVar")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ncbi_clinvar is not in the API anymore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the schema still returns them, and the available get parameters too. I will remove from the json models. Maybe we should also patch the api and remove the parameters, but we should keep the serialzers for the portal..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok done, removed clinvar (1). I will prepare a patch for the api as well

README.md Outdated

varsome_api_annotate_vcf.py -g hg19 -k api_key -i input.vcf -o annotated_vcf.vcf -p add-all-data=1

Notice however that not all available annotations will be present in the annotated_vcf file. Only a subset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. Why won't all annotations be present? Which ones will be there and which ones will be missing? How does the script "decide"? Is it random? Is it a specific set of annotations, the same each time?

README.md Outdated
# proceed with your code flow e.g.
print(e) # 404 (invalid reference genome)

### Example Usage
To view available request parameters (used in the params method parameter) refer to an example at [api.varsome.com](https://api.varsome.com)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "params" is code, it should be params (`params`). I can't tell here if the method os parameter or params.

README.md Outdated
api = VarSomeAPIClient(api_key)
# fetch information about a variant into a dictionary
result = api.lookup('chr7-140453136-A-T', params={'add-source-databases': 'gnomad-exomes,refseq-transcripts'}, ref_genome='hg19')
annotated_variant = AnnotatedVariant(**result)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ** here make al subsequent lines bold. It looks like the code is not actually in a code block, so it is parsed as markdown.

@@ -0,0 +1,112 @@
#!/usr/bin/env python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this (and all other shebang lines) be #!/usr/bin/env python3? env python will likely still resolve to python2 on many systems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if you are on a system where python2 is still the default. which is none except Macs, and dont create a python3 virtual env then yes it will fail.
So leave it on pyton3. The setup.py script has a requirement for python3 but I guess its ok to change that to python3

README.md Outdated

for a list of available options
Please visit the [api documentation](https://api.varsome.com) to find out how to use the api and
what values does the api provide as a response to lookup requests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change

what values does the api provide as a response to lookup requests

to

what values the api provides as a response to lookup requests

README.md Outdated
This client is still in beta but it is a good for playing around with the API.

### Python versions
Python version 3 is supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires at least Python 3.5, you can download the latest version from www.python.org

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ie: Python 3.4 doesn't work.

README.md Outdated
within your code, or do
There are several ways to create a virtual environment, but you can refer to [pip installation](https://pip.pypa.io/en/stable/installing/) and
[virtualenv installation](https://virtualenv.pypa.io/en/stable/installation/) to first install these 2 tools if you don't
have them already installed via a package manager (Linux) or HomeBrew (MacOS), etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to use "sudo -H" when installing on Mac.

README.md Outdated
varsome_api_run.py -g hg19 -k api_key -i variants.txt -o annotations.txt -p add-all-data=1

The command above will read variants from `variants.txt` and dump the annotations to `annotations.txt`.
If you don't use the `-k` parameter, the script will do as many requests as there are variants in `variants.txt`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For any substantial number of variants you will need to register for an API key. You can try the software without the -k apiKey parameter but you will quickly bump into safeguard limits...

or something like that.

README.md Outdated
varsome_api_annotate_vcf.py -g hg19 -k api_key -i input.vcf -o annotated_vcf.vcf -p add-all-data=1

Notice, however, that not all available annotations will be present in the `annotated_vcf.vcf` file. Only a subset
of the returned annotations will be available when running this script. See the "Using the client in your code"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's missing? Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added only a subset of annotations in the output.vcf the possible number of annotations are too many to add them all and header info in the vcf file for all of them it might take several more days with tests and changes etc.. Besides you dont know what a user might need from all of these annotations, this is why the user is referred to the section using the client,where he is instructed how to override the VCFAnnotator class with the annotations he wants.
After all the intended use is that the end user will get the python class objects to develop an app of his own not use the run.py and annotate_vcf scripts. Even if he wants to do that he can copy the code and use his overriden vcf annotator class

README.md Outdated

You will also not be able to perform batch requests without an API key.

To obtain an API key please [contact us](mailto:support@saphetor.com)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the top of this section.

README.md Outdated

### How to get an API key

You are generally not required to have an API key to use the API but, without one, the number of requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to: "You can use the API without an API key, but performance and the number of queries will be limited in order to safeguard our platform's reliability."

setup.py Outdated
'Operating System :: OS Independent',
'License :: OSI Approved :: Apache License',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably need to remove up to 3.4 included.

@@ -0,0 +1,41 @@
#!/usr/bin/env python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many (all?) of these files are missing (c) 2018 Saphetor at the top.
We need to decide which license we are using for these.
And then include the appropriate header in all files.
Preferably something that means that:
1- We own the original code.
2- If somebody modifies the code they still have to include the copyright message.
3- Would be nice if they send bug-fixes back to us.
4- They can sell the code to 3rd party uses, but they must credit Saphetor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need all this info in all files. The repository contains a license file which can cover all the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RikMaxSpeed RikMaxSpeed changed the title Rewritted python client VarSome API Python client rewrite Jan 15, 2018
@RikMaxSpeed RikMaxSpeed merged commit ff57060 into master Jan 15, 2018
@ckopanos ckopanos deleted the vcf-annotator branch January 17, 2018 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants