Command line tool for running netMHC with Apache Spark
git clone https://github.com/nmdp-bioinformatics/netMHC-spark
cd netMHC-spark
mvn package
netmhc-spark 1.0
Usage: spark-submit netmhc-spark-1.0-SNAPSHOT.jar [options]
-i, --input <value> input is the input path
-o, --output <value> output is the output path
-a, --alleles <value> alleles is the list of HLA alleles to use
-f, --format <value> format is the output format (default = parquet)
spark-submit --master yarn --deploy-mode client \
target/netmhc-spark-1.0-SNAPSHOT.jar \
--input src/test/resources/test_peptides.pep \
--alleles src/test/resources/allele_list.txt \
--output peptide_binding
Massimo Andreatta, Morten Nielsen; Gapped sequence alignment using artificial neural networks: application to the MHC class I system, Bioinformatics, Volume 32, Issue 4, 15 February 2016, Pages 511–517, https://doi.org/10.1093/bioinformatics/btv639