The Microsoft Genomics service in Azure can power genome sequencing using a cloud implementation of the Burrows-Wheeler Aligner (BWA) and the Genome Analysis Toolkit (GATK) for secondary analysis. The pipeline can take in multiple FASTQ and BAM files and provides alignment and variant outputs. The msgen
package provides an interface to use the service from within R.
You can install the latest stable version from GitHub using the following command:
remotes::install_github("colbyford/msgen")
library(msgen)
submit_workflow(subscription_key = "04afabfc...",
region = "eastus",
process = "snapgatk",
reference = "b37m1",
description = "Submission from cford/msgen R package.",
input_storage_account_name = "mygenomicsstorage",
input_storage_account_key= "6GyBAbvgw5sqo2...",
input_container_name = "myinputdata",
blob_name_1 = "chr21_1.fq.gz",
blob_name_2 = "chr21_2.fq.gz",
output_container_name = "myoutputdata")
list_workflows(subscription_key = "04afabfc...",
region = "eastus")
get_workflow_status(subscription_key = "04afabfc...",
region = "eastus",
workflow_id = "12g3c5a...")
cancel_workflow(subscription_key = "04afabfc...",
region = "eastus",
workflow_id = "12g3c5a...")
- Medium Blog Post on this Package
- msgen Python 2.7 command-line client
- Microsoft Genomics service on Azure
- Microsoft Genomics Documentation
This open source R package/project is licensed under the Apache 2.0 License - see the LICENSE file for details
Note: The Microsoft Genomics service, Azure, and the msgen
Python command-line interface are all Copyright (c) Microsoft Corporation. All rights reserved.