Skip to content

Mahsa-Ehsanifard/cBioPortal-mutation-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cBioportal

cBio Cancer Genomics Portal

Online database

Gene mutation data for Cancer Genomics

https://www.cbioportal.org/

The cBioPortal for Cancer Genomics is a resource for interactive exploration of multidimensional cancer genomics data sets. The portal supports and stores non-synonymous mutations, DNA copy-number data, mRNA and microRNA expression data, protein-level and phosphoprotein level data (RPPA or mass spectrometry based), DNA methylation data, and de-identified clinical data.

R client

There are multiple ways to access the API using R.

One of the recommended R packages to access cBioPortal data is cBioPortalData package.

cBioPortalData

cBioPortal/github

Bioconductor.cBioPortalData.package

Overview

The cBioPortalData R package accesses cancer datasets from the cBio Cancer Genomics Portal. The package provides cBioPortal datasets as MultiAssayExperiment objects in Bioconductor.

Thanks to waldronlab/cBioPortalData

MultiAssayExperiment

According to Bioconductor.MultiAssayExperiment, harmonized and managed data of multiple experimental assays performed on an overlapping set of specimens by MultiAssayExperiment.

Installation

To install this package in R (version >= "4.3.0"), BiocManager package should be used:

BiocManager::install("cBioPortalData")

loading package

library(cBioPortalData)

getting the information of cBioPortal API :

a list of all api datasets of studies that are available and currently building as MultiAssayExperiment representations.

cbio <- cBioPortal()
service: cBioPortal
tags(); use cbioportal$<tab completion>:
# A tibble: 65 x 3
   tag                 operation                            summary
   <chr>               <chr>                                <chr>  
 1 Cancer Types        getAllCancerTypesUsingGET            Get al~
 2 Cancer Types        getCancerTypeUsingGET                Get a ~
 3 Clinical Attributes fetchClinicalAttributesUsingPOST     Fetch ~
 4 Clinical Attributes getAllClinicalAttributesInStudyUsin~ Get al~
 5 Clinical Attributes getAllClinicalAttributesUsingGET     Get al~
 6 Clinical Attributes getClinicalAttributeInStudyUsingGET  Get sp~
 7 Clinical Data       fetchAllClinicalDataInStudyUsingPOST Fetch ~
 8 Clinical Data       fetchClinicalDataUsingPOST           Fetch ~
 9 Clinical Data       getAllClinicalDataInStudyUsingGET    Get al~
10 Clinical Data       getAllClinicalDataOfPatientInStudyU~ Get al~
# i 55 more rows
# i Use `print(n = ...)` to see more rows
tag values:
  Cancer Types, Clinical Attributes, Clinical Data, Copy
  Number Segments, Discrete Copy Number Alterations, Gene
  Panel Data, Gene Panels, Generic Assay Data, Generic
  Assays, Genes, Info, Molecular Data, Molecular Profiles,
  Mutations, Patients, Sample Lists, Samples, Server
  running status, Studies, Treatments
schemas():
  AlleleSpecificCopyNumber, AlterationFilter,
  AndedPatientTreatmentFilters,
  AndedSampleTreatmentFilters, CancerStudy
  # ... with 58 more elements

releasing the studies available in cbio and making a matrix of full information about all api studies including study ID :

$Note$ : The studies with permission=TRUE is represented.

study <- getStudies(cbio)

studyID selection

  • Choosing a particular cancer study with TCGA studyID (GDC portal). This function will provide sample lists of the study selected based on cbio in MultiAssayExperiment using sampleLists function based on TCGA study id. ( SKCM-TCGA study is an example here).

  • SampleListid column will be added to the table with study id and description.

sample <- sampleLists(studyId = "skcm_tcga",cbio)
colnames(sample)
[1] "category"     "name"         "description"  "sampleListId"
[5] "studyId"  
table(sample$category)
#      all_cases_in_study 
#                                            1 
#                      all_cases_with_cna_data 
#                                            1 
#              all_cases_with_methylation_data 
#                                            2 
#              all_cases_with_mrna_rnaseq_data 
#                                            1 
# all_cases_with_mutation_and_cna_and_mrna_data 
#                                            1 
#         all_cases_with_mutation_and_cna_data 
#                                            1 
#                 all_cases_with_mutation_data 
#                                            1 
#                     all_cases_with_rppa_data

$Note$: we can see variant sample categories for the study id including mutation data.

downloading a particular study

It allows users to download sections of the data with molecular profile and gene panel combinations within a study.

SKCM <- cBioPortalData(api = cbio, studyId = "skcm_tcga",by ="hugoGeneSymbol",
 
                       molecularProfileIds = c("skcm_tcga_mutations"),

                       sampleListId = "skcm_tcga_3way_complete",
 
                       genePanelId = "IMPACT341")
SKCM
#> A MultiAssayExperiment object of 2 listed
#> experiments with user-defined names and respective classes.
#> Containing an ExperimentList class object of length 2:
#> [1] skcm_tcga_mutations: RangedSummarizedExperiment with 3798 rows and 283 columns
#> [2] skcm_tcga_rna_seq_v2_mrna: SummarizedExperiment with 341 rows and 287 columns
#> Functionality:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DataFrame
#> sampleMap() - the sample coordination DataFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DataFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
#> exportClass() - save data to flat files

About

Mutation data of particular genes in cancers using cBioPortal and TCGA

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages