Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update mtp tables qc checks to include expression tpm tables #288

Merged
merged 30 commits into from
Dec 1, 2022

Conversation

adilahiri
Copy link

Purpose/implementation Section

What scientific question is your analysis addressing?

Update mtp tables qc checks to include expression tpm tables

What was your approach?

  1. Change the name of the directory from mutation-frequencies-table-checks to mtp-tables-qc-checks
  2. Added V11 gene and group wise tpm files to the current directory of this module.
  3. Added V10 gene and group wise tpm files to the current directory of this module.
  4. Created script 02-tpm-tables-checks.Rmd . The script adapts logic and code from the script 01-frequencies-tables-checks.Rmd. The results include:
    a) number of samples in each cohort
    b) cancer groups represented in multiple cohorts
    c) a subset of sorted top 50 records from a static cancer group (Neuroblastoma) that should not change
    d) changes in common columns among both gene and group wise TPM tables with non-dynamic values.
  5. Updated the bash script run_frequencies-tables-checks.sh to include running the new 02-tpm-tables-checks.Rmd
  6. Updated readme.

What GitHub issue does your pull request address?

Issue#446

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Please review the results for the TPM tables :
long_n_tpm_mean_sd_quantile_gene_wise_zscore.xlsx

long_n_tpm_mean_sd_quantile_group_wise_zscore.xlsx

Also please closely review the code logic applied for finding the number samples in each cohort in lines 66 to 77 of the script 02-tpm-tables-checks.Rmd . Please suggest any other improvement or suggestions for the code logic, organization and documentation as well.

Is there anything that you want to discuss further?

No

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

Table

What is your summary of the results?

For both the gene and group wise TPM tables, we generate a QC check table in excel format in the results folder and the script report is generated as html file in the main repo of the module. For both the gene and group wise tables:

  1. The non-dynamic columns for the 50 neuroblastoma samples remain consistent between V10 and V11 versions, however, the n_samples for each gene is lower in V11.
  2. In V11, the number of samples for GMKF and PBTA are lower compared to V10, however the number of samples for TARGET increased in V11.

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • [ X] This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

@adilahiri adilahiri marked this pull request as ready for review November 11, 2022 21:42
@adilahiri adilahiri self-assigned this Nov 11, 2022
@ewafula ewafula self-requested a review November 14, 2022 20:01
@ewafula ewafula merged commit f9e9712 into dev Dec 1, 2022
@jharenza jharenza deleted the update-mtp-tables-qc-checks branch February 19, 2023 02:12
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants