Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v12 rnaseq expression summary stats #347

Merged
merged 4 commits into from
Apr 28, 2023
Merged

v12 rnaseq expression summary stats #347

merged 4 commits into from
Apr 28, 2023

Conversation

ewafula
Copy link

@ewafula ewafula commented Mar 25, 2023

Purpose/implementation Section

What scientific question is your analysis addressing?

Update rna-seq-expression-summary-stats for v12

What was your approach?

  • included TCGA RNA-Seq expression data
  • no longer using R jsonlite::josn for JSON format conversion due to R limitations in memory allocation for large files
  • errors out with much larger v12 expression data matrices - R character strings are limited to 2^31-1 bytes
  • now using bash script with command line utilities for JSON conversion
  • results tables to large and no longer uploaded on GitHub

What GitHub issue does your pull request address?

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

  • review code update to include TCGA RNA-Seq expression data
  • review the code update for JSON format conversion

Is there anything that you want to discuss further?

NA

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Results

What types of results are included (e.g., table, figure)?

tables

What is your summary of the results?

Regenerate results the module results/ folder

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

Copy link
Collaborator

@sangeetashukla sangeetashukla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the merge for OPC and TCGA input files, and the change in the bash script for creating json files instead of doing that with the R script. I corroborated the failure of R to handle the larger files while creating json and agree that this change is justified.

Copy link

@zzgeng zzgeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code works fine. It included RNA-seq from TCGA. Bash script generated json files. Approving!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants