Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping to promoters for all genes on array #11

Open
katehoffshutta opened this issue Feb 17, 2025 · 0 comments
Open

Mapping to promoters for all genes on array #11

katehoffshutta opened this issue Feb 17, 2025 · 0 comments
Assignees

Comments

@katehoffshutta
Copy link
Collaborator

get_gene_level_methylation.r relies on the NDC function probeToMeanPromoterMethylation, which takes in a list of input genes of interest. This input list must be one gene per line. The all_genes list found here: https://github.com/QuackenbushLab/tcga-data-nf/blob/main/bin/r/get_gene_level_methylation.r/#L51 will include some lines with semicolons and multiple genes, and probes will not map correctly to these genes.

To solve this, all_genes needs to be split to long form - one gene per line - prior to being reduced to a unique gene list.

I believe this does not affect the current examples in the manuscript, which rely on the input TF list instead.

We should also add some NDC documentation clarifying that the input gene list has to have the form of one gene per line. I have added this to an outstanding issue in the NDC repo.

@katehoffshutta katehoffshutta self-assigned this Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant