-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated EFO code for Wilms tumor per EFO v45 in prepation for OPC v12… #272
Conversation
… data and MTP compatibility
Hi @sangeetashukla and @ewafula , thanks for working to resolve this. I want to make some clarifications, since the current approach will not fix the primary issue that ChoP Wilms tumor evidence will not load into the MTP. I see two separate discussions here:
|
Hi @zdorman thanks for the explanation. Question- given the findings with the DGD (CHOP P30 cohort), I think v11 won't be added to the MTP in any state, is that correct? We plan to fix this in v12, which will also include extensive updates to all current cohorts to gencode v39, and addition of about 1000 more PBTA samples. Given this will be pushed to v12, we suggested your next release be around Feb/March which will allow us time to integrate all of the new data, and your team to update the EFO/MONDO terms. Does that sound like a good plan? |
@zdorman also to add this this thread - we are not planning to update the backend db/API which delivers the plots for MTP, with the v11 data release. We are planning on updating with the v12 release. |
@zdorman, this PR is not for the update you are suggesting. This is a PR (not reviewed yet) for v12 onwards for OPC |
Since the EFO change in this PR is an obsolete code for Wilms tumor, this PR does not need to be merged. Refer to this ticket for other updates related to EFO code changes to MTP tables. |
@ewafula & @sangeetashukla Thanks for the redirect to #433 - I hadn't seen that one before commenting here. That ticket looks like it will address the requested changes to the Somatic Alterations files well. Thank you for blocking the merge of the obsolete term for future releases. One question - @chinwallaa mentioned that the backend db/API for the Gene Expression (TPM) plots will not be updated, but #433 mentions some changes to the long-tpm jsonl tables used for our QA. Will the long-tpm tables and the db/API remain consistent? Regardless of update status, we want the tables to be an accurate representation of the db. @jharenza I see the disconnect now. We understand that the CHOP P30 DGD Panel-DNA is problematic, and will not release that on MTP before it is fixed in a future OPC release. We are hoping to get a "v11.1" (or any other naming scheme you prefer) containing our requested data fixes along with the other two panels (TARGET Panel-DNA & CHOP P30 DGD Panel-RNA-Fusion) for a near-term MTP release. We're currently evaluating level of effort for updating MTP to the new OT build (which uses EFO v3.45.0), but I expect that it will be up by early 2023 when OPC v12 is ready. |
@zdorman can you provide the filtered EFO v3.40.0, and EFO v3.45.0 - you had indicated that OT does another layer of filtering based on the EFO version that it uses. We should sync up to the final filtered version that is being used in OT. re: the db/API for the minor v11 release - we are planning on providing the tpm files as requested. We can review need/timelines for the API dev/prod for the v11 minor release (+ ~4-5 weeks ) vs v12 release - The v11 minor release staged for MTP (tables/files) includes additional - PBTA data 11 samples, DGD fusion-pannel-data - 870 samples , TARGET-DNA-pannel - 998 samples, PBTA-DNA-Pannel - 2 samples, AND will exclude DGD-DNA-pannels (929 samples) |
The best way to ensure compatibility is to use the OT database directly as previously suggested. Their versioned files are publicly accessible via FTP here: http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/ OT 22.04 (current MTP using EFO v3.40.0): http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/22.04/output/etl/json/diseases/ The folder contains multiple jsonl files (though labeled as json). Feel free to incorporate as best works for you, but one method of loading to a df using python pandas is:
|
Purpose/implementation Section
EFO code for
cancer_group =='Wilms tumor'
needs to be updated with the latest value per EFO release version v45.What scientific question is your analysis addressing?
MTP platform uses a particular EFO release version, and the MTP tables that CHOP shares with FNL must be compatible with that. However, the EFO code for Wilms tumor is not currently compatible with v11 data that we shared. With the expectation that MTP will update itself to EFO release v45 by the time of OPC v12 data release, this PR has the 'Wilms tumor' EFO code updated to resolve the incompatibility.
What was your approach?
Manually update EFO ID in the
efo-mondo-mapping/results/efo-mondo-map.tsv
file toEFO_1000056
as per EFO release v45.What GitHub issue does your pull request address?
Issue 420
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
Note: The module was re-run and I can confirm the qc passed and no other changes are attached to this PR except the EFO ID change.
Is there anything that you want to discuss further?
No
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
Yes
Results
What types of results are included (e.g., table, figure)?
results/efo-mondo-map.tsv
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.