-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BreastInvasiveCarcinoma
dataset
#7905
Conversation
This is a dataset that was generated by integrating the breast cancer (BRCA TCGA) dataset from the cBioPortal (cbioportal.org) and a biological network for node connections from Pathway Commons (www.pathwaycommons.org). The dataset contains the gene features of each patient and the overall survival time (in months) of each patient, which are the labels.
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## master #7905 +/- ##
==========================================
- Coverage 90.25% 89.52% -0.73%
==========================================
Files 459 459
Lines 26954 26951 -3
==========================================
- Hits 24328 24129 -199
- Misses 2626 2822 +196 see 31 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
BreastInvasiveCarcinoma
dataset
updating repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding this dataset! I've just directly made some changes to your PR, but please have a look and feel free to revert them if necessary :)
r"""The breast cancer (BRCA TCGA) dataset from `cBioPortal | ||
<https://www.cbioportal.org>`_ and the biological network for node | ||
connections from `Pathway Commons <https://www.pathwaycommons.org>`_. | ||
The dataset contains the gene features of each patient in graph_features | ||
and the overall survival time (in months) of each patient, | ||
which are the labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check the docstring change as I've just removed some redundant sentences? Also, it'd be ncie if you could describe what nodes and edges represent to help new users understand this dataset. (example: https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.datasets.KarateClub.html#torch_geometric.datasets.KarateClub)
Hi @akihironitta |
This is a dataset that was generated by integrating the breast cancer (BRCA TCGA) dataset from the cBioPortal (cbioportal.org) and a biological network for node connections from Pathway Commons (www.pathwaycommons.org). The dataset contains the gene features of each patient and the overall survival time (in months) of each patient, which are the labels. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
This is a dataset that was generated by integrating the breast cancer (BRCA TCGA) dataset from the cBioPortal (cbioportal.org) and a biological network for node connections from Pathway Commons (www.pathwaycommons.org). The dataset contains the gene features of each patient and the overall survival time (in months) of each patient, which are the labels.