-
Notifications
You must be signed in to change notification settings - Fork 3
Issues: mmcdermott/MEDS_transforms
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
We should be able to convert between different ontological code vocabularies.
#204
opened Oct 11, 2024 by
mmcdermott
add_time_derived_measurements breaks if you use _script in the meds_transform_runner
bug
Something isn't working
priority:medium
A medium priority issue.
Runner
For things about the multi-stage, single-script Runner
#202
opened Sep 3, 2024 by
Oufattole
All stages must have unique names or an error should be thrown.
#201
opened Sep 2, 2024 by
mmcdermott
Stages that depend on code metadata having been recently computed (e.g., Improvements or additions to documentation
priority:high
A high priority issue.
Usability / Interface
filter_measurements
) should be better documented
documentation
#200
opened Sep 2, 2024 by
mmcdermott
Should distribute / package typing information too
priority:low
A low priority issue.
#195
opened Aug 30, 2024 by
mmcdermott
Lock files should be pipeline ID specific in some way -- this will enable pipelines to flag when old run locks are present.
Computational Performance
Issues relating to efficient computational performance of MEDS_transforms pipelines
Pipeline Configuration and Stage Management
Issues relating to proper definition and usability of different stages in a pipeline
priority:medium
A medium priority issue.
#194
opened Aug 30, 2024 by
mmcdermott
Metadata extraction should log a warning if code-part column names are not uniformly either extracted or not extracted across metadata sources.
documentation
Improvements or additions to documentation
Logging
MEDS-Extract
Metadata Extraction
priority:low
A low priority issue.
#186
opened Aug 28, 2024 by
mmcdermott
Should pull the generic hydra resolvers (e.g., For code style, cleanliness, reduction of technical debt, etc.
priority:low
A low priority issue.
get_script_docstring
) into a separate package
Code Cleanliness
#180
opened Aug 27, 2024 by
mmcdermott
We need a more robust interface for ways of (a) processing numerical and categorical values and (b) normalizing output data in light of those modes.
Blocking External Tools
For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc.
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
Needs Clarification
This issue needs further clarification before it can be operationalized
New Transformation
Requests for a new transformation function that can be used in MEDS pipelines
priority:high
A high priority issue.
Release Blocking
#177
opened Aug 25, 2024 by
mmcdermott
1 of 3 tasks
Error message when Issues about building new aggregations over codes and values.
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
priority:medium
A medium priority issue.
Usability / Interface
aggregate_code_metadata.py
gets an aggregation that should be an object but is just a string should be clearer.
Code aggregations
#164
opened Aug 14, 2024 by
mmcdermott
Pipeline Configuration Improvements
MEDS-Extract
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
Needs Clarification
This issue needs further clarification before it can be operationalized
Pipeline Configuration and Stage Management
Issues relating to proper definition and usability of different stages in a pipeline
priority:low
A low priority issue.
Usability / Interface
#155
opened Aug 13, 2024 by
mmcdermott
2 tasks
reshard_to_split
should (in a configurable manner) sub-shard the input rather than re-shard the input where possible.
Computational Performance
#153
opened Aug 13, 2024 by
mmcdermott
The dropping of nulls and making the dataframe unique could be done once and shared across all time dependent fntrs.
Code Cleanliness
For code style, cleanliness, reduction of technical debt, etc.
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
priority:low
A low priority issue.
#152
opened Aug 13, 2024 by
mmcdermott
We need to be able to support joining on metadata based on partial code matches (e.g., no For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc.
MEDS-Extract
Metadata Extraction
Needs Clarification
This issue needs further clarification before it can be operationalized
priority:low
A low priority issue.
valueuom
).
Blocking External Tools
#148
opened Aug 12, 2024 by
mmcdermott
reshard stage code is very messy and really stretches the limits of this "MR" library's API.
Code Cleanliness
For code style, cleanliness, reduction of technical debt, etc.
priority:low
A low priority issue.
#145
opened Aug 11, 2024 by
mmcdermott
hydra_loguru_init
only captures a portion of the logging that happens in the code.
bug
#142
opened Aug 11, 2024 by
mmcdermott
Logging strings should indicate what worker they belong to.
Logging
priority:medium
A medium priority issue.
#141
opened Aug 11, 2024 by
mmcdermott
Logging may be misconfigured for importing this package as a library.
Logging
Needs Clarification
This issue needs further clarification before it can be operationalized
priority:low
A low priority issue.
#140
opened Aug 11, 2024 by
mmcdermott
Make it such that Improvements or additions to documentation
MEDS Formal Compatability
For efforts to ensure formal compatibility with the MEDS schema
MEDS-Extract
priority:medium
A medium priority issue.
Usability / Interface
external_splits
specification can point to a patient_splits.parquet
file or a prior splits.json
file from MEDS-extract to match the cohort.
documentation
#130
opened Aug 8, 2024 by
mmcdermott
3 tasks
Pipeline should throw a warning if there are deprecated column names and not current column names in event config
documentation
Improvements or additions to documentation
MEDS Formal Compatability
For efforts to ensure formal compatibility with the MEDS schema
MEDS-Extract
priority:low
A low priority issue.
Usability / Interface
#126
opened Aug 8, 2024 by
mmcdermott
Tokenization & Tensorization Updates
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
priority:critical
A critical priority issue that should be solved and pushed to a new minor version release ASAP.
Release Blocking
Tokenization & Tensorization
For the process of taking a pre-processed MEDS dataset and converting it into DL friendly forms.
Usability / Interface
Add transformation for injecting time-interval codes based on config specification
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
New Transformation
Requests for a new transformation function that can be used in MEDS pipelines
priority:high
A high priority issue.
Release Blocking
#120
opened Aug 5, 2024 by
prenc
Ensure that documentation specifies that all stages that rely on Improvements or additions to documentation
Pipeline Configuration and Stage Management
Issues relating to proper definition and usability of different stages in a pipeline
priority:high
A high priority issue.
Release Blocking
Usability / Interface
metadata/codes.parquet
having all codes should run an explicit aggregation first.
documentation
#118
opened Aug 5, 2024 by
mmcdermott
Decide how to handle stages that require metadata to contain all codes in the (train set) of the dataset.
Code Cleanliness
For code style, cleanliness, reduction of technical debt, etc.
MEDS-Transform
Issues for the data pre-processing transformations in MEDS_transforms
Needs Clarification
This issue needs further clarification before it can be operationalized
priority:high
A high priority issue.
question
Further information is requested
#117
opened Aug 5, 2024 by
mmcdermott
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.