Skip to content

Issues: mmcdermott/MEDS_transforms

Release 0.1 Tracker
#35 opened Jul 16, 2024 by mmcdermott
Open
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

add_time_derived_measurements breaks if you use _script in the meds_transform_runner bug Something isn't working priority:medium A medium priority issue. Runner For things about the multi-stage, single-script Runner
#202 opened Sep 3, 2024 by Oufattole
Lock files should be pipeline ID specific in some way -- this will enable pipelines to flag when old run locks are present. Computational Performance Issues relating to efficient computational performance of MEDS_transforms pipelines Pipeline Configuration and Stage Management Issues relating to proper definition and usability of different stages in a pipeline priority:medium A medium priority issue.
#194 opened Aug 30, 2024 by mmcdermott
Should pull the generic hydra resolvers (e.g., get_script_docstring) into a separate package Code Cleanliness For code style, cleanliness, reduction of technical debt, etc. priority:low A low priority issue.
#180 opened Aug 27, 2024 by mmcdermott
We need a more robust interface for ways of (a) processing numerical and categorical values and (b) normalizing output data in light of those modes. Blocking External Tools For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc. MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms Needs Clarification This issue needs further clarification before it can be operationalized New Transformation Requests for a new transformation function that can be used in MEDS pipelines priority:high A high priority issue. Release Blocking
#177 opened Aug 25, 2024 by mmcdermott
1 of 3 tasks
Error message when aggregate_code_metadata.py gets an aggregation that should be an object but is just a string should be clearer. Code aggregations Issues about building new aggregations over codes and values. MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms priority:medium A medium priority issue. Usability / Interface
#164 opened Aug 14, 2024 by mmcdermott
Pipeline Configuration Improvements MEDS-Extract MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms Needs Clarification This issue needs further clarification before it can be operationalized Pipeline Configuration and Stage Management Issues relating to proper definition and usability of different stages in a pipeline priority:low A low priority issue. Usability / Interface
#155 opened Aug 13, 2024 by mmcdermott
2 tasks
reshard_to_split should (in a configurable manner) sub-shard the input rather than re-shard the input where possible. Computational Performance Issues relating to efficient computational performance of MEDS_transforms pipelines MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms Needs Clarification This issue needs further clarification before it can be operationalized priority:medium A medium priority issue.
#153 opened Aug 13, 2024 by mmcdermott
The dropping of nulls and making the dataframe unique could be done once and shared across all time dependent fntrs. Code Cleanliness For code style, cleanliness, reduction of technical debt, etc. MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms priority:low A low priority issue.
#152 opened Aug 13, 2024 by mmcdermott
We need to be able to support joining on metadata based on partial code matches (e.g., no valueuom). Blocking External Tools For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc. MEDS-Extract Metadata Extraction Needs Clarification This issue needs further clarification before it can be operationalized priority:low A low priority issue.
#148 opened Aug 12, 2024 by mmcdermott
reshard stage code is very messy and really stretches the limits of this "MR" library's API. Code Cleanliness For code style, cleanliness, reduction of technical debt, etc. priority:low A low priority issue.
#145 opened Aug 11, 2024 by mmcdermott
Logging may be misconfigured for importing this package as a library. Logging Needs Clarification This issue needs further clarification before it can be operationalized priority:low A low priority issue.
#140 opened Aug 11, 2024 by mmcdermott
Pipeline should throw a warning if there are deprecated column names and not current column names in event config documentation Improvements or additions to documentation MEDS Formal Compatability For efforts to ensure formal compatibility with the MEDS schema MEDS-Extract priority:low A low priority issue. Usability / Interface
#126 opened Aug 8, 2024 by mmcdermott
Tokenization & Tensorization Updates MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms priority:critical A critical priority issue that should be solved and pushed to a new minor version release ASAP. Release Blocking Tokenization & Tensorization For the process of taking a pre-processed MEDS dataset and converting it into DL friendly forms. Usability / Interface
#122 opened Aug 6, 2024 by mmcdermott
1 of 2 tasks
Release 0.1
Add transformation for injecting time-interval codes based on config specification MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms New Transformation Requests for a new transformation function that can be used in MEDS pipelines priority:high A high priority issue. Release Blocking
#120 opened Aug 5, 2024 by prenc
Decide how to handle stages that require metadata to contain all codes in the (train set) of the dataset. Code Cleanliness For code style, cleanliness, reduction of technical debt, etc. MEDS-Transform Issues for the data pre-processing transformations in MEDS_transforms Needs Clarification This issue needs further clarification before it can be operationalized priority:high A high priority issue. question Further information is requested
#117 opened Aug 5, 2024 by mmcdermott
ProTip! Add no:assignee to see everything that’s not assigned.