Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We need to be able to support joining on metadata based on partial code matches (e.g., no valueuom). #148

Open
mmcdermott opened this issue Aug 12, 2024 · 2 comments
Labels
Blocking External Tools For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc. MEDS-Extract Metadata Extraction Needs Clarification This issue needs further clarification before it can be operationalized priority:low A low priority issue.

Comments

@mmcdermott
Copy link
Owner

# TODO: make this work even with missing valueuom

@mmcdermott mmcdermott added priority:high A high priority issue. MEDS-Extract Metadata Extraction Needs Clarification This issue needs further clarification before it can be operationalized Blocking External Tools For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc. labels Aug 12, 2024
@mmcdermott
Copy link
Owner Author

mmcdermott commented Aug 12, 2024

The solution here is to make it so that, in extract_code_metadata, if the metadata_config (e.g., https://github.com/mmcdermott/MEDS_transforms/blob/main/MIMIC-IV_Example/configs/event_configs.yaml#L217) sets a code part column to null, such as that shown below:

meas_chartevents_main:
  description: ["omop_concept_name", "label"] # List of strings are columns to be collated
  itemid: "itemid (omop_source_code)"
  parent_codes: "{omop_vocabulary_id}/{omop_concept_code}"
  valueuom: null

then the system identifies from the set of allowed codes all those codes that would match the code constructed for the surrounding event with a valueuom set to null and takes a cross-product join between the metadata rows and all matching codes.

This leaves it up to the user to identify which parts of the code are subsidiary on a case by case basis. This will make it trickier to work with a more expressive code parser language in the future, as this will only work if we can deconstruct realized codes into code parts, but that's ok for now.

@mmcdermott mmcdermott added priority:low A low priority issue. and removed priority:high A high priority issue. labels Aug 29, 2024
@mmcdermott
Copy link
Owner Author

This is not actively causing any issues given #156 has been resolved, so I've lowered the priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocking External Tools For issues actively blocking external tools, such as ACES, MEDS-torch, MEDS-tab, etc. MEDS-Extract Metadata Extraction Needs Clarification This issue needs further clarification before it can be operationalized priority:low A low priority issue.
Projects
None yet
Development

No branches or pull requests

1 participant