You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mark and I have run into an issue with some existing vocabulary IDs. e.g. ICD03_Morphology is misspelled and should be ICDO3_Morphology and SEER_SRC is better named SEER_RPTSRC
Some existing datasets use the old, "incorrect" vocabulary IDs and existing algorithms use these same names and IDs
Our current solution is to continue to use this old vocabulary IDs in current and future ETLs in order to stay compatible, but this prevents us from fixing the issues or moving forward with improvements.
We have a few options:
Update old datasets to use new vocabulary IDs while also updating algorithms to use the new IDs/names
Old exports will not work in this situation if we ever need to recreate them
Allow vocabulary operators to look for more than one vocabulary ID
We actually support option 2 already using multiple_vocabularies.csv and can probably just create vocabularies that override the existing vocabulary operator so that we end up with an operator like "ICD03_Morphology" that looks for both "ICD03_Morphology" and "ICDO3_Morphology" vocabuarly IDs. We could have a similar "ICDO3_Morphology" operator that searches both vocabularies as well.
Looks like this is supported in the above mentioned CSV file and this is what we should attempt to use first.
The only concern I have is if we have ICD03_Morphology defined as an operator via Lexicon, will the multiple_vocabularies.csv properly override that operator?
The text was updated successfully, but these errors were encountered:
Mark and I have run into an issue with some existing vocabulary IDs. e.g. ICD03_Morphology is misspelled and should be ICDO3_Morphology and SEER_SRC is better named SEER_RPTSRC
Some existing datasets use the old, "incorrect" vocabulary IDs and existing algorithms use these same names and IDs
Our current solution is to continue to use this old vocabulary IDs in current and future ETLs in order to stay compatible, but this prevents us from fixing the issues or moving forward with improvements.
We have a few options:
We actually support option 2 already using multiple_vocabularies.csv and can probably just create vocabularies that override the existing vocabulary operator so that we end up with an operator like "ICD03_Morphology" that looks for both "ICD03_Morphology" and "ICDO3_Morphology" vocabuarly IDs. We could have a similar "ICDO3_Morphology" operator that searches both vocabularies as well.
Looks like this is supported in the above mentioned CSV file and this is what we should attempt to use first.
The only concern I have is if we have ICD03_Morphology defined as an operator via Lexicon, will the multiple_vocabularies.csv properly override that operator?
The text was updated successfully, but these errors were encountered: