Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value Set vs SSSOM global context - understanding the difference #144

Open
matentzn opened this issue Feb 25, 2022 · 4 comments
Open

Value Set vs SSSOM global context - understanding the difference #144

matentzn opened this issue Feb 25, 2022 · 4 comments

Comments

@matentzn
Copy link
Collaborator

matentzn commented Feb 25, 2022

Note: all of the following is a bit speculative and an attempt to try and understand how value sets and sssom fit together.


Many or even most clinical mapping schemes, such as FHIR concept map (https://www.hl7.org/fhir/conceptmap.html), revolve around the notion of value sets. A value set is a selection of codes for use in a particular context. Let's unpack:

  • selection of codes: simply a set. For example [Female, Male].
  • for use: this implies that value sets are created for a specific use case, such as documentation, diagnosing etc (I assume that is what is meant). Conversely, this implies the mapping should not be used for a different use case.
  • in a particular context. This really is the meaty part: the way I read this, this really constraints the mapping itself to a particular situation, while the goal in SSSOM is to provide mappings that hold universally. In the value set parlor, this means, everything in SSSOM applies to the default value set.

A value set is usually associated with an element in a datamodel - in linkml land, you can imagine, for example, that the value set is defined as an enum, which is associated with a slot. The element that holds the slot could be, to stay with the simple example above, patientencounter::patienthistory:gender. (Correct me if I am wrong, musing aloud here). Now we can map, for example,

Female in the patientencounter::patienthistory:gender element to, say, NCIT:Female.

The "contextual" part of the value set basically says: Female ---> NCIT:Female only in the context of patientencounter::patienthistory:gender. But just to make a point here: Female could be anything! Indeed, it could be a code of a terminology: OMOP:Female ---> NCIT:Female. Even here, the mapping only holds in the context of the value set delineated by patientencounter::patienthistory:gender.

To make it clear from the start: SSSOM does not, nor likely will as far as I understand, have a notion of value sets - so the question here is:

Can the context established by the value set be modelled by the global context in SSSOM? Or, to say it in a less geeky way, can we somehow describe the context established by a valueset as global metadata elements? An easy solution is to simply contextualise the ID itself, i.e. patientencounter::patienthistory::gender#OMOP:Female as the subject id (see #43 for more details). Here we need help from clinical mapping experts to understand if that would be sufficient.

However, I find this quite unsatisfactory. In many case OMOP:Female ---> NCIT:Female is very true all by itself. It seems like a missed opportunity to generalise to publish thousands of different patientencounter::patienthistory::gender#OMOP:Female ---> NCIT:Female.

I may simply not understand the value set idea in the context of mappings well enough, and be totally off-base here. Hopefully we can sort out my misconceptions with the help of @ShahimEssaid @mellybelly and @cgchute

@graybeal
Copy link

graybeal commented Mar 6, 2022

Well, to respond with proper context, I must be a bit speculative myself. I worry we are in an 'angels on head of a pin' space similar to discussions in more esoteric fora, but as a relatively lay reader, I can claim no real knowledge of that space. That said, I claim you are overthinking this situation.

The tooling of semantics—W3C tooling, at any rate—describes the meanings of various properties related to mappings. Yet the first thing that happened with the release of the specifications—in particular see sameAs—is that people began misusing it in ways not anticipated nor very supported by the standard. Just as tag-bombing can make a Twitter or other folksonomy tag worthless, misuse of properties per "what they should mean" inevitably means you can't control the ways properties are used once they are released (aka 'in the wild'). Even a casual inspection of mappings done manually by experts and used by BioPortal to automatically establish similarity relations (n.b. UMLS CUIs) makes it obvious that it is all but impossible to rely on the precision and context sensitivity of mappings created by other entities in other contexts.

I believe this is not such a bad thing—not surprising, given the relaxed semantic functionality supported by the tools I help manage—as I see the power of these mappings even when applied in contexts well beyond where they were conceived. To the extent that SSSOM was designed as a way to simplify the creation and expression of mappings, we can safely assume that many more people will be able to access this kind of data, building mappings of their own of likely even less rigor than the current mappings built with formal semantic tooling (again, sameAs).

As with the use of ontologies 'in the wild', the ability to re-use mappings created with SSSOM will not be established or enhanced by the addition of global context descriptions restricting the meaning of the mapping, because (a) the mappings may not be that good even in that context, and (b) most re-use will want to use them in other contexts, on the "more likely than not" chance that they improve the overall knowledge construct that the re-users are trying to create. (I'm going to guess that this partly this improvement is successful because of syntactic effects, and partly because the conceptual relations often map to additional domains, at least to some degree.)

I appreciate that in the OBO Foundry context there is an attempt at rigor that makes the question itself a bit more rigorous and potentially meaningful. But here also, I think the setting of context happens largely outside of the mapping document, with the creators' stipulations about what they were trying to do, any additional reviewers' conclusions about what they succeeded in doing, and the existence of an independent tracker where you can look those things up. And in the end, the re-users decisions about how they try to use the artifact. I could imagine standardizing a formalism for declaring the context beyond the 'raw' mapping content, but not the automation of that declaration as an inferred conclusion.

(Also, as a minor note, many people of my acquaintance use the term "value set" to mean explicitly "a bag of terms without definitions or other internal relations", while others use it to mean "a collection of codes or concepts in a drop down list", and I used it generically long before seeing it defined narrowly, so without a true community consensus I don't accept that it is a narrower 'thing' than all of the above.)

@matentzn
Copy link
Collaborator Author

matentzn commented Mar 7, 2022

@graybeal thanks for your detailed response.. You come from the same school of thought as me :) I hope you are right, but I wrote this issue in response to meetings with medical terminology people, and I think we at least need to clarify the difference in thinking so we can clearly communicate what is and what is not on scope for SSSOM.

@sharifX
Copy link

sharifX commented Mar 7, 2022

Hi @matentzn @graybeal
Just a quick note.

Similar discussions are going on in the European Research Data Alliance (RDA) and FAIR communities. In particular, please take a look at I-ADOPT (https://www.rd-alliance.org/groups/interoperable-descriptions-observable-property-terminology-wg-i-adopt-wg) and the BiCIKL project (https://bicikl-project.eu/). Some of these initiatives are from Environmental Sciences and Biodiversity domain however the semantic mapping issues are same. I hope there are some overlaps in all these initiatives otherwise, I agree with @graybeal we are getting towards the "angels on head of a pin" situation.

@matentzn
Copy link
Collaborator Author

matentzn commented Mar 7, 2022

Thank you @sharifX very interesting projects. Will check them out!

As a non-native speaker, I just double-checked the meaning of angels on the head of a pin, and I agree, I do not want to be "wasting time debating topics of no practical value" (acc. to Wikipedia), merely to understand why all meetings with the clinical community sooner or later end up on this subject. I am happy to ignore it or reject it, but I would still like to be able to communicate clearly how a mapping system that is based around the notion of value sets like FHIR ConceptMap should, if at all, relate to SSSOM. I just don't understand it yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants