-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Primary Deliverable - MaterialSample definition #2
Comments
Additional related commentary in #3 (comment). |
From #3 (comment)
|
Also see #3 (comment):
A sample is not necessarily a material thing, social science samples are often not. I think specimens are always material things. |
|
Thanks for bringing that here! (And sorry for not thinking to do so myself). Just to be clear, though, I was not intending to propose a formal definition; but rather I tried to capture my own thinking of what a |
But it is good! |
a materialSample is an object separated from the material world, intended to be representative of some sampled feature. Samples are typically collected with the intention of making measurements/observations on the sample that will characterize the sampled feature. |
Maybe: "A sample might undergo some curation and accession process and become a specimen (as well as a sample)." |
I recall a previous comment about accession and asked our Registrar about the legal meaning, which is, in part, legal ownership. Some samples and specimens we can never own, i.e. fossils/archaeological remains from US Federal lands, other countries have similar laws, but we do reposit them. I suggest "accession or reposit". |
Related requests for new terms that we should not lose sight of: New Term - materialSampleType |
Wouldn't the controlled vocabulary examples listed for Or have I misunderstood the purpose & function of |
I would think so. |
If we are going to really flesh out a "Material" class in Darwin Core, the first step should be defining the class. We have MaterialSample to begin with, but I think we have agreed that the definition is not working for everyone. While some seemed opposed to it, I think the broadest possible definition for a Darwin Core "Material" class would be the Dublin Core PhysicalObject: Term Name: PhysicalObject
For me, this also removes the problems of human and machine observations (images, etc) from our discourse. The next question for me is are we only thinking about "curated" objects in Darwin Core? If that is true, then perhaps the best definition for MaterialSample might be:
The problem I see in this definition has to do with LivingSpecimen, which may not really be "extracted" from the natural environment. So how about
|
Is 'inanimate' a problem? |
I would think so - a gorilla in the zoo, a tree in the botanic garden. Thanks for pointing that out! So now what.....I need to have a weekend! |
We already have a "Material" class in DwC ( I think it's beyond the scope of DwC to be defining terms that apply to literally everything that is a physical object (atoms? galaxies?). I think what we're interested in is the subset of physical objects that we humans handle or maintain or process in some way. I think dc:PhysicalObject could be indicated as the superclass of As for wording, I would favor something like:
This encompasses objects, their derivatives, and aggregates, and also avoids potential ambiguities about "natural environment" (which might get a bit squirrelly if we want to accommodate other kinds of objects, like geological samples or cultural artefacts). We can probably remove some of the verbs (e.g., eliminate "analyzed", as it may be implied by "processed"?) |
I don't find that very helpful. It is also not very consistent with the various bits of discussion above.
The sentence quoted conflates these concerns in a rather confusing way. I believe the concern here is to recognize that, if 'sample' and 'specimen' are both roles, and are somewhat independent of each other, then we need to identify the parent class of 'material things', some of which are also samples, some of which are specimens, and some of which are both. http://purl.org/dc/dcmitype/PhysicalObject would be fine, except for the 'inanimate' qualifier :-( @tombaker do you know why |
I've raised an issue about 'inanimate' over on the DCMI issue tracker. |
Fair enough -- that sentence was written hastily -- which is why I was a bit more careful in the wording of the definition text:
So... replace "humans capture or care for in some way" with "collected, processed, analyzed, managed, or curated by humans". Not sure if that is any better, though.
I think we get way too hung up on the semantics of "specimen" (as a noun?) and "sample" (as a verb? noun?). Both of these terms have different meanings to different people, and different definitions in different contexts. Of the two (specimen and sample), my sense is that "sample" probably carries less misinterpretation-potential baggage. But maybe that's just me? In any case, the good news is that we don't need to define "specimen", and we don't need to define "sample", because neither of those terms, by themselves, is a DwC term. What we do need to do is define So... I agree... the phrase "humans capture or care for in some way" was unhelpful. But I'm curious: what do folks think of the actual wording I proposed for the definition of
I agree that "inanimate" is problematic, but I think a bigger problem is the scope. I do not think that |
Actually I think the verb 'to sample' is pretty clear, and helpful. My concern is exactly that your definition slides immediately over into the curation and handling aspect, which I understood to be associated with specimens, but not with all samples. That is confusing. If the 'inanimate' qualifier could be removed from the Dublin Core class, then dwc:MaterialSample rdfs:subClassOf dctype:PhysicalObject . We could also perhaps see an additional class dwc:AccessionedThing rdfs:subClassOf dctype:PhysicalObject . to support the collections folk more explicitly. my:Individual987 a dwc:MaterialSample , dwc:AccessionedThing . and implicitly also a |
Ok, yes -- that sounds right to me. What are some examples of
? |
So I guess the way I see it is that things like
I think that depends on the meaning of "is an" in the quoted text above -- and it also underscores my long-standing uncertainty about the boundary between The way I understand it, the properties that apply to the Tiger as an instance of
Absolutely! Which is why I think
That's a key part of the question I've been asking for a long time now (spoiler alert: I don't have a good answer). I would say that the Organism does not exist until either a sperm fertilizes an egg, or an asexual organism splits into two, or whatever reproduction mode applies. But does that mean that, once created, the Organism continues to exist into all eternity from that moment forward? I don't think so. After the last molecule that had comprised the physical being of the
Sure (maybe?) But if that same fish is eaten by a shark, and some of its molecules are absorbed into the shark's body through digestion, and other molecules are excreted over time -- would you still call that dissociated set of molecules scattered over miles of reef and ocean water to collectively still be a fish? I'm guessing not. So... somewhere between the point at which it stopped living, and the point at which its molecules are dissociated and dispersed, I would say it stopped being an I could wax on about this for hours, but I think that wouldn't be helpful for the task at hand. The core task is to come up with a definition for I think @baskaufs has suggested (and I agree), that a more practical way to arrive at these definitions and distinctions is by figuring out which properties go with which class, and from those respective sets of properties the boundaries of the classes should emerge. I have a pretty clear idea which properties I would assign to each of these two classes, but I've already consumed too much bandwidth on this discussion, and I need to get some sleep before TDWG starts again (1am Hawaii time... ouch). So I'll end it here for now. |
Thanks for those clarifications, Rich. I think we agree. Not all organisms become material-samples, and not all (biodiversity) material-samples are (whole) organisms, so the one-to-one correspondence that can exist in some cases is not a class-subclass relationship. I also want to argue that organism and (biodiversity) material-samples should be recognized as distinct things because our samples infer the existence (or former existence) of organisms and their properties. Samples tell us about organisms, and by inference populations and taxa. |
Thanks, @stanblum - yes, we definitely agree! I apologize that my endless ramblings don't always capture my points clearly. But I would like to focus on this a bit more:
... because this gets to the heart of not only the definition of First of all, I should explain that in our implementation "Organism" is itself a subclass of something we call "Individual". The latter is broader in scope and includes all manner of non-biological things. So for us, the relationship between But even if we focus only on the biological/biodiversity subsets of these two classes [ Again, I don't have a clear answer, but I think we should explore this as a way to refine the definition of At the heart of this is your point that "...our samples infer the existence (or former existence) of organisms and their properties. Samples tell us about organisms, and by inference populations and taxa." I think there is some consensus that instances of
[Side note: I'm imagining that the example values above for materalSampleType are subtypes of Separately, we'd track each of the Organisms comprising the lot of specimens:
There are three examples of one-to-one correspondence between
Perhaps that's all we need in this example, because we can infer/derive the relationships between instances of
Similarly, there may need to be many-to-one
I'm not trying to divine an implementation data model; rather I'm trying to get at the nature of the relationships both among instances of |
Calling "whole organism" a subtype of PreservedSpecimen seems pretty darn confusing! |
Back on Oct 17, 2021 I mentioned that I think "Material Sample" entered the DwC discourse through the BioCollections Ontology (BCO). A change in BCO I wasn't aware of until yesterday is that BCO has now deprecated the "Material Sample" class (made it an obsolete class), and instead have adopted a term/class from a larger ontology, the Ontology for Biomedical Investigations (OBI)(!):
This combines several of the notions we've been discussing: a material entity that is the result of a material sampling process and has been taken (collected and understood) to represent some larger entity (thing, population, community) in further study or analysis. Also deprecated in BCO were the subclasses of Material Sample, including: preserved-, living-, and fossil-specimen. I thought it was noteworthy that having taken materialSample from BCO to create a superclass for all the different kinds of things we manage in the biocollections community, the DwC is now (still) using "material sample," while the BCO is now using "obi:specimen." Should we follow? Would it be appropriate for us to 1) incorporate the obi:specimen term in DwC, or 2) mint our own specimen term, dwc:specimen, and paraphrase their definition while including a "crosslink", like:
The argument for the second option being that DwC is currently a bag of terms and doesn't support reasoning, which OBI (an OWL ontology) does. In other words, they aren't the same kinds of standards, so incorporating an OBI term in DwC isn't the right thing to do. The better practice might be just to reference obi:specimen in some appropriate way. I'll defer to others with more experience. Or, given that we also want to include environmental samples in DwC (for metagenomic analysis), should we just retain the term "material sample", because most people wouldn't think of an environmental sample as a "specimen." |
looks like OBI still has 'material sample', defined as 'A material entity that has the material sample role', which is a subclass of specimen, 'A material entity that has the specimen role.'. I don't see anything about deprecation (Last uploaded: January 10, 2022). You'd be hard pressed to distinguish specimen from material sample given their definitions, so I can see why they'd get rid of one of them. |
Thanks, @stanblum! Does OBI define the scope of “specimen role”? And what other kinds of material entities (in the sense of OBI) are outside that scope? I’m not in favor of changing dwc:MaterialSample to dwc:Specimen if they have essentially the same definition (for reasons articulated by @baskaufs at an earlier zoom meeting). |
I want to add a bit of historical perspective on the relationship between In the end, the adopted class was defined to be a subclass of http://purl.obolibrary.org/obo/OBI_0100051, which I believe at the time had the label "material sample", but whose label has now been changed to "specimen". Declaring a TDWG term by its relationship to a non-TDWG term other than those in Dublin Core was a new thing to TDWG. Eventually, the decision was made and codified in Section 4.4.2.2 of the Vocabulary Maintenance Specification(SDS) that assertions that generate machine-computable entailments should not be included in the core metadata about a term, but rather in an "extension term list" layered on top of the basic "bag of terms" layer. As a result, the subclass declaration for There are two important issues that are raised by Stan's comment. The first is the importance of differentiating between term labels and the terms themselves. There is no such thing as The second issue, which is currently very relevant is the mechanism by which we make connections between TDWG terms and terms defined outside of TDWG. This has been a topic of discussion for years, without resolution. Some suggestions, like using the SKOS relationship terms like |
Term change
Current Term definition: https://dwc.tdwg.org/list/#dwc_MaterialSample Proposed attributes of the new term version (Please put actual changes to be implemented in bold and
|
Sorry I missed the last session on this. One question and one comment: Question: In the proposed new definition, is there a difference between "physical object" and "physical entity"? Comment: The proposed examples seem a little animal-centric -- maybe a plant and a bacteria example would be good to add? Also, maybe a better/more intuitive example of an "undetatched" instance of MS would be a fossil aggregate represented by a single physical rock with multiple embedded organisms. Also... I hadn't considered the "undetatched" potential within a single organism in MaterialSample examples. Certainly examples of multiple organisms represented as a single colelctive object (aforementioned fossil; hermit crab+shell+anemone; etc.). But I'd not considered the possibility of branding undetatched subcomponents of the same individual Organism as distinct MS instances. I guess that means that any given MS instance of a single organism could have near-infinite potential child instances, without any disarticulation action happening to the whole. I don't have a problem with this, but the non-normative documentation should probably explain this a bit more, with an explanation that MS instances are minted when there is an informatic need to do so, and also including examples where there is an informatic need to track undetatched subcomponents of an object (e.g., the undetached leg of a dog). |
I understand that the notion of "representing" in the proposed definition is to convey the notion that an object that is the subject of collecting or observing (e.g., a goose, a swarm of geese, a fossil bearing rock small enough to be lifted, a small twig from a tree) subsequently to its collecting or observing, often is used to drive inferences about about a larger whole that it is part of or relates to in a specific way (the Swedish population of geese, the mountain range the rock originates from, the entirety of shrubs belonging to the same species). I think the use of two different terms ("physical object" - "physical entity") can be defended to reflect that a subject of collection or observation will as a matter of necessity be spatially more confined than physical entities in general (the latter including, for example, all geese in the world, the taiga, Earth's atmosphere). It would be good though, to clarify this in the documentation around the definition. Apart from this I agree with @deepreef's conclusions about instances of |
Add the draft definition and non-normative documentation for dwc:MaterialSample as discussed in #2.
OK, thanks! That makes sense to me. But it wasn't immediately obvious (to me, at least) from the wording of the definition. I think the wording of the definition can stay as it is, as long as the non-normative explanatory comments help folks understand the implications of the distinction (physical object vs. physical entity) -- as you suggest.
Yes! Definitely. I think it's clear that aggregates of multiple disconnected MS items can be collectively bundled into a single umbrella/parent MS instance. Whether or not those individual component items came from the same instance of Organism, or multiple Organism instances, shouldn't make any difference.
I guess so... but this almost sounds like advocation for accepting hypothetical/inferred physical objects within scope. I don't immediately have anything against that, but I worry that it might be flirting with the edges of the MS scope a bit. I'm thinking of the type specimen for Nessiteras rhombopteryx. But in that case, the physical object is not hypothetical -- it's just that it might be an Organism, and it might (probably) be a rock. |
In October 2020, we had a meeting of the Paleo "Happy Hour" on the topic of clusters/fossils on a slab and otherwise instances of what I refer to as "loanable objects" (cannot loan one without loaning all objects in or on a "container"). The list we came up with may be useful. We sub-divided the list into Natural Accumulations and Artificial or Anthropogenic Accumulations: Natural Accumulations:
Artificial or Anthropogenic Accumulations: The antithesis of these examples for anthropogenic modification is serial thin sections or peels of an individual fossil, which would fall under the current definition of MaterialSample. What we discussed briefly are examples of display fossils that are composits of several individuals, usually vertebrate fossils, sometimes invertebrate or plant fossils. The point is that a wide variety of natural and anthropogenic objects are possible. This list was assembled from the Invertebrate Paleontology and Paleobotant collections at one museum. |
I left these comments last night after two long weeks of interviewing Curator candidates for the collection I am CM for. I left these as examples of paleontology collection objects and the difficulties of fitting our collections into existing CMS and DwC models. Paleontology objects are rarely "as found", they routinely require some effort at initial preparation to further expose fossilized biological objects and then some means to preserve not only the object but the relationship between it and other associated objects found within the original sample (i.e. dwc:associatedOrganisms). However, consider for example a palynology sample: a quantity of rock is collected from a locality, transported to a lab, the rock is broken into smaller peices and placed in a jar for reserve and a subset is separated and further reduced to a coffee ground consistancy and stored, a subset of the ground sample is then placed in an acid resistant beaker and processed with HF for a day or more, washed and centrifuged. The processed "residue is then stored in a vial and a subset of the now processed residue further cleaned and pipetted onto a microscope slide cover slip, dried and flipped onto a microscope slide. When the slide is examined, you finally can see the objects collected, along with hundred to thousands of other co-occuance objects (dwc:associatedOrganisms). What is the MaterialSample, the original collected rock, the ground residue, the acid prepared residue, the microscope slide itself, or the pollen grain on the slide (that has a Linnean name and coordinates from an England finder)? The original collected sample may have other fossil forms embedded within that would be destroyed by HF processing. Subsets may be processed by other acids (Formic, dilute HCl, Acetic, etc.) or reducuced and hand-picked under a binocular microscope to produce Conodonts, Foraminifera, Calcareous Nanofossils, or megafossils, etc.. Derivative samples may also be processed for non-biological data such as isotopes of strontium, carbon, oxygen, boron, etc. and/or radiometric dating. As such. I would see the original collected sample as the parentMaterialSample to maintain at least some relationship to all of the derivative biological and non-biological entities, with the processed samples as (what?) and the (identified and named) biological objectas the MaterialSample. Seems a lot of processed derivative samples either carry the same designation as the original parentMaterialSample, or are absorbed into dwc:preparations (that do not really fit as not a preservation method), and making the link between the parent and child somewhat muddy. This is also very true of commercial CMS and why I use a database I created to keep these relationships and results discoverable. Mapping to DwC has always been a challenge (need more coffee). Sorry for the long comments. |
@RogerBurkhalter This kind of detailed use case is immensely useful. It highlights both the value of the concept of a parentMaterialSample (see tdwg/dwc#344) and its limitations. By limitations, I mean, "What does it mean to be the parent?" I suspect we need a much richer way to relate materials, with something at the level of a ResourceRelationship where the nature of the relationship can be specified. In the Diversifying the GBIF Data Model work, the model anticipates the relationships "part of" and "derived from" as well as a separate mechanism to establish membership in a material group that was developed for the OBIS Community Measurement use case, but that would also work for other purposes. |
My answer: The MaterialSamples (plural) are: whatever units of physical material(s) warrant identification and associated metadata from an informatics perspective. In other words, the decision of whether to mint a new Some use cases in my (non-fossil) world:
In this case, I would not bother assigning a separate MS instance to the fish before its fin clip was removed (or scale lost); and I would not bother assigning a separate MS instance to the lost scale, because I have no informatic need to track either of those separately from the two MS instances I do mint. Whole bird is collected, put in freezer, and accessioned/catalogued. Later, the skin is removed and prepared dry, the internal organs are preserved in alcohol, the skeleton is processed with the aid of dermestid beetles. Later still a subsection of tissue is removed from the preserved organs for DNA analysis.
In this case, I have an informatic need to track the whole organism prior to dissociation of parts (object that is accessioned and catalogued), so I do assign an MS instance to this. I do not bother assigning MS instances to the blood and other tissue that ended up in the waste basket, nor the tissue consumed & digested by the dermestid beetles, because I don't have an informatic need to track them. Because the tissue sample was subsequently removed from the alcohol-preserved organs, I treat it as a child of that MS (3), rather than a a direct child of the whole (1). That way, the curation history of the tissue sample is more precisely/completely represented in the chain of preservation processes (e.g., in case the internal organs where first fixed in formalin, so I would then know that the derived tissue sample is not fit for purpose for DNA sequencing). These are pretty straightforward examples in my mind. Another straightforward example is if a feather is plucked from the skin and used for some purpose/preparation/whatever, in which case I would mint: But here's where it gets interesting, relative to the earlier discussion on "undetatched" MS children. Suppose I photograph just the wing of the mounted skin. Would I have a need/desire to mint a new MS instance for the wing, even though it is still physically part of the whole skin? That way, I could make the subject of the image the wing alone, rather than the whole skin preparation. But is that really good practice? I honestly dunno. In any case, I think the same basic logic ("Do I have an informatic need to track properties or relationships of a particular aggregate/unit of physical material?") would apply in the example you gave for which bits get distinct instances of MS. |
I've come around to applying that logic to all relationships within DwC. In other words, whenever there is an xxxID term/property within a DwC Class, I'm leaning towards representing those values not as direct properties of the root instance, but as instances of I think of this as a "semi-serialized" approach. That is, literal values are treated as direct properties of DwC class instances (e.g., property "fields" to the class "tables"), but all "foreign key" property values are captured as an "octuple store" (eight terms organized in I have no idea whether this quasi-hybrid relational model/serialized model is practical or sensical, but it feels like a potentially practical middle-ground between the two different ways of representing data (i.e., tables & fields vs. triple-store). |
Attendance at the 2022 working session included a lot of people who are not members of the Task Group and their primary concern was with the baggage that might be associated with "sample".
It was clear to me that people were looking for something that could encompass any physical material whether it was a "sample" or not if we hope to allow collections to use DarwinCore to share their objects. There was also discussion about the use of the term sample when associated with human remains. As I have first-hand experience attempting to remove "specimen" from everything in a CMS, I completely understand the concern. Notes from the session include this: “Sample” is problematic, consider “catalogueRecord”, “object”, “entity”, “unit”
It was also discussed that a "material" class should start with a "High-level distinction between material and information artefact" as this would mesh with the LatimerCore baseTypeOfCollection So - should we really be starting with the class MaterialEntity? Would this be equivalent to the Dublin Core PhysicalResource?
I know this feels like a step backward.
BUT as LatimerCore is currently in expert review and they cover a lot of things that crossover into material, I think we need to think deeply about this. |
Changes as suggested in #37 added to review package - https://github.com/tdwg/material-sample/blob/main/review%20package/MaterialSample.md |
Term Change submitted - tdwg/dwc#451 |
change complete - tdwg/dwc#451 |
Current Definition
http://rs.tdwg.org/dwc/terms/MaterialSample
Please suggest changes/improvements in this issue.
See also https://github.com/tdwg/material-sample/blob/main/primary_deliverable/MaterialSample.markdown
See also MaterialSample terms Google Sheet
The text was updated successfully, but these errors were encountered: