Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore the pairing of units, data types and formats #2

Open
Public-Health-Bioinformatics opened this issue Oct 30, 2017 · 1 comment

Comments

@Public-Health-Bioinformatics
Copy link
Contributor

Public-Health-Bioinformatics commented Oct 30, 2017

A flexible data entry/reporting system would allow that a given date or numeric datum could be inputted

  • with an option of more than one entry unit, e.g. celsius/fahrenheit/kelvin, or year/month/week/day/hour/minute. For example, an "age" field might be measured in year, month, week, day, hour or even second buckets.
  • a preferred data storage unit, presumably the smallest of the given units if they are a precision scale, or stated if from different scales (celsius vs fahrenheit).
  • each of a datum's units could have its own numeric range constraint, e.g. human age tops 130 years, .
  • a datum value could have a different precision from the precision it is stored in or the precision of other comparable datums. E.g. 2005/01/01 to the nearest week.

Currently to enable GEEM to display different units with a specification (or form) field, we have them set up in the ontology via the OBO Foundry 'has measurement unit label' relation:

[entity] 'has measurement unit label' max 1 (day or week or month or year)

And a second subClassOf axiom defines the datatype using GenEpiO's 'has primitive data type' which allows association of a datum with an XML numeric, date or other type, as well as some range constraints (which are translated from OWL into a JSON representation):

'has primitive data type' exactly 1 xsd:nonNegativeInteger[< "130"^^xsd:nonNegativeInteger]

A first question is whether such a specification engenders that any stored value be stored in the most granular unit. This is problematic for time ranges insofar as 1 month = 4 and a bit weeks; so for some scales this approach would require averaging of months in a year, leap year amounts, etc.

Secondly, whether the range constraint should always be given in the smallest unit possible, so that it can be automatically calculated for the other scales?

A test case is "Drug MIC", which has two units of measure, "mg/L" and variant "ug/mL which however entail different precision, and "mm" millimetre.

'has measurement unit label' exactly 1 (millimeter or 'milligram per liter' or 'microgram per milliliter')

Possibility allow each datum to be accompanied by a 'has value specification' which contains both unit and numeric/string datatype constraints? This case may highlight an underlying difference between the type of measurement (diameter vs solution density) that needs to be separated out for easier data analysis.

@ddooley
Copy link
Contributor

ddooley commented Nov 27, 2019

Circling back to this now, in light of new OBI value specification data structure. Tricky part is how to specify numeric ranges when user can select different units. Probably have to key numeric range to just ONE unit, and calculate range, precision based on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant