Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complexValue required type fixed #89

Merged
merged 3 commits into from
Jun 19, 2023
Merged

complexValue required type fixed #89

merged 3 commits into from
Jun 19, 2023

Conversation

costero-e
Copy link
Collaborator

This is the fix for the issue opened by @redmitry here #87 . Please @mbaudis and @redmitry review it to move this forward. Thank you.

Copy link
Collaborator

@redmitry redmitry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... The "required" should by on the "typedQuantities" property.

"properties": {
        "typedQuantities": {
            "description": "List of quantities required to fully describe the complex value",
            "items": {
                "$ref": "#/definitions/TypedQuantity"
            },
            "type": "array"
        },
        "required": [ "typedQuantities" ]
    },

the removed part was ok:

"required": [
                "quantityType",
                "quantity"
            ]

@costero-e
Copy link
Collaborator Author

Thank you @redmitry to spot that! I added another commit to fix the position of the required. Can you review it again and confirm it's correct now? Thanks.

@redmitry
Copy link
Collaborator

The last piece is to return back:

            "required": [
                "quantityType",
                "quantity"
            ]

😄

@costero-e
Copy link
Collaborator Author

ok, done @redmitry , I am starting to feel the boomerang effect with this PR jaja, can you review it again? thank you!

@redmitry
Copy link
Collaborator

It looks correct now.
There is also an error in the the original CINECA Individuals data.

"measurementValue" : {                                                                                                                                                                                          
    "quantity" : {                                                                                                                                                                                                  
        "unit" : {                                                                                                                                                                                                      
            "id" : "NCIT:C49671",                                                                                                                                                                                        
            "label" : "Kilogram per Square Meter"                                                                                                                                                                     
        },                                                                                                                                                                                                           
        "value" : 26.63838307                                                                                                                                                                                     
    }
}

while correct one must be like:

"measurementValue": {
    "unit": {
        "id": "NCIT:C49671",
        "label": "Kilogram per Square Meter"
    },
    "value": 26.63838307
}

I have a fixed version:
https://gitlab.bsc.es/inb/ga4gh/beacon-v2-docker-demo/-/tree/master/beacon-data

@mrueda
Copy link
Collaborator

mrueda commented Jun 16, 2023

Just a note, if we want to be Phenopacket friendly, I suggest changing our schema to align with theirs. When in trouble, I follow theirs...

See: https://phenopacket-schema.readthedocs.io/en/latest/value.html

and:

EGA-archive/beacon2-ri-tools#3

Cheers,

m

@costero-e
Copy link
Collaborator Author

Ok, let's close this PR and discuss the individuals schema question in issues. @mbaudis can you please review and accept this PR? Thanks.

Copy link
Member

@mbaudis mbaudis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are good but there is an example (both yaml & json) which says "quentity"... I wonder who the original committer was ...

@mbaudis
Copy link
Member

mbaudis commented Jun 16, 2023

@mrueda

Just a note, if we want to be Phenopacket friendly, I suggest changing our schema to align with theirs. When in trouble, I follow theirs...

Tru dat. But PXF is inconsistent itself, at least when looking at the examples (and hampered by the "let's write this in YAML starting from a protobuf definition" documentation...):

Value:

value:
    quantity:
        unit:
            id: "UO:0000316"
            label: "cells per microliter"
        value: 24000.0
        referenceRange:
            unit:
                id: "UO:0000316"
                label: "cells per microliter"
            low: 150000.0
            high: 450000.0

But example in Measurement:

measurement:
    assay:
        id: "LOINC:26515-7"
        label: "Platelets [#/volume] in Blood"
    value:
        quantity:
            unit:
                id: "UO:0000316"
                label: "cells per microliter"
            value: 24000.0
    referenceRange:
        unit:
            id: "UO:0000316"
            label: "cells per microliter"
        low: 150000.0
        high: 450000.0

... places the referenceRange wrong.

@costero-e
Copy link
Collaborator Author

we need @redmitry approval here to close this PR

Copy link
Collaborator

@redmitry redmitry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remember to validate individuals over fixed schema.

@costero-e
Copy link
Collaborator Author

yes @redmitry , I still have to take a look carefully at individuals issue before validating it.

@costero-e costero-e requested a review from redmitry June 19, 2023 10:38
@costero-e costero-e merged commit 647a167 into main Jun 19, 2023
1 check passed
@costero-e
Copy link
Collaborator Author

costero-e commented Jun 19, 2023

@redmitry I just checked the CINETICA individuals schema. Correct me if I'm wrong but I think the correction you made is because the value is not complex? Also, even if the value was complex, quantity should be an array not an object?
@mrueda said that phenopackets (https://phenopacket-schema.readthedocs.io/en/latest/covid19-example.html) are written like this:
Screenshot 2023-06-19 at 16 49 33
Here I see the example value is not complex and written with quantity.
I can correct the individuals schema but wouldn't be good for beacon to respect the phenopackets schema? I'm not saying I'm not going to do it, I just want to understand know why we are not following the same logic as for phenopackets schema. Thank you @redmitry !

@redmitry
Copy link
Collaborator

can correct the individuals schema but wouldn't be good for beacon to respect the phenopackets schema?
IMHO phenopackets is not a schema, but a format that can be represented in json.

Beacon's schema != phenopackets. We may create an entry type "phenopackets" and provide a schema for it.

Correct me if I'm wrong but I think the correction you made is because the value is not complex?
The correction to the CINECA data was done because it didn't comply with beacon v2 schema.

Cheers,

D.

@mrueda
Copy link
Collaborator

mrueda commented Jun 19, 2023

In my view, the difference between Beacon v2 and Phenopackets v2 schemas, specifically in the individuals entry type, likely stems from the reverse engineering of Phenopackets due to its Protobuff schema. If a JSON schema for Phenopackets v2 was available, this discrepancy wouldn't exist. As a developer working with BFF/PXF, these minor differences in defining identical terms/properties are impractical, especially since Beacon v2 was derived from Phenopackets and the goal is future interoperability.

The ideal scenario would be Phenopackets folks serializing their schema to JSON (if anybody from Phenopackets is reading this, please listen to us!! 😄). However, if this isn't feasible, I propose forming a small working group of two/three people (@mbaudis) to identify and gradually address these differences. I would be eager to participate, given the impact on my work.

Cheers,

m

@costero-e
Copy link
Collaborator Author

Ok, then at the moment we will stick to specifications and give the CINECA dataset the schema suggested by @redmitry for measurementValues until we find a solution for the adaptability to Phenopackets.

@mrueda
Copy link
Collaborator

mrueda commented Jun 21, 2023

@costero-e, IMO the key consideration here is whether the exclusion of the hierarchy 'quantity' from the Phenopacket schema was intentional or simply due to the challenges associated with reverse engineering the Protobuf schema. It should be noted that as of 2021, there were not many Phenopacket schemas available for 'Value'.

https://phenopacket-schema.readthedocs.io/en/master/value.html

If the downstream of the object 'Value' in Beacon v2 is identical to that in Phenopackets, it would be more reasonable to maintain consistency upstream as well. Instead of attempting to modify all replicas of CINECA's 'individuals.json', which is likely one of the few, if not the only, real examples that people have of Beacon v2 Models 😅 , it would be preferable to align the schema upstream and move on. What do you think @mbaudis?

Cheers,

Manu

@redmitry
Copy link
Collaborator

Ok, then at the moment we will stick to specifications and give the CINECA dataset the schema suggested by @redmitry for measurementValues until we find a solution for the adaptability to Phenopackets.

The problem is that "phenopackets-schema" is not a JSON schema.
Phenopackets lib provides json serialization/deserialization, but not with defined (via schema) format.

The question is if it is feasible to develop "phenopackets-like" json schema, because in this case phenopackets would need to provide serializer/deserializer for it.

The choices are:

  1. Leave it as is - Beacon uses it's own "phenopacket-like" format.
  2. Beacon use phenopackes serialization format and would need to craft JSON schema that is in agreement with phenpackets format.
  3. Develop complete JSON schema for phenopackets (incompatible with current phenopackets serialization).
    3.1 use traditional JSON schema pattern with oneOf()
    3.2 use VRS pattern where the type is explicitly set (e.g. "type": "Age") for each object

Cheers,

Dmitry

@costero-e
Copy link
Collaborator Author

Hi @mrueda and @redmitry. I agree with both of you and as we need to find a solution for CINECA individuals schema respecting both specifications and future phenopackets interoperability I would propose two solutions (if I am wrong or is not feasible what I propose, correct me, please):

  1. Leave CINECA schema like it is and make specifications more flexible. Maybe we could specify that measurementValues can adopt both solutions so it also accepts the "phenopackets-like format" for a quantity measurement.
  2. Attach to @redmitry's solution but adding quantity elsewhere in the schema (and also in the specifications), that the value refers to a quantitative measurement. Maybe like adding a "measurementType" or something like that.
    Have nice afternoon!

@costero-e costero-e deleted the hotfix_complexValue branch June 21, 2023 15:08
@costero-e costero-e added this to the 2.0.1 milestone Jul 16, 2024
@costero-e costero-e modified the milestones: 2.0.1, 2.1 Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants