-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One-dimensionalization of data models #613
Conversation
Dear @BrapiCoordinatorSelby, @VivianBass and I finished the editing of the schemas. Like we talked about, we added the `relationshipType` to every occurence of the attributes `additionalInfo` and `externalReferences` in every model. Also we flattend all models, so there shouldn`t be a nested model left or rather all models should be 1-dimensional. (As in Wittenberg, we proceeded here using separate models and associations). In some models there were location specifications (such as ImageLocation in Image.json). These attributes were effectively just a copy of GeoJSON, although GeoJSON exists as an independent model. Accordingly, we have changed the attributes so that they form an association to GeoJSON: GeoJSON ```json "GeoJSON": { "title": "GeoJSON", "type": "object", "description": "One geometry as defined by GeoJSON (RFC 7946). All coordinates are decimal values on the WGS84 geographic coordinate reference system.\n\nCopied from RFC 7946 Section 3.1.1\n\nA position is an array of numbers. There MUST be two or more elements. The first two elements are longitude and latitude, or\neasting and northing, precisely in that order and using decimal numbers. Altitude or elevation MAY be included as an optional third element.", "properties": { "geometryDbId": { "description": "Unique identifier for the geometry", "type": [ "null", "string" ] }, "geometry": { "description": "A geometry as defined by GeoJSON (RFC 7946). In this context, only Point or Polygon geometry are allowed.", "$ref": "GeoJSONGeometry.json#/$defs/GeoJSONGeometry" }, "type": { "type": "string", "default": "Feature", "example": "Feature", "description": "The literal string \"Feature\"" }, "image": { "description": "Geometry associated with an image", "$ref": "../BrAPI-Phenotyping/Image.json#/$defs/Image", "relationshipType": "many-to-one", "referencedAttribute": "imageLocation" }, "observation": { "description": "Geometry associated with an image", "$ref": "../BrAPI-Phenotyping/Observation.json#/$defs/Observation", "relationshipType": "many-to-one", "referencedAttribute": "geoCoordinates" }, "observationUnit": { "description": "Geometry associated with an image", "$ref": "../BrAPI-Phenotyping/ObservationUnit.json#/$defs/ObservationUnitPosition", "relationshipType": "many-to-one", "referencedAttribute": "geoCoordinates" }, "germplasmOrigin": { "description": "Geometry associated with an image", "$ref": "../BrAPI-Germplasm/Germplasm.json#/$defs/GermplasmOrigin", "relationshipType": "many-to-one", "referencedAttribute": "coordinates" } }, "required": [ "geometryDbId" ] } ``` New `ImageLocation` from `Image.json`: ```json "imageLocation": { "description": "One geometry as defined by GeoJSON (RFC 7946). All coordinates are decimal values on the WGS84 geographic coordinate reference system.\n\nCopied from RFC 7946 Section 3.1.1\n\nA position is an array of numbers. There MUST be two or more elements. The first two elements are longitude and latitude, or\neasting and northing, precisely in that order and using decimal numbers. Altitude or elevation MAY be included as an optional third element.", "relationshipType": "one-to-many", "referencedAttribute": "image", "items": { "$ref": "../BrAPI-Common/GeoJSON.json#/$defs/GeoJSON", "description": "A geometry as defined by GeoJSON (RFC 7946). In this context, only Point or Polygon geometry are allowed." }, "title": "GeoJSON", "type": [ "null", "array" ] } ``` Could you please check or changes, if there is something wrong or if you like us to change other things? Thank you and best regards! :)
Thanks @LzLang, I will check it out and run it through the schema validator. I have a suggestion about the additions (not the re-organization part). Would if be possible to put this in a different branch, since these change the model? So keep the data-model-separation for v2.1? Alternatively we could merge the data model in to 2.1 branch and then create a new branch for 2.2 model changes? What do you think @BrapiCoordinatorSelby? |
@LzLang I got the following validation errors: I will make the corrections and commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice a few model changes from single to array and use of Plural when Singular is appropriate.
"$defs": { | ||
"Species": { | ||
"properties": { | ||
"specieDbId": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be speciesDbId
@@ -4,7 +4,15 @@ | |||
"properties": { | |||
"additionalInfo": { | |||
"description": "A free space containing any additional information related to a particular object. A data source may provide any JSON object, unrestricted by the BrAPI specification.", | |||
"$ref": "../BrAPI-Common/AdditionalInfo.json#/$defs/AdditionalInfo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the model, from a single to an array of AdditionalInfo, can this be put in v2.2? See in same changes in other entities.
"referencedAttribute": "study", | ||
"items": { | ||
"$ref": "../BrAPI-Core/Study.json#/$defs/ExperimentalDesign", | ||
"description": "ExperimentalDesign" | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model change from a single ExperimentalDesign to an array of ExperimentalDesign. Not sure this is appropriate.
"type": [ | ||
"null", | ||
"object" | ||
"array" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model change from a single grow facility to array of growth facilities!
"type": [ | ||
"null", | ||
"object" | ||
"array" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model change from single lastUpdate to array of lastUpdates!
}, | ||
"title": "GeoJSON", | ||
"type": [ | ||
"null", | ||
"object" | ||
"array" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model change from single to array
}, | ||
"title": "GeoJSON", | ||
"type": [ | ||
"null", | ||
"object" | ||
"array" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model change from single to array
} | ||
}, | ||
"type": "object" | ||
"$ref": "../BrAPI-Phenotyping/OntologyReference.json#/$defs/DocumentationLinks", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be DocumentationLink
"relationshipType": "one-to-many", | ||
"referencedAttribute": "scale", | ||
"items": { | ||
"$ref": "../BrAPI-Phenotyping/Scale.json#/$defs/ValidValues", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be ValidValue
"primaryModel": false | ||
} | ||
}, | ||
"Categories": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be Category
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more comments to previous review.
}, | ||
"Progeny": { | ||
"properties": { | ||
"parentDbId": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there should be an ID associated with a 'Progeny', since it is a Value Type and not an Entity. In any case it should be progenyDbId and not parentDbId.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to progenyDbId
Zendro is using the models for a data warehouse.
The problem we have is, that we need a primary key, which is why I added it.
If @asishallab don't mind I could remove it and add the primary key code side.
}, | ||
"Parents": { | ||
"properties": { | ||
"parentDbId": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there should be an ID associated with a 'Parent', since it is a Value Type and not an Entity.
"parentType" | ||
], | ||
"type": "object" | ||
"$ref": "../BrAPI-Germplasm/PedigreeNode.json#/$defs/Progeny", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the team GermplasmChild, since it is more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you prefere, that we rename the whole "Progeny" Model to "GermplasmChild"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please change :)
} | ||
}, | ||
"required": [ | ||
"crossDbId" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listed as required but is not a property of CrosssAttribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it accordingly to crossAttributeDbId
Thank you very much @daveneti for your review and corrections!
Personally I don't see a problem with putting it in a different branch instaed of merching it. I have now noticed something else: {
"$defs": {
"TraitDataType": {
"description": "<p>Class of the scale, entries can be</p>\n<p>\"Code\" - This scale class is exceptionally used to express complex traits. Code is a nominal scale that combines the expressions of the different traits composing the complex trait. For example a severity trait might be expressed by a 2 digit and 2 character code. The first 2 digits are the percentage of the plant covered by a fungus and the 2 characters refer to the delay in development, e.g. \"75VD\" means \"75 %\" of the plant is infected and the plant is very delayed.</p>\n<p>\"Date\" - The date class is for events expressed in a time format, See ISO 8601</p>\n<p>\"Duration\" - The Duration class is for time elapsed between two events expressed in a time format, e.g. days, hours, months</p>\n<p>\"Nominal\" - Categorical scale that can take one of a limited and fixed number of categories. There is no intrinsic ordering to the categories</p>\n<p>\"Numerical\" - Numerical scales express the trait with real numbers. The numerical scale defines the unit e.g. centimeter, ton per hectare, branches</p>\n<p>\"Ordinal\" - Ordinal scales are scales composed of ordered categories</p>\n<p>\"Text\" - A free text is used to express the trait.</p>",
"enum": [
"Code",
"Date",
"Duration",
"Nominal",
"Numerical",
"Ordinal",
"Text"
],
"type": "string",
"example": "Numerical"
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/Components/Common/AdditionalInfo.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} Is there a specific reason why these have been removed? Once again, thank you very much for your review. |
Hi @LzLang Thanks for your comment and changes. Yes, creating a new branch for v2.2 is a great idea, perhaps that can be done with the other 2.2 changes @BrapiCoordinatorSelby will add. I am not actually sure if there any many model changes. I think I might have been confused. If you can review my suggestions and make any changes, then we can review again to be sure. The comment you made about properties I will double check when get back to my computer. |
Hi @LzLang, back at my computer now. For AdditionalInfo I did not see any properties on the original definition, only addtionalProperties, which is still there. Can you explain further your comment on theses please? I can see in the example the id is incorrect. That looks Iike C&P error, I will correct that. It should be "$id": "https://brapi.org/Specification/BrAPI-Schema/Components/Common/TraitDataType.json" I will add a check in the validator that looks for those mistakes. |
Hey @daveneti,
Sorry, for the confusion! This is a mistake on my part. "additionalInfo": {
"description": "A free space containing any additional information related to a particular object. A data source may provide any JSON object, unrestricted by the BrAPI specification.",
"$ref": "../BrAPI-Common/AdditionalInfo.json#/$defs/AdditionalInfo"
"relationshipType": "one-to-many",
"items": {
"$ref": "../BrAPI-Common/AdditionalInfo.json#/$defs/AdditionalInfo",
"description": "AdditionalInfo"
},
"type": [
"null",
"array"
]
} If @asishallab and @VivianBass have no objections, I think we can adopt the changes without further problems and use them as a basis for v2.2 and our own project. |
Thanks @LzLang In the above 'additionalInfo' example, be careful you have the $ref to two places. I don't think it should be in the 'items', and the 'type' should be removed. Also I think the relationship will be one-to-one. E.g.
Can you make all the changes in this branch, then I can review again, before merging into 'data-model-separation'. I will need to check again with the validator and then I can merge this modified 2.1 version into our extended versions. I don't think there are any model changes so this would be still be 2.1. |
Thank you @daveneti for your advice, this simple slipped my eyes.
Certainly!
There are some model changes, because there were some nested models left, which we one-dimensionalized. Edit: I changed the models accordingly to your comments. I will check the models and make necessary changes (e.g. after renaming the model to it's singular form, I need to correct associations to this model). |
So far I changed the models accordingly to our conversation. For me, there are more problems with plural associations. "dataLinks": {
"description": "List of links to extra data files associated with this study. Extra data could include notes, images, and reference data.",
"items": {
"$ref": "DataLink.json#/$defs/DataLink"
},
"type": [
"null",
"array"
]
}
"parent1": {
"$ref": "CrossParent.json#/$defs/CrossParent",
"description": "the first parent used in the cross"
},
"parent2": {
"$ref": "CrossParent.json#/$defs/CrossParent",
"description": "the second parent used in the cross"
}
"potentialParents": {
"description": "A list of all the potential parents in the crossing block, available in the crossing project\n<br/> If the parameter 'includePotentialParents' is false, the array 'potentialParents' should be empty, null, or excluded from the response object.",
"items": {
"$ref": "CrossParent.json#/$defs/CrossParent"
},
"type": [
"null",
"array"
]
}
"parent1": {
"$ref": "CrossParent.json#/$defs/CrossParent",
"description": "the first parent used in the cross"
},
"parent2": {
"$ref": "CrossParent.json#/$defs/CrossParent",
"description": "the second parent used in the cross"
} What relationshipType would you prefer for those associations? Another thing I noticed while revising: "crosses": {
"title": "crosses",
"description": "crosses",
"referencedAttribute": "plannedCross",
"relationshipType": "one-to-many",
"items": {
"$ref": "Cross.json#/$defs/Cross",
"description": "Cross"
},
"type": [
"null",
"array"
]
}
"siblings": {
"description": "A list of sibling germplasm references in the pedigree tree for this germplasm. These represent edges in the tree, connecting to other nodes.\n<br/> Siblings share at least one parent with the given germplasm. \n<br/> If the parameter 'includeSiblings' is set to false, then this array should be empty, null, or not present in the response.",
"items": {
"$ref": "Germplasm.json#/$defs/Germplasm",
"description": "The ID which uniquely identifies a parent germplasm",
"referencedAttribute": "siblingPedigreeNodes",
"relationshipType": "many-to-one"
},
"type": [
"null",
"array"
]
} The first definition seems like the old one. |
@LzLang Thanks for all your work, there is a lot to unpack there and consider. On the question of which relationship type: You should add the relationship between CrossParent and Germplasm or ObservationUnit as 'one-to-many Well spotted on the non-uniform placement of the referencedAttribute and relationshipType. I had not noticed these! I propose that these are associated with the property and not the items in a array. So the example for crosses in PlannedCross. The schema validator will only look for these in the property and not the items. I will now review the PR. |
@LzLang Did you commit your changes? I did not see a new commit from you. Can you commit the changes to the branch and then I will review again. If there are any model changes, we can put these in a separate branch. Putting the models in a separate top level definition if it does not change the cardinality of the relationship, is not a model change, so I think there are actually none or few model changes. Are there any additional properties or changes from one object to array or vice versa? I don't think there are any. I can do a final check on the PR once you have committed the changes. |
@daveneti I made the changes we talked about and commited it-
No, I don't think that we changed models in that way, or rather I reversed those changes according to your review. So far I have made the changes we discussed. I will of course fix the errors/problems you find in the review as soon as possible. We still have one question regarding GeoJSONGeometry: Sidenote: I'm currentyl helping my parents to move and I'm moving myself too, which is why my work on the models has been delayed somewhat. |
@LzLang Thanks for the changes, I will review now For GeoJSONGeometry I don't have a suggestion. Perhaps @BrapiCoordinatorSelby has one? In the Schema tools generator for GraphQL I added a TypeResolver, that simply returned the first type to get the schema to generate. Ideally, we need to use a discriminator, which is the 'type'property. Since I was not generating a working API, it did not matter to me, which was default.
|
PS, @LzLang it would be great soon to compare the GraphQL schema you have developed and the one the Schema tools generates. I will add soon a CI build to generate it each time the model changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LzLang @BrapiCoordinatorSelby I finished the review. I think there is a little bit more work to do. Mainly because of the addition of DbIds in some 'Value Types', which changes the model. Also the GermplasmParent, Progeny/GermplasmChild needs some discussion.
@LzLang I finished the latest review. There are some minor changes that can be fixed easily and some more major questions that needs some input from @BrapiCoordinatorSelby that where in the original JSON Schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LzLang I think there is only two remaining issue. Sorry it has taken so long!
- The issues are: Link from study to growth facility should be one-to-many, so the link should not be array.
- I think we can remove SourceGermplasm
I will make the changes in the branch directly, you can merge when you want. I will also run the validator.
@@ -2,13 +2,6 @@ | |||
"$defs": { | |||
"SourceGermplasm": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this link I would propose to drop SourceGermplasm directly. So in germplasm you would have in Reference and Reference Set
"sourceGermplasm": {
"description": "All known corresponding Germplasm",
"relationshipType": "many-to-many",
"referencedAttribute": "reference",
"items": {
"$ref": "../BrAPI-Germplasm/Germplasm.json#/$defs/Germplasm",
},
"title": "SourceGermplasm",
"type": [
"null",
"array"
]
Then you could add the back links in Germplasm
"referencedAttribute": "study", | ||
"$ref": "../BrAPI-Core/Study.json#/$defs/GrowthFacility" | ||
"items": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct before
"$ref": "../BrAPI-Core/Study.json#/$defs/GrowthFacility"
it was just the relationship type that was incorrect, which you corrected to 'many-to-one'
@@ -462,17 +455,17 @@ | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure now, I think it is ok now :)
Dear @BrapiCoordinatorSelby,
@VivianBass and I finished the editing of the schemas. Like we talked about, we added the
relationshipType
to every occurence of the attributesadditionalInfo
andexternalReferences
in every model. Also we flattend all models, so there shouldn`t be a nested model left or rather all models should be 1-dimensional. (As in Wittenberg, we proceeded here using separate models and associations).In some models there were location specifications (such as ImageLocation in Image.json). These attributes were effectively just a copy of GeoJSON, although GeoJSON exists as an independent model. Accordingly, we have changed the attributes so that they form an association to GeoJSON: GeoJSON
New
ImageLocation
fromImage.json
:Could you please check or changes, if there is something wrong or if you like us to change other things?
Thank you and best regards! :)