Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust handling of unknown properties #456

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

rartino
Copy link
Contributor

@rartino rartino commented Feb 20, 2023

In discussions related to issue #406 we've found that the current handling of unknown properties in OPTIMADE is not optimal for interoperability, as queries using a new property will generate errors in databases that implement an older version of the standard.

This PR is a suggestion for fixing this issue for providers implementing 1.2 and forward. (However, queries to databases that remain on 1.1 will still generate errors)."

@rartino rartino requested a review from ml-evs February 20, 2023 12:11
optimade.rst Outdated Show resolved Hide resolved
Co-authored-by: Antanas Vaitkus <antanas.vaitkus90@gmail.com>
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @rartino, thanks for writing this up.

I have one outstanding question before we merge: we focus here on filtering, but I think response_fields has the same problem.

If I ask for structure_origin from my own implementation (100% standards compliant, I'm sure...) at the moment (https://optimade.odbx.science/v1/structures?response_fields=structure_origin) then I get an error. Should I instead return null for the field (with a warning) after this PR?

@rartino
Copy link
Contributor Author

rartino commented Feb 27, 2023

@ml-evs Excellent point. Actually, I don't see anything in the specification right now about returning or not returning errors for unknown response_fields. After looking around, I think this segment of text is the right place for that.

But, yes, should we force returning null + a warning? Or should we allow to not include it in the response even when requested? Both?

@ml-evs
Copy link
Member

ml-evs commented Feb 27, 2023

@ml-evs Excellent point. Actually, I don't see anything in the specification right now about returning or not returning errors for unknown response_fields. After looking around, I think this segment of text is the right place for that.

But, yes, should we force returning null + a warning? Or should we allow to not include it in the response even when requested? Both?

I think returning null + warning would be fine, and maps to what we have already for known-not-implemented fields. The current wording for another-database-provider-field means that if I request _random_field (e.g., https://optimade.odbx.science/v1/structures?response_fields=_random_field&page_limit=1) then I have to return null but without a warning right? Do we want a warning in all cases for consistency?

@rartino
Copy link
Contributor Author

rartino commented Feb 27, 2023

The current wording for another-database-provider-field means that if I request _random_field (e.g., https://optimade.odbx.science/v1/structures?response_fields=_random_field&page_limit=1) then I have to return null but without a warning right? Do we want a warning in all cases for consistency?

I think the thought with suppressing that warning was that it is a somewhat expected situation even with databases all at the latest version, and thus shouldn't trigger warnings. This exception wasn't written by me and I don't feel strongly either way. But, can we remove it now? Strictly speaking this is actually a backwards-breaking change in that a clients written to the older specification do not expect to get a warning in these cases, and may now start breaking.

@rartino
Copy link
Contributor Author

rartino commented Feb 27, 2023

@ml-evs Ok, I've updated the text to handle response_fields. Does this seem ok?

ml-evs
ml-evs previously approved these changes Feb 27, 2023
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rartino, this looks clear to me now!

@rartino
Copy link
Contributor Author

rartino commented Feb 28, 2023

@ml-evs Sorry for re-editing this after approval, but I have given this some more thought.

First, I think you are right that we should (and can) remove the differences in handling between "recognized" and "unrecognized" prefixes as they complicate the specification, implementation, and the difference may be confusing for users. I also really didn't like the sentence referring to putting a note in "diagnostic output" since we do not have a place for such output, and it is not good if there is no information whatsoever to the client that an unrecognized property has been given a null value. So, first I started looking at introducing a notifications list alongside warnings, but then, after reading the text carefully, I realized that the formulation that no warning is to be given for unrecognized properties in ~ "others' recognized prefixes" is subjective enough to be meaningless, i.e., existing standards-conformant clients have to already accept warnings here since they cannot know what others' prefixes the implementation has chosen to recognize.

I also realized that to be able to handle the future-proofing properly, we ought to strengthen the interaction between future properties and structure_features. So, I have now explicitly clarified how these mechanisms are meant to work together so new properties can be introduced without affecting old data.

Finally, since I'm already editing this quite heavily and "Unknown properties" is a term we use for other things, I've systematically changed the term for this to "unrecognized property names", including renaming the whole section.

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the important clarifications @rartino! A few minor comments:

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
@JPBergsma JPBergsma added this to the v1.2 milestone May 3, 2023
@rartino
Copy link
Contributor Author

rartino commented May 4, 2023

@ml-evs Is there anything still blocking the approval? I looked over your final set of comments again, but I don't see anything to act on. You argue that the features flag should be named the same as the field; but to reiterate the argument from above, the point is that a "feature" doesn't need to map cleanly to a single field; it could possibly even involve using keys or values in otherwise defined fields with unusual meaning.

A client that understands what _exmpl_strangeness is will know to look for and interpret all the, say, 20 different fields related to describing all aspects of the strangeness. All other clients will just se the black box _exmpl_strangeness in stucture_features and know to dismiss those structures.

optimade.rst Outdated Show resolved Hide resolved
Co-authored-by: Antanas Vaitkus <antanas.vaitkus90@gmail.com>
Copy link
Member

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely formatting changes -- I think these are uncontroversial so will commit them and then add an additional commit that adds :warning-code: to the registered tags. actually went with "field-val" for the warning code, I think this is appropriate but will leave it for @rartino to confirm.

optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
optimade.rst Outdated Show resolved Hide resolved
Co-authored-by: Matthew Evans <7916000+ml-evs@users.noreply.github.com>
optimade.rst Outdated Show resolved Hide resolved
@rartino rartino requested review from ml-evs and JPBergsma May 17, 2023 13:54
@rartino rartino added the PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing label May 17, 2023
@JPBergsma
Copy link
Contributor

I am sorry, but I still do not see why we need to add special values to structures_features.
structures_featuresis now used to indicate which properties(in the general sense) a structure has. But I would not consider more metadata like fields (for example, the method with which the structure was determined) as features of the structure itself.
Adding a query parameter to test whether a property is set to null, i.e. explicitly defined as unknown, seems a more natural solution to me.

Could you perhaps remove lines 2858, 2859, 435 and 436 And place those in a separate PR?
Those are the controversial lines for me.
The rest of the proposal I could accept straight away.


When an implementation receives a request with a query filter that refers to an unknown property name it is handled differently depending on the database-specific prefix:
When an implementation receives a request with a query filter or other mechanism (e.g., :query-param:`response_fields`) that refers to a property name that is not recognized, it MUST NOT treat this as an error but rather MUST issue a warning with :field:`code` set to :field-val:`unrecognized_property`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to understand this better, does "unrecognized property" now mean unsupported OPTIMADE properties too? I recall that prior this PR "unknown" stood for properties with database-specific prefixes. So now for example an implementation can choose to "not recognize" structures' chemical_formula_anonymous?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this change is the most vital part of this PR to improve forwards compatibility in OPTIMADE.

The point is, when we add, e.g., spacegroup to the standard, and you send a query to two databases asking to include spacegroup among the response fields (or referencing it in the filter), you do not want an error; you want it treated as null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so the client will have to inspect the warnings to make sure that the empty return list means the database does not implement spacegroup, not that it does not contain any structures with the queried space group.

optimade.rst Outdated Show resolved Hide resolved
@rartino rartino requested a review from merkys June 8, 2023 22:01
@rartino
Copy link
Contributor Author

rartino commented Jun 9, 2023

Workshop: no agreement if this is the right protocol for futureproofing requestes.

An argument made is that this can be handled on the client side by rewriting filters and requested fields dynamically for the version of each server the client communicates with.

@rartino rartino added status/no-consensus There is no consensus on this issue, it needs to be further discussed and developed. and removed PR/ready-for-review Add this flag if you are the author of the PR and you want it to be reviewed. Remove it when editing labels Jun 11, 2023
@rartino
Copy link
Contributor Author

rartino commented Jun 11, 2023

@ml-evs: I suggest we reiterate this discussion at the release web meet? We need to do something to improve forward compatibility. Perhaps some general "tags" field in all endpoints to do something like what you suggested for structure_features.

@rartino
Copy link
Contributor Author

rartino commented Jan 17, 2024

After some discussions: lets remove all parts except the way for old servers to handle references to unknown fields from newer clients and see if we can get consensus for that.

The handling with structure_features for older clients to handle newer servers (with fields that alter the data interpretation) can instead use the major version number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/requires-discussion status/no-consensus There is no consensus on this issue, it needs to be further discussed and developed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants