Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validation capabilities to PySTAC #118

Closed
lossyrob opened this issue Jul 23, 2020 · 0 comments · Fixed by #139
Closed

Add validation capabilities to PySTAC #118

lossyrob opened this issue Jul 23, 2020 · 0 comments · Fixed by #139
Assignees

Comments

@lossyrob
Copy link
Member

When using PySTAC it would be convenient to be able to validate STAC objects. This is true for incoming STAC objects - for example, you might want to run validation against the JSON of a STAC object against an older version before reading in with PySTAC (and thereby migrating it to the latest STAC version). This is also true of STAC you construct with PySTAC - though the structure of PySTAC tends towards making valid STAC objects, a user can set a property that breaks the specification (e.g. setting a string instead of an array of strings) and it would be convenient to be able to check validity on-demand to avoid this.

I believe that validation should not happen by default in the core package. For one, this would add a dependency to the library, which aims to minimize dependencies. Also, validation takes time - and when dealing with many STAC objects, it should be up to the user if they want to pay the cost of that validation time or avoid it altogether.

An approach to solving this is as follows:

Validate STACObjects

  • Create a new package, pystac_validator. This will expand on the SchemaValidator already used in the unit tests, and be based off of jsonschema validation. It will also implement an abstract class so that other validators can easily be utilized by extending this class.
  • Allow the registering of a validator in the core library. This allows flexibility for users to implement their own validation logic.
  • In the core library, create a .validate method on STACObject. This method will:
    • Check if there is a validator registered with PySTAC. If so, use that to validate the STACObject.
    • If no validator is registered, check to see if the pystac_validator library is importable. If so, use that validator.
    • If no validator is registered and pystac_validator cannot be imported, throw an exception.

This way validation can happen on-demand for any STACObject by calling its validate method.

This can be done in a way that lets catalogs be validated completely in a single call.

Validate JSON

In addition to validating STACObjects, users may want to validate incoming JSON before they are read in as PySTAC objects. This could be done by a pystac.validate(d) method that will validate a dict based on the type and extensions identified in pystac.serialization.identify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants