Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permissively deserialize invalid temporal extents #1221

Closed
giswqs opened this issue Sep 22, 2023 · 3 comments · Fixed by #1222
Closed

Permissively deserialize invalid temporal extents #1221

giswqs opened this issue Sep 22, 2023 · 3 comments · Fixed by #1222

Comments

@giswqs
Copy link
Contributor

giswqs commented Sep 22, 2023

I have been using the following code to process the Maxar Open Data catalog for several months now. It has been working until today. Now it throws a ValueError: ISO string too short. I am not sure if this a pystac or isoparser issue.

from pystac import Catalog
url = "https://maxar-opendata.s3.amazonaws.com/events/catalog.json"
root_catalog = Catalog.from_file(url)
collections = root_catalog.get_collections()
collections = [collection.id for collection in collections]

image

@gadomski
Copy link
Member

Looks like (at least) one of their collections has invalid temporal extents, and needs to be corrected on their side:

$ curl -s https://maxar-opendata.s3.amazonaws.com/events/BayofBengal-Cyclone-Mocha-May-23/collection.json | jq .extent.temporal.interval
[
  "2023-01-03 04:30:17Z",
  "2023-05-22 04:35:25Z"
]

.extent.temporal.interval should be a list of lists: https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#temporal-extent-object

That being said, this is a common problem. On deserialization, we should probably permissively correct the problem with a warning. Leaving this open to track that need.

@gadomski gadomski changed the title Maxar Open Data Catalog issue - ISO string too short Permissively deserialize invalid temporal extents Sep 22, 2023
@gadomski
Copy link
Member

@giswqs just checked your test script against #1222 and looks like it's a fix:

$ cat > test.py
from pystac import Catalog
url = "https://maxar-opendata.s3.amazonaws.com/events/catalog.json"
root_catalog = Catalog.from_file(url)
collections = root_catalog.get_collections()
collections = [collection.id for collection in collections]
$ python test.py 
/Users/gadomski/Code/stac-utils/pystac/pystac/collection.py:264: UserWarning: A collection's temporal extent should be a list of lists, but is instead a list of strings. pystac is fixing this issue and continuing deserialization, but note that the source collection is invalid STAC.
  warnings.warn(
$

So you can work from that branch until we're able to release an update.

@giswqs
Copy link
Contributor Author

giswqs commented Sep 22, 2023

@gadomski Awesome! Thank you very much for the quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants