Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata #72

Merged
merged 14 commits into from
Oct 7, 2024
Merged

Metadata #72

merged 14 commits into from
Oct 7, 2024

Conversation

zacdezgeo
Copy link
Collaborator

@zacdezgeo zacdezgeo commented Oct 1, 2024

What's changed

  • Metadata additions: Created a STAC catalog from parquet source file.
  • New metadata files: Added STAC metadata files, including catalog.json, sources.json, and space2stats.json.
  • Notebook updates: Added a notebook (metadata.ipynb) to handle metadata generation for Space2Stats.

How to test it

  1. Open the metadata.ipynb notebook and run the cells to generate the STAC catalog.
  2. Validate the metadata files in the stac folder for completeness and accuracy.
  3. Run stac-check on the generated STAC catalog to verify metadata compliance.

Other Notes

@zacdezgeo
Copy link
Collaborator Author

@andresfchamorro, feel free to update the description or add comments here.

@zacdezgeo zacdezgeo added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 1, 2024
Copy link
Collaborator Author

@zacdezgeo zacdezgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @andresfchamorro! The only change requested is moving the folder to a better location.

I see three points where the metadata could come into play:

  1. Letting a user browse the metadata and understand the dataset better - could be accomplished via a copy to the STAC browser. It would require that the copy be updated regularly. I tried using the GitHub repo file link directly and got a CORS error.
  2. Retrieving fields—After discussing this with @alukach, I propose we continue relying on the database directly to retrieve the available list of fields.
  3. Ensure consistency between metadata and data upon ingestion (Add checks to ensure data and metadata are in sync upon ingestion #73) - We should add some checks to avoid any drift between the metadata and underlying data in the database. This would make two more sensible.

notebooks/METADATA/stac/catalog.json Outdated Show resolved Hide resolved
notebooks/METADATA/stac/catalog.json Outdated Show resolved Hide resolved
notebooks/METADATA/stac/space2stats/space2stats.json Outdated Show resolved Hide resolved
@zacdezgeo
Copy link
Collaborator Author

@andresfchamorro @alukach; just need a ✅ to merge

Copy link
Collaborator

@andresfchamorro andresfchamorro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zacharyDez Did you make these changes manually, or can you update the notebook so that we don't recreate it in the future without these changes?

Copy link
Collaborator

@andresfchamorro andresfchamorro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow the content on catalog description and purpose got trimmed too...

@andresfchamorro
Copy link
Collaborator

ok i think i fixed what I mentioned in the comments so I will merge now :) also copied it to gh-pages so we can view it here: https://radiantearth.github.io/stac-browser/#/external/worldbank.github.io/DECAT_Space2Stats/stac/catalog.json?.language=en

@andresfchamorro andresfchamorro merged commit 7232c49 into main Oct 7, 2024
2 checks passed
@zacdezgeo zacdezgeo deleted the metadata branch October 7, 2024 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata needs research
3 participants