Some of the popular cloud-computing platforms (e.g., AWS, Google Earth Engine, Microsoft Planetary Computer, NASA's Common Metadata Repository (CMR)) host a lot of publicly available geospatial datasets. This repo compiles the list of all geospatial datasets on these platforms as a CSV file and as a JSON file, making it easier to find and use them programmatically. The list is updated daily.
This repo provides the list of geospatial datasets in two formats:
- AWS Open Data: aws_open_datasets.tsv
- AWS Open Geospatial Data: aws_geo_datasets.tsv
- AWS Open Geospatial Data with STAC endpoint: aws_stac_catalogs.tsv
- STAC Index Catalogs: stac_catalogs.tsv
- Earth Engine Catalog: gee_catalog.tsv
- Planetary Computer Catalog: pc_catalog.tsv
- NASA CMR STAC Catalog: nasa_cmr_catalog.tsv
- AWS Open Data: aws_open_datasets.json
- AWS Open Geospatial Data: aws_geo_datasets.json
- AWS Open Geospatial Data with STAC endpoint: aws_stac_catalogs.json
- STAC Index Catalogs: stac_catalogs.json
- Earth Engine Catalog: gee_catalog.json
- Planetary Computer Catalog: pc_catalog.json
- NASA CMR STAC Catalog: nasa_cmr_catalog.json
The TSV file can be easily read into a Pandas DataFrame using the following code:
import pandas as pd
url = 'https://github.com/giswqs/geospatial-data-catalogs/raw/master/aws_geo_datasets.tsv'
df = pd.read_csv(url, sep='\t')
df.head()
- A list of open datasets on AWS: aws-open-data
- A list of open geospatial datasets on AWS: aws-open-data-geo
- A list of open geospatial datasets on AWS with a STAC endpoint: aws-open-data-stac
- A list of STAC endpoints from stacindex.org: stac-index-catalogs
- A list of geospatial datasets on Microsoft Planetary Computer: Planetary-Computer-Catalog
- A list of geospatial datasets on Google Earth Engine: Earth-Engine-Catalog
- A list of geospatial datasets on NASA's Common Metadata Repository (CMR): NASA-CMR-STAC
- A list of geospatial data catalogs: geospatial-data-catalogs
- The Maxar Open Data STAC Catalog: maxar-open-data