Skip to content

Latest commit

 

History

History
181 lines (121 loc) · 8.14 KB

naming.md

File metadata and controls

181 lines (121 loc) · 8.14 KB

Imagery Dataset Naming Conventions

Imagery dataset titles and S3 paths are constructed from metadata about each imagery survey so that they will be consistent and human readable. Imagery is stored according to the main region that each dataset covers. Datasets that cover two or more full regions may instead be stored under new-zealand.

Imagery Dataset Titles

The imagery dataset title is constructed from metadata that is entered when an imagery dataset is processed.

[<geographic_description>|<region>] <gsd>m [<survey_number>|<geospatial_category>] (<start_year>[-<end_year>?])[ - <lifecycle>?]

which can be broken down as:

  • if it exists, the <geographic_description> is used, if not, <region> is used instead (this would be the case where the imagery dataset contains full coverage of the region)
  • then <gsd> (which is always in metres)
  • if it exists, the <survey_number> is used, if not, <geospatial_category> is used instead
  • then <start_year> (using all four digits to indicate the year)
  • if the imagery dataset was captured over multiple years, include a hyphen and the <end_year> (using all four digits to indicate the year)
  • if the imagery dataset has been processed as a QC preview or if it only represents partial capture, include "- Preview" or "- Draft" at the end of the title, from the dataset <lifecycle>

Imagery Dataset S3 Paths

The imagery dataset S3 path is also constructed from similar metadata.

<region>/
  [<geographic_description>|<region>][_<survey_number>?]_<start_year>[-<end_year>?]_<gsd>m/
    <product>/
      <crs>/

which can be broken down as:

  • the main <region> that the dataset covers
  • then if it exists, the <geographic_description> is used, if not, <region> is repeated instead (this would be the case where the imagery dataset contains full coverage of the region)
  • if it exists, the <survey_number> is used
  • then <start_year> (using all four digits to indicate the year)
  • if the imagery dataset was captured over multiple years, include a hyphen and the <end_year> (using all four digits to indicate the year)
  • then <gsd> (which is always in metres)
  • then <product> as multiple products may be created from the same imagery survey
  • then <crs> as we may store the data in different coordinate reference systems for different purposes

S3 Path Restrictions

The path is restricted to a limited set of characters with no whitespace: lowercase "a through "z", numbers "0" through "9", hyphen ("-"), and underscore ("_"). When generating a dataset S3 path, the system will pass through these characters unchanged to the path, and will transform many others to allowed characters - see the subsections for details. Any characters not mentioned in this section or subsections will result in an error.

Uppercase characters

Uppercase characters are changed to lowercase. For example, "Wellington" is changed to "wellington".

Diacritics

Characters with diacritics, such as macrons ("ā", "ē", etc), are transliterated into Latin script. For example, a dataset with "Ōmāpere" in the title would have "omapere" in the path.

Spaces, commas, and slashes

These characters are replaced with a hyphen. For example, "Tikitapu/Blue Lake" is changed to "tikitapu-blue-lake".

Apostrophes

These are removed, so "Hawke's Bay" is changed to "hawkes-bay".

Ampersands

These are replaced with "-and-", so "Gore A&P Showgrounds" is changed to "gore-a-and-p-showgrounds".

Other characters

"ø" is transliterated to "o", so "Mount Brøgger" is changed to "mount-brogger".

Title and S3 Path Components

crs

EPSG Code for the coordinate reference system of the imagery. Generally this is 2193 as it is the primary projection for most of LINZ's imagery.

geographic_description

This is free text and at the imagery maintainer's discretion. A specific city or sub-region or event name may be used to help describe the imagery capture area. The Gazetteer is referenced to ensure official names with correct spelling are used. If the region has full coverage, then the geographic description can be empty and the region will be used instead.

geospatial_category

A general categorisation of imagery held within our archive that includes the following possible values:

  • Aerial Photos aerial-photos
  • Rural Aerial Photos rural-aerial-photos
  • Scanned Aerial Photos scanned-aerial-photos
  • Satellite Imagery satellite-imagery
  • Urban Aerial Photos urban-aerial-photos

gsd

The GSD or spatial resolution is the area covered on the ground by a single pixel. This is stored in metadata in metres with no trailing zeros after the decimal point.

lifecycle

If lifecycle = preview then - Preview is appended to the end of the imagery dataset title and if lifecycle = ongoing then - Draft is appended to the end of the imagery dataset title. For any other lifecycle values, nothing is appended.

product

Imagery product type, generally this is rgb as it is the primary imagery type for LINZ. Black and white imagery scanned from the Crown Aerial Film Archive is still stored as RGB, given that WEBP Lossless compression is more effective on 3-band RGB than other compression options that support single-band TIFFs.

region

Is taken from a list of regions:

  • Antarctica antarctica
  • Auckland auckland
  • Bay of Plenty bay-of-plenty
  • Canterbury canterbury
  • Northland northland
  • Gisborne gisborne
  • Global global
  • Hawke's Bay hawkes-bay
  • Manawatū-Whanganui manawatu-whanganui
  • Marlborough marlborough
  • Nelson nelson
  • New Zealand new-zealand
  • Otago otago
  • Pacific Islands pacific-islands
  • Southland southland
  • Taranaki taranaki
  • Tasman tasman
  • Waikato waikato
  • Wellington wellington
  • West Coast west-coast

start_year and end_year

In both cases, the full four digits should be used. If the imagery dataset was entirely captured within one year, then only a start_year is provided.

As imagery can be updated after it is "named" for initial processing, the end_year or lack of an end_year may be incorrect in the S3 Path. It is best to use this as a rough guideline and then use the STAC Collection for a more precise capture timeframe.

survey_number

A survey number reference from the Crown Aerial Film Archive. This will only be included if the imagery dataset is georeferenced historical aerial imagery from the Archive. If it is available, it will replace the geospatial_category in the imagery dataset title, and optionally be included in the imagery dataset path.

Examples

5cm Aerial RGB imagery covering Hamilton within the Waikato region captured in 2023

Title: Hamilton 0.05m Urban Aerial Photos (2023)
Path: s3://nz-imagery/waikato/hamilton_2023_0.05m/rgb/2193/collection.json

30cm Aerial RGB imagery covering the Waikato region captured in 2016-2019

Title: Waikato 0.3m Rural Aerial Photos (2016-2019)
Path: s3://nz-imagery/waikato/waikato_2016-2019_0.3m/rgb/2193/collection.json

10cm Aerial RGB imagery covering Ōtorohanga in the Waikato region captured in 2021 (macron removed in imagery dataset path)

Title: Ōtorohanga 0.1m Urban Aerial Photos (2021)
Path: s3://nz-imagery/waikato/otorohanga_2021_0.1m/rgb/2193/collection.json

37.5cm Aerial RGB imagery covering Waikato (primarily) and the Bay of Plenty regions captured in 1981-1982 (scanned from the Crown Aerial Film Archive, recorded as SN5944)

Title: Waikato / Bay of Plenty 0.375m SN5944 (1981-1982)
Path: s3://nz-imagery/waikato/waikato_bay-of-plenty_sn5944_1981-1982_0.375m/rgb/2193/collection.json

15cm Aerial RGB imagery covering Nelson (primarily) captured in 2022

Title: Top of the South Flood 0.15m Aerial Photos (2022)
Path: s3://nz-imagery/nelson/top-of-the-south-flood_2022_0.15m/rgb/2193/collection.json

50cm Satellite RGB imagery covering a large part of the North Island captured in 2023

Title: Cyclone Gabrielle North Island 0.5m Satellite Imagery (2023)
Path: s3://nz-imagery/new-zealand/cyclone-gabrielle-north-island_2023_0.5m/rgb/2193/collection.json