From d3777f0138a7fa7ec1be6a468d473ddd8164c972 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Sat, 6 May 2023 16:21:54 +0200 Subject: [PATCH 01/24] docs(spec): add first draft of more detailed v3 spec --- spec/v3/spec.md | 465 ++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 372 insertions(+), 93 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index c0e76822..b7a3d28a 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -1,93 +1,372 @@ -# PMTiles version 3 - -## File structure - -A PMTiles archive is a single-file archive of square tiles with five main sections: - -1. A fixed-size, 127-byte **Header** starting with `PMTiles` and then the spec version - currently `3` - that contains offsets to the next sections. -2. A root **Directory**, described below. The Header and Root combined must be less than 16,384 bytes. -3. JSON metadata. -4. Optionally, a section of **Leaf Directories**, encoded the same way as the root. -5. The tile data. - -## Entries - -A Directory is a list of `Entries`, in ascending order by `TileId`: - - Entry = (TileId uint64, Offset uint64, Length uint32, RunLength uint32) - -* `TileId` starts at 0 and corresponds to a cumulative position on the series of square Hilbert curves starting at z=0. -* `Offset` is the position of the tile in the file relative to the start of the data section. -* `Length` is the size of the tile in bytes. -* `RunLength` is how many times this tile is repeated: the `TileId=5,RunLength=2` means that tile is present at IDs 5 and 6. -* If `RunLength=0`, the offset/length points to a Leaf Directory where `TileId` is the first entry. - -# Directory Serialization - -Entries are stored in memory as integers, but serialized to disk using these compression steps: - -1. A little-endian varint indicating the # of entries -2. Delta encoding of `TileId` -3. Zeroing of `Offset`: - * `0` if it is equal to the `Offset` + `Length` of the previous entry - * `Offset+1` otherwise -4. Varint encoding of all numbers -5. Columnar ordering: all `TileId`s, all `RunLength`s, all `Length`s, then all `Offset`s -6. Finally, general purpose compression as described by the `Header`'s `InternalCompression` field - -# Directory Hierarchy - -* The number of entries in the root directory and leaf directories is up to the implementation. -* However, the compressed size of the header plus root directory is required in v3 to be under **16,384 bytes**. This is to allow latency-optimized clients to prefetch the root directory and guarantee it is complete. A sophisticated writer might need several attempts to optimize this. -* Root size, leaf sizes and depth should be configurable by the user to optimize for different trade-offs: cost, bandwidth, latency. - -# Header Design - -*Certain fields belonging to metadata in v2 are promoted to fixed-size header fields. This allows a map container to be initialized to the desired extent or center without blocking on the JSON metadata, and allows proxies to return well-defined HTTP headers.* - -The `Header` is 127 bytes, with little-endian integer values: - -| offset | description | width | -| --- | --- | --- | -| 0 | magic number `PMTiles` | 7 | -| 7 | spec version, currently `3` | 1 | -| 8 | offset of root directory | 8 | -| 16 | length of root directory | 8 | -| 24 | offset of JSON metadata, possibly compressed by `InternalCompression` | 8 | -| 32 | length of JSON metadata | 8 | -| 40 | offset of leaf directories | 8 | -| 48 | length of leaf directories | 8 | -| 56 | offset of tile data | 8 | -| 64 | length of tile data | 8 | -| 72 | # of addressed tiles, 0 if unknown | 8 | -| 80 | # of tile entries, 0 if unknown | 8 | -| 88 | # of tile contents, 0 if unknown | 8 | -| 96 | boolean clustered flag, `0x1` if true | 1 | -| 97 | `InternalCompression` enum (0 = Unknown, 1 = None, 2 = Gzip, 3 = Brotli, 4 = Zstd) | 1 | -| 98 | `TileCompression` enum | 1 | -| 99 | tile type enum (0 = Unknown/Other, 1 = MVT (PBF Vector Tile), 2 = PNG, 3 = JPEG, 4 = WEBP | 1 | -| 100 | min zoom | 1 | -| 101 | max zoom | 1 | -| 102 | min longitude (signed 32-bit integer: longitude * 10,000,000) | 4 | -| 106 | min latitude | 4 | -| 110 | max longitude | 4 | -| 114 | max latitude | 4 | -| 118 | center zoom | 1 | -| 119 | center longitude | 4 | -| 123 | center latitude | 4 | - -### Notes - -* **# of addressed tiles**: the total number of tiles before run-length encoding, i.e. `Sum(RunLength)` over all entries. -* **# of tile entries**: the total number of entries across all directories where `RunLength > 0`. -* **# # of tile contents**: the number of referenced blobs in the tile section, or the unique # of offsets. If the archive is completely deduplicated, this is equal to the # of unique tile contents. If there is no deduplication, this is equal to the number of tile entries above. -* **boolean clustered flag**: if true, blobs in the data section are ordered by Hilbert `TileId`. When writing with deduplication, this means that offsets are either contiguous with the previous offset+length, or refer to a lesser offset. -* **compression enum**: Mandatory, tells the client how to decompress contents as well as provide correct `Content-Encoding` headers to browsers. -* **tile type**: A hint as to the tile contents. Clients and proxies may use this to: - * Automatically determine a visualization method - * provide a conventional MIME type `Content-Type` HTTP header - * Enforce a canonical extension e.g. `.mvt`, `png`, `jpeg`, `.webp` to prevent duplication in caches - -### Organization - -In most cases, the archive should be in the order `Header`, Root Directory, JSON Metadata, Leaf Directories, Tile Data. It is possible to relocate sections other than `Header` arbitrarily, but no current writers/readers take advantage of this. A future design may allow for reverse-ordered archives to enable single-pass writing. +# PMTiles Version 3 Specification + +## 1 Introduction + +PMTiles is a single-file archive of square tiles. + +## 2 Overview + +A archive consist of five main sections: + +1. A fixed-size 127-byte header (described in [chapter 3](#3-header)) +1. A root directory (described in [chapter 4](#4-directories)) +1. Optional JSON meta data (described in [chapter 5](#5-json-metadata)) +1. Optional leaf directories (described in [chapter 4](#4-directories)) +1. The actual tile data. + +These sections are normally in the same order as in the list above, but theoretically it is possible to relocate all sections other than the header arbitrarily. +The only restriction is that the root directory must be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. + +## 3 Header + +The Header has a length of 127 bytes and is always at the start of the archive. It includes the most important meta data and everything needed to decode the rest of the PMTiles archive properly. + +### 3.1 Overview +``` +Offset 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000000 | Magic Number | V | Root Directory Offset | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000010 | Root Directory Length | Meta Data Offset | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000020 | Meta Data Length | Leaf Directories Offset | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000030 | Leaf Directories Length | Tile Data Offset | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000040 | Tile Data Length | Num of Addressed Tiles | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000050 | Number of Tile Entries | Number of Tile Contents | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000060 | C | IC | TC | TT |MinZ|MaxZ| Min Position | Max + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +000070 Position |CenZ| Center Position | + +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ +``` + +### 3.2 Fields +#### Magic Number + +The magic number is a fixed 7-byte field whose value is always `PMTiles` in UTF-8 encoding (`0x50 0x4D 0x54 0x69 0x6C 0x65 0x73`) + +#### Version (V) + +The version is a fixed 1-byte field whose value is always 3 (`0x03`). + +#### Root Directory Offset + +The Root Directory Offset is a 8-byte field whose value gives the offset of the first byte of the root directory. This address offset is relative to the first byte of the archive. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Root Directory Length + +The Root Directory Length is a 8-byte field specifying the number of bytes in the root directory. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Meta Data Offset + +The Meta Data Offset is a 8-byte field whose value gives the offset of the first byte of the meta data. This address offset is relative to the first byte of the archive. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Meta Data Length + +The Meta Data Length is a 8-byte field specifying the number of bytes reserved for the meta data. A value `0` indicates that there is no meta data included in this PMTiles archive. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Leaf Directories Offset + +The Leaf Directories Offset is a 8-byte field whose value gives the offset of the first byte of the leaf directories. This address offset is relative to the first byte of the archive. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Leaf Directories Length + +The Leaf Directories Length is a 8-byte field specifying the number of bytes reserved for leaf directories. A value `0` indicates that there are no leaf directories included in this PMTiles archive. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Tile Data Offset + +The Tile Data Offset is a 8-byte field whose value gives the offset of the first byte of the tile data. This address offset is relative to the first byte of the archive. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Tile Data Length + +The Tile Data Length is a 8-byte field specifying the number of bytes reserved for the tile data. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Number of Addressed Tiles + +The Number of Addressed Tiles is a 8-byte field specifying the total number of tiles, which are addressable in the PMTiles archive. + +A value of `0` indicates that the number is unknown. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Number of Tile Entries + +The Number of Tile Entries is a 8-byte field specifying the total number of tile-entries (_Run-Length_ is greater 0). + +A value of `0` indicates that the number is unknown. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Number of Tile Contents + +The Number of Tile Contents is a 8-byte field specifying the total number of blobs in the tile data section. + +A value of `0` indicates that the number is unknown. + +This field is encoded as an little-endian 64-bit unsigned integer. + +#### Clustered (C) + +Clustered is a 1-byte field specifying if the data of the individual tiles in the data section are order by their Tile-ID (clustered) or not (not clustered). + +Clustered means, that offsets are either contiguous with the previous offset+length, or refer to a lesser offset, when writing with deduplication. + +The field can be one of the following values: + +| Value | Meaning | +| :----- | :------------ | +| `0x00` | Not clustered | +| `0x01` | Clustered | + +#### Internal Compression (IC) + +The Internal Compression is a 1-byte field specifying the compression of the root directory, meta data as well as all leaf directories. + +The encoding of this field is described in [chapter 3.3](#33-compression). + +#### Tile Compression (TC) + +The Tile Compression is a 1-byte field specifying the compression of all tiles. + +The encoding of this field is described in [chapter 3.3](#33-compression). + +#### Tile Type (TT) + +The Tile Type is a 1-byte field specifying the type of tiles. + +The field can be one of the following values: + +| Value | Meaning | +| :----- | :----------------- | +| `0x00` | Unknown / Other | +| `0x01` | Mapbox Vector Tile | +| `0x02` | PNG | +| `0x03` | JPEG | +| `0x04` | WebP | + +#### Min Zoom (MinZ) + +The Min Zoom is a 1-byte field specifying minimum zoom (LOD) of the tiles. + +This field is encoded as an 8-bit unsigned integer. + +#### Max Zoom (MaxZ) + +The Max Zoom is a 1-byte field specifying maximum zoom (LOD) of the tiles. + +This field is encoded as an 8-bit unsigned integer. + +#### Min Position + +The Min Position is a 8-byte field including the minimum latitude and minimum longitude of the bounds. + +The encoding of this field is described in [chapter 3.4](#34-position). + +#### Max Position + +The Max Position is a 8-byte field including the maximum latitude and maximum longitude of the bounds. + +The encoding of this field is described in [chapter 3.4](#34-position). + +#### Center Zoom (CZ) + +The Center Zoom is a 1-byte field specifying center zoom (LOD) of the tiles. A reader may use this as the initial zoom, when displaying tiles from the PMTiles archive. + +This field is encoded as an 8-bit unsigned integer. + +#### Center Position + +The Center Position is a 8-byte field including the latitude and longitude of the center position. A reader may use this as the initial center position, when displaying tiles from the PMTiles archive. + +The encoding of this field is described in [chapter 3.4](#34-position). + +### 3.3 Compression + +Compression is a enum with the following values: + +| Value | Meaning | +| :----- | :------ | +| `0x00` | Unknown | +| `0x01` | None | +| `0x02` | GZip | +| `0x03` | Brotli | +| `0x04` | ZStd | + +### 3.4 Position + +A Position is encoded into 8 bytes. Bytes 0 through 3 (first 4 bytes) represent the latitude and byte 4 through 7 (last 4 bytes) represent the longitude. + +#### Encoding + +To encode a latitude or a longitude into 4 bytes use the following method: + +1. Multiply value by 10,000,000 +1. Convert result into little-endian 32-bit signed integer + +#### Decoding + +To decode a latitude or a longitude from 4 bytes use the following method: + +1. Read bytes as a little-endian 32-bit signed integer +1. Divide read value by 10,000,000 + +## 4 Directories + +A directory is simply a list of entries. Each entry describes either where a specific tile can be found in the _tile data section_, or where a leaf directory can be found in the _leaf directories section_. + +The number of entries in the root directory and in the leaf directories is left to the implementation and can vary drastically depending on what the writer has optimized for (cost, bandwidth, latency etc.). +However, the size of the header plus the compressed size of the root directory must not exceed 16384 bytes to allow latency-optimized clients to retrieve the root directory in its entirety. Therefore, the **maximum compressed size of the root directory is 16257 bytes** (16384 bytes - 127 bytes). A sophisticated writer might need several attempts to optimize this. + +### 4.1 Directory Entries + +Each directory entry consists of the following properties: +- Tile ID +- Offset +- Length +- Run-Length + +#### Tile-ID +Specifies the ID of the tile / the first tile in the leaf directory. + +The Tile-ID corresponds to a cumulative position on the series of [Hilbert curves](https://wikipedia.org/wiki/Hilbert_curve) starting at zoom level 0. + +|Z|X|Y|TileID| +|--:|--:|--:|--:| +|0|0|0|0| +|1|0|0|1| +|1|0|1|2| +|1|1|0|3| +|1|1|1|4| +|2|0|0|5| +|... +|12|3423|1763|19078479| + +#### Offset +Specifies the offset of the first byte of the tile or leaf directory. This address offset is relative to the first byte of the _tile data section_ for tile-entries, and relative to the first byte of the _leaf directories section_ for leaf-directory-entries. + +#### Length +Specifies the number of bytes reserved for this tile or leaf directory. This size always indicates the compressed size, if the tile or leaf directory is compressed. + +#### Run-Length +Specifies the number of tiles for which this entry is valid. A run length of `0` means that this entry is for a leaf directory and not for a tile. + +#### Examples +|Tile-ID|Offset|Length|Run-Length|Description| +|--:|--:|--:|--:|:--| +|`5`|`1337`|`42`|`1`|Tile 5 is located at bytes 1337-1378 of the _tile data section_| +|`5`|`1337`|`42`|`3`|Tile 5, 6 and 7 are located at bytes 1337-1378 of the _tile data section_| +|`5`|`1337`|`42`|`0`|A leaf directory whose first tile has an ID of 5 is located at byte 1337-1378 of the _leaf directories section_| + +### 4.2 Encoding +A directory can only be encoded in its entirety. It is not possible to encode a single directory entry by itself. + +[Appendix A.1](#a1-encode-a-directory) includes a pseudo code implementation of encoding a directory. + +An encoded directory consists of five parts in the following order: +1. The number of entries contained in the directory +1. Tile-IDs of all entries +1. Run-Lengths of all entries +1. Lengths of all entries +1. Offsets of all entries + +#### Number of entries +The number of entries included in this directory. + +This field is encoded as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). + +#### Tile IDs +The Tile-IDs are delta-encoded, i.e. the number to be written is the difference to the last Tile-ID. + +For example the Tile-IDs `5`, `42` and `69` would be encoded as `5` (_5 - 0_) `37` (_42 - 5_) and `27` (_69 - 42_). + +Each delta-encoded Tile-ID is encoded as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). + +#### Run-Lengths + +The Run-Lengths are simply encoded as is, each as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). + +#### Lengths + +The lengths are simply encoded as is, each as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). + +#### Offsets +Offsets are encoded either as `Offset + 1` or `0`, if it is equal to the sum of offset and length of the previous entry (tile blobs are contiguous). + +Each offset is encoded as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). + +#### Compression +After encoding, each directory is compressed according to the internal compression field of the header. Leaf directories are compressed separately and not as a whole section. + +### 4.3 Decoding + +_TODO_ + +## 5 JSON Metadata + +_TODO_ + +--- + +## A Pseudo Codes + +### A.1 Encode a directory + +#### Functions + +``` +write_var_int(x, y) = write 'y' as a little-endian variable-width integer to 'x' +compress(x) = compress 'x' according to internal compression +``` + +#### Pseudo Code + +```rs +entries = list of entries in this directory +buffer = the output byte-buffer + +last_id = 0 +for entry in entries { + write_var_int(buffer, entry.tile_id - last_id) + last_id = entry.tile_id +} + +for entry in entries { + write_var_int(buffer, entry.run_length) +} + +for entry in entries { + write_var_int(buffer, entry.length) +} + +next_byte = 0 +for (index, entry) in entries { + if entry.offset == next_byte { + write_var_int(buffer, 0) + } else { + write_var_int(buffer, entry.offset + 1) + } + + next_byte = entry.offset + entry.length +} + +compress(buffer) +``` \ No newline at end of file From b73d7bc1e6c6b69cffcae868294d395bbfeb2c16 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Sat, 6 May 2023 16:25:24 +0200 Subject: [PATCH 02/24] docs(spec): remove space in metadata --- spec/v3/spec.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index b7a3d28a..17ff1937 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -10,7 +10,7 @@ A archive consist of five main sections: 1. A fixed-size 127-byte header (described in [chapter 3](#3-header)) 1. A root directory (described in [chapter 4](#4-directories)) -1. Optional JSON meta data (described in [chapter 5](#5-json-metadata)) +1. Optional JSON metadata (described in [chapter 5](#5-json-metadata)) 1. Optional leaf directories (described in [chapter 4](#4-directories)) 1. The actual tile data. @@ -19,7 +19,7 @@ The only restriction is that the root directory must be contained in the first 1 ## 3 Header -The Header has a length of 127 bytes and is always at the start of the archive. It includes the most important meta data and everything needed to decode the rest of the PMTiles archive properly. +The Header has a length of 127 bytes and is always at the start of the archive. It includes the most important metadata and everything needed to decode the rest of the PMTiles archive properly. ### 3.1 Overview ``` @@ -27,9 +27,9 @@ Offset 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ 000000 | Magic Number | V | Root Directory Offset | +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ -000010 | Root Directory Length | Meta Data Offset | +000010 | Root Directory Length | Metadata Offset | +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ -000020 | Meta Data Length | Leaf Directories Offset | +000020 | Metadata Length | Leaf Directories Offset | +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ 000030 | Leaf Directories Length | Tile Data Offset | +----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+ @@ -64,15 +64,15 @@ The Root Directory Length is a 8-byte field specifying the number of bytes in th This field is encoded as an little-endian 64-bit unsigned integer. -#### Meta Data Offset +#### Metadata Offset -The Meta Data Offset is a 8-byte field whose value gives the offset of the first byte of the meta data. This address offset is relative to the first byte of the archive. +The Metadata Offset is a 8-byte field whose value gives the offset of the first byte of the metadata. This address offset is relative to the first byte of the archive. This field is encoded as an little-endian 64-bit unsigned integer. -#### Meta Data Length +#### Metadata Length -The Meta Data Length is a 8-byte field specifying the number of bytes reserved for the meta data. A value `0` indicates that there is no meta data included in this PMTiles archive. +The Metadata Length is a 8-byte field specifying the number of bytes reserved for the metadata. A value `0` indicates that there is no metadata included in this PMTiles archive. This field is encoded as an little-endian 64-bit unsigned integer. @@ -139,7 +139,7 @@ The field can be one of the following values: #### Internal Compression (IC) -The Internal Compression is a 1-byte field specifying the compression of the root directory, meta data as well as all leaf directories. +The Internal Compression is a 1-byte field specifying the compression of the root directory, metadata as well as all leaf directories. The encoding of this field is described in [chapter 3.3](#33-compression). From 4baeb7b7f7068b7c9b0a1022a384177b18253009 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Sat, 6 May 2023 16:28:25 +0200 Subject: [PATCH 03/24] docs(spec): add clarify number of addressed tiles --- spec/v3/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 17ff1937..0cf273ba 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -102,7 +102,7 @@ This field is encoded as an little-endian 64-bit unsigned integer. #### Number of Addressed Tiles -The Number of Addressed Tiles is a 8-byte field specifying the total number of tiles, which are addressable in the PMTiles archive. +The Number of Addressed Tiles is a 8-byte field specifying the total number of tiles, which are addressable in the PMTiles archive (before Run-Length encoding). A value of `0` indicates that the number is unknown. From 709b6ace5f8fbf16422deb80cd637ccf75b43745 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Sat, 6 May 2023 16:32:12 +0200 Subject: [PATCH 04/24] docs(spec): clarify restrinctions for reordering sections --- spec/v3/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 0cf273ba..0ad3d8b1 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -15,7 +15,7 @@ A archive consist of five main sections: 1. The actual tile data. These sections are normally in the same order as in the list above, but theoretically it is possible to relocate all sections other than the header arbitrarily. -The only restriction is that the root directory must be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. +The only two restrictions are that the header is at the start of the archive and the root directory must be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. ## 3 Header From e0dbf43af2efda84ba984b1c5f2cbcbbbf4cfe6d Mon Sep 17 00:00:00 2001 From: DerZade Date: Thu, 25 May 2023 22:45:00 +0200 Subject: [PATCH 05/24] docs(spec): mark meta data section as required --- spec/v3/spec.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 0ad3d8b1..228300be 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -10,7 +10,7 @@ A archive consist of five main sections: 1. A fixed-size 127-byte header (described in [chapter 3](#3-header)) 1. A root directory (described in [chapter 4](#4-directories)) -1. Optional JSON metadata (described in [chapter 5](#5-json-metadata)) +1. JSON metadata (described in [chapter 5](#5-json-metadata)) 1. Optional leaf directories (described in [chapter 4](#4-directories)) 1. The actual tile data. @@ -72,7 +72,7 @@ This field is encoded as an little-endian 64-bit unsigned integer. #### Metadata Length -The Metadata Length is a 8-byte field specifying the number of bytes reserved for the metadata. A value `0` indicates that there is no metadata included in this PMTiles archive. +The Metadata Length is a 8-byte field specifying the number of bytes reserved for the metadata. This field is encoded as an little-endian 64-bit unsigned integer. From 7b2d5d5988c00fbe8aeda65e6fae9deba5024126 Mon Sep 17 00:00:00 2001 From: DerZade Date: Thu, 25 May 2023 23:05:15 +0200 Subject: [PATCH 06/24] docs(spec): json meta data section --- spec/v3/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 228300be..9b0f7a16 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -322,7 +322,7 @@ _TODO_ ## 5 JSON Metadata -_TODO_ +The meta data section may include additional meta data related to the tileset, which is not already covered in the header section. This section allows for any valid JSON, thereby enabling flexibility in its structure. As a result, readers should refrain from assuming a predetermined format. --- From 1cf98e64abafa5c0ecce473a0436358e5932fbff Mon Sep 17 00:00:00 2001 From: DerZade Date: Sun, 4 Jun 2023 12:05:23 +0200 Subject: [PATCH 07/24] docs(spec): add pseudo code to decode a directory --- spec/v3/spec.md | 51 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 9b0f7a16..8ba875fd 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -369,4 +369,55 @@ for (index, entry) in entries { } compress(buffer) +``` + +### A.2 Decode a directory + +#### Functions + +``` +read_var_int(x) = read little-endian variable-width integer from 'x' +decompress(x) = decompress 'x' according to internal compression +``` + +#### Pseudo Code + +```rs +input_buffer = the input byte-buffer + +buffer = decompress(buffer) + +num_entries = read_var_int(buffer) + +entries = empty list of entries + +last_id = 0 +for i in num_entries { + value = read_var_int(buffer) + last_id = last_id + value + + entries[i] = Entry { tile_id: last_id } +} + +for i in num_entries { + entries[i].run_length = read_var_int(buffer) +} + +for i in num_entries { + entries[i].length = read_var_int(buffer) +} + +for i in num_entries { + value = read_var_int(buffer) + + if value == 0 && i > 0 { + // offset = 0 -> entry is directly after previous entry + prev_entry = entries[i - 1]; + + entries[i].offset = prev_entry.offset + prev_entry.length; + } else { + entries[i].offset = value - 1; + } +} + ``` \ No newline at end of file From f3b59ac5a4aaa19096b540b011bdd8ce0644f867 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Tue, 13 Jun 2023 11:27:05 +0200 Subject: [PATCH 08/24] docs(spec): fix bug in encode dir pseudo code --- spec/v3/spec.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 8ba875fd..1df5a7b1 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -359,7 +359,7 @@ for entry in entries { next_byte = 0 for (index, entry) in entries { - if entry.offset == next_byte { + if index > 0 && entry.offset == next_byte { write_var_int(buffer, 0) } else { write_var_int(buffer, entry.offset + 1) @@ -420,4 +420,4 @@ for i in num_entries { } } -``` \ No newline at end of file +``` From 3698f951dcefcd03c1ad9375fed129454a226f27 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 12 Jul 2023 08:29:53 +0200 Subject: [PATCH 09/24] docs(spec): minor tweaks to spec --- spec/v3/spec.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 1df5a7b1..ecf51d4b 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -126,9 +126,8 @@ This field is encoded as an little-endian 64-bit unsigned integer. #### Clustered (C) -Clustered is a 1-byte field specifying if the data of the individual tiles in the data section are order by their Tile-ID (clustered) or not (not clustered). - -Clustered means, that offsets are either contiguous with the previous offset+length, or refer to a lesser offset, when writing with deduplication. +Clustered is a 1-byte field specifying if the data of the individual tiles in the data section are order by their Tile-ID (clustered) or not (not clustered). +Therfore Clustered means, that offsets are either contiguous with the previous offset+length, or refer to a lesser offset, when writing with deduplication. The field can be one of the following values: @@ -385,7 +384,7 @@ decompress(x) = decompress 'x' according to internal compression ```rs input_buffer = the input byte-buffer -buffer = decompress(buffer) +buffer = decompress(input_buffer) num_entries = read_var_int(buffer) From bb8198014d8a341fb64349c61112f5437dd9c7c0 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 12 Jul 2023 08:56:11 +0200 Subject: [PATCH 10/24] docs(spec): add decoding directory section --- spec/v3/spec.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index ecf51d4b..40009096 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -317,7 +317,15 @@ After encoding, each directory is compressed according to the internal compressi ### 4.3 Decoding -_TODO_ +Decoding a directory works similar to encoding but in reverse. [Appendix A.2](#a2-decode-a-directory) includes a pseudo code implementation of decoding a directory. The basic steps for are the following: +1. Decompress the data according to the internal compression +1. Read a [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints) indicating how many entries are included in the directory (let's call this `n`) +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the delta-encoded Tile IDs of all entries _¹_ +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the Run-Lenghts of all entries +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the Lenghts of all entries +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the Offsets of all entries _¹_ + +_¹ Please refer to [Section 4.2](#42-encoding) for details on how Tile ID and Offset are encoded_ ## 5 JSON Metadata From cdf746b5bceb1d5a846087060592f3e91450ff18 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 12 Jul 2023 09:52:00 +0200 Subject: [PATCH 11/24] docs(spec): add introduction --- spec/v3/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 40009096..af916ee7 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -2,7 +2,7 @@ ## 1 Introduction -PMTiles is a single-file archive of square tiles. +PMTiles is a single-file archive format for tiled data. It enables low-cost, zero-maintenance map applications for "serverless" environments, without having to rely on a custom tile backend or a thrid pary provider. This is achieved by packing all tiles of a tileset into an archive so that all tiles can be accessed easily and without much overhead via HTTP range requests. By combining all the tiles into one archive, hosting costs are kept low, as it is usually a lot cheaper to update one large file than to update thousands or even millions of small files. ## 2 Overview From 1ac5ad1d9523d82246e254f6f93bcbf43dfd9a2f Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 12 Jul 2023 09:54:09 +0200 Subject: [PATCH 12/24] docs(spec): fix typos --- spec/v3/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index af916ee7..440e087e 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -2,7 +2,7 @@ ## 1 Introduction -PMTiles is a single-file archive format for tiled data. It enables low-cost, zero-maintenance map applications for "serverless" environments, without having to rely on a custom tile backend or a thrid pary provider. This is achieved by packing all tiles of a tileset into an archive so that all tiles can be accessed easily and without much overhead via HTTP range requests. By combining all the tiles into one archive, hosting costs are kept low, as it is usually a lot cheaper to update one large file than to update thousands or even millions of small files. +PMTiles is a single-file archive format for tiled data. It enables low-cost, zero-maintenance map applications for "serverless" environments without having to rely on a custom tile backend or a third-party provider. This is achieved by packing all tiles of a tileset into an archive so that all tiles can be accessed easily and without much overhead via HTTP range requests. By combining all the tiles into one archive, hosting costs are kept low, as it is usually a lot cheaper to update one large file than to update thousands or even millions of small files. ## 2 Overview From 7d20f95d083da6737cec0a4a0ef11ad843f6cab2 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 12 Jul 2023 10:16:32 +0200 Subject: [PATCH 13/24] docs(spec): fix more typos --- spec/v3/spec.md | 124 ++++++++++++++++++++++++------------------------ 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 440e087e..ed54eb69 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -6,15 +6,15 @@ PMTiles is a single-file archive format for tiled data. It enables low-cost, zer ## 2 Overview -A archive consist of five main sections: +An archive consists of five main sections: 1. A fixed-size 127-byte header (described in [chapter 3](#3-header)) 1. A root directory (described in [chapter 4](#4-directories)) 1. JSON metadata (described in [chapter 5](#5-json-metadata)) 1. Optional leaf directories (described in [chapter 4](#4-directories)) -1. The actual tile data. +1. The actual tile data -These sections are normally in the same order as in the list above, but theoretically it is possible to relocate all sections other than the header arbitrarily. +These sections are normally in the same order as in the list above, but theoretically, it is possible to relocate all sections other than the header arbitrarily. The only two restrictions are that the header is at the start of the archive and the root directory must be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. ## 3 Header @@ -54,80 +54,80 @@ The version is a fixed 1-byte field whose value is always 3 (`0x03`). #### Root Directory Offset -The Root Directory Offset is a 8-byte field whose value gives the offset of the first byte of the root directory. This address offset is relative to the first byte of the archive. +The Root Directory Offset is an 8-byte field whose value gives the offset of the first byte of the root directory. This address offset is relative to the first byte of the archive. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Root Directory Length -The Root Directory Length is a 8-byte field specifying the number of bytes in the root directory. +The Root Directory Length is an 8-byte field specifying the number of bytes in the root directory. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Metadata Offset -The Metadata Offset is a 8-byte field whose value gives the offset of the first byte of the metadata. This address offset is relative to the first byte of the archive. +The Metadata Offset is an 8-byte field whose value gives the offset of the first byte of the metadata. This address offset is relative to the first byte of the archive. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Metadata Length -The Metadata Length is a 8-byte field specifying the number of bytes reserved for the metadata. +The Metadata Length is an 8-byte field specifying the number of bytes reserved for the metadata. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Leaf Directories Offset -The Leaf Directories Offset is a 8-byte field whose value gives the offset of the first byte of the leaf directories. This address offset is relative to the first byte of the archive. +The Leaf Directories Offset is an 8-byte field whose value gives the offset of the first byte of the leaf directories. This address offset is relative to the first byte of the archive. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Leaf Directories Length -The Leaf Directories Length is a 8-byte field specifying the number of bytes reserved for leaf directories. A value `0` indicates that there are no leaf directories included in this PMTiles archive. +The Leaf Directories Length is an 8-byte field specifying the number of bytes reserved for leaf directories. A value of `0` indicates that there are no leaf directories included in this PMTiles archive. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Tile Data Offset -The Tile Data Offset is a 8-byte field whose value gives the offset of the first byte of the tile data. This address offset is relative to the first byte of the archive. +The Tile Data Offset is an 8-byte field whose value gives the offset of the first byte of the tile data. This address offset is relative to the first byte of the archive. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Tile Data Length -The Tile Data Length is a 8-byte field specifying the number of bytes reserved for the tile data. +The Tile Data Length is an 8-byte field specifying the number of bytes reserved for the tile data. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Number of Addressed Tiles -The Number of Addressed Tiles is a 8-byte field specifying the total number of tiles, which are addressable in the PMTiles archive (before Run-Length encoding). +The Number of Addressed Tiles is an 8-byte field specifying the total number of tiles that are addressable in the PMTiles archive (before Run-Length encoding). A value of `0` indicates that the number is unknown. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Number of Tile Entries -The Number of Tile Entries is a 8-byte field specifying the total number of tile-entries (_Run-Length_ is greater 0). +The Number of Tile Entries is an 8-byte field specifying the total number of tile entries (_Run-Length_ is greater than 0). A value of `0` indicates that the number is unknown. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Number of Tile Contents -The Number of Tile Contents is a 8-byte field specifying the total number of blobs in the tile data section. +The Number of Tile Contents is an 8-byte field specifying the total number of blobs in the tile data section. A value of `0` indicates that the number is unknown. -This field is encoded as an little-endian 64-bit unsigned integer. +This field is encoded as a little-endian 64-bit unsigned integer. #### Clustered (C) -Clustered is a 1-byte field specifying if the data of the individual tiles in the data section are order by their Tile-ID (clustered) or not (not clustered). -Therfore Clustered means, that offsets are either contiguous with the previous offset+length, or refer to a lesser offset, when writing with deduplication. +Clustered is a 1-byte field specifying if the data of the individual tiles in the data section is ordered by their Tile-ID (clustered) or not (not clustered). +Therfore, Clustered means that offsets are either contiguous with the previous offset+length, or refer to a lesser offset when writing with deduplication. The field can be one of the following values: @@ -138,7 +138,7 @@ The field can be one of the following values: #### Internal Compression (IC) -The Internal Compression is a 1-byte field specifying the compression of the root directory, metadata as well as all leaf directories. +The Internal Compression is a 1-byte field specifying the compression of the root directory, metadata, and all leaf directories. The encoding of this field is described in [chapter 3.3](#33-compression). @@ -164,43 +164,43 @@ The field can be one of the following values: #### Min Zoom (MinZ) -The Min Zoom is a 1-byte field specifying minimum zoom (LOD) of the tiles. +The Min Zoom is a 1-byte field specifying the minimum zoom (LOD) of the tiles. This field is encoded as an 8-bit unsigned integer. #### Max Zoom (MaxZ) -The Max Zoom is a 1-byte field specifying maximum zoom (LOD) of the tiles. +The Max Zoom is a 1-byte field specifying the maximum zoom (LOD) of the tiles. This field is encoded as an 8-bit unsigned integer. #### Min Position -The Min Position is a 8-byte field including the minimum latitude and minimum longitude of the bounds. +The Min Position is an 8-byte field that includes the minimum latitude and minimum longitude of the bounds. The encoding of this field is described in [chapter 3.4](#34-position). #### Max Position -The Max Position is a 8-byte field including the maximum latitude and maximum longitude of the bounds. +The Max Position is an 8-byte field including the maximum latitude and maximum longitude of the bounds. The encoding of this field is described in [chapter 3.4](#34-position). #### Center Zoom (CZ) -The Center Zoom is a 1-byte field specifying center zoom (LOD) of the tiles. A reader may use this as the initial zoom, when displaying tiles from the PMTiles archive. +The Center Zoom is a 1-byte field specifying the center zoom (LOD) of the tiles. A reader may use this as the initial zoom when displaying tiles from the PMTiles archive. This field is encoded as an 8-bit unsigned integer. #### Center Position -The Center Position is a 8-byte field including the latitude and longitude of the center position. A reader may use this as the initial center position, when displaying tiles from the PMTiles archive. +The Center Position is an 8-byte field that includes the latitude and longitude of the center position. A reader may use this as the initial center position when displaying tiles from the PMTiles archive. The encoding of this field is described in [chapter 3.4](#34-position). ### 3.3 Compression -Compression is a enum with the following values: +Compression is an enum with the following values: | Value | Meaning | | :----- | :------ | @@ -212,27 +212,27 @@ Compression is a enum with the following values: ### 3.4 Position -A Position is encoded into 8 bytes. Bytes 0 through 3 (first 4 bytes) represent the latitude and byte 4 through 7 (last 4 bytes) represent the longitude. +A Position is encoded into 8 bytes. Bytes 0 through 3 (the first 4 bytes) represent the latitude, and bytes 4 through 7 (the last 4 bytes) represent the longitude. #### Encoding -To encode a latitude or a longitude into 4 bytes use the following method: +To encode a latitude or a longitude into 4 bytes, use the following method: -1. Multiply value by 10,000,000 -1. Convert result into little-endian 32-bit signed integer +1. Multiply the value by 10,000,000. +1. Convert the result into a little-endian 32-bit signed integer. #### Decoding -To decode a latitude or a longitude from 4 bytes use the following method: +To decode a latitude or a longitude from 4 bytes, use the following method: -1. Read bytes as a little-endian 32-bit signed integer -1. Divide read value by 10,000,000 +1. Read bytes as a little-endian 32-bit signed integer. +1. Divide the read value by 10,000,000. ## 4 Directories -A directory is simply a list of entries. Each entry describes either where a specific tile can be found in the _tile data section_, or where a leaf directory can be found in the _leaf directories section_. +A directory is simply a list of entries. Each entry describes either where a specific tile can be found in the _tile data section_ or where a leaf directory can be found in the _leaf directories section_. -The number of entries in the root directory and in the leaf directories is left to the implementation and can vary drastically depending on what the writer has optimized for (cost, bandwidth, latency etc.). +The number of entries in the root directory and in the leaf directories is left to the implementation and can vary drastically depending on what the writer has optimized for (cost, bandwidth, latency, etc.). However, the size of the header plus the compressed size of the root directory must not exceed 16384 bytes to allow latency-optimized clients to retrieve the root directory in its entirety. Therefore, the **maximum compressed size of the root directory is 16257 bytes** (16384 bytes - 127 bytes). A sophisticated writer might need several attempts to optimize this. ### 4.1 Directory Entries @@ -260,7 +260,7 @@ The Tile-ID corresponds to a cumulative position on the series of [Hilbert curve |12|3423|1763|19078479| #### Offset -Specifies the offset of the first byte of the tile or leaf directory. This address offset is relative to the first byte of the _tile data section_ for tile-entries, and relative to the first byte of the _leaf directories section_ for leaf-directory-entries. +Specifies the offset of the first byte of the tile or leaf directory. This address offset is relative to the first byte of the _tile data section_ for tile-entries and relative to the first byte of the _leaf directories section_ for leaf-directory-entries. #### Length Specifies the number of bytes reserved for this tile or leaf directory. This size always indicates the compressed size, if the tile or leaf directory is compressed. @@ -290,46 +290,46 @@ An encoded directory consists of five parts in the following order: #### Number of entries The number of entries included in this directory. -This field is encoded as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). +This field is encoded as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). #### Tile IDs -The Tile-IDs are delta-encoded, i.e. the number to be written is the difference to the last Tile-ID. +The Tile-IDs are delta-encoded, i.e., the number to be written is the difference to the last Tile-ID. -For example the Tile-IDs `5`, `42` and `69` would be encoded as `5` (_5 - 0_) `37` (_42 - 5_) and `27` (_69 - 42_). +For example, the Tile-IDs `5`, `42`, and `69` would be encoded as `5` (_5 - 0_), `37` (_42 - 5_), and `27` (_69 - 42_). -Each delta-encoded Tile-ID is encoded as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). +Each delta-encoded Tile-ID is encoded as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). #### Run-Lengths -The Run-Lengths are simply encoded as is, each as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). +The Run-Lengths are simply encoded as is, each as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). #### Lengths -The lengths are simply encoded as is, each as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). +The lengths are simply encoded as is, each as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). #### Offsets -Offsets are encoded either as `Offset + 1` or `0`, if it is equal to the sum of offset and length of the previous entry (tile blobs are contiguous). +Offsets are encoded either as `Offset + 1` or `0`, if they are equal to the sum of offset and length of the previous entry (tile blobs are contiguous). -Each offset is encoded as an little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). +Each offset is encoded as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). #### Compression After encoding, each directory is compressed according to the internal compression field of the header. Leaf directories are compressed separately and not as a whole section. ### 4.3 Decoding -Decoding a directory works similar to encoding but in reverse. [Appendix A.2](#a2-decode-a-directory) includes a pseudo code implementation of decoding a directory. The basic steps for are the following: -1. Decompress the data according to the internal compression -1. Read a [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints) indicating how many entries are included in the directory (let's call this `n`) -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the delta-encoded Tile IDs of all entries _¹_ -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the Run-Lenghts of all entries -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the Lenghts of all entries -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the the Offsets of all entries _¹_ +Decoding a directory works similarly to encoding, but in reverse. [Appendix A.2](#a2-decode-a-directory) includes a pseudo code implementation of decoding a directory. The basic steps are the following: +1. Decompress the data according to the internal compression. +1. Read a [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints) indicating how many entries are included in the directory (let's call this `n`). +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the delta-encoded Tile IDs of all entries. _¹_ +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Run-Lenghts of all entries. +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Lenghts of all entries. +1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Offsets of all entries. _¹_ -_¹ Please refer to [Section 4.2](#42-encoding) for details on how Tile ID and Offset are encoded_ +_¹ Please refer to [Section 4.2](#42-encoding) for details on how Tile ID and Offset are encoded._ ## 5 JSON Metadata -The meta data section may include additional meta data related to the tileset, which is not already covered in the header section. This section allows for any valid JSON, thereby enabling flexibility in its structure. As a result, readers should refrain from assuming a predetermined format. +The meta data section may include additional meta data related to the tileset that is not already covered in the header section. This section allows for any valid JSON, thereby enabling flexibility in its structure. As a result, readers should refrain from assuming a predetermined format. --- From d20cf7cd6e85635324c3e3a54abf6b79cfa3204a Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 12 Jul 2023 10:43:00 +0200 Subject: [PATCH 14/24] docs(spec): fix more typos --- spec/v3/spec.md | 50 ++++++++++++++++++++++++------------------------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index ed54eb69..c706b9bf 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -8,10 +8,10 @@ PMTiles is a single-file archive format for tiled data. It enables low-cost, zer An archive consists of five main sections: -1. A fixed-size 127-byte header (described in [chapter 3](#3-header)) -1. A root directory (described in [chapter 4](#4-directories)) -1. JSON metadata (described in [chapter 5](#5-json-metadata)) -1. Optional leaf directories (described in [chapter 4](#4-directories)) +1. A fixed-size 127-byte header (described in [Chapter 3](#3-header)) +1. A root directory (described in [Chapter 4](#4-directories)) +1. JSON metadata (described in [Chapter 5](#5-json-metadata)) +1. Optional leaf directories (described in [Chapter 4](#4-directories)) 1. The actual tile data These sections are normally in the same order as in the list above, but theoretically, it is possible to relocate all sections other than the header arbitrarily. @@ -127,9 +127,9 @@ This field is encoded as a little-endian 64-bit unsigned integer. #### Clustered (C) Clustered is a 1-byte field specifying if the data of the individual tiles in the data section is ordered by their Tile-ID (clustered) or not (not clustered). -Therfore, Clustered means that offsets are either contiguous with the previous offset+length, or refer to a lesser offset when writing with deduplication. +Therefore, Clustered means that offsets are either contiguous with the previous offset+length, or refer to a lesser offset when writing with deduplication. -The field can be one of the following values: +The field can have one of the following values: | Value | Meaning | | :----- | :------------ | @@ -140,19 +140,19 @@ The field can be one of the following values: The Internal Compression is a 1-byte field specifying the compression of the root directory, metadata, and all leaf directories. -The encoding of this field is described in [chapter 3.3](#33-compression). +The encoding of this field is described in [Chapter 3.3](#33-compression). #### Tile Compression (TC) The Tile Compression is a 1-byte field specifying the compression of all tiles. -The encoding of this field is described in [chapter 3.3](#33-compression). +The encoding of this field is described in [Chapter 3.3](#33-compression). #### Tile Type (TT) The Tile Type is a 1-byte field specifying the type of tiles. -The field can be one of the following values: +The field can have one of the following values: | Value | Meaning | | :----- | :----------------- | @@ -178,13 +178,13 @@ This field is encoded as an 8-bit unsigned integer. The Min Position is an 8-byte field that includes the minimum latitude and minimum longitude of the bounds. -The encoding of this field is described in [chapter 3.4](#34-position). +The encoding of this field is described in [Chapter 3.4](#34-position). #### Max Position The Max Position is an 8-byte field including the maximum latitude and maximum longitude of the bounds. -The encoding of this field is described in [chapter 3.4](#34-position). +The encoding of this field is described in [Chapter 3.4](#34-position). #### Center Zoom (CZ) @@ -196,7 +196,7 @@ This field is encoded as an 8-bit unsigned integer. The Center Position is an 8-byte field that includes the latitude and longitude of the center position. A reader may use this as the initial center position when displaying tiles from the PMTiles archive. -The encoding of this field is described in [chapter 3.4](#34-position). +The encoding of this field is described in [Chapter 3.4](#34-position). ### 3.3 Compression @@ -244,7 +244,7 @@ Each directory entry consists of the following properties: - Run-Length #### Tile-ID -Specifies the ID of the tile / the first tile in the leaf directory. +Specifies the ID of the tile or the first tile in the leaf directory. The Tile-ID corresponds to a cumulative position on the series of [Hilbert curves](https://wikipedia.org/wiki/Hilbert_curve) starting at zoom level 0. @@ -271,14 +271,14 @@ Specifies the number of tiles for which this entry is valid. A run length of `0` #### Examples |Tile-ID|Offset|Length|Run-Length|Description| |--:|--:|--:|--:|:--| -|`5`|`1337`|`42`|`1`|Tile 5 is located at bytes 1337-1378 of the _tile data section_| -|`5`|`1337`|`42`|`3`|Tile 5, 6 and 7 are located at bytes 1337-1378 of the _tile data section_| -|`5`|`1337`|`42`|`0`|A leaf directory whose first tile has an ID of 5 is located at byte 1337-1378 of the _leaf directories section_| +|`5`|`1337`|`42`|`1`|Tile 5 is located at bytes 1337–1378 of the _tile data section_.| +|`5`|`1337`|`42`|`3`|Tiles 5, 6, and 7 are located at bytes 1337–1378 of the _tile data section_.| +|`5`|`1337`|`42`|`0`|A leaf directory whose first tile has an ID of 5 is located at byte 1337–1378 of the _leaf directories section_.| ### 4.2 Encoding A directory can only be encoded in its entirety. It is not possible to encode a single directory entry by itself. -[Appendix A.1](#a1-encode-a-directory) includes a pseudo code implementation of encoding a directory. +[Appendix A.1](#a1-encode-a-directory) includes a pseudocode implementation of encoding a directory. An encoded directory consists of five parts in the following order: 1. The number of entries contained in the directory @@ -317,13 +317,13 @@ After encoding, each directory is compressed according to the internal compressi ### 4.3 Decoding -Decoding a directory works similarly to encoding, but in reverse. [Appendix A.2](#a2-decode-a-directory) includes a pseudo code implementation of decoding a directory. The basic steps are the following: +Decoding a directory works similarly to encoding, but in reverse. [Appendix A.2](#a2-decode-a-directory) includes a pseudocode implementation of decoding a directory. The basic steps are the following: 1. Decompress the data according to the internal compression. 1. Read a [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints) indicating how many entries are included in the directory (let's call this `n`). -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the delta-encoded Tile IDs of all entries. _¹_ -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Run-Lenghts of all entries. -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Lenghts of all entries. -1. Read `n` amount of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Offsets of all entries. _¹_ +1. Read `n` number of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the delta-encoded Tile IDs of all entries. _¹_ +1. Read `n` number of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Run-Lenghts of all entries. +1. Read `n` number of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Lenghts of all entries. +1. Read `n` number of [variable-width integers](https://protobuf.dev/programming-guides/encoding/#varints), which are the Offsets of all entries. _¹_ _¹ Please refer to [Section 4.2](#42-encoding) for details on how Tile ID and Offset are encoded._ @@ -333,7 +333,7 @@ The meta data section may include additional meta data related to the tileset th --- -## A Pseudo Codes +## A Pseudocodes ### A.1 Encode a directory @@ -344,7 +344,7 @@ write_var_int(x, y) = write 'y' as a little-endian variable-width integer to 'x' compress(x) = compress 'x' according to internal compression ``` -#### Pseudo Code +#### Pseudocode ```rs entries = list of entries in this directory @@ -387,7 +387,7 @@ read_var_int(x) = read little-endian variable-width integer from 'x' decompress(x) = decompress 'x' according to internal compression ``` -#### Pseudo Code +#### Pseudocode ```rs input_buffer = the input byte-buffer From 08388babe07513a603af34c5f7389c4d6d17cc3a Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Mon, 17 Jul 2023 13:58:59 +0200 Subject: [PATCH 15/24] docs(spec): added AVIF to tile type --- spec/v3/spec.md | 1 + 1 file changed, 1 insertion(+) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index c706b9bf..a9b338d4 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -161,6 +161,7 @@ The field can have one of the following values: | `0x02` | PNG | | `0x03` | JPEG | | `0x04` | WebP | +| `0x05` | AVIF | #### Min Zoom (MinZ) From 44538a7a2ce6a559cae0e575c076b9ba58292349 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Mon, 17 Jul 2023 14:05:10 +0200 Subject: [PATCH 16/24] docs(spec): backport changes from 3.1 --- spec/v3/spec.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index a9b338d4..a702cac9 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -282,7 +282,7 @@ A directory can only be encoded in its entirety. It is not possible to encode a [Appendix A.1](#a1-encode-a-directory) includes a pseudocode implementation of encoding a directory. An encoded directory consists of five parts in the following order: -1. The number of entries contained in the directory +1. The number of entries contained in the directory (must be greater than 0) 1. Tile-IDs of all entries 1. Run-Lengths of all entries 1. Lengths of all entries @@ -306,7 +306,7 @@ The Run-Lengths are simply encoded as is, each as a little-endian [variable-widt #### Lengths -The lengths are simply encoded as is, each as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). +The lengths are simply encoded as is, each as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). Each length must be greater than 0. #### Offsets Offsets are encoded either as `Offset + 1` or `0`, if they are equal to the sum of offset and length of the previous entry (tile blobs are contiguous). From 60cb7259e94aedff54d28dc786b6db948454089c Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Mon, 17 Jul 2023 15:11:27 +0200 Subject: [PATCH 17/24] docs(spec): update metadata section --- spec/v3/spec.md | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index a702cac9..9adcb7a5 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -330,7 +330,21 @@ _¹ Please refer to [Section 4.2](#42-encoding) for details on how Tile ID and O ## 5 JSON Metadata -The meta data section may include additional meta data related to the tileset that is not already covered in the header section. This section allows for any valid JSON, thereby enabling flexibility in its structure. As a result, readers should refrain from assuming a predetermined format. +The meta data section must contain a valid JSON object encoded in UTF-8, which may include additional meta data related to the tileset that is not already covered in the header section. + +If the [Tile Type](#tile-type-tt) in the header has a value of _Mapbox Vector Tile_, the object should contain a key of `vector_layers` as described in the [TileJSON 3.0 specification](https://github.com/mapbox/tilejson-spec/blob/22f5f91e643e8980ef2656674bef84c2869fbe76/3.0.0/README.md#33-vector_layers). + +Additionally, this specification defines the following keys, which may be included in the object: + +|Key|Description|Type| +|--:|--|--| +|`name`|A name describing the tileset|string| +|`description`|A text description of the tileset|string| +|`attribution`|An attribution to be displayed when the map is shown to a user. Implementations may decide to treat this as HTML or literal text. |string| +|`type`|The type of the tileset |a string with a value of either `overlay` or `baselayer`| +|`version`|The version number of the tileset|a string containing a valid version according to [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0.html) | + +The JSON object may also include any other keys with an arbitrary value. This specification recommends nesting all application-specific data in an object under a semi-unique key to avoid overlap with other application-specific data or keys that may be defined in future versions of this specification. For example, instead of including the custom fields `author` and `companyId` directly in the top level of the metadata object, they should be nested in another object under a key with your project or organization name. --- From 7d92b036987be47f3210fd9d7dd764641309af60 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Mon, 17 Jul 2023 15:26:56 +0200 Subject: [PATCH 18/24] docs(spec): add changelog --- spec/v3/spec.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 9adcb7a5..1b335170 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -1,5 +1,20 @@ # PMTiles Version 3 Specification +## Changelog + +
+Version 3.2 +Complete rewrite to clarify many ambiguous wordings. +
+ +
+Version 3.1 + +- added `metadata` details about `vector_layers`. +- Clarified that directory entry lengths must be nonzero, and directories must be non-empty. +- add AVIF to TileTypes. +
+ ## 1 Introduction PMTiles is a single-file archive format for tiled data. It enables low-cost, zero-maintenance map applications for "serverless" environments without having to rely on a custom tile backend or a third-party provider. This is achieved by packing all tiles of a tileset into an archive so that all tiles can be accessed easily and without much overhead via HTTP range requests. By combining all the tiles into one archive, hosting costs are kept low, as it is usually a lot cheaper to update one large file than to update thousands or even millions of small files. From 132319fc0600dc917d34ea75374517b3905b888a Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Tue, 18 Jul 2023 12:29:04 +0200 Subject: [PATCH 19/24] docs(spec): add RFC 2119 remartk --- spec/v3/spec.md | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 1b335170..8014184b 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -1,5 +1,7 @@ # PMTiles Version 3 Specification +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). + ## Changelog
@@ -30,7 +32,7 @@ An archive consists of five main sections: 1. The actual tile data These sections are normally in the same order as in the list above, but theoretically, it is possible to relocate all sections other than the header arbitrarily. -The only two restrictions are that the header is at the start of the archive and the root directory must be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. +The only two restrictions are that the header is at the start of the archive and the root directory MUST be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. ## 3 Header @@ -204,13 +206,13 @@ The encoding of this field is described in [Chapter 3.4](#34-position). #### Center Zoom (CZ) -The Center Zoom is a 1-byte field specifying the center zoom (LOD) of the tiles. A reader may use this as the initial zoom when displaying tiles from the PMTiles archive. +The Center Zoom is a 1-byte field specifying the center zoom (LOD) of the tiles. A reader MAY use this as the initial zoom when displaying tiles from the PMTiles archive. This field is encoded as an 8-bit unsigned integer. #### Center Position -The Center Position is an 8-byte field that includes the latitude and longitude of the center position. A reader may use this as the initial center position when displaying tiles from the PMTiles archive. +The Center Position is an 8-byte field that includes the latitude and longitude of the center position. A reader MAY use this as the initial center position when displaying tiles from the PMTiles archive. The encoding of this field is described in [Chapter 3.4](#34-position). @@ -249,7 +251,7 @@ To decode a latitude or a longitude from 4 bytes, use the following method: A directory is simply a list of entries. Each entry describes either where a specific tile can be found in the _tile data section_ or where a leaf directory can be found in the _leaf directories section_. The number of entries in the root directory and in the leaf directories is left to the implementation and can vary drastically depending on what the writer has optimized for (cost, bandwidth, latency, etc.). -However, the size of the header plus the compressed size of the root directory must not exceed 16384 bytes to allow latency-optimized clients to retrieve the root directory in its entirety. Therefore, the **maximum compressed size of the root directory is 16257 bytes** (16384 bytes - 127 bytes). A sophisticated writer might need several attempts to optimize this. +However, the size of the header plus the compressed size of the root directory MUST NOT exceed 16384 bytes to allow latency-optimized clients to retrieve the root directory in its entirety. Therefore, the **maximum compressed size of the root directory is 16257 bytes** (16384 bytes - 127 bytes). A sophisticated writer might need several attempts to optimize this. ### 4.1 Directory Entries @@ -297,7 +299,7 @@ A directory can only be encoded in its entirety. It is not possible to encode a [Appendix A.1](#a1-encode-a-directory) includes a pseudocode implementation of encoding a directory. An encoded directory consists of five parts in the following order: -1. The number of entries contained in the directory (must be greater than 0) +1. The number of entries contained in the directory (MUST be greater than 0) 1. Tile-IDs of all entries 1. Run-Lengths of all entries 1. Lengths of all entries @@ -321,7 +323,7 @@ The Run-Lengths are simply encoded as is, each as a little-endian [variable-widt #### Lengths -The lengths are simply encoded as is, each as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). Each length must be greater than 0. +The lengths are simply encoded as is, each as a little-endian [variable-width integer](https://protobuf.dev/programming-guides/encoding/#varints). Each length MUST be greater than 0. #### Offsets Offsets are encoded either as `Offset + 1` or `0`, if they are equal to the sum of offset and length of the previous entry (tile blobs are contiguous). @@ -345,21 +347,21 @@ _¹ Please refer to [Section 4.2](#42-encoding) for details on how Tile ID and O ## 5 JSON Metadata -The meta data section must contain a valid JSON object encoded in UTF-8, which may include additional meta data related to the tileset that is not already covered in the header section. +The meta data section MUST contain a valid JSON object encoded in UTF-8, which MAY include additional meta data related to the tileset that is not already covered in the header section. -If the [Tile Type](#tile-type-tt) in the header has a value of _Mapbox Vector Tile_, the object should contain a key of `vector_layers` as described in the [TileJSON 3.0 specification](https://github.com/mapbox/tilejson-spec/blob/22f5f91e643e8980ef2656674bef84c2869fbe76/3.0.0/README.md#33-vector_layers). +If the [Tile Type](#tile-type-tt) in the header has a value of _Mapbox Vector Tile_, the object SHOULD contain a key of `vector_layers` as described in the [TileJSON 3.0 specification](https://github.com/mapbox/tilejson-spec/blob/22f5f91e643e8980ef2656674bef84c2869fbe76/3.0.0/README.md#33-vector_layers). -Additionally, this specification defines the following keys, which may be included in the object: +Additionally, this specification defines the following keys, which MAY be included in the object: |Key|Description|Type| |--:|--|--| |`name`|A name describing the tileset|string| |`description`|A text description of the tileset|string| -|`attribution`|An attribution to be displayed when the map is shown to a user. Implementations may decide to treat this as HTML or literal text. |string| +|`attribution`|An attribution to be displayed when the map is shown to a user. Implementations MAY decide to treat this as HTML or literal text. |string| |`type`|The type of the tileset |a string with a value of either `overlay` or `baselayer`| |`version`|The version number of the tileset|a string containing a valid version according to [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0.html) | -The JSON object may also include any other keys with an arbitrary value. This specification recommends nesting all application-specific data in an object under a semi-unique key to avoid overlap with other application-specific data or keys that may be defined in future versions of this specification. For example, instead of including the custom fields `author` and `companyId` directly in the top level of the metadata object, they should be nested in another object under a key with your project or organization name. +The JSON object MAY also include any other keys with an arbitrary value. This specification recommends nesting all application-specific data in an object under a semi-unique key to avoid overlap with other application-specific data or keys that may be defined in future versions of this specification. For example, instead of including the custom fields `author` and `companyId` directly in the top level of the metadata object, they SHOULD be nested in another object under a key with your project or organization name. --- From cea413064720679743010ee5916f15c4ca2a05da Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Tue, 18 Jul 2023 12:29:40 +0200 Subject: [PATCH 20/24] docs(spec): rename introduction to abstract --- spec/v3/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 8014184b..e4086744 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -17,7 +17,7 @@ Complete rewrite to clarify many ambiguous wordings. - add AVIF to TileTypes.
-## 1 Introduction +## 1 Abstact PMTiles is a single-file archive format for tiled data. It enables low-cost, zero-maintenance map applications for "serverless" environments without having to rely on a custom tile backend or a third-party provider. This is achieved by packing all tiles of a tileset into an archive so that all tiles can be accessed easily and without much overhead via HTTP range requests. By combining all the tiles into one archive, hosting costs are kept low, as it is usually a lot cheaper to update one large file than to update thousands or even millions of small files. From 70ef05bf7811dce3c021923a196be9bc44a69c73 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Tue, 18 Jul 2023 12:34:54 +0200 Subject: [PATCH 21/24] docs(spec): move change log to own file --- spec/v3/CHANGELOG.md | 9 +++++++++ spec/v3/spec.md | 14 ++------------ 2 files changed, 11 insertions(+), 12 deletions(-) create mode 100644 spec/v3/CHANGELOG.md diff --git a/spec/v3/CHANGELOG.md b/spec/v3/CHANGELOG.md new file mode 100644 index 00000000..e569165c --- /dev/null +++ b/spec/v3/CHANGELOG.md @@ -0,0 +1,9 @@ +# Changelog + +## Version 3.2 +Complete rewrite to clarify many ambiguous wordings. + +## Version 3.1 +- added `metadata` details about `vector_layers`. +- Clarified that directory entry lengths must be nonzero, and directories must be non-empty. +- add AVIF to TileTypes. \ No newline at end of file diff --git a/spec/v3/spec.md b/spec/v3/spec.md index e4086744..c1a59939 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -2,20 +2,10 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). -## Changelog - -
-Version 3.2 -Complete rewrite to clarify many ambiguous wordings. -
+--- -
-Version 3.1 +Please refer to the [change log](./CHANGELOG.md) for a documentation of changes to this specification. -- added `metadata` details about `vector_layers`. -- Clarified that directory entry lengths must be nonzero, and directories must be non-empty. -- add AVIF to TileTypes. -
## 1 Abstact From c3a6c6ffe19310e983f886eb788d4a40ced5001d Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Tue, 18 Jul 2023 12:40:05 +0200 Subject: [PATCH 22/24] docs(spec): remove "reserved" --- spec/v3/spec.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index c1a59939..2bf7b23f 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -79,7 +79,7 @@ This field is encoded as a little-endian 64-bit unsigned integer. #### Metadata Length -The Metadata Length is an 8-byte field specifying the number of bytes reserved for the metadata. +The Metadata Length is an 8-byte field specifying the number of bytes of the metadata. This field is encoded as a little-endian 64-bit unsigned integer. @@ -91,7 +91,7 @@ This field is encoded as a little-endian 64-bit unsigned integer. #### Leaf Directories Length -The Leaf Directories Length is an 8-byte field specifying the number of bytes reserved for leaf directories. A value of `0` indicates that there are no leaf directories included in this PMTiles archive. +The Leaf Directories Length is an 8-byte field specifying the accumulated size (in bytes) of all leaf directories. A value of `0` indicates that there are no leaf directories included in this PMTiles archive. This field is encoded as a little-endian 64-bit unsigned integer. @@ -103,7 +103,7 @@ This field is encoded as a little-endian 64-bit unsigned integer. #### Tile Data Length -The Tile Data Length is an 8-byte field specifying the number of bytes reserved for the tile data. +The Tile Data Length is an 8-byte field specifying the accumulated size (in bytes) of all tiles in the tile data section. This field is encoded as a little-endian 64-bit unsigned integer. @@ -271,7 +271,7 @@ The Tile-ID corresponds to a cumulative position on the series of [Hilbert curve Specifies the offset of the first byte of the tile or leaf directory. This address offset is relative to the first byte of the _tile data section_ for tile-entries and relative to the first byte of the _leaf directories section_ for leaf-directory-entries. #### Length -Specifies the number of bytes reserved for this tile or leaf directory. This size always indicates the compressed size, if the tile or leaf directory is compressed. +Specifies the number of bytes of this tile or leaf directory. This size always indicates the compressed size, if the tile or leaf directory is compressed. #### Run-Length Specifies the number of tiles for which this entry is valid. A run length of `0` means that this entry is for a leaf directory and not for a tile. From 82175a3f226193cca362ce9e106d525b167561fb Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 19 Jul 2023 15:51:01 +0200 Subject: [PATCH 23/24] docs(spec): add ascii diagram for overview --- spec/v3/spec.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 2bf7b23f..1c1c0d7f 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -24,6 +24,20 @@ An archive consists of five main sections: These sections are normally in the same order as in the list above, but theoretically, it is possible to relocate all sections other than the header arbitrarily. The only two restrictions are that the header is at the start of the archive and the root directory MUST be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. +``` + Header Root Directory Meta Data Leaf Directories Tile Data + Length Length Length Length Length + <------> <--------------> <---------> <----------------> <---------> ++--------+----------------+-----------+------------------+-----------+ +| | | | | | +| Header | Root Directory | Meta Data | Leaf Directories | Tile Data | +| | | | | | ++--------+----------------+-----------+------------------+-----------+ + ^ ^ ^ ^ + Root Dir Meta Data Leaf Dirs Tile Data + Offset Offset Offset Offset +``` + ## 3 Header The Header has a length of 127 bytes and is always at the start of the archive. It includes the most important metadata and everything needed to decode the rest of the PMTiles archive properly. From 2ae0d6194f897e04d35a4c384370cc963106b974 Mon Sep 17 00:00:00 2001 From: Jonas Schade Date: Wed, 19 Jul 2023 16:31:09 +0200 Subject: [PATCH 24/24] docs(spec): remove header length --- spec/v3/spec.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/spec/v3/spec.md b/spec/v3/spec.md index 1c1c0d7f..d1c796b7 100644 --- a/spec/v3/spec.md +++ b/spec/v3/spec.md @@ -25,9 +25,9 @@ These sections are normally in the same order as in the list above, but theoreti The only two restrictions are that the header is at the start of the archive and the root directory MUST be contained in the first 16,384 bytes (16 KB) of the archive so that latency-optimized clients can retrieve the root directory in advance and ensure that it is complete. ``` - Header Root Directory Meta Data Leaf Directories Tile Data - Length Length Length Length Length - <------> <--------------> <---------> <----------------> <---------> + Root Directory Meta Data Leaf Directories Tile Data + Length Length Length Length + <--------------> <---------> <----------------> <---------> +--------+----------------+-----------+------------------+-----------+ | | | | | | | Header | Root Directory | Meta Data | Leaf Directories | Tile Data |