Revisit CRS code assigning logic #296

olsen232 · 2020-11-03T01:50:16Z

In datasets V2, a CRS is stored at a path, eg:
crs/EPSG:2193.wkt
and in the working copy, a CRS is stored at a ID number eg:
{"srs_id": 2193, "definition": ...}

This is fine for standard CRS which always have an authority. Things get trickier when the user provides a custom CRS which or may not have an authority. Whether or not the CRS has an authority, we still need to generate a probably-unique filename and ID:

I wanted to revisit how these numbers are generated. For every WKT definition, we need a filename to store the definition in V2 datasets, and a unique code-number to store the definition in the working copy.

Right now, we have the following rules for generating the filename:

Use $AUTHORITY_NAME:$AUTHORITY_CODE.wkt if both are present - eg EPSG:2193.wkt
Use either $AUTHORITY_NAME.wkt or $AUTHORITY_CODE.wkt if only one is present eg 2193.wkt
Use CUSTOM:$ID if no authority is set. Eg CUSTOM:13572468
(where ID is generated using the rule below)

The rules for generating the unique ID number:

Use $AUTHORITY_CODE if it is present - eg 2193
Otherwise use a stable hash based on the normalised WKT - where normalising just means normalising whitespace - in the range from 1,000,000 to 269,435,455 which is (0x0 to 0xFFFFFFF) + a million

There are a few things that could be improved -

more recognisable filenames: if a CRS doesn't have an authority, but does have a name, that could be a good filename. For instance, one client uses Zone 12N (80 W TO 72 W) as the CRS name. This would be much more recognisable to someone familiar with that CRS that the filename we do generate (ie, CUSTOM:13572468)
more recognisable CRS IDs: the same client uses a null authority name, but an authority code of UTM-12N-ME27. Since this is not an integer, we ignore it and just generate our own. But we could use the integer parts of it - 101227 might be a more recognisable ID than the random one we generate (eg 13572468). However, the user probably doesn't care too much, if they did, they should probably have included an authority.
stabler IDs: filenames that contain the generated ID currently change if the definition changes, that is, the CRS definition will move from one file to another - they show up as a delete and an add. And if the CRS definition is stored in the working copy with a generated ID, then that will move too if the definition changes. This feels different to CRS's which have a proper authority including an integer ID - those stay in the same place, and so changes to them show up as edits, as they should. If we were better at giving CRS's names or IDs that depended on their name or authority, instead of the entire definition, then editing the definition would be more likely to show up as an edit, and not a move / rename (as long as you didn't actually rename the definition by changing its name / authority).

The text was updated successfully, but these errors were encountered:

olsen232 · 2020-11-03T02:07:47Z

I just learned that custom codes should officially be in the range 200000 - 209199 - this should be fixed regardless - and I guess it means the most recognisable code we can generate for UTM-12N-ME27 is probably 201227

olsen232 self-assigned this Nov 3, 2020

olsen232 closed this as completed Nov 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisit CRS code assigning logic #296

Revisit CRS code assigning logic #296

olsen232 commented Nov 3, 2020 •

edited

Loading

olsen232 commented Nov 3, 2020

Revisit CRS code assigning logic #296

Revisit CRS code assigning logic #296

Comments

olsen232 commented Nov 3, 2020 • edited Loading

olsen232 commented Nov 3, 2020

olsen232 commented Nov 3, 2020 •

edited

Loading