Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit CRS code assigning logic #296

Closed
olsen232 opened this issue Nov 3, 2020 · 1 comment
Closed

Revisit CRS code assigning logic #296

olsen232 opened this issue Nov 3, 2020 · 1 comment
Assignees

Comments

@olsen232
Copy link
Collaborator

olsen232 commented Nov 3, 2020

In datasets V2, a CRS is stored at a path, eg:
crs/EPSG:2193.wkt
and in the working copy, a CRS is stored at a ID number eg:
{"srs_id": 2193, "definition": ...}

This is fine for standard CRS which always have an authority. Things get trickier when the user provides a custom CRS which or may not have an authority. Whether or not the CRS has an authority, we still need to generate a probably-unique filename and ID:

I wanted to revisit how these numbers are generated. For every WKT definition, we need a filename to store the definition in V2 datasets, and a unique code-number to store the definition in the working copy.

Right now, we have the following rules for generating the filename:

  • Use $AUTHORITY_NAME:$AUTHORITY_CODE.wkt if both are present - eg EPSG:2193.wkt
  • Use either $AUTHORITY_NAME.wkt or $AUTHORITY_CODE.wkt if only one is present eg 2193.wkt
  • Use CUSTOM:$ID if no authority is set. Eg CUSTOM:13572468
    (where ID is generated using the rule below)

The rules for generating the unique ID number:

  • Use $AUTHORITY_CODE if it is present - eg 2193
  • Otherwise use a stable hash based on the normalised WKT - where normalising just means normalising whitespace - in the range from 1,000,000 to 269,435,455 which is (0x0 to 0xFFFFFFF) + a million

There are a few things that could be improved -

  • more recognisable filenames: if a CRS doesn't have an authority, but does have a name, that could be a good filename. For instance, one client uses Zone 12N (80 W TO 72 W) as the CRS name. This would be much more recognisable to someone familiar with that CRS that the filename we do generate (ie, CUSTOM:13572468)

  • more recognisable CRS IDs: the same client uses a null authority name, but an authority code of UTM-12N-ME27. Since this is not an integer, we ignore it and just generate our own. But we could use the integer parts of it - 101227 might be a more recognisable ID than the random one we generate (eg 13572468). However, the user probably doesn't care too much, if they did, they should probably have included an authority.

  • stabler IDs: filenames that contain the generated ID currently change if the definition changes, that is, the CRS definition will move from one file to another - they show up as a delete and an add. And if the CRS definition is stored in the working copy with a generated ID, then that will move too if the definition changes. This feels different to CRS's which have a proper authority including an integer ID - those stay in the same place, and so changes to them show up as edits, as they should. If we were better at giving CRS's names or IDs that depended on their name or authority, instead of the entire definition, then editing the definition would be more likely to show up as an edit, and not a move / rename (as long as you didn't actually rename the definition by changing its name / authority).

@olsen232 olsen232 self-assigned this Nov 3, 2020
@olsen232
Copy link
Collaborator Author

olsen232 commented Nov 3, 2020

I just learned that custom codes should officially be in the range 200000 - 209199 - this should be fixed regardless - and I guess it means the most recognisable code we can generate for UTM-12N-ME27 is probably 201227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant