Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Degree Sign on input #2791

Merged
merged 8 commits into from
Aug 15, 2021

Conversation

direvus
Copy link
Contributor

@direvus direvus commented Jul 30, 2021

This PR adds support in dsmtor() for a Degree Sign (U+00B0), encoded
as UTF-8 (\xc2\xb0), as an alternative symbol to D/d to designate
the degree unit.

No encodings other than UTF-8 are supported.

I had previously suggested in #2712 that I was willing to take a swing at this, for UTF-8 only, and nobody jumped in to dissuade me, so here goes.

I also correct a typo in a code comment and some fix some rogue whitespace on the way past.

I have added a test to test/unit/gie_self_test, but I wasn't able to run a clean red/green on the new test, due to an unrelated failure earlier in the same test block (line 411, currently failing on 'master'). My new test does pass if I comment out those failing tests. As an aside, it would probably be a good idea to break this block up into smaller units that can run independently of each other.

As for the documentation, I have tweaked one of the examples in the cs2cs docs to demonstrate use of the degrees sign, but I didn't see anywhere in the docs that we give a general description of allowed DMS formats. If I've missed it, or there is anywhere else in the docs that should be updated for this change, please let me know. It's my first time offering a PR to this project so please do be gentle.

Cheers,
Brendan

Fixes #2712.

Brendan Jurd added 2 commits July 30, 2021 14:19
This commit adds support in dsmtor() for a Degree Sign (U+00B0), encoded
as UTF-8 (`\xc2\xb0`), as an alternative symbol to `D`/`d` to designate
the degree unit.

Fixes OSGeo#2712.
src/dmstor.cpp Outdated Show resolved Hide resolved
src/dmstor.cpp Outdated Show resolved Hide resolved
src/dmstor.cpp Outdated Show resolved Hide resolved
direvus added 2 commits August 6, 2021 22:48
In response to PR feedback from @rouault on pull OSGeo#2791, there was an
unintentional fallthrough in the switch statement case for the degree
sign.

Replacing the 'switch' altogether removes the fallthrough risk, allows us
to check for both bytes in the degree sign in the same test, and avoids
the need to duplicate the code from the 'default' case in multiple
branches.
Also remove the cast from the raw unsigned numeric byte value to 'char',
instead just initialise the constexprs for the bytes as 'char'.
src/dmstor.cpp Outdated
double
dmstor(const char *is, char **rs) {
return dmstor_ctx( pj_get_default_ctx(), is, rs );
}

double
dmstor_ctx(PJ_CONTEXT *ctx, const char *is, char **rs) {
int n, nl;
int n, nl, adv;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be better to move the declaration of the variable with its initialization at line 68

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, thanks @rouault I've updated my branch today with this change.

@mwtoews
Copy link
Member

mwtoews commented Aug 11, 2021

Another consideration is to also accept the single-byte encoding for the degree symbol '\xb0', used in a wide range of encodings. Demo in Python:

deg = "\N{DEGREE SIGN}"
print(deg)                   # °
print(deg.encode("utf-8"))   # b'\xc2\xb0'
print(deg.encode("cp1252"))  # b'\xb0'

This single byte sequence has the same meaning for: cp1250, cp1251, cp1252, cp1253, cp1254, cp1255, cp1256, cp1257, cp1258, iso8859_1, iso8859_2, iso8859_3, iso8859_7, iso8859_8, iso8859_9, iso8859_10, iso8859_13, iso8859_15, iso8859_16, koi8_t, kz1048, latin_1, mbcs, ptcp154, and others. There are other byte sequences for the degree symbol using other encodings, but they are less common or obsolete.

@direvus
Copy link
Contributor Author

direvus commented Aug 11, 2021

Thanks @mwtoews I am comfortable to include it, given how many encodings we can cover with \xb0 it seems like a useful addition.

This byte is the degree symbol in many widely used single-byte
encodings, including ISO 8859 parts 1-4, 7-10, 13, 15, 16, and several
Windows code pages.

With this change, there are now two accepted encodings for the degree
symbol in dmstor:

- `\xc2\xb0` (UTF-8)
- `\xb0` (various single-byte encodings)

Per PR feedback on pull OSGeo#2791.
@rouault rouault added this to the 8.2.0 milestone Aug 15, 2021
@rouault rouault merged commit 17f83c6 into OSGeo:master Aug 15, 2021
sthagen added a commit to sthagen/OSGeo-PROJ that referenced this pull request Aug 15, 2021
Add support for Degree Sign on input (OSGeo#2791)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Be ready for U+00B0 DEGREE SIGN "°" on input
5 participants