Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pick a hex literal standard and use it consistently #11212

Open
mikkelfj opened this issue May 9, 2015 · 21 comments
Open

Pick a hex literal standard and use it consistently #11212

mikkelfj opened this issue May 9, 2015 · 21 comments
Assignees
Labels
design Design of APIs or of the language itself help wanted Indicates that a maintainer wants help on an issue or pull request parser Language parsing and surface syntax

Comments

@mikkelfj
Copy link

mikkelfj commented May 9, 2015

The hexadecimal integer notation normally allows for both 0x and 0X (C, Python, IEEE 754-2008).
Julia only supports the lower case variant.

Octal and binary constants have similar issues, except Julia is probably alone with 0o777 for octal notation, and uppercase could be hard to read - still it is inconsistent with common use of case insensitivity in numeric constants.

Hexadecimal floating point (HFP) constants are affected by the same issue. Julia follows the C99 specification with the exception of 0X upper case prefix.

However, the HFP standard is a bit inconsistent in not allowing for the power suffix to be omitted if there is a (hexa)decimal point. Python makes this optional. It is not very elegant to type 0x1p0 instead of 0x1.0

The question is whether to adhere strictly to standard, or follow Pythons more intuitive approach. Pythons documented syntax is not exact; the runtime also allows for upper case everywhere in HFP, and an absent integer part, like Julia and C99.

Additionally, a binary floating point notation has already been requested in #9371.
This would add to consistency, but is not supported by C99. The main motivation of HFP notation is probably to avoid loss of precision between machine and printet representation - which binary notation would not improve upon. Binary would, however, make certain binary polynomials more pleasant to deal with - most likely for assigning to other datatypes with larger precision than float.

Python
[sign] ['0x'] integer ['.' fraction] ['p' exponent]
https://docs.python.org/2/library/stdtypes.html

C99, section 6.4.4.2
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

http://www.exploringbinary.com/hexadecimal-floating-point-constants/

@pao pao changed the title Inconsistent numeric constants Pick a hex literal standard and use it consistently May 9, 2015
@pao pao added needs decision A decision on this change is needed parser Language parsing and surface syntax labels May 9, 2015
@simonbyrne
Copy link
Contributor

Note that Python doesn't support hex float literals at the parser level, only from strings via float.fromhex. Our analogous usage would be parse(Float64,"0x1") which does all that Python does (well, it does on my OS X machine: I think we punt this to the system strtod).

@simonbyrne
Copy link
Contributor

Backlink to mailing list discussion:
https://groups.google.com/d/topic/julia-dev/wUlhmwZNeMk/discussion

@vtjnash
Copy link
Member

vtjnash commented May 9, 2015

or drop them altogether, in favor of reinterpret(Float32, 0xFF001122)? (#6349 (comment))

@simonbyrne
Copy link
Contributor

or drop them altogether, in favor of reinterpret(Float32, 0xFF001122)

Fun fact I learnt the other day: technically the endianness of integers and floating point values doesn't have to be the same, which means this isn't guaranteed to work on all platforms. (the author, Stephen Canon, is responsible for a lot of Apple's libm & Accelerate functions).

@ScottPJones
Copy link
Contributor

That's easy to deal with compared to C on the PDP-11!
At least one C compiler (Plaugher?) Had mixed endian integers...
int was 16 bit, long 32, layout was 01 for ints, 2301 for long

@mikkelfj
Copy link
Author

mikkelfj commented May 9, 2015

GCM crypto also mixed byte order because it looked nice on paper, never mind it was supposed to be fast.

@mikkelfj
Copy link
Author

I discovered the following:

00.10 is a valid decimal floating point value in C. In other words, a future octal floating point value in C is not possible without a dedicated exponent part. This could explain why the hexadecimal floating point value requires the p | P notation in the IEEE-754 standard.

I still suggest the hexadecimal exponent be optional, in part because it is user friendly, in part because several languages have conversion functions that support it, and in part because hexadecimal floating point notation does not have the ambiguity seen with octal numbers in C. But I am not entirely convinced due to the portability issue.

For printing functions, the default behaviour should include the exponent for portability.

While octal floating points are purely of academic interest, I think they should not be supported specifically because they would be very confusing in the equivalent C notation.

Binary floating points, on the other hand, seems more relevant, but perhaps delay that decision until a valid usecase presents itself.

@simonbyrne
Copy link
Contributor

I still suggest the hexadecimal exponent be optional [...]

@mikkelfj As I stated above, it already is optional if you use x = parse(Float64,"0x1.0") (assuming it is supported by the platform strtod function), just not as a literal (x = 0x1.0). I don't know of any language that allows hex float literals without exponents.

@mikkelfj
Copy link
Author

I agree, that is my point, several languages has the conversion function that way, including Julia. But it is not supported as a literal constant in Julia 0.3, unless I am mistaken.

There is a parsing ambiguity with hex literals without exponents, but it is not significant:
If you lex 0x100.01 you get 0x100 hex followed by .01 float. But since that would be a parse error it should be possible. In conversion functions there are no ambiguity.

@simonbyrne
Copy link
Contributor

Oh okay. I personally don't really see the need for it (adding a p0 to the end isn't that hard), but as it does appear to be unambiguous, I'm not opposed to it either.

@mikkelfj
Copy link
Author

I still would like upper case support for 0b, 0o and 0x as it gives some special cases to explain in the lexer I am implementing, but it is not a huge issue.

As to mandatory hex exponent, I just discovered that there is an ambiguity in C due to the float suffix f. Depending on how you interpret Julia juxtaposition, this is either absent here, or much worse. Thus, it better stay mandatory:

https://gcc.gnu.org/onlinedocs/gcc/Hex-Floats.html
quote
"Unlike for floating-point numbers in the decimal notation the exponent is always required in the hexadecimal notation. Otherwise the compiler would not be able to resolve the ambiguity of, e.g., 0x1.f. This could mean 1.0f or 1.9375 since ‘f’ is also the extension for floating-point constants of type float."

@simonbyrne
Copy link
Contributor

The main challenge here is hacking it into the parser. The other remaining question is how to handle Float32s

@StefanKarpinski StefanKarpinski added design Design of APIs or of the language itself and removed needs decision A decision on this change is needed labels Sep 13, 2016
@vchuravy
Copy link
Member

For Float32 C/C++ supports 0x1.0p+0f the equivalent in Julia gives right now

julia> typeof(0x1.0p+0f0)
Float64

@tkelman tkelman modified the milestones: 1.0, 0.6.0 Jan 5, 2017
@tkelman
Copy link
Contributor

tkelman commented Jan 5, 2017

not changing for 0.6

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 20, 2017

We should consistently remove support for uppercase variants of numeric literals, standards be damned. We should also remove hex float literals and they can be implemented in a package as string macros.

@StefanKarpinski StefanKarpinski changed the title Pick a hex literal standard and use it consistently remove uppercase E in float literals; remove hex float literals Jul 20, 2017
@StefanKarpinski StefanKarpinski changed the title remove uppercase E in float literals; remove hex float literals remove uppercase E in float literals, hex floats Jul 20, 2017
@Keno Keno added the help wanted Indicates that a maintainer wants help on an issue or pull request label Jul 20, 2017
@StefanKarpinski StefanKarpinski changed the title remove uppercase E in float literals, hex floats remove uppercase E in float literals, hex floats Jul 20, 2017
@StefanKarpinski StefanKarpinski changed the title remove uppercase E in float literals, hex floats Pick a hex literal standard and use it consistently Jul 20, 2017
@StefanKarpinski
Copy link
Member

Nobody really seems to mind that we don't allow 0X for hex literals, so it's hard to see why we should do this just to be consistent with something. We could add support for hexadecimal floating point constants at any time since they're currently a syntax error.

@mikkelfj
Copy link
Author

I'm also more in favor of strict lower case now. https://rustbyexample.com/primitives/literals.html

@Keno
Copy link
Member

Keno commented Aug 17, 2017

There have been some pleas to keep hex float literals, and people seemed to be fine with that, though we should in that case probably disallow juxtaposition with hex literals in general (#23304), so people don't run into this by accident (since most people don't know about hexfloat syntax).

@StefanKarpinski
Copy link
Member

0x1.0 is currently a syntax error so it could be allowed in the future.

@StefanKarpinski StefanKarpinski removed this from the 1.0 milestone Aug 31, 2017
@KristofferC
Copy link
Member

Are these supposed to work:

julia> 0x1.0p1f
2.0

julia> 0x1.0p1f0
2.0

It seems a bit incosistent:

julia> 0x1.0p1e
ERROR: syntax: invalid numeric constant "0x1.0p1e"

julia> 0x1.0p1f1
ERROR: syntax: invalid numeric constant "0x1.0p1f1"

@simonbyrne
Copy link
Contributor

No, I think that is a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design of APIs or of the language itself help wanted Indicates that a maintainer wants help on an issue or pull request parser Language parsing and surface syntax
Projects
None yet
Development

No branches or pull requests

10 participants