Arithmetic conversions can result in LLVM undefined values #993

Praetonus · 2016-07-01T20:03:18Z

Currently, integer to floating point and floating point to integer conversions compile down to LLVM uitofp, sitofp, fptoui and fptosi instructions. In case of overflow on the destination type, these instructions produce undefined values. I think this is not something we want in Pony and we need to address this concern, while impacting the existing API as little as possible.

In my opinion, the best solution would be to make the relevant functions partial and to error if over/underflow would occur. We'd also have to split the _ArithmeticConvertible trait in more specific versions so that functions are made partial only when necessary (e.g. F32.u16() should be partial but U8.u16() should not). Numeric unions (Number, Signed, etc) should still work correctly since a non-partial function is a subtype of a partial one and we could still have a supertype for all of this.

While the LLVM issue is with conversions to/from floating point, I feel like this is more general at the language level and should also apply to conversions between integers. First, for consistency with the above fix for floating points and second, because the only reason 1178 in an U16 space is mapped to 154 in an U8 space is because our hardware is truncating it as a binary number (so the behaviour here isn't depending on the language semantics).

As a side note, if we do make functions for integers partial, the old truncating behaviour will still be possible with modular arithmetic (which is optimised to truncation or bitwise operations by LLVM if corresponding to machine word sizes).

To sum up the general idea: for every arithmetic conversion function from type A to type B, if some values of A don't fit in B, make the A.b() function partial and check for overflow.

The text was updated successfully, but these errors were encountered:

jemc · 2016-07-01T20:26:03Z

Definitely agreed that having the result of these operations be formally undefined is a big problem, however, given that the effect that many of us (myself included) have been depending on is truncation, I'm currently thinking it might make more sense to make the language semantics formally specify that the conversion performs a truncation if the source type is larger width than the target type.

Under your plan, how would it look if I had a U16 and wanted to truncate to a U8? You say that truncation can be done with %, but the result of a % operation on a U16 would still be a U16, so you'd still have to call u8() on it, which would still be partial - what would I put in the else that guards the partial?

Praetonus · 2016-07-02T13:47:44Z

I don't really like the idea of the default behaviour being errors for floating point conversions and truncation in base 2 for integer conversions. I feel like conversion and truncation should be separated in different functions, maybe a fun truncate(digits: USize, base: USize) for integers.

After trying various things with LLVM and optimisations, it seems that % isn't the best candidate here. Using bit masking and and operations always compile down to a single trunc instruction and can even optimise out the overflow check (and the error handling code). For example:

primitive U16
  fun u8(): U8 ? =>
    compile_intrinsic // Check for overflow and use LLVM trunc instruction

actor Main
  new create(env: Env) =>
    let x: USize = env.args.size() and 0xFF
    let y: U8 = try
      x.u8()
    else
      0 // Unreachable
    end

In a program like this, y would directly be assigned to a truncation of env.args.size(). The masking with 0xFF and the conditional in U16.u8() would be removed. You still have to put the else for the try but implementing the truncate function above would remove the need to do the trick manually.

This integer conversion/truncation overhaul requires a RFC, I'll do that.

Do we agree on the fix for floating point conversions being partial functions and overflow check?

jemc · 2016-07-02T14:03:34Z

Right now, I don't really think any of the conversion functions should be partial - they should just all have defined semantics.

jemc · 2016-07-02T14:04:11Z

It would be good to hear @sylvanc's perspective on this, though.

Praetonus · 2016-07-02T14:12:20Z

The big issue is that the only alternative to partial functions to fix the floating point issue would be to return an arbitrary value if the input can't be represented by the integer type. That includes trying to convert NaN, and I don't see any sane default value for this (and it's undefined behaviour in C).

sylvanc · 2016-07-05T09:44:54Z

Integer truncation and extension should remain as it is. Pony integers are not abstract numbers: they are real machine words with a bit size that perform modular arithmetic :) As a result, U8(256) is 0, and by the same token, U16(256).u8() should be zero. Truncation is simply a fast way to handle the required modular arithmetic.

Think of it in pure functional terms: generate a U8 with an imaginary successor function succ, calling it 256 times with an initial value of 0. The result is of course 0.

That leaves the core problem that @Praetonus found, which I do think is important: fptoui and fptosi. I'm not concerned about uitofp or sitofp, as they produce reasonable results, even when they are inexact. Since most floating point operation are inexact, I think that's fine. But fptoui and fptosi can produce undefined, and that's not fine :)

Using a partial function for float to int conversion is problematic, I think. While in one sense there is "no value in the codomain" (i.e. a partial function) for some float to int conversions, in another sense most values in the codomain are approximations anyway. Should F32(1.3).u32() be partial? No, it should return 1, and we're all comfortable with that.

So I lean towards saturation being the right way to go, from a "fully defined values" point of view. However, I'm concerned about performance. Float to int conversion is a real performance problem in games, video/audio codecs, etc, and C programmers spend a non-trivial amount of time optimising it to do what Pony currently does (i.e. use SSE registers and pipeline conversions, not worrying about overflow).

Interestingly, for any value that isn't a compile time constant, Pony won't generate an LLVM undef, and uses cvttsd2si on x64, which returns 0x80000000 on overflow:

http://www.felixcloutier.com/x86/CVTTSD2SI.html

However, this doesn't appear as a consistent result because the optimiser can do unexpected things in intermediate code.

In summary: saturation seems correct, but introducing runtime bounds checking could make it necessary to provide alternate functions that do not perform runtime bounds checking.

sylvanc · 2016-09-28T20:28:35Z

Discussed in a sync call. undef is dangerous to leave around, but runtime bounds checking is too expensive, so we'll special case compile-time constant fptoui and fptosi to do saturation, and leave the non-compile-time constant case as it is now.

jemc · 2016-09-28T20:31:10Z

Discussed on the sync call, @sylvanc proposed a plan in which we could treat compile time constants as a special case, and make sure that the value is not undef in LLVM, but we can't have the overflow semantics for floating point to integer be well-defined in a crossplatform way and still be fast.

So after setting up the special case logic, we can document that the resulting value is undefined (may vary on different platforms), but is guaranteed not to leak information from a security standpoint (as it does have the potential to do now).

We can also consider an RFC that adds a separate set of float-to-integer conversion methods that are slower, but have defined result semantics for the entire domain of input.

Praetonus · 2016-10-13T17:31:23Z

Doing a special case for compile-time constants won't be possible because the optimiser can propagate and synthetise constants.

Also, it turns out there are some inconsistencies with unsafe maths.

Integer arithmetic doesn't use nuw/nsw flags and division by 0 is checked. Safe.
Bitwise shifting doesn't check for overflow, which is undefined behaviour in LLVM. Unsafe.
clz/ctz use safe LLVM intrinsics. Safe.
All floating point operations use fast math flags by default. Unsafe.
As discussed in this issue, floating point conversions are unsafe.

We're currently offering performance over potential undefined results in most cases. To me, that doesn't align with Pony's "correctness first" philosophy. I'm tempted to recategorise the issue as a bug and to do the following.

Add runtime checks on every UB-inducing math operation and remove fast math flags on floating points.
Add a set of unsafe functions for performance. That would cover
- Integer operations with nsw/nuw, unchecked division
- Unchecked bitwise shifting
- ctz/clz with unsafe LLVM intrinsics
- Floating point operations with fast math flags
- Unchecked floating point conversions

That way, users would get correctness by default, but could still pull the "unsafe hatch" for performance critical operations.

@jemc @sylvanc Thoughts?

jemc · 2016-10-13T17:55:49Z

I'm tempted to recategorise the issue as a bug

Yes, I agree - recategorizing as a bug and marking with needs discussion during sync, though if we manage to iron out the details before the sync, we can remove the latter label.

I'm also inclined to agree with your suggestion of fixing all of the default methods for correctness, and adding "fast"/"unsafe" variants as needed for performance-critical arithmetic. I think this will be a good approach, provided that we have clear documentation for the semantics of each operation, including what happens in an overflow. We need to make it very clear what the user is "giving up" when choosing the "fast"/"unsafe" variants over the default ones, so that they can determine whether it is acceptable (or even relevant) for their use case.

As part of our discussion here, I think it would be useful to continue to hammer out and clarify these details (the precise semantics of overflow in the safe variants, the limits of safety in the unsafe variants, etc).

As an aside, I'm not wild about using unsafe_ as a name / prefix for these methods, as I think it implies a lack of memory safety or type safety. Using fast_ comes to mind as a possibility, though there may be other good options.

SeanTAllen · 2016-10-13T18:01:56Z

@jemc i generally agree with you on the unsafe_.

Praetonus · 2016-10-13T21:24:03Z

Here's what I'm thinking, based on LLVM semantics.

Safe variants

Integer arithmetic is modular with two's complement. Division by 0 yields 0. No change
Bitwise shifting
- Left shift and logical right shift would yield 0 when shifting by more bits than the number's bit width
- Same reasoning for the arithmetic right shift but filled with the sign bit
Count leading/trailing zero yields the expected result for every possible value. No change
Floating point operations would conform strictly to IEEE 754. This is the current behaviour when compiling with --ieee-math.
Floating point to integer conversions
- Converting NaN would yield 0
- On overflow/underflow, do saturation

Fast/unsafe variants

Integer arithmetic operations with nuw/nsw flags. If the operation overflows (respectively unsigned and signed overflow), the result is undefined. Division by 0 would be unchecked and the result would be undefined.
Bitwise shifting
- Left shift with nuw/nsw. If the operation overflows, the result is undefined
- Logical and arithmetic right shift with exact semantics. If non-zero bits are shifted out, the result is undefined
- In addition, for every shifting type, shifting by more bits than the number's bit width yields an undefined result
Count leading/trailing zero. If the operand is 0, the result is undefined.
Floating point operations with fast math flags. The optimiser can assume infinity/NaN never occurs and can perform algebraically equivalent transformations that may not make sense for floating point numbers. Current behaviour
Floating point to integer conversions. On overflow, the result is undefined. Current behaviour.

Praetonus · 2016-10-13T22:06:56Z

Also, I think we should incorporate the fast/unsafe variants into the language trust boundary and only allow them in safe packages.

jemc · 2016-10-13T22:26:35Z

I think we should incorporate the fast/unsafe variants into the language trust boundary and only allow them in safe packages.

I'm not sure how I feel about that - can you explain your reasoning more? Are they unsafe in any sense other than possibly giving back a wrong answer to a calculation? Is it because the undefined values could be used on some platforms in an attack to return uninitialized memory that could contain sensitive data?

Also, another question to ask is - if I have a package that I want to allow to do "fast math", is it always the case that I also want to allow it to do FFI? Is it possibly better to think of it as a distinct and separate trust boundary?

jemc · 2016-10-13T22:27:46Z

Also, thank you for providing the thorough and explicit writeup on the safe vs unsafe semantics in your previous comment!

Praetonus · 2016-10-13T22:43:24Z

I'm not sure how I feel about that - can you explain your reasoning more? Are they unsafe in any sense other than possibly giving back a wrong answer to a calculation? Is it because the undefined values could be used on some platforms in an attack to return uninitialized memory that could contain sensitive data?

My main concern is hardware trap values and exceptions. For example, the unsafe division by 0 would raise an hardware exception on both x86 and ARM, which is very likely to halt the program.

Also, another question to ask is - if I have a package that I want to allow to do "fast math", is it always the case that I also want to allow it to do FFI? Is it possibly better to think of it as a distinct and separate trust boundary?

That's a good point. Maybe that would be better.

jemc · 2016-10-13T22:44:50Z

My main concern is hardware trap values and exceptions. For example, the unsafe division by 0 would raise an hardware exception on both x86 and ARM, which is very likely to halt the program.

Fair enough - I think I'm on board with calling these operations "unsafe" now 🔥

Praetonus · 2016-10-15T10:04:56Z

On a somewhat related matter, should rotl and rotr be defined for signed numbers? These functions are implemented with bitwise shifting, which means arithmetic shifting for signed numbers, which means the result is wrong in many cases. Also, the functions don't guarantee associativity.

DemiMarie · 2016-10-16T00:10:22Z

Note that the rotate semantics you are proposing require a runtime check on x86 that could be costly. I don't know the behavior on ARM.

A potentially better-performing approach would be to define shift and rotation as masking off the excess high-order bits from the shift amount. That doesn't need any branches on any processor.

Praetonus · 2016-10-16T03:19:06Z

Is that check done in the hardware instruction itself or in the potential boilerplate required around it? For reference, here is the implementation of rotl in Pony (rotr being the symmetrical).

fun rotl(y: A): A => (this << y) or (this >> (bitwidth() - y))

This implementation isn't currently safe so rotl will need a safe and an unsafe variant if we keep that base implementation.

Praetonus · 2016-10-16T05:16:52Z

Also, I've been thinking about defining real Pony semantics for the "undefined results" here.

First, we need to define the transformations permitted by the compiler on the program. I'm going to mostly draw from the C++ as-if rule here.

The Pony compiler is permitted to perform any changes to the program as long as the following remains true for every behaviour in the program:

At the end of a behaviour, the associated actor is exactly in the same state it would be if the program was executed as written.

If an object has a finaliser, that finaliser will eventually be called exactly once.

Messages are sent as if the behaviour was executed as written, in the same order and with the same contents.

FFI calls to well-defined C functions are executed as written, in the same order and with the same arguments. The runtime functions pony_ctx, pony_alloc, pony_alloc_small, pony_alloc_large, pony_alloc_final and pony_realloc are free from this rule and from the following one.

Message sends and FFI calls are not reordered with respect to each other.

Object and reference capability security are maintained.

Undefined results (placeholder name):

If an expression with undefined results is evaluated:

Every subsequent expression depending on undefined results also has undefined results.

If the constructor of an object with a finaliser has undefined results, that finaliser isn't guaranteed to be called.

If a message send or a FFI call has undefined result, it is free from the as-if rule.

The implementation is allowed to trap or to generate hardware exceptions (n.b. I'm not sure if this is adequate in a "formal specification". Maybe we should only refer to the "Pony abstract machine", but I don't know what the wording could be in that case.)

Capability security is always maintained.

Trust boundary:

If any behaviour in a Pony program does a FFI call to a C function which violates capability security, then the capability security guarantees are void for every behaviour in the Pony program.

If any behaviour in a Pony program does a FFI call to a C function with undefined behaviour (as defined by the ISO C standard), then every behaviour in the Pony program is free from the as-if rule, including the capability security guarantees.

If we keep that, it will need some refinement to work with distributed Pony.

Thoughts?

Praetonus · 2016-10-19T21:19:05Z

We discussed this during the sync call. The above semantics are fine, we'll probably end up with a similar version somewhere in the documentation.

We also talked about adding new operators for the unsafe operations. I'll open a RFC with a few possibilities for the operators syntax.

Praetonus · 2017-02-04T17:11:03Z

This was resolved in #1395.

sylvanc added the needs discussion label Jul 5, 2016

sylvanc added difficulty: 2 - medium and removed needs discussion during sync labels Sep 28, 2016

Praetonus added enhancement: 4 - in progress and removed enhancement: 3 - ready for work labels Oct 13, 2016

Praetonus self-assigned this Oct 13, 2016

jemc added bug: 4 - in progress and removed enhancement: 4 - in progress labels Oct 13, 2016

SeanTAllen removed the needs discussion during sync label Oct 19, 2016

This was referenced Oct 20, 2016

Unsafe operators ponylang/rfcs#49

Merged

Floating point min_value is inconsistent #1351

Closed

Praetonus mentioned this issue Nov 5, 2016

Separate safe and unsafe math operations #1395

Merged

Praetonus closed this as completed Feb 4, 2017

Praetonus mentioned this issue Jun 2, 2017

Infinite loops/recursion are optimised incorrectly #1930

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arithmetic conversions can result in LLVM undefined values #993

Arithmetic conversions can result in LLVM undefined values #993

Praetonus commented Jul 1, 2016

jemc commented Jul 1, 2016

Praetonus commented Jul 2, 2016

jemc commented Jul 2, 2016

jemc commented Jul 2, 2016

Praetonus commented Jul 2, 2016

sylvanc commented Jul 5, 2016

sylvanc commented Sep 28, 2016

jemc commented Sep 28, 2016

Praetonus commented Oct 13, 2016

jemc commented Oct 13, 2016

SeanTAllen commented Oct 13, 2016

Praetonus commented Oct 13, 2016

Praetonus commented Oct 13, 2016

jemc commented Oct 13, 2016

jemc commented Oct 13, 2016

Praetonus commented Oct 13, 2016

jemc commented Oct 13, 2016

Praetonus commented Oct 15, 2016 •

edited

Loading

DemiMarie commented Oct 16, 2016

Praetonus commented Oct 16, 2016

Praetonus commented Oct 16, 2016 •

edited

Loading

Praetonus commented Oct 19, 2016

Praetonus commented Feb 4, 2017

Arithmetic conversions can result in LLVM undefined values #993

Arithmetic conversions can result in LLVM undefined values #993

Comments

Praetonus commented Jul 1, 2016

jemc commented Jul 1, 2016

Praetonus commented Jul 2, 2016

jemc commented Jul 2, 2016

jemc commented Jul 2, 2016

Praetonus commented Jul 2, 2016

sylvanc commented Jul 5, 2016

sylvanc commented Sep 28, 2016

jemc commented Sep 28, 2016

Praetonus commented Oct 13, 2016

jemc commented Oct 13, 2016

SeanTAllen commented Oct 13, 2016

Praetonus commented Oct 13, 2016

Safe variants

Fast/unsafe variants

Praetonus commented Oct 13, 2016

jemc commented Oct 13, 2016

jemc commented Oct 13, 2016

Praetonus commented Oct 13, 2016

jemc commented Oct 13, 2016

Praetonus commented Oct 15, 2016 • edited Loading

DemiMarie commented Oct 16, 2016

Praetonus commented Oct 16, 2016

Praetonus commented Oct 16, 2016 • edited Loading

Praetonus commented Oct 19, 2016

Praetonus commented Feb 4, 2017

Praetonus commented Oct 15, 2016 •

edited

Loading

Praetonus commented Oct 16, 2016 •

edited

Loading