[stdlib] Floating-point random-number improvements #33455

NevinBR · 2020-08-13T16:40:20Z

Overview

This patch resolves multiple issues with generating random floating-point numbers.

The existing random methods on BinaryFloatingPoint will crash for some valid ranges (SR-8798), cannot produce all values in some ranges (SR-12765), and do not follow the proposed and documented semantics. This patch solves these problems:

SR-8798: BinaryFloatingPoint.random(in:) crashes on some valid ranges
SR-12765: BinaryFloatingPoint.random(in:) cannot produce all values in range

Summary of changes

i) Finite ranges where the distance between the bounds exceeds greatestFiniteMagnitude previously caused a trap at run-time. Now (with this patch) they are handled correctly to produce a uniform value in the range.

ii) Generating a random floating-point value in -1..<1 left the low-bit always 0. If the magnitude was less than 1/2, then the lowest 2 bits would be 0. If less than 1/4, 3 bits, and so forth. Other ranges which spanned multiple binades had similar issues. Now all values in the input range are produced with correct probability.

iii) The proposal (SE-0202: Random Unification) which added random number generation to Swift was quite sparse in its mention of floating-point semantics. However, during the review thread, discussion about the intended semantics arose with comments like this:

I think it makes sense to have the floating-point implementation behave as though a real number were chosen uniformly from the appropriate interval and then rounded.

Agreed. I would consider any implementation that does not have this behavior (within an ulp or two) to be incorrect.

To which the author of the proposal responded:

Yes, these are the semantics that I’m trying achieve here (those being uniform in real numbers).

Concordantly, the documentation comments for the floating-point random methods state:

  /// The `random(in:using:)` static method chooses a random value from a
  /// continuous uniform distribution in `range`, and then converts that value
  /// to the nearest representable value in this type.

However, prior to this patch, the implementation did not match that behavior, and many representable values could not be produced in certain ranges.

Mathematical details

In order to achieve the desired semantics, it is necessary to define precisely what “converts that value to the nearest representable value” should mean. This patch takes the following axiomatic approach:

`Range`

Single-value ranges: random(in: x ..< x.nextUp) always produces x.

Adjacent intervals: If x < y < z, then random(in: x ..< z) is equivalent to generating random(in: x ..< y) with probability (y-x)/(z-x), and otherwise random(in: y ..< z) with the remaining probability (z-y)/(z-x).

In order to satisfy these two principles, random(in: x ..< y) must behave as if a real number r were generated uniformly in [x, y), then rounded down to the nearest representable value. Note that the rounding must be downward, as any other choice would violate one of the two principles.

`ClosedRange`

Subintervals: If x <= y < z, then repeatedly generating random(in: x ..< z) until the result lands in x ... y is equivalent to generating random(in: x ... y).

This rule ensures consistency of results produced by the Range and ClosedRange versions of random(in:). As a result, it also guarantees that partitioning a closed interval into disjoint closed subintervals is consistent as well.

In order to satisfy this principle, random(in: x ... y) must be equivalent to random(in: x ..< y.nextUp) if the latter is finite.

In the edge-case that y == .greatestFiniteMagnitude, we utilize the adjacent intervals principle on [x, y) and [y, y + y.ulp). Although the latter endpoint is not representable as a finite floating-point value, the conceptual idea still holds, and the probability of producing y is proportional to y.ulp just as it is for all other values in the same binade.

This patch implements those semantics.

Similarity with random integer methods

It is interesting to note that the strategy of generating a uniform real number in a half-open interval then rounding down, is equivalent to how the random methods work for integer types. That is, T.random(in: x ..< y) behaves as if choosing a real number r uniformly in [x, y) then rounding down to the next representable value, regardless of whether T is an integer or (with this patch) a floating-point type.

Similarly for closed ranges, T.random(in: x ... y) behaves as if extending to a half-open interval bounded above by either the next representable value larger than y, or in case of overflow then where that value would as a real number, generating a random value in the new range, and rounding down.

Syncing from upstream

The `random(in:)` family of methods are now located in FloatingPointRandom.swift

It will reside in FloatingPoint.swift

Azoy · 2020-08-13T17:25:06Z

One trick that people have been doing is submitting the benchmark changes in a separate pr because if it’s in this pr, we can’t test it against the old implementation.

airspeedswift · 2020-08-13T18:41:53Z

I wouldn't describe that as a trick so much as What You Should Do.

airspeedswift

Thanks for the work here!

I think this PR needs breaking up in a few ways.

The benchmarks should be landed first so we can compare performance
Reorganization like moving functions around should be landed as an NFC PR rather than done together, particularly to ease reviewing preservation of ABI stability.
WIP commits should be squashed together, so separate commits should group logical changes rather than just when the work was done.
If there are new tests that fail on the current compiler, consider putting them in as XFAILs and then un-xfailing them when the fix lands.

test/stdlib/FloatingPointRandom.swift.gyb

Benchmarks for `0..<1` are now at [https://github.com/apple/swift/pull/33462](#33462)

airspeedswift · 2020-08-13T19:18:11Z

I've not reviewed the code itself (I'll leave that to @stephentyrone) but the quantity of new code to be inlined into the caller gives me a little bit of pause. Maybe it's not materially different though, hopefully the benchmarks will give us an idea.

NevinBR · 2020-08-13T19:28:24Z

@airspeedswift

I think this PR needs breaking up in a few ways.

The benchmarks should be landed first so we can compare performance

Okay, I created #33462 with just the benchmark change.

Reorganization like moving functions around should be landed as an NFC PR rather than done together, particularly to ease reviewing preservation of ABI stability.

Okay, I created #33463 which just moves the existing random methods from FloatingPoint.swift to FloatingPointRandom.swift

WIP commits should be squashed together, so separate commits should group logical changes rather than just when the work was done.

I wish I knew how to do this.

If there are new tests that fail on the current compiler, consider putting them in as XFAILs and then un-xfailing them when the fix lands.

Yes, the last 3 tests (smallRange, fullRange, and lowBit) fail on the current implementation. But the tests and the new implementation are part of the same PR (this one) so they should land at the same time, right?

airspeedswift · 2020-08-13T19:33:03Z

WIP commits should be squashed together, so separate commits should group logical changes rather than just when the work was done.

I wish I knew how to do this.

https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History

karwa · 2020-08-14T16:02:54Z

Have you tried using @_specialize as an alternative to inlining?

NevinBR · 2020-08-15T16:27:29Z

Have you tried using @_specialize as an alternative to inlining?

My understanding (as Ben explained to me on the forums) is that @_alwaysEmitIntoClient is necessary for ABI stability here.

While @_specialize would probably suffice for the standard-library types, the code still needs to work (and be well-optimizable) for 3rd-party types in an app compiled against the new implementation then back-deployed to an OS with an older version of the standard library.

karwa · 2020-08-17T13:52:49Z

Oh. I'm not an ABI expert, but my understanding was that something like:

extension BinaryFloatingPoint {
  func random(...) -> Self {
    if #available(iOS 11, macOS ...) {
      return newRandom()
    } else {
      return oldRandom()
    }
  }

  @available(iOS 11, macOS ...)
  func newRandom() -> Self {
    // ...
  }
}

Should perform the check at runtime (as long as your deployment target allows OSes which don't pass the check, otherwise it just gets removed). @alwaysEmitInToClient is used if you never want to revert to the old behaviour (and hence need to ship the new implementation in your app).

As for @_specialize - it's just a suggestion due to Ben's reservations about the amount of inlined code. This isn't going to be constant-folded 😄, so the only gain from inlining is specialisation. It's worth considering how often we expect people to be generating lots of random values of their custom floating-point types, and how much gain would they even get from specialisation - is the performance maybe dominated by the RNG or other operations?

airspeedswift · 2020-08-17T14:28:22Z

As for @_specialize - it's just a suggestion due to Ben's reservations about the amount of inlined code...It's worth considering how often we expect people to be generating lots of random values of their custom floating-point types

At the standard library, we definitely need to serve the needs of custom numeric types (like those you might find in swift-numerics), so @_specialize won't work here (though, once it can serve up pre-specialized versions as ABI, that'll definitely be worth it and might mitigate the binary size impact).

NevinBR · 2020-08-20T02:17:05Z

Now that #33463 has been merged (creating the file FloatingPointRandom.swift with the existing random implementations) I’ve gone ahead and opened a new PR to supersede this one: #33560

@airspeedswift will be happy to know the new PR has only 2 commits: one to update the implementation, and one to add the tests.

The new benchmarks remain in #33462, which has not yet been merged at the time of this writing.

shahmishal · 2020-10-01T06:54:13Z

Please update the base branch to main by Oct 5th otherwise the pull request will be closed automatically.

How to change the base branch: (Link)
More detail about the branch update: (Link)

NevinBR · 2020-10-01T13:29:29Z

Closed as superseded by #33560

NevinBR added 30 commits May 9, 2020 19:36

Merge pull request #1 from apple/master

ba8dfeb

Syncing from upstream

Create FloatingPointRandom.swift

0beabf1

FloatingPointRandom initial commit

0fe952c

Moved random methods out of FloatingPoint.swift

b1f53e7

The `random(in:)` family of methods are now located in FloatingPointRandom.swift

Moved _significandMask out of FloatingPointRandom.swift

b11c1d6

It will reside in FloatingPoint.swift

Moved _significandMask into FloatingPoint.swift

9c1627c

Always emit helper functions into client

272a529

Expanded introductory comment

447963a

Minor edit to comments

b0735e5

Attributes on _significandMask

694c66b

Made _sectionBitCount transparent

c528d6b

Moved _significandMask out of FloatingPoint.swift

ca18c5b

Moved _significandMask into FloatingPointRandom.swift

ac65f01

Line breaks in comments

c448f86

Updated comments

3a26c3e

Minor changes

3eea545

Invariants

5d2ad18

Comments

06c6820

More comments

01c064e

Wording

7ca0b81

Removed some blank lines

3d1d6d4

FloatingPointRandom tests

cacc18f

Fixed typo in FloatingPointRandom tests

14fca3b

Gyb'd FloatingPointRandom tests

6b8b38b

Added .gyb to test file

c54f6b9

Renamed a variable in tests

9f33bd0

Fixed quote marks

aeff868

Benchmarks for unit interval

658d266

License text

2c8f548

Spaces in comment

10c7e4d

NevinBR added 3 commits August 12, 2020 21:44

Empty line

a5ca976

Blank line

abf4ca3

Fixed typo in comment

4f760b2

theblixguy requested a review from stephentyrone August 13, 2020 17:18

airspeedswift requested changes Aug 13, 2020

View reviewed changes

test/stdlib/FloatingPointRandom.swift.gyb Outdated Show resolved Hide resolved

NevinBR mentioned this pull request Aug 13, 2020

Floating point random benchmark #33462

Open

NevinBR added 2 commits August 13, 2020 15:04

Separated benchmarks into new PR

5ee775f

Benchmarks for `0..<1` are now at [https://github.com/apple/swift/pull/33462](#33462)

Minor cleanup

4b1e501

NevinBR mentioned this pull request Aug 13, 2020

[stdlib] [NFC] FloatingPointRandom.swift created #33463

Merged

Updated comments on tests

ac8f64e

NevinBR marked this pull request as draft August 14, 2020 14:29

NevinBR closed this Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[stdlib] Floating-point random-number improvements #33455

[stdlib] Floating-point random-number improvements #33455

NevinBR commented Aug 13, 2020

Azoy commented Aug 13, 2020 •

edited

Loading

airspeedswift commented Aug 13, 2020

airspeedswift left a comment

airspeedswift commented Aug 13, 2020

NevinBR commented Aug 13, 2020

airspeedswift commented Aug 13, 2020 •

edited

Loading

karwa commented Aug 14, 2020

NevinBR commented Aug 15, 2020

karwa commented Aug 17, 2020

airspeedswift commented Aug 17, 2020

NevinBR commented Aug 20, 2020

shahmishal commented Oct 1, 2020

NevinBR commented Oct 1, 2020

[stdlib] Floating-point random-number improvements #33455

[stdlib] Floating-point random-number improvements #33455

Conversation

NevinBR commented Aug 13, 2020

Overview

Summary of changes

Mathematical details

Range

ClosedRange

Similarity with random integer methods

Azoy commented Aug 13, 2020 • edited Loading

airspeedswift commented Aug 13, 2020

airspeedswift left a comment

Choose a reason for hiding this comment

airspeedswift commented Aug 13, 2020

NevinBR commented Aug 13, 2020

airspeedswift commented Aug 13, 2020 • edited Loading

karwa commented Aug 14, 2020

NevinBR commented Aug 15, 2020

karwa commented Aug 17, 2020

airspeedswift commented Aug 17, 2020

NevinBR commented Aug 20, 2020

shahmishal commented Oct 1, 2020

NevinBR commented Oct 1, 2020

`Range`

`ClosedRange`

Azoy commented Aug 13, 2020 •

edited

Loading

airspeedswift commented Aug 13, 2020 •

edited

Loading