[stdlib] Floating-point random-number improvements #33560

NevinBR · 2020-08-20T02:09:07Z

Overview

This patch resolves multiple issues with generating random floating-point numbers.

The existing random methods on BinaryFloatingPoint will crash for some valid ranges (SR-8798), cannot produce all values in some ranges (SR-12765), and do not follow the proposed and documented semantics. This patch solves these problems:

SR-8798: BinaryFloatingPoint.random(in:) crashes on some valid ranges
SR-12765: BinaryFloatingPoint.random(in:) cannot produce all values in range

Summary of changes

i) Finite ranges where the distance between the bounds exceeds greatestFiniteMagnitude previously caused a trap at run-time. Now (with this patch) they are handled correctly to produce a uniform value in the range.

ii) Generating a random floating-point value in -1..<1 left the low-bit always 0. If the magnitude was less than 1/2, then the lowest 2 bits would be 0. If less than 1/4, 3 bits, and so forth. Other ranges which spanned multiple binades had similar issues. Now all values in the input range are produced with correct probability.

iii) The proposal (SE-0202: Random Unification) which added random number generation to Swift was quite sparse in its mention of floating-point semantics. However, during the review thread, discussion about the intended semantics arose with comments like this:

I think it makes sense to have the floating-point implementation behave as though a real number were chosen uniformly from the appropriate interval and then rounded.

Agreed. I would consider any implementation that does not have this behavior (within an ulp or two) to be incorrect.

To which the author of the proposal responded:

Yes, these are the semantics that I’m trying achieve here (those being uniform in real numbers).

Concordantly, the documentation comments for the floating-point random methods state:

  /// The `random(in:using:)` static method chooses a random value from a
  /// continuous uniform distribution in `range`, and then converts that value
  /// to the nearest representable value in this type.

However, prior to this patch, the implementation did not match that behavior, and many representable values could not be produced in certain ranges.

Mathematical details

In order to achieve the desired semantics, it is necessary to define precisely what “converts that value to the nearest representable value” should mean. This patch takes the following axiomatic approach:

`Range`

Single-value ranges: random(in: x ..< x.nextUp) always produces x.

Adjacent intervals: If x < y < z, then random(in: x ..< z) is equivalent to generating random(in: x ..< y) with probability (y-x)/(z-x), and otherwise random(in: y ..< z) with the remaining probability (z-y)/(z-x).

In order to satisfy these two principles, random(in: x ..< y) must behave as if a real number r were generated uniformly in [x, y), then rounded down to the nearest representable value. Note that the rounding must be downward, as any other choice would violate one of the two principles.

`ClosedRange`

Subintervals: If x <= y < z, then repeatedly generating random(in: x ..< z) until the result lands in x ... y is equivalent to generating random(in: x ... y).

This rule ensures consistency of results produced by the Range and ClosedRange versions of random(in:). As a result, it also guarantees that partitioning a closed interval into disjoint closed subintervals is consistent as well.

In order to satisfy this principle, random(in: x ... y) must be equivalent to random(in: x ..< y.nextUp) if the latter is finite.

In the edge-case that y == .greatestFiniteMagnitude, we utilize the adjacent intervals principle on [x, y) and [y, y + y.ulp). Although the latter endpoint is not representable as a finite floating-point value, the conceptual idea still holds, and the probability of producing y is proportional to y.ulp just as it is for all other values in the same binade.

This patch implements those semantics.

Similarity with random integer methods

It is interesting to note that the strategy of generating a uniform real number in a half-open interval then rounding down, is equivalent to how the random methods work for integer types. That is, T.random(in: x ..< y) behaves as if choosing a real number r uniformly in [x, y) then rounding down to the next representable value, regardless of whether T is an integer or (with this patch) a floating-point type.

Similarly for closed ranges, T.random(in: x ... y) behaves as if extending to a half-open interval bounded above by the next representable value larger than y (or in case of overflow, then where that value would be as a real number), generating a random value in the new range, and rounding down.

shahmishal · 2020-10-01T06:17:57Z

Please update the base branch to main by Oct 5th otherwise the pull request will be closed automatically.

How to change the base branch: (Link)
More detail about the branch update: (Link)

NevinBR · 2021-01-15T19:07:34Z

@stephentyrone, have you had a chance to look at this?

Is there anything I can do to make it easier to review?

NevinBR · 2021-09-07T15:37:57Z

stdlib/public/core/FloatingPointRandom.swift

+  // If section numbers used 64 bits, then for ranges like `-1.0...1.0`, the
+  // `Int64.random(in:using:)` call in the general case would need to call
+  // `next()` twice on average. Each bit smaller than that halves the
+  // probability of a second `next()` call.
+  //
+  // The tradeoff is wider sections, which means an increased probability of
+  // landing in a section which spans more than one representable value and
+  // thus requires a second random integer.
+  //
+  // We optimize for `Double` by using 60 bits. This gives worst-case ranges
+  // like `-1.0...64.0` a 3% chance of needing a second random integer.
+  @_transparent
+  @_alwaysEmitIntoClient
+  internal static var _sectionBitCount: Int { UInt64.bitWidth - 4 }


This will be unnecessary after #39143 “An optimal algorithm for bounded random integers” lands.

May still be beneficial, let's measure; I have some further ideas for improving floating-point generation as well, though =)

I look forward to hearing your ideas!

Artoria2e5 · 2021-09-13T05:59:28Z

I think @goualard-f is an authority in this stuff after his survey of random-float-intervals in programming languages. I am not sure about how to best contact him -- maybe this mention will do.

NevinBR · 2021-09-13T19:24:51Z

I think @goualard-f is an authority in this stuff after his survey of random-float-intervals in programming languages. I am not sure about how to best contact him -- maybe this mention will do.

Thanks for the link. I skimmed through the article—will have to go back and read it more thoroughly later—but my initial impression is that its proposed algorithm has the goal of selecting equally spaced values in an interval.

My understanding of the proposed and documented semantics for Swift’s floating-point random numbers, is that the goal here is to behave as if a real number was chosen uniformly within the interval, then rounded to a representable value. These are distinct operations, and my PR here achieves the latter.

NevinBR added 2 commits August 19, 2020 21:51

FloatingPointRandom implementation

0451195

FloatingPointRandom tests

d0caa82

NevinBR mentioned this pull request Aug 20, 2020

[stdlib] Floating-point random-number improvements #33455

Closed

NevinBR changed the base branch from master to main October 1, 2020 13:26

NevinBR commented Sep 7, 2021

View reviewed changes

Artoria2e5 mentioned this pull request Sep 13, 2021

Rebased & reviewed translations for zh_CN & zh_TW abh/ntppool#136

Closed

Artoria2e5 mentioned this pull request May 11, 2024

Not just [0,1), but also random int tc39/proposal-seeded-random#22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[stdlib] Floating-point random-number improvements #33560

[stdlib] Floating-point random-number improvements #33560

NevinBR commented Aug 20, 2020

shahmishal commented Oct 1, 2020

NevinBR commented Jan 15, 2021

NevinBR Sep 7, 2021

stephentyrone Sep 7, 2021

NevinBR Sep 7, 2021

Artoria2e5 commented Sep 13, 2021

NevinBR commented Sep 13, 2021

[stdlib] Floating-point random-number improvements #33560

Are you sure you want to change the base?

[stdlib] Floating-point random-number improvements #33560

Conversation

NevinBR commented Aug 20, 2020

Overview

Summary of changes

Mathematical details

Range

ClosedRange

Similarity with random integer methods

shahmishal commented Oct 1, 2020

NevinBR commented Jan 15, 2021

NevinBR Sep 7, 2021

Choose a reason for hiding this comment

stephentyrone Sep 7, 2021

Choose a reason for hiding this comment

NevinBR Sep 7, 2021

Choose a reason for hiding this comment

Artoria2e5 commented Sep 13, 2021

NevinBR commented Sep 13, 2021

`Range`

`ClosedRange`