Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove max sample value #1374

Merged
merged 12 commits into from
Apr 18, 2023
Merged

Remove max sample value #1374

merged 12 commits into from
Apr 18, 2023

Conversation

StephenButtolph
Copy link
Contributor

Why this should be merged

  • We shouldn't rely on the std lib for exact consistency.
  • We have a number of odd edge cases because we can't sample a number in the range (MaxInt64, MaxUint64].

How this works

Extends the rng function to support generating all possible values in the range [0, MaxUint64]

How this was tested

  • Existing tests
  • Manual edge case testing
  • Fuzzing against the prior implementation for values [0, MaxInt64) (note, these must produce the exact same numbers when n < MaxInt64 or else this could result in an unexpected change in the proposervm calculation.

@StephenButtolph StephenButtolph added the cleanup Code quality improvement label Apr 18, 2023
@StephenButtolph StephenButtolph added this to the v1.10.1 milestone Apr 18, 2023
@StephenButtolph StephenButtolph self-assigned this Apr 18, 2023
@@ -82,7 +76,7 @@ func (s *uniformReplacer) Next() (uint64, error) {
return 0, errOutOfRange
}

draw := uint64(s.rng.Int63n(int64(s.length-s.drawsCount))) + s.drawsCount
draw := s.rng.Uint64n(s.length-1-s.drawsCount) + s.drawsCount
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the -1 here now because Uint64n returns values in the range [0, n] whereas Int63n returned values in the range [0, n)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to just have Uint64n return a number in [0, n)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should distinguish between these 2 random modes? Seems confusing that *n function now behaves differently (especially if it is different than the golang rand pkg: https://pkg.go.dev/math/rand#Int31n)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to just have Uint64n return a number in [0, n)?

This would mean that we would need to panic if n == 0 which I was trying to avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed Uint64n to Uint64Inclusive to be very clear

@@ -70,7 +64,7 @@ func (s *uniformResample) Next() (uint64, error) {
}

for {
draw := uint64(s.rng.Int63n(int64(s.length)))
draw := s.rng.Uint64n(s.length - 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the -1 here now because Uint64n returns values in the range [0, n] whereas Int63n returned values in the range [0, n)

samples := []uint64(nil)
if totalWeight > 0 {
samples = make([]uint64, s.benchmarkIterations)
for i := range samples {
samples[i] = uint64(globalRNG.Int63n(int64(totalWeight)))
samples[i] = globalRNG.Uint64n(totalWeight - 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the -1 here now because Uint64n returns values in the range [0, n] whereas Int63n returned values in the range [0, n)

Copy link
Contributor

@darioush darioush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@patrick-ogrady patrick-ogrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct to me and all "novel" code is pulled from math/rand.

utils/sampler/rand.go Outdated Show resolved Hide resolved
utils/sampler/weighted_benchmark_test.go Outdated Show resolved Hide resolved
utils/sampler/weighted_benchmark_test.go Outdated Show resolved Hide resolved
utils/sampler/weighted_benchmark_test.go Outdated Show resolved Hide resolved
@@ -82,7 +76,7 @@ func (s *uniformReplacer) Next() (uint64, error) {
return 0, errOutOfRange
}

draw := uint64(s.rng.Int63n(int64(s.length-s.drawsCount))) + s.drawsCount
draw := s.rng.Uint64n(s.length-1-s.drawsCount) + s.drawsCount

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to just have Uint64n return a number in [0, n)?

utils/sampler/rand.go Show resolved Hide resolved
s.rng = globalRNG
s.seededRNG = newRNG()
s.length = length
s.drawn = make(defaultMap)
s.drawsCount = 0
return nil
}

func (s *uniformReplacer) Sample(count int) ([]uint64, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will panic if count = -1

s := uniformReplacer{}
s.Initialize(1)
s.Sample(-1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like panicking is pretty reasonable when requesting a negative number of values

@StephenButtolph StephenButtolph merged commit dbdd478 into dev Apr 18, 2023
@StephenButtolph StephenButtolph deleted the remove-max-sample-value branch April 18, 2023 21:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Code quality improvement
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants