Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to reset multiple global random seeds #2578

Merged
merged 1 commit into from
Aug 22, 2023
Merged

Conversation

fingolfin
Copy link
Member

@fingolfin fingolfin commented Jul 20, 2023

Progress towards resolving issue #100, by adding a simple helper to reset the global Julia and GAP RNGs to a fixed seed. More global RNGs should be added.

Goal: make it easier to create reproducible tests involving multiple cornerstones. So that you can e.g. write loops like this:

for seed in 1:10000
  println("No trying seed $seed") 
  Oscar.seed_global!(seed)
  run_test_that_only_sometimes_fails()
end

and then hopefully you end up with a 100% reproducible test case for the given seed (of course that doesn't help with e.g. GC issues that depend on memory pressure / memory layout, but with many others it does help)

TODOs:

  • add a way to reset Polymake RNGs, if applicable (CC @benlorenz @lkastner)
  • add a way to reset Singular RNGs, if applicable (CC @hannes14)
  • add more RNGs we use, if there are any (maybe FLINT has one?)
  • hook the new docstring up into the documentation (where?)

A future extension might also allow saving and restoring the global states of the RNGs, but this seems far less important to me right now, I'd like to focus on this simply API for now.

CC @fieker @thofma

@thofma
Copy link
Collaborator

thofma commented Jul 20, 2023

Nemo.randseed! sets the seed for the flint RNG of the current thread (every thread has its own RNG).

@fingolfin
Copy link
Member Author

Thanks, will add Nemo.randseed! then.

I'll also clarify the docstring that right now, this method does not do anything specific for threads, and so some RNGs may only be reset on the current thread.

As soon as multiple threads are involved, reproducibility is much more complicated anyway :-/

@benlorenz
Copy link
Member

benlorenz commented Jul 20, 2023

I would assume that there is no guarantee that no other package we are depending on (or code in the julia core) uses the default julia rng as well? Unless that is the case I am not sure if it makes much sense to seed the julia rng.

Please also set the rng_seed further up in the same file, and it might make sense to integrate / merge seed_global! with set_seed!?

The idea with the other functions in rand.jl was that developers would fetch a (seeded) RNG object in whatever algorithm they are writing or in the test-file and use that throughout their code. This way it should be unaffected by other uses of the global rng objects.

Regarding polymake:
We don't use a global rng that but instead functions that depend on random numbers have a seed keyword argument that will be used for the random source in that algorithm (this should also be exposed in Oscar).

@fingolfin
Copy link
Member Author

I would assume that there is no guarantee that no other package we are depending on (or code in the julia core) uses the default julia rng as well?

Of course not.

Unless that is the case I am not sure if it makes much sense to seed the julia rng.

I don't see what one has to do with the other: the benefit of seeding the default Julia rng is that a lot of our code uses it (just grep for rand).

And of course some still uses different RNGs and hence won't be "handled" by the code in this PR -- but that's OK, this is not about a perfect, solution, just a pragmatic one; even if it doesn't cover all cases, it's still very useful in a bunch of situations.

Please also set the rng_seed further up in the same file,

I deliberately didn't do it in my first iteration because based on the description it did seem to be orthogonal (it is for CI tests which are written to deliberately use get_seeded_rng()).

But if you think it should be done, sure, I can do it, no problem.

and it might make sense to integrate / merge seed_global! with set_seed!?

I don't think that makes sense, because they are for different applications. seed_global! is more for interactive use, e.g. when trying to make a sporadically failing test fail reliably (as in the example in the PR description). In contrast set_seed! only affects code using get_seeded_rng() which according to its docstring should be mostly things in the test suite?

The idea with the other functions in rand.jl was that developers would fetch a (seeded) RNG object in whatever algorithm they are writing or in the test-file and use that throughout their code. This way it should be unaffected by other uses of the global rng objects.

Ah, I was not aware that it is also intended for regular code, not just tests. But I am somewhat sceptical about that:

julia> rng=Oscar.get_seeded_rng();  rand(rng, 1:1000)
219

julia> rng=Oscar.get_seeded_rng();  rand(rng, 1:1000)
219

julia> rng=Oscar.get_seeded_rng();  rand(rng, 1:1000)
219

So if I write code which uses Oscar.get_seeded_rng() it will always get the same sequence of random numbers, unless something outside calls Oscar.set_seed! with a different seed. But that's not desirable for many Monte Carlo or Las Vegas algorithms. In fact it would be a disaster for many such algorithms in matrix group recognition, for example.

Regarding polymake: We don't use a global rng that but instead functions that depend on random numbers have a seed keyword argument that will be used for the random source in that algorithm (this should also be exposed in Oscar).

OK.

@fingolfin fingolfin marked this pull request as ready for review August 16, 2023 09:41
Copy link
Member

@benlorenz benlorenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that there is no guarantee that no other package we are depending on (or code in the julia core) uses the default julia rng as well?

Of course not.

Unless that is the case I am not sure if it makes much sense to seed the julia rng.

I don't see what one has to do with the other: the benefit of seeding the default Julia rng is that a lot of our code uses it (just grep for rand).

I was thinking that there might be various julia-internal uses of the default-rng which would mutate the state and thus lead to rather unpredictable output. But they seem to use a separate rng for most of the internal stuff so I guess this should be fine.

Please also set the rng_seed further up in the same file,

I deliberately didn't do it in my first iteration because based on the description it did seem to be orthogonal (it is for CI tests which are written to deliberately use get_seeded_rng()).

But if you think it should be done, sure, I can do it, no problem.

I see, and yes it is probably better to keep those separate.

The idea with the other functions in rand.jl was that developers would fetch a (seeded) RNG object in whatever algorithm they are writing or in the test-file and use that throughout their code. This way it should be unaffected by other uses of the global rng objects.

Ah, I was not aware that it is also intended for regular code, not just tests. But I am somewhat sceptical about that:

julia> rng=Oscar.get_seeded_rng();  rand(rng, 1:1000)
219

julia> rng=Oscar.get_seeded_rng();  rand(rng, 1:1000)
219

julia> rng=Oscar.get_seeded_rng();  rand(rng, 1:1000)
219

So if I write code which uses Oscar.get_seeded_rng() it will always get the same sequence of random numbers, unless something outside calls Oscar.set_seed! with a different seed. But that's not desirable for many Monte Carlo or Las Vegas algorithms. In fact it would be a disaster for many such algorithms in matrix group recognition, for example.

That depends on the use-case, if I want to write a randomized algorithm that will run in a deterministic way for identical input the following would be fine:

function algo(input; rng=Oscar.get_seeded_rng())
  while (something) do
    r = rand(rng, 1:1000)
    something = other_algo(input, r)
  end
end

If other_algo would work in the same way the rand calls of those two algorithms wont influence each other even if one of the algorithms is changed.

Currently we have many algorithms that all use the same common sequence and some do randomly fail. Suppose we seed the global state at the beginning then adding one random call somewhere in one algorithm will change the sequence for all subsequent code which might run into annoying random errors.

But this is somewhat orthogonal to how this code is supposed to work so this does not really matter for now.

Thanks for the clarification in the docs regarding how this is supposed to be used.

@fingolfin fingolfin merged commit eb7c362 into master Aug 22, 2023
15 of 38 checks passed
@fingolfin fingolfin deleted the mh/seed_global branch August 22, 2023 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants