Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Model.rng for SPEC-7 compliant numpy random number generation #2352

Merged
merged 14 commits into from
Oct 16, 2024

Conversation

quaquel
Copy link
Member

@quaquel quaquel commented Oct 14, 2024

This PR adds a SPEC 7 compliant numpy random number generator to the Model class.

Implementation
I have chosen a straightforward approach. I have added an additional optional keyword argument rng that can be used to seed the numpy.random.Generator instance when instantiating a model. I have also added 2 new attributes to Model: rng and _rng. The first is the generator which can be used analogously to model.random. The second is _rng which is the dictionary describing the state of the numpy.random.Generator instance. Knowing this dict allows for reproducibility.

The code itself and e.g., the typing behavior is lifted straight from SPEC 7.

At the moment, a user has to pass either seed or rng, but not both which will raise a value error. The value passed for the one will be used for the other as well. If this argument is invalid, a valid integer based seed/rng number will be generated instead. What does this mean practically: if you pass seed we start with seeding random.Random. We try to use seed also for numpy.random.default_rng. If this raises a type error because seed is not valid here (e.g., seed is a float). Instead, we draw a valid integer via the seeded random.Random(). It works the same way if you only pass rng. So we create a Generator, try to use rgn for random.Random. If this failes because it is e.g. an ArrayLike sequence of ints, we draw a random int to use for random.Random(()

What's next?
At present, I have not touched the existing random throughout the code base. SPEC 7 is only about numpy and does not imply anything for python.random. It is for me an open question whether we should completely deprecate python.random throughout mesa in favor of numpy.random. I am inclined to be in favor of making this change because the quality of the random number generators in numpy is superior (long story about mersene twister's "bad" behavior when seeded with non 32 bit numbers or with adjacent numbers in case of seed analysis). However, this requires a substantial effort throughout the code base and might come with a performance penalty. What do others think and should we do this before MESA 3.0 becomes stable?

@quaquel quaquel added the feature Release notes label label Oct 14, 2024
@quaquel quaquel requested review from rht and EwoutH October 14, 2024 06:57
Copy link

Performance benchmarks:

Model Size Init time [95% CI] Run time [95% CI]
BoltzmannWealth small 🔴 +23.5% [+21.6%, +25.3%] 🔵 +0.6% [+0.3%, +0.9%]
BoltzmannWealth large 🔵 +0.6% [+0.0%, +1.2%] 🔵 -0.8% [-2.9%, +1.3%]
Schelling small 🔵 -0.5% [-1.1%, +0.0%] 🔵 -2.7% [-2.9%, -2.5%]
Schelling large 🔵 -2.5% [-4.0%, -1.0%] 🟢 -11.8% [-13.7%, -10.3%]
WolfSheep small 🔵 +0.8% [+0.7%, +1.0%] 🔵 -1.3% [-1.5%, -1.1%]
WolfSheep large 🟢 -4.3% [-5.5%, -3.2%] 🟢 -12.0% [-14.5%, -9.7%]
BoidFlockers small 🔵 +0.8% [+0.1%, +1.6%] 🔵 +0.6% [-0.1%, +1.4%]
BoidFlockers large 🔵 +1.5% [+0.5%, +2.5%] 🔵 +1.0% [+0.4%, +1.6%]

@EwoutH
Copy link
Member

EwoutH commented Oct 14, 2024

Implementation looks clean and good!

SPEC 7 is only about numpy and does not imply anything for python.random. It is for me an open question whether we should completely deprecate python.random throughout mesa in favor of numpy.random.

I think we should have one consistent recommended way to pass seeds. In my view, the random number generator should be reset directly in Model init with a certain seed, to make model runs also reproducable when run in (for example) a Jupyter notebook multiple times, or in (custom) batch scripts.

How it's implemented internally doesn't matter that much in my opinion. If that seed number is passed to both python.random and numpy.random and that provides consistently reproducable results, that perfectly fine. If it uses numpy.random for everything, that's also great.

It's something we should test on a very minimal model I guess.

What do others think and should we do this before Mesa 3.0 becomes stable?

I'm willing to advocate and enable an aggressive timeline for this.

@quaquel
Copy link
Member Author

quaquel commented Oct 14, 2024

I think we should have one consistent recommended way to pass seeds. In my view, the random number generator should be reset directly in Model init with a certain seed, to make model runs also reproducable when run in (for example) a Jupyter notebook multiple times, or in (custom) batch scripts.

I started out this way but its not trivial. For example, python.random.Random does accept a float, numpy.random.default_rng does not accept a float. Also what defines the state of the rng is different accross both. In case of python.random.Random, all you need is the initial value for seed. In case of numpy.random.default_rng state is a dict with several fields (the class of the generator, another state dict with state and inc as fields, and some other fields).

If there is a broader concensus on moving aggresively on this, I would probably approach it a bit differently. Now, I just added rng next to random. In case of an aggressive timeline, I would switch to rng as the basis and use this to seed python.random.Random. It would also imply raising at least a deprecation warning if both seed and rng are passed to Model.__init__ as avocated by SPEC 7, and deprecation warnings from Model,random and Agent.random.

@EwoutH
Copy link
Member

EwoutH commented Oct 14, 2024

For example, python.random.Random does accept a float, numpy.random.default_rng does not accept a float. Also what defines the state of the rng is different accross both.

Can we solve this by requiring the user to input the seed in a certain dtype, and doing the rest behind the scenes? In this case, require inputting a int and constructing the remainder of the dict in the Model init?

@rht
Copy link
Contributor

rht commented Oct 14, 2024

In general, there is no need to rush, as SciPy (one of the endorsers of the spec and the one behind the spec) itself has yet to make a move scipy/scipy#14322 (because their user-facing surface with random state is so huge). No project hosted on GH has completely adopted the spec yet (https://github.com/topics/spec-7).

The restriction in not being able to use a float for the seed will be a huge one. What should be the recommended conversion from users's existing seeds? If we move aggressively with the rng today, then existing users results can no longer be reproducible in Mesa 3.0. If we still have both random and rng, then they can migrate to Mesa 3.0 while still having their results intact.

@quaquel
Copy link
Member Author

quaquel commented Oct 14, 2024

In general, there is no need to rush,

There is a rush in another way. If we want to make breaking changes to how random number generation is handled in MESA it would be good to make these as part of the shift to MESA 3.0. My current idea (based on thoughts below) is to make numpy.random.default_rng the default and prefered random number generator. The user has to pass either seed or rng, but not both (consistent wit spec 7). If seed/rng is not valid for random.Random, we raise a value warning and create a random but deterministic seed for random.Random from model.rng. I think this gives us the proper foundation from which to make subsequent changes. Old behavior is still there and can be used. Models remain reproducible. We have in place the foundations to move to numpy.random.

The restriction in not being able to use a float for the seed will be a huge one. What should be the recommended conversion from users's existing seeds? If we move aggressively with the rng today, then existing users results can no longer be reproducible in Mesa 3.0. If we still have both random and rng, then they can migrate to Mesa 3.0 while still having their results intact.

I deliberately left random in place for now to make migration easier (and to have this conversation). In case of seed=None, we can easily instantiate rng and use this to draw the seed for random (instead of using random.random as currently done).

If we want to use seed/rng to seed both python.random.Random and numpy.random.default_rng, we would have to check what the intersection of valid arguments is to both and only allow those. Or, alternatively, make one or the other the priority and use the resulting seeded rng to create a valid seed for the other.

So, for the float case, we could no longer allow it because numpy.random.default_rng takes precedent. Alternatively, if python.random.Random takes precedence, you could seed this as normal, draw a random integer with this and use this as a valid seed for numpy.random.default_rng. Personally, I have never used a float for seeding but always some very large integer so I don't know how big of an issue it would be if we don't allow floats anymore.

There are various other changes in MESA 3.0 that make me wonder whether perfect numerical reproducability is not already impossible when switching from MESA 2.x to MESA 3.

@rht
Copy link
Contributor

rht commented Oct 14, 2024

There are various other changes in MESA 3.0 that make me wonder whether perfect numerical reproducability is not already impossible when switching from MESA 2.x to MESA 3.

If it is already at the point of no return, then it is fine to do the breaking change. (scikit-learn/scikit-learn#16988 (comment) (from the creator of https://github.com/pyutils/line_profiler) has an extensive read on why a generator construct eliminates a lot of headaches in scikit-learn. I haven't read the lengthy explanation version.)

I haven't said elsewhere, but [rng.random() for i in range(100)] is a lot slower with NumPy than with stdlib random. The performance benchmark might show this outcome later.

@quaquel
Copy link
Member Author

quaquel commented Oct 14, 2024

If it is already at the point of no return, then it is fine to do the breaking change.

Ok, I'll update the PR based on some of the other points discussed but make sure we keep random in there while adding spec-7 compliant rng.

I haven't said elsewhere, but [rng.random() for i in range(100)] is a lot slower with NumPy than with stdlib random. The performance benchmark might show this outcome later.

This is an important point and a possible argument for keeping both stdlib random and rng if the benchmarks further done the line indicate that the overhead is massive.

@quaquel
Copy link
Member Author

quaquel commented Oct 14, 2024

Ok, I have updated the PR. Users can not pass both seed and rng. I currently raise a value error if both are provided. This is in line with SPEC 7.

If both seed and rng are None, I create numpy.random.default_rng and use this to create a random integer with which to seed random.Random().

If either is passed, we seed the associated random number generator. We try to also use this number for the other. If this raises a type error, we draw a random integer from the seeded random number generator and use this to seed the other.

If either is defined, the other takes the same value. The only drawback is that some of the argument that are valid for random.Random are not valid for numpy.random.default_rng and vice versa. There is no simple solution for this. The current implementation limits seed and rng to values that are valid in both. Based on the documentation, the only valid argument for both is int. I am open for ideas on how to approach this better. One option is to create random.Random(seed=seed) if seed is passed or numpy.random.default_rng(rng) if rng is passed and then use this seeded random number generator to create a valid seed for the other. This would allow for a much broader range of valid arguments (the union of both in fact).

Copy link
Member

@EwoutH EwoutH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t really understand all the little details and nuances of this implementation and how it will interact with everything, but in general it looks clean and doesn’t break anything, and I agree on a conceptual level a SPEC 7 implementation in Mesa is useful.

So I’m approving this, but requesting for the PR to remain open for about a day or so other maintainers have a chance to take a look at it.

@EwoutH
Copy link
Member

EwoutH commented Oct 15, 2024

@projectmesa/maintainers any of you additional insights / an opinion on this?

@quaquel quaquel merged commit 7ad5826 into projectmesa:main Oct 16, 2024
9 of 12 checks passed
@EwoutH
Copy link
Member

EwoutH commented Oct 16, 2024

@quaquel Unfortunately, this PR broke pre-commit.

@quaquel
Copy link
Member Author

quaquel commented Oct 16, 2024

so, what does that mean? I don't know what is covered by pre-commit?

@EwoutH
Copy link
Member

EwoutH commented Oct 16, 2024

The .pre-commit-config.yaml defines what's checked with each commit. Ruff is part of that, and in this case the one that fails:

image

@jackiekazil
Copy link
Member

Was this not noticed before the merge?
I wonder how this accidentally got through to ensure it doesn't happen in the future.

@EwoutH
Copy link
Member

EwoutH commented Oct 17, 2024

A while ago we un-required some checks, because some times they error either erroneously or because of unrelated issues to that specific PR, which than blocked it unnecessarily. Yesterday I re-required pre-commit and one of the build jobs again.

It’s a bit about finding a balance, but in general I’m a fan of educating ourselves about the systems and checks we use and interpreting them as guidelines, then making everything strict rules and being held hostage to automated processes.

These things are really minor and easily fixed, so also really not a big deal. As we say in Dutch: “Waar gehakt wordt, vallen spaanders”.

@Corvince
Copy link
Contributor

Yesterday I re-required pre-commit and one of the build jobs again.

The problem was that when you have a PR that doesn't change code, for example a pure documentation update, then the build runs are not triggered at all. But they are still required so you run into a situation where you have to use the big red button, which isn't available to all maintainers.

I just wish there would be an extra step involved when you want to merge something, but some checks are failing. Right now it's way too easy to miss failed checks. But making everything required would be too strict, sometimes it's okay when you know you have a dependency that will fix the error soon.

But let's see how it works out now with the required workflows

@quaquel
Copy link
Member Author

quaquel commented Oct 17, 2024

In addition to @Corvince's point, I was just assuming pre-commit etc. was required and did not check (I always check unit tests). So @EwoutH, did you post somewhere that you changed the required checks, and did I miss it? If not, a simple post in Element would be useful as a heads up.

@EwoutH
Copy link
Member

EwoutH commented Oct 17, 2024

The problem was that when you have a PR that doesn't change code, for example a pure documentation update, then the build runs are not triggered at all.

You’re right, that’s why we removed that one. I removed it again as a required check.

I just wish there would be an extra step involved when you want to merge something

Agreed, that would be useful. I don’t think GitHub offers such a feature unfortunately.

@quaquel Tom did it a while back before I had admin permissions. I will communicate clearly if I change anything in the future.

Currently only pre-commit is a required check, since that runs always. It’s good habit as a maintainer to always check if all checks in GitHub are green before merging, and if something isn’t to check the root cause. If it’s due to the PR, fix it within the PR, otherwise post it somewhere relevant.

@EwoutH
Copy link
Member

EwoutH commented Nov 9, 2024

Found this FIXME in the Agent code, with this merged, could this be resolved?

) # FIXME see issue 1981, how to get the central rng from model

@quaquel
Copy link
Member Author

quaquel commented Nov 9, 2024

The issue still exists for both AgentSet and the various new-style discrete spaces.

At the moment, they all create a new Random instance if random is None, which is convenient but also makes it easy to create models that are not reproducible. In particular, for the cell spaces, this has bitten me already a few times.

Given spec-7, we cannot really go the global route, and as discussed in #1980, there are additional reasons why that is undesirable. We might consider raising a user warning?

@EwoutH
Copy link
Member

EwoutH commented Nov 9, 2024

I love sensible user warnings, sounds like a good idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Release notes label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants