AgentSet: Allow selecting a fraction of agents in the AgentSet #2253

EwoutH · 2024-08-28T10:02:44Z

This PR updates the select method in the AgentSet class by replacing the n parameter with a more versatile at_most parameter. The at_most parameter allows for selecting either a specific number of agents or a fraction of the total agents when provided as an integer or a float, respectively. Additionally, backward compatibility is maintained by supporting the deprecated n parameter, which will trigger a warning when used.

Motive

Previously, the select method only allowed users to specify a fixed number of agents (n) to be selected. The new at_most parameter extends this functionality by enabling the selection of agents based on a proportion of the total set, which is particularly useful in scenarios where relative selection is desired over absolute selection.

Implementation

at_most Parameter:
- Accepts either an integer (to select a fixed number of agents) or a float between 0.0 and 1.0 (to select a fraction of the total agents).
- at_most=1 selects one agent, while at_most=1.0 selects all agents.
- If a float is provided, it determines the maximum fraction of agents to be selected from the total set. It rounds down to the nearest number of whole agents.
Backward Compatibility:
- The deprecated n parameter is still supported, but it now serves as a fallback for at_most and triggers a deprecation warning.
Behavior Notes:
- at_most serves as an upper limit on the number of selected agents. If additional filtering criteria are provided, the final selection may include fewer agents.
- For random sampling, users should shuffle the AgentSet before applying at_most.

Usage Examples

# Select the first 5 agents from the AgentSet
selected_agents = agents.select(at_most=5)
selected_agents = agents.select(n=5)  # Still works but throws a deprecation warning

# Select the first 20% of agents from the AgentSet (as currently sorted, rounded down)
selected_agents = agents.select(at_most=0.2)

To randomly select a fraction, add a shuffle():

# Select 20% of agents randomly from the AgentSet
random_agents = agents.shuffle().select(at_most=0.2)

Combining with sorting:

# Select the 20% of agents with the lowest wealth
selected_agents = agents.sort("wealth", ascending=True).select(at_most=0.2)

The most powerful feature is that you can combine at_most with additional criteria:

# Select agents with "wealth" less than 5, and at most 20% of the total
selected_agents = agents.select(lambda agent: agent.wealth < 5, at_most=0.2)

# First filter agents, then select 20% of those remaining
filtered_agents = agents.select(lambda agent: agent.wealth < 5).select(at_most=0.2)

You can also use it with chaining:

# Randomly select 40% of the agents from the AgentSet and set a value
model.agents.shuffle().select(at_most=0.4).set('has_license', True)

github-actions · 2024-08-28T10:08:04Z

Performance benchmarks:

quaquel · 2024-08-28T10:14:32Z

what is the motivation for adding this to the agentset?

Corvince · 2024-08-28T10:18:28Z

That seems useful, thanks!

The only worry I have is how this behaves if a user specifies both n and p. That probably should raise an error?

Or maybe there is a good name that could incorporate both p and n? So if it is between 0 and 1 use a fraction and if it is a whole number above 1 use that number?

EwoutH · 2024-08-28T10:27:39Z

what is the motivation for adding this to the agentset?

Sorry, was still working on other features (and my actual model), wrote it up.

That seems useful, thanks!

The only worry I have is how this behaves if a user specifies both n and p. That probably should raise an error?

Yeah I was thinking about that. Maybe just don't do that (and we mention it in the docstring)?

If you just want to select a fraction of n, you can do n=round(n*p), so having both doesn't make sense.

Or maybe there is a good name that could incorporate both p and n? So if it is between 0 and 1 use a fraction and if it is a whole number above 1 use that number?

Very interesting idea, but maybe in this case explicit is better than implicit. Except if you can come up with a killer name.

quaquel · 2024-08-28T10:53:06Z

I like the clarity of p. So my suggestion would be to raise a value error if both n and p are passed

mesa/agent.py

quaquel · 2024-08-28T11:32:11Z

see the few minor comments and once unit tests are added, this is good to go.

Allow selecting a fraction of agents in the AgentSet.

for more information, see https://pre-commit.ci

Also add the Raises ValueError and Note about not shuffling by default.

for more information, see https://pre-commit.ci

EwoutH · 2024-08-28T18:22:12Z

Okay, I:

Changed p to fraction
Used the ValueError
Updated the other docstring, including notes
Added tests
Updated the examples

However, I noticed that there's an important difference between n and fraction. n is always fixed, it's just an upper limiter. fraction does matter when you apply it, before or after the rest of the selection.

Currently fraction is interpreted as a fraction of the input AgentSet. When writing the usage examples that felt really counter intuitive. It would be more logical if you could apply it afterwards, such that a fraction of the selected AgentSet is returned.

Why? Because if you take these two use cases:

Select the agents with "wealth" less than 5 but at most 20% of total agents
Select the agents with "wealth" less than 5, and then 20% of those agents

The latter is used way more than the former. And it will be way more logical if you select by type.

So I would suggest applying fraction afterwards, on the selected AgentSet after all other operations are done. Then you could still do both:

# Select the agents with "wealth" less than 5, and at most 20% of total agents
agents.select(fraction=0.2).select(lambda agent: agent.wealth < 5)

# Select the agents with "wealth" less than 5, and then 20% of those agents
agents.select(lambda agent: agent.wealth < 5, fraction=0.2)
# or, equivalently:
agents.select(lambda agent: agent.wealth < 5).select(fraction=0.2)

But now the one that's more used and more intuitive will go well by default.

Totally other options could be:

Don't allow fraction and/or n with other functions, but enforce chaining
Introduce a new method, like sample, that give a sample of n or a sample of fraction.

rht · 2024-08-28T20:17:57Z

what is the motivation for adding this to the agentset?

@EwoutH I'm also wondering about this. Not saying that this shouldn't be in the library, but a concrete example could give some illustration. Is this used in your project?

EwoutH · 2024-08-28T20:41:45Z

This was the thing I wanted to do:

# Randomly select 40% of the agents from the AgentSet and give them a license
model.agents.shuffle().select(fraction=0.4).set('has_license', True)

I needed to do this:

n_license = round(model.agents * license_chance)
model.agents.shuffle().select(n=n_license).do(lambda agent: setattr(agent, 'has_license', True))

With #2254 it got simplified to:

n_license = round(model.agents * license_chance)
model.agents.shuffle().select(n=n_license).set('has_license', True)

It's not a huge use case, but it's nice. Especially that you don't need to break the chain.

Combine it with a function and it get's really powerful though. Assume I want to distribute some cars around (I know a certain percentage of all people has a car), but only to agents with licenses.

agents.select(lambda a: a.has_license, fraction=car_chance).set('has_car', True)

Without the fraction, this would have been:

n_car = round(model.agents * car_chance)
model.agents.shuffle().select(n=n_car ).set('has_license', True)

So yeah, it's not a huge use case. Maybe it adds some complexity.

There's an unique application for fraction as upper limit (cap), as currently implemented, and a unique application for doing it afterwards. I need to think about this a bit longer.

EwoutH · 2024-08-28T21:13:23Z

Right, n=0 has a special status. With a small fraction or small agentset, n can become 0, returning all agents.

Corvince · 2024-08-29T04:16:28Z

Right, n=0 has a special status. With a small fraction or small agentset, n can become 0, returning all agents.

Good catch!

I see two possibilities now. Either just change the special meaning from 0 to -1. I don't know if there was a good use case for 0, but it's rather strange for 0 to indicate all agents.

The more holistic approach would be to split select into a filter function and a sample function. This would also simplify the logic and solve the "before or after" question (which was present but unconsidered before fraction was introduced)

EwoutH · 2024-08-29T05:34:50Z

The brain is so interesting that after a nights sleep you look at it again and you think oh, and it all clicks together.

Now I just have to write it up, rewrite the codes, tests and examples.

Can’t wait for 2026/2027 where with a voice message a bit does that automatically.

Long story short: There’s a special use case for when filtering, you want a certain number or fraction at most. Especially the fraction should happen right there in the function, because after the function is done, you don’t know how large the

For all other cases (before, after) a sample method would be perfect (and can be implemented pretty fast I think). sample could also draw a random sample, where select selects the first n/fraction.

Or maybe there is a good name that could incorporate both p and n? So if it is between 0 and 1 use a fraction and if it is a whole number above 1 use that number?

Obviously the way to go. I was thinking max, limit, ceiling or at_most.

Corvince · 2024-08-29T10:02:12Z

Agreed on the performance aspect. One way to solve this but keep the chainable approach would be to use generator functions to return iterators instead of the complete AgentSet. But maybe as you said this is all mainly catered towards nice semantics and there are other ways already available for performance critical operations.

That's an interesting idea worth exploring at some point (but not this PR). Basically, what if we have a generator interface to an AgenSet? And can we make a chainable API work with generators?

I think having an __iter__ method is kind of enough, so

(agent for agent in agentset)

should already give you an iterator over the agentset. Definitely worth exploring that more, but certainly way out of scope for this PR

//Edit
Ah, sorry, didn't think this through. Definitely needs more thought on the possibility to make this chainable. This if course only iterates over the agents themselves

n is removed with a fallback max (int | float, optional): The maximum amount of agents to select. Defaults to infinity. - If an integer of 1 or larger, the first n matching agents are selected. - If a float between 0 and 1, at most that fraction of original the agents are selected.

EwoutH · 2024-08-29T15:03:22Z

I updated this PR to replace n with max.

max (int | float, optional): The maximum amount of agents to select. Defaults to infinity.

If an integer of 1 or larger, the first n matching agents are selected.

If a float between 0 and 1, at most that fraction of original the agents are selected.

Some details:

max=1 will give one agent, max=1.0 gives all agents.
A fallback for n was added, which does max = n and throws a warning.

Tests are updated. Please double check the internal agent_generator function.

If we decide this is the way to go, I will update the PR description.

I plan on adding a separate sample() function that implements max in the same way, including with a shuffle=True option. Fun fact: sample(n, shuffle=True) will be equivalent to NetLogo's up-to-n-of. @quaquel I know you hate NetLogo with all your hearth, but sometimes you can learn a lot from them ;).

But that would be separate PR.

quaquel · 2024-08-30T05:47:33Z

I am unsure about using a single keyword for both the number and the percentage, but I won't object to it either. I would change the name, however. max shadows the name of a build-in.

It would be nice to see a quick overview of what the API is now becoming just for clarity.

sample(n, shuffle=True) will be equivalent to NetLogo's up-to-n-of. @quaquel I know you hate NetLogo with all your hearth, but sometimes you can learn a lot from them ;).

I hate the language, but, yes, we can pick up useful ideas and give them a better name. sample is much better than that weird construct with hyphens in the name 😉.

EwoutH · 2024-08-30T06:01:18Z

I was thinking max, limit, ceiling or at_most.

Any suggestions (either these or another)?

Corvince · 2024-08-30T07:01:52Z

I like at_most the best. It conveys that "n" can be arbitrary large, but must the number of returned agents must not match. It also sort of implies that you first apply a filter and then take a sample. And it also makes the rounding clear for fractions. So 1/3 of 5 (1.67) will be 1 agent, otherwise it would be more than 1/3.

EwoutH · 2024-08-30T07:14:26Z

So 1/3 of 5 (1.67) will be 1 agent, otherwise it would be more than 1/3.

Currently it does round, do you think it shouldn't?

Corvince · 2024-08-30T07:20:49Z

If its an upper limit I think it should always round down/floor

EwoutH · 2024-08-30T07:32:19Z

Difficult one. Because if you describe it as "selecting a fraction" I would expect it to select the closest match.

I think in many practical scenarios the closest selection to the fraction you wanted is most logical.

quaquel · 2024-08-30T07:51:23Z

If we go with at_most, it should round down in the case of fractions. Otherwise, the name and behavior don't match.

Corvince · 2024-08-30T07:51:31Z

Valid argument for "selecting a fraction", but for selecting "at most" 33% I would not expect it to select 40%

Corvince · 2024-08-30T07:53:06Z

If we go with at_most, it should round down in the case of fractions. Otherwise, the name and behavior don't match.

Exactly. Thats why I think its a good name (if we floor), because people will always have different expectations for "selecting a fraction" with respect to rounding.

EwoutH · 2024-08-30T08:12:28Z

I renamed max to at_most, made sure it rounded down, and updated the tests.

EwoutH · 2024-08-30T08:22:09Z

PR description is updated, including the usage examples

mesa/agent.py

EwoutH · 2024-08-30T10:54:43Z

@projectmesa/maintainers ready to go? (would like to merge myself)

mesa/agent.py

EwoutH · 2024-08-30T12:52:04Z

(keeping the branch in case of regressions)

This PR updates the `select` method in the `AgentSet` class by replacing the `n` parameter with a more versatile `at_most` parameter. The `at_most` parameter allows for selecting either a specific number of agents or a fraction of the total agents when provided as an integer or a float, respectively. Additionally, backward compatibility is maintained by supporting the deprecated `n` parameter, which will trigger a warning when used. ### Motive Previously, the `select` method only allowed users to specify a fixed number of agents (`n`) to be selected. The new `at_most` parameter extends this functionality by enabling the selection of agents based on a proportion of the total set, which is particularly useful in scenarios where relative selection is desired over absolute selection. ### Implementation - **`at_most` Parameter:** - Accepts either an integer (to select a fixed number of agents) or a float between 0.0 and 1.0 (to select a fraction of the total agents). - `at_most=1` selects one agent, while `at_most=1.0` selects all agents. - If a float is provided, it determines the maximum fraction of agents to be selected from the total set. It rounds down to the nearest number of whole agents. - **Backward Compatibility:** - The deprecated `n` parameter is still supported, but it now serves as a fallback for `at_most` and triggers a deprecation warning. - **Behavior Notes:** - `at_most` serves as an upper limit on the number of selected agents. If additional filtering criteria are provided, the final selection may include fewer agents. - For random sampling, users should shuffle the `AgentSet` before applying `at_most`.

EwoutH added the enhancement Release notes label label Aug 28, 2024

EwoutH mentioned this pull request Aug 28, 2024

AgentSet: Add set method #2254

Merged

quaquel reviewed Aug 28, 2024

View reviewed changes

mesa/agent.py Outdated Show resolved Hide resolved

quaquel reviewed Aug 28, 2024

View reviewed changes

mesa/agent.py Outdated Show resolved Hide resolved

quaquel reviewed Aug 28, 2024

View reviewed changes

mesa/agent.py Outdated Show resolved Hide resolved

EwoutH and others added 3 commits August 28, 2024 19:30

AgentSet: Allow selecting a fraction of agents in the AgentSet

69ee552

Allow selecting a fraction of agents in the AgentSet.

Agentset.select(): n and p can't be set simultaneously

fadb496

[pre-commit.ci] auto fixes from pre-commit.com hooks

e096bc1

for more information, see https://pre-commit.ci

EwoutH force-pushed the select_fraction branch from 7966884 to e096bc1 Compare August 28, 2024 17:30

EwoutH and others added 6 commits August 28, 2024 19:45

Rename p argument to fraction

38920b0

Also add the Raises ValueError and Note about not shuffling by default.

[pre-commit.ci] auto fixes from pre-commit.com hooks

eeec30a

for more information, see https://pre-commit.ci

Another p --> fraction

a06e00e

Add tests for select fraction

75c67d9

[pre-commit.ci] auto fixes from pre-commit.com hooks

6651426

for more information, see https://pre-commit.ci

Update fraction notes

af8f076

EwoutH added 3 commits August 28, 2024 22:52

Fix tests

ce76c80

Fix codespell

7897d25

Really fix tests

4f12317

AgentSet.select(): Rename max to at_most

4a0f8c3

EwoutH added breaking Release notes label deprecation When a new deprecation is introduced and removed breaking Release notes label labels Aug 30, 2024

Corvince reviewed Aug 30, 2024

View reviewed changes

mesa/agent.py Outdated Show resolved Hide resolved

Corvince reviewed Aug 30, 2024

View reviewed changes

mesa/agent.py Outdated Show resolved Hide resolved

Docs update

a5d1d09

Corvince approved these changes Aug 30, 2024

View reviewed changes

quaquel approved these changes Aug 30, 2024

View reviewed changes

mesa/agent.py Outdated Show resolved Hide resolved

Clean note

37a3619

EwoutH merged commit efa51cd into main Aug 30, 2024
9 of 10 checks passed

EwoutH deleted the select_fraction branch September 20, 2024 09:09

EwoutH mentioned this pull request Sep 26, 2024

tests: Resolve warnings by removing scheduler and updating arguments #2329

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AgentSet: Allow selecting a fraction of agents in the AgentSet #2253

AgentSet: Allow selecting a fraction of agents in the AgentSet #2253

EwoutH commented Aug 28, 2024 •

edited

Loading

github-actions bot commented Aug 28, 2024

quaquel commented Aug 28, 2024

Corvince commented Aug 28, 2024

EwoutH commented Aug 28, 2024

quaquel commented Aug 28, 2024

quaquel commented Aug 28, 2024

EwoutH commented Aug 28, 2024 •

edited

Loading

rht commented Aug 28, 2024

EwoutH commented Aug 28, 2024

EwoutH commented Aug 28, 2024

Corvince commented Aug 29, 2024

EwoutH commented Aug 29, 2024 •

edited

Loading

Corvince commented Aug 29, 2024 •

edited

Loading

EwoutH commented Aug 29, 2024 •

edited

Loading

quaquel commented Aug 30, 2024

EwoutH commented Aug 30, 2024

Corvince commented Aug 30, 2024

EwoutH commented Aug 30, 2024

Corvince commented Aug 30, 2024

EwoutH commented Aug 30, 2024

quaquel commented Aug 30, 2024

Corvince commented Aug 30, 2024

Corvince commented Aug 30, 2024

EwoutH commented Aug 30, 2024

EwoutH commented Aug 30, 2024

EwoutH commented Aug 30, 2024 •

edited

Loading

EwoutH commented Aug 30, 2024

AgentSet: Allow selecting a fraction of agents in the AgentSet #2253

AgentSet: Allow selecting a fraction of agents in the AgentSet #2253

Conversation

EwoutH commented Aug 28, 2024 • edited Loading

Motive

Implementation

Usage Examples

github-actions bot commented Aug 28, 2024

quaquel commented Aug 28, 2024

Corvince commented Aug 28, 2024

EwoutH commented Aug 28, 2024

quaquel commented Aug 28, 2024

quaquel commented Aug 28, 2024

EwoutH commented Aug 28, 2024 • edited Loading

rht commented Aug 28, 2024

EwoutH commented Aug 28, 2024

EwoutH commented Aug 28, 2024

Corvince commented Aug 29, 2024

EwoutH commented Aug 29, 2024 • edited Loading

Corvince commented Aug 29, 2024 • edited Loading

EwoutH commented Aug 29, 2024 • edited Loading

quaquel commented Aug 30, 2024

EwoutH commented Aug 30, 2024

Corvince commented Aug 30, 2024

EwoutH commented Aug 30, 2024

Corvince commented Aug 30, 2024

EwoutH commented Aug 30, 2024

quaquel commented Aug 30, 2024

Corvince commented Aug 30, 2024

Corvince commented Aug 30, 2024

EwoutH commented Aug 30, 2024

EwoutH commented Aug 30, 2024

EwoutH commented Aug 30, 2024 • edited Loading

EwoutH commented Aug 30, 2024

EwoutH commented Aug 28, 2024 •

edited

Loading

EwoutH commented Aug 28, 2024 •

edited

Loading

EwoutH commented Aug 29, 2024 •

edited

Loading

Corvince commented Aug 29, 2024 •

edited

Loading

EwoutH commented Aug 29, 2024 •

edited

Loading

EwoutH commented Aug 30, 2024 •

edited

Loading