-
Notifications
You must be signed in to change notification settings - Fork 923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GroupBy: Add count
and agg
methods
#2290
Conversation
Performance benchmarks:
|
|
@@ -606,6 +606,31 @@ def do(self, method: Callable | str, *args, **kwargs) -> GroupBy: | |||
|
|||
return self | |||
|
|||
def count(self) -> dict[Any, int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it can be expressed with agg
, I am in favour of having a separate method for count
. It just makes code more readable and easier to write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be dict[Hashable, int]
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, missed this one.
Aren't all dicts by design required to have Hashable
as a key? Seems a bit redundant to define that every time.
Like, the int
part adds information. The Hashable
doesn't (over Any
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please "Unresolve conversation" if you have additional comments by the way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Hashable
adds information to the type checker (in an IDE) to constraint user code. Arguably, any dict[Any, *]
should be dict[Hashable, *]
. I don't have a citation on why this is better. ChatGPT (the fallback version after my quota of GPT-4o has run out) seems to agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds about right.
A smart type checker should know you can’t use non-hashable types as dict keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried Mypy:
from typing import Hashable, Any
a: dict[Hashable, int] = {[1]: 1}
a: dict[Any, int] = {[1]: 1}
It caught the first one, but doesn't consider the 2nd one to be a type error.
Added two new methods to the `GroupBy` class to enhance aggregation and group operations: - `count`: Returns the count of agents in each group. - `agg`: Performs aggregation on a specific attribute across groups, applying a function like `sum`, `min`, `max`, or `mean`. These methods improve flexibility in applying both group-level and attribute-specific operations.
- Extend test_agentset_groupby() function - Include tests for count() method to verify group sizes - Add tests for agg() method with sum, max, min, and custom functions - Ensure proper handling of grouped data and attribute aggregation
Change return type annotation of GroupBy.agg() from dict[Any, Any] to dict[Hashable, Any]
d8891c0
to
817b532
Compare
for more information, see https://pre-commit.ci
Added tests and cleaned up commits, ready for final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Added two new methods to the `GroupBy` class to enhance aggregation and group operations: - `count`: Returns the count of agents in each group. - `agg`: Performs aggregation on a specific attribute across groups, applying a function like `sum`, `min`, `max`, or `mean`. These methods improve flexibility in applying both group-level and attribute-specific operations.
Added two new methods to the
GroupBy
class:count
: Returns the count of agents in each group.agg
: Performs aggregation on a specific attribute across groups, applying a function likesum
,min
,max
, ormean
.Usage examples
count()
:agg(attr_name, func)
:sum
,min
,max
,mean
).I'm a bit worried about our AgentSet and now GroupBy exploding our code base.
(obviously tests are still needed)