-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node and Edge Filtering #886
Conversation
Pull Request Test Coverage Report for Build 5226533962
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking like a good start, thanks for doing this! I left a small suggestion inline on how you can remove an unnecessary map()
call. Besides that can you add some tests to validate it works as expected. Also make sure to add a test that raises an exception when filter_function
is run to validate the exception gets raised back to python correctly. Besides that if you could add a release note documenting the addition of this new feature (you can see the details on this here: https://github.com/Qiskit/rustworkx/blob/main/CONTRIBUTING.md#release-notes )
…ted tests for filter_edges and filter_nodes for both PyGraph and PyDiGraph. Created release notes for the functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Danielle, your additions seem to work well with rustworkx
on python. I have personally tested it with both PyGraph
and PyDiGraph
. Everything LGTM!
However, there are some changes to be made to your release notes, as they're not passing the required tests. Because of the way the release notes are generated, the examples you provided will not run unless formatted properly. Consider the examples I have left below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great now, thanks for the fast updates adding the tests and release notes. I just had a few small comments inline mostly with some small documentation improvements we should make.
I also left a question inline about memory allocation, it's not a blocker as one way is not clearly better than the other.
src/digraph.rs
Outdated
res.extract(py) | ||
}; | ||
|
||
let mut n = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A potential performance improvement we can make here is to do:
let mut n = Vec::new(); | |
let mut n = Vec::with_capacity(self.graph.node_count()); |
This basically allocates up front a Vec
with enough space for the number of nodes in the graph. THis avoids having to do multiple memory allocations as the Vec
grows (which are slow). The tradeoff here being that we are likely over allocating memory in the return object (unless filter_function
returns True
for every node. I'm thinking this tradeoff is ok in this case though because each element of the output array is just a 64bit integer so except for pathological edge cases where we have millions of nodes and the filter only has a few matches nobody will really notice if we've allocated and extra space for a few integers.
But, I'm curious what your thoughts are on this tradeoff and what you think we should do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, this makes sense. I think over allocating is worth it in the long run, especially if reallocating is expensive. I'll make that change!
src/digraph.rs
Outdated
res.extract(py) | ||
}; | ||
|
||
let mut e = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above with filter_nodes
we can do:
let mut e = Vec::new(); | |
let mut e = Vec::with_capacity(self.graph.edge_count()); |
@@ -2889,6 +2889,48 @@ impl PyDiGraph { | |||
} | |||
} | |||
|
|||
/// Filters a graph's nodes by some criteria conditioned on a node's data payload and returns those nodes' indices | |||
/// | |||
/// :param filter_function: Function with which to filter nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a little more detail on how this is used. I think we should include what the format is for the filter function (what it gets passed and the required return type). Maybe having an example in the docs here would be useful too since it might be unclear even with an English description of the behavior.
This comment applies to all the functions being added in both graph and digraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the example goes a long way for clarity thanks for adding that. But I still think we should explicitly document the inputs and required output from filter_function
so that it's clear to users that it will be passed a node weight and is required to return a boolean value (this applies to all the docstrings for the other 3 methods too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for updating the docstring
Aims to resolve #800
Adds filter_nodes() and filter_edges() methods to the Graph and DiGraph classes
:)