-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider dropping Vertex ID #131
Comments
The lack of feedback appears to indicate the Vertex ID will not be missed much :-) I'll look into removing this and see how it works out. |
For the reference: Dropping the vertex ID also helps avoiding performance issues when adding a huge number of vertices, see #136 for some more background. |
here is my feedback: I personally use and depend on the ID field a whole lot. The graph uniqueness enforcement comes in handy sometimes. Cloning a vertex within a graph is rather important. However, there should be a "flag" or a mechanism by which we one can choose to enforce uniqueness. As long as that is the case, I don't think anyone will miss the ID field in particular. |
This ticket hasn't seen any updates in quite a while and I suppose it's about time for a progress update at least: I'm still planning to remove the ID attribute, but will postpone this to a later release simply because a release has been outstanding for quite some time and I would rather not introduce too many BC breaks in a single release. This should also give people plenty of time to plan ahead and be aware that they might be relying on a deprecated feature that will be removed with the following release. |
I strongly oppose this change. Vertex identity is a key concept in basically every graph algorithm. Even the basics like BFS and DFS rely on identity to know when cycles are encountered. How else do you suggest doing vertex comparison in way that is portable between graph instances? Suppose we have an algorithm that decomposes a graph into N subgraphs before doing further processing. Simple cycles (graphp/algorithms#33) for example. A subgraph with all new vertex instances and no $id would only be identifiable by its object instance, which would be a different object to the equivalent vertex in the original graph. Sure, you would be able to get the strongly connected components, and then the simple cycles within those strongly connected components, but there would be no way to relating the resulting vertices to the original ones. Pretty much defeats the point of the exercise, no? |
Identity is a core concept. Any other attributes are labels. We can't remove identity.
What's the point of having an identical clone of a vertex in a graph? That would just be redundant. This is very different to having a new vertex with new identity and a matching set of in/out edges.
Yes, this is the whole point of identity. |
@phyrwork Thank you for your input, I think you're bringing up some very good points here! 👍
I think we can agree that identity is indeed a core concept and that all other attributes are indeed just arbitrary attributes. I think we can discuss the idea of whether identity needs to be a core concept in this library and I would argue that we can indeed remove IDs altogether.
Admittedly, I agree that relying on the vertex ID is a convenient feature that is used extensively in our algorithms. As an alternative, we can also rely on PHP's object instance identity using with the $graph = new Graph();
$a = $graph->createVertex();
$b = $graph->createVertex();
assert($a !== $b); Similarly, many algorithms rely on marking vertices as "visited" and currently rely on the vertex ID for this. This can similarly be achieved by either using $graph = new Graph();
$a = $graph->createVertex();
$b = $graph->createVertex();
$visited = [
spl_object_hash($a) => true
];
assert(isset($visited[spl_object_hash($a)]));
assert(!isset($visited[spl_object_hash($b)])); Both examples have the advantage that this is entirely built-in and nothing we have to worry about. Additionally, this actually ensures identity of an instance, not just identity within some graph. This is an important distinction because a number of places rely on identity when it comes to multiple independent (or cloned) graphs like this: $graph = new Graph();
$a = $graph->createVertex();
$graph = new Graph();
$b = $graph->createVertex();
// these are never the same, even if they may have the same vertex ID
assert($a !== $b);
Again a very good point. And arguably one that I suppose may be a bit more tricky to address. If possible, it may be best to avoid this altogether (depending on the algorithm of course). As an alternative, one option would be to map from the original vertex instance to the cloned vertex instance relying on vertex positions: $graph = new Graph();
$graph->createVertex();
$graph->createVertex();
$clone = $graph->createGraphClone();
$original = $graph->getVertices();
$new = $clone->getVertices();
assert($original[0] !== $new[0]);
assert($original[1] !== $new[1]); Similarly, one can easily build a map using the instance identifiers as in the previous example. Or as another alternative, we may even attach some arbitrary identifier attributes to the original graph before cloning and then use this to match from the cloned instances. Admittedly, this may sound like a non-trivial change and one that still has to be evaluated to see if it's reasonable to change our algorithms like this and how much of a big deal this actually is. My gut feeling is that this sounds more challenging than it actually is, as not too many algorithms should rely on this in the first place. Again, your input is very much appreciated! The suggested change is still under discussion and nothing is set in stone yet. Keep it coming, any input is welcome! |
Yes, this is what I was getting at. But you're suggesting that vertex objects can be shared between graphs. And how would this vertex ID be represented in a user friendly way when exporting to X format? Not very user-friendly to export a huge string when you could just export 1,2,3,4... Singleton vertex objects complicates a whole bunch of stuff.
Essentially you are suggesting maintaining a linked list of objects to their original objects in place of simply retaining vertex ID like every other graph library in existence does. Vertex ID is fundamental in graph theory. Look at every diagram that explains graph theory ever. Nodes are displayed with numbers representing their origin. Traversing a list/maps to maps to maps to... (you get the idea) is wasteful and overly complicated. I advise against this move. It seems like a simplification but the ramifications are significant. |
@phyrwork I think we agree that identity is an important concept in graph theory. I still think that it makes sense to rely on language-level identity (object identity) instead of applying arbitrary identifiers on top of the existing objects. I've just merged #185 and have prepared related changes for our algorithms and graph rendering and the changeset looks quite reasonable (imho). I'll file these PRs as soon as possible and will link against this ticket to keep this discussion going. If we feel that having explicit identifiers in here makes sense in the future (again), we can still add this later on without introducing a BC break. Thanks for the discussion! 👍 |
Some thoughts:
Vertex
andEdge
classes, such as weight, balance etc. and will use general purposeAttributes
instead.Attributes
. Arguably, this might cause confusion and limits how this library can be used.As an alternative, we should consider if dropping the Vertex ID altogether could be an option.
Attributes
insteadThe text was updated successfully, but these errors were encountered: