Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better behaviour for giant component calculation of null graph #649

Closed
szhorvat opened this issue Mar 14, 2023 · 1 comment
Closed

Better behaviour for giant component calculation of null graph #649

szhorvat opened this issue Mar 14, 2023 · 1 comment

Comments

@szhorvat
Copy link
Member

szhorvat commented Mar 14, 2023

When trying to take the giant component of the null graph, this happens:

In [16]: ig.Graph().connected_components().giant()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[16], line 1
----> 1 ig.Graph().connected_components().giant()

File ~/opt/miniconda3/envs/net/lib/python3.9/site-packages/igraph/clustering.py:429, in VertexClustering.giant(self)
    414 """Returns the largest cluster of the clustered graph.
    415 
    416 The largest cluster is a cluster for which no larger cluster exists in
   (...)
    426 @return: a copy of the largest cluster.
    427 """
    428 ss = self.sizes()
--> 429 max_size = max(ss)
    430 return self.subgraph(ss.index(max_size))

ValueError: max() arg is an empty sequence

The behaviour could be more user-friendly. Here are two options:

  1. Either throw an error which clearly states that "the null graph has no giant component" (instead of "max() arg is an empty sequence"). This makes sense since the null graph has zero connected components, thus it has no largest connected component.
  2. Or generalize the idea and explicitly define the giant component of the null graph to be a null graph as well. This is a useful generalization to the quirky edge case of the null graph, where it's impossible to keep everything consistent anyway. The advantage is that that more operations would succeed on all graphs, including the null graph. For example, consider computing a percolation curve, i.e. what is the size of the giant component after removing a given number of vertices? Now we can go to zero. The result of .giant() is still a subgraph of the original graph—this is fine.

I have a slight preference towards (2) for its practicality. This is Design Principle 2 from https://github.com/igraph/igraph/wiki/Design-principles-for-functions This is the choice I made in the Mathematica interface.


Update: Here's a further argument for why (2) is also mathematically sensible.

We can think of (1) as: Partition the vertex set according to connected components, then take a largest partition. Since in the case of the null graph, there are no partitions, this makes no sense.

We can think of (2) as: Take the set of vertices that belong to a largest connected component, and produce the associated induced subgraph. Since there is no largest connected component, the corresponding set of vertices is empty. Then the subgraph induced by them is the null graph.

At this point I'm firmly in favour of (2).

@ntamas ntamas closed this as completed in 79a7829 Mar 14, 2023
@ntamas
Copy link
Member

ntamas commented Mar 14, 2023

Implemented (2), thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants