Better behaviour for giant component calculation of null graph #649

szhorvat · 2023-03-14T14:01:13Z

When trying to take the giant component of the null graph, this happens:

In [16]: ig.Graph().connected_components().giant()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[16], line 1
----> 1 ig.Graph().connected_components().giant()

File ~/opt/miniconda3/envs/net/lib/python3.9/site-packages/igraph/clustering.py:429, in VertexClustering.giant(self)
    414 """Returns the largest cluster of the clustered graph.
    415 
    416 The largest cluster is a cluster for which no larger cluster exists in
   (...)
    426 @return: a copy of the largest cluster.
    427 """
    428 ss = self.sizes()
--> 429 max_size = max(ss)
    430 return self.subgraph(ss.index(max_size))

ValueError: max() arg is an empty sequence

The behaviour could be more user-friendly. Here are two options:

Either throw an error which clearly states that "the null graph has no giant component" (instead of "max() arg is an empty sequence"). This makes sense since the null graph has zero connected components, thus it has no largest connected component.
Or generalize the idea and explicitly define the giant component of the null graph to be a null graph as well. This is a useful generalization to the quirky edge case of the null graph, where it's impossible to keep everything consistent anyway. The advantage is that that more operations would succeed on all graphs, including the null graph. For example, consider computing a percolation curve, i.e. what is the size of the giant component after removing a given number of vertices? Now we can go to zero. The result of .giant() is still a subgraph of the original graph—this is fine.

I have a slight preference towards (2) for its practicality. This is Design Principle 2 from https://github.com/igraph/igraph/wiki/Design-principles-for-functions This is the choice I made in the Mathematica interface.

Update: Here's a further argument for why (2) is also mathematically sensible.

We can think of (1) as: Partition the vertex set according to connected components, then take a largest partition. Since in the case of the null graph, there are no partitions, this makes no sense.

We can think of (2) as: Take the set of vertices that belong to a largest connected component, and produce the associated induced subgraph. Since there is no largest connected component, the corresponding set of vertices is empty. Then the subgraph induced by them is the null graph.

At this point I'm firmly in favour of (2).

The text was updated successfully, but these errors were encountered:

ntamas · 2023-03-14T22:14:59Z

Implemented (2), thanks.

ntamas closed this as completed in 79a7829 Mar 14, 2023

szhorvat mentioned this issue Apr 20, 2023

feat: New largest_component() returns the largest connected component igraph/rigraph#786

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better behaviour for giant component calculation of null graph #649

Better behaviour for giant component calculation of null graph #649

szhorvat commented Mar 14, 2023 •

edited

Loading

ntamas commented Mar 14, 2023

Better behaviour for giant component calculation of null graph #649

Better behaviour for giant component calculation of null graph #649

Comments

szhorvat commented Mar 14, 2023 • edited Loading

ntamas commented Mar 14, 2023

szhorvat commented Mar 14, 2023 •

edited

Loading