-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
7c0a222
commit 4d8c63d
Showing
26 changed files
with
61,708 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,241 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Inferring the Modular Structure of Networks with Weighted Stochastic\n", | ||
"\n", | ||
"Blockmodeling\n", | ||
"\n", | ||
"Simone Santoni \n", | ||
"2024-11-26\n", | ||
"\n", | ||
"This notebook shows how to infer the modular structure of weighted\n", | ||
"networks using the Weighted Stochastic Blockmodeling (WSBM) approach.\n", | ||
"WSBM extends the concept of the Stochastic Blockmodel (SBM) to\n", | ||
"incorporate edge weights into the modeling process, allowing for a more\n", | ||
"detailed representation of the network’s structure. We will use a\n", | ||
"synthetic network dataset to illustrate the key steps involved in\n", | ||
"inferring the modular structure of a weighted network using the WSBM\n", | ||
"approach.\n", | ||
"\n", | ||
"# Overview: inferring the modular structure of networks\n", | ||
"\n", | ||
"The notebook\n", | ||
"[`community_detection.qmd`](https://github.com/simoneSantoni/net-analysis-smm638/blob/master/tutorials/networkModules/community_detection.qmd)\n", | ||
"shows how to infer the modular structure of weighted networks using\n", | ||
"Louvain’s community-detection algorithm. The current notebook focuses on\n", | ||
"a different approach to the same problem: the weighted stochastic\n", | ||
"blockmodeling (WSBM).\n", | ||
"\n", | ||
"WSBM[1] approach is an extension of the stochastic blockmodel (SBM) that\n", | ||
"incorporates edge weights into the modeling process.[2] SBMs are\n", | ||
"probabilistic models used to detect community structures in networks by\n", | ||
"partitioning nodes into blocks or communities, where the probability of\n", | ||
"an edge existing between any two nodes depends only on the blocks to\n", | ||
"which the nodes belong.\n", | ||
"\n", | ||
"WSBM extends this concept by considering not just the presence or\n", | ||
"absence of edges, but also the weights of the edges, which can represent\n", | ||
"the strength or capacity of connections between nodes. This allows for a\n", | ||
"more nuanced understanding of the network’s structure, capturing\n", | ||
"variations in connection strengths within and between communities.\n", | ||
"\n", | ||
"The WSBM approach involves defining a likelihood function that accounts\n", | ||
"for the observed edge weights and optimizing this function to find the\n", | ||
"most likely partition of nodes into communities. This is typically done\n", | ||
"using techniques such as the Expectation-Maximization (EM) algorithm or\n", | ||
"variational inference.\n", | ||
"\n", | ||
"By incorporating edge weights, WSBM provides a more detailed and\n", | ||
"accurate representation of the network, making it particularly useful\n", | ||
"for analyzing weighted networks such as social networks with varying\n", | ||
"interaction strengths, biological networks with different interaction\n", | ||
"intensities, and transportation networks with different capacities. This\n", | ||
"approach enhances the ability to uncover meaningful community structures\n", | ||
"and understand the underlying processes governing the network.\n", | ||
"\n", | ||
"In this notebook, we will learn how to implement the WSBM approach using\n", | ||
"the `graph-tool` library in Python. We will use a synthetic network\n", | ||
"dataset to illustrate the key steps involved in inferring the modular\n", | ||
"structure of a weighted network using the WSBM.\n", | ||
"\n", | ||
"# Notebook setup\n", | ||
"\n", | ||
"We will use the `graph-tool` library for network analysis and\n", | ||
"visualization. If you haven’t already installed `graph-tool`, you can do\n", | ||
"so using the following command:\n", | ||
"\n", | ||
"``` {bash}\n", | ||
"conda activate %YOUR_ENVIRONMENT%\n", | ||
"conda install -c conda-forge graph-tool\n", | ||
"```\n", | ||
"\n", | ||
"Then, we can import the necessary libraries and set up the notebook\n", | ||
"environment:\n", | ||
"\n", | ||
"[1] Tiago P. Peixoto, “Nonparametric weighted stochastic block models”,\n", | ||
"Phys. Rev. E 97, 012306 (2018), DOI: 10.1103/PhysRevE.97.012306, arXiv:\n", | ||
"1708.01432\n", | ||
"\n", | ||
"[2] Paul W. Holland, Kathryn Blackmond Laskey, Samuel Leinhardt,\n", | ||
"“Stochastic blockmodels: First steps”, Social Networks Volume 5, Issue\n", | ||
"2, Pages 109-137 (1983). DOI: 10.1016/0378-8733(83)90021-7" | ||
], | ||
"id": "88ebbed3-99e0-4f8a-8724-df2e5749c1eb" | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from IPython.display import Image\n", | ||
"import numpy as np\n", | ||
"import matplotlib\n", | ||
"from graph_tool.all import *" | ||
], | ||
"id": "98f31231" | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Toy Dataset: ‘Food Web’ network\n", | ||
"\n", | ||
"To illustrate the WSBM approach, we will use a synthetic network dataset\n", | ||
"representing a food web. The food web network consists of different\n", | ||
"species (nodes) connected by interactions (edges) representing\n", | ||
"predator-prey relationships. The edge weights in the network represent\n", | ||
"the strength of these interactions, with higher weights indicating\n", | ||
"stronger relationships.[1] It is worth noticing that the edge values are\n", | ||
"positive $[0, \\infty]$.\n", | ||
"\n", | ||
"[1] Robert E. Ulanowicz, and Donald L. DeAngelis. “Network analysis of\n", | ||
"trophic dynamics in south florida ecosystems.” US Geological Survey\n", | ||
"Program on the South Florida Ecosystem 114 (2005)." | ||
], | ||
"id": "af7fc3a0-9558-49a4-8541-bae99c9dff3a" | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": { | ||
"fig-width": 300 | ||
}, | ||
"outputs": [ | ||
{ | ||
"output_type": "display_data", | ||
"metadata": {}, | ||
"data": {} | ||
} | ||
], | ||
"source": [ | ||
"g = collection.ns[\"foodweb_baywet\"]\n", | ||
"graph_draw(g, pos=g.vp._pos, output_size=(300, 300), output=\"foodweb.png\")\n", | ||
"Image(filename=\"foodweb.png\")" | ||
], | ||
"id": "cell-fig-foodweb" | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<a href=\"#fig-foodweb\" class=\"quarto-xref\">Figure 1</a> does not show a\n", | ||
"clear pattern. Let us see if we can uncover the underlying modular\n", | ||
"structure of this network using the WSBM approach.\n", | ||
"\n", | ||
"# Weighted Stochastic Blockmodeling (WSBM)\n", | ||
"\n", | ||
"The WSBM approach involves defining a likelihood function that accounts\n", | ||
"for the observed edge weights and optimizing this function to find the\n", | ||
"most likely partition of nodes into communities. As the following code\n", | ||
"cell shows, we can use the `graph-tool` library to implement the WSBM\n", | ||
"approach for inferring the modular structure of the food web network.\n", | ||
"Specifically, we will use the `minimize_nested_blockmodel_dl` function\n", | ||
"to fit the WSBM model to the network data. This function takes the\n", | ||
"following arguments in our example:\n", | ||
"\n", | ||
"- `g`: the input graph object representing the food web network\n", | ||
"- `state_args`: a dictionary containing the edge weights (`recs`) and\n", | ||
" their types (`rec_types`) for the model fitting process\n", | ||
"- `rec_types`: the type of edge weights, in this case,\n", | ||
" real-exponential – remember that the edge values are positive!!" | ||
], | ||
"id": "298a7461-9070-44d3-b31e-aa59fd055a6e" | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": { | ||
"fig-width": 300 | ||
}, | ||
"outputs": [ | ||
{ | ||
"output_type": "display_data", | ||
"metadata": {}, | ||
"data": {} | ||
} | ||
], | ||
"source": [ | ||
"# model fit\n", | ||
"state = minimize_nested_blockmodel_dl(\n", | ||
" g, state_args=dict(recs=[g.ep.weight], rec_types=[\"real-exponential\"])\n", | ||
")\n", | ||
"# improve solution with merge-split\n", | ||
"for i in range(100):\n", | ||
" ret = state.multiflip_mcmc_sweep(niter=10, beta=np.inf)\n", | ||
"state.draw(\n", | ||
" edge_color=prop_to_size(g.ep.weight, power=1, log=True),\n", | ||
" ecmap=(matplotlib.cm.inferno, 0.6),\n", | ||
" eorder=g.ep.weight,\n", | ||
" edge_pen_width=prop_to_size(g.ep.weight, 1, 4, power=1, log=True),\n", | ||
" edge_gradient=[],\n", | ||
" output_size=(300, 300),\n", | ||
" output=\"foodweb-wsbm.png\",\n", | ||
")\n", | ||
"# show the plot\n", | ||
"Image(filename=\"foodweb-wsbm.png\")" | ||
], | ||
"id": "cell-fig-foodweb-wsbm" | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<a href=\"#fig-foodweb-wsbm\" class=\"quarto-xref\">Figure 2</a> shows the\n", | ||
"modular structure of the food web network inferred using the WSBM\n", | ||
"approach. Quite evidently, this is a complex visualization:\n", | ||
"\n", | ||
"- The nodes are colored according to the community they belong to,\n", | ||
" with nodes of the same color representing the same community (that\n", | ||
" is, species dependent on the same resources)\n", | ||
"- The size of nodes is proportional to their degree, with larger nodes\n", | ||
" having more connections\n", | ||
"- The layout of the network is determined by the WSBM algorithm, which\n", | ||
" arranges nodes based on their community assignments and interaction\n", | ||
" strengths\n", | ||
"- The color gradient represents the edge weights, with lighter colors\n", | ||
" indicating higher weights\n", | ||
"- The set of square nodes in the center of the network represents the\n", | ||
" reduced form (i.e., simplified) version of the network\n", | ||
"- The square nodes are arranged in a hierarchical structure, with each\n", | ||
" level representing a different level of community organization. In\n", | ||
" other words the square nodes in the outer layer are nested in the\n", | ||
" square nodes in the inner layer." | ||
], | ||
"id": "bfe8cf1d-9391-4e83-9239-f5c88af26a9b" | ||
} | ||
], | ||
"nbformat": 4, | ||
"nbformat_minor": 5, | ||
"metadata": { | ||
"kernelspec": { | ||
"name": "python3", | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"path": "/home/simone/miniconda3/envs/smm638/share/jupyter/kernels/python3" | ||
} | ||
} | ||
} |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
--- | ||
title: Inferring the Modular Structure of Networks with Weighted Stochastic Blockmodeling | ||
author: Simone Santoni | ||
date: last-modified | ||
abstract-title: Synopsis | ||
abstract: This notebook shows how to infer the modular structure of weighted networks using the Weighted Stochastic Blockmodeling (WSBM) approach. WSBM extends the concept of the Stochastic Blockmodel (SBM) to incorporate edge weights into the modeling process, allowing for a more detailed representation of the network's structure. We will use a synthetic network dataset to illustrate the key steps involved in inferring the modular structure of a weighted network using the WSBM approach. | ||
warning: false | ||
fig-cap-location: top | ||
format: | ||
html: | ||
code-fold: true | ||
code-tools: true | ||
toc: true | ||
toc-title: Table of Contents | ||
toc-depth: 2 | ||
toc-location: right | ||
number-sections: true | ||
citations-hover: false | ||
footnotes-hover: false | ||
crossrefs-hover: false | ||
theme: journal | ||
fig-width: 9 | ||
fig-height: 6 | ||
ipynb: default | ||
docx: default | ||
typst: | ||
number-sections: true | ||
df-print: paged | ||
#pdf: | ||
# documentclass: scrartcl | ||
# papersize: letter | ||
# papersize: letter | ||
--- | ||
|
||
# Overview: inferring the modular structure of networks | ||
|
||
The notebook [`community_detection.qmd`](https://github.com/simoneSantoni/net-analysis-smm638/blob/master/tutorials/networkModules/community_detection.qmd) shows how to infer the modular structure of weighted networks using Louvain's community-detection algorithm. The current notebook focuses on a different approach to the same problem: the weighted stochastic blockmodeling (WSBM). | ||
|
||
WSBM[^1] approach is an extension of the stochastic blockmodel (SBM) that incorporates edge weights into the modeling process.[^2] SBMs are probabilistic models used to detect community structures in networks by partitioning nodes into blocks or communities, where the probability of an edge existing between any two nodes depends only on the blocks to which the nodes belong. | ||
|
||
[^1]: Tiago P. Peixoto, “Nonparametric weighted stochastic block models”, Phys. Rev. E 97, 012306 (2018), DOI: 10.1103/PhysRevE.97.012306, arXiv: 1708.01432 | ||
|
||
[^2]: Paul W. Holland, Kathryn Blackmond Laskey, Samuel Leinhardt, “Stochastic blockmodels: First steps”, Social Networks Volume 5, Issue 2, Pages 109-137 (1983). DOI: 10.1016/0378-8733(83)90021-7 | ||
|
||
WSBM extends this concept by considering not just the presence or absence of edges, but also the weights of the edges, which can represent the strength or capacity of connections between nodes. This allows for a more nuanced understanding of the network's structure, capturing variations in connection strengths within and between communities. | ||
|
||
The WSBM approach involves defining a likelihood function that accounts for the observed edge weights and optimizing this function to find the most likely partition of nodes into communities. This is typically done using techniques such as the Expectation-Maximization (EM) algorithm or variational inference. | ||
|
||
By incorporating edge weights, WSBM provides a more detailed and accurate representation of the network, making it particularly useful for analyzing weighted networks such as social networks with varying interaction strengths, biological networks with different interaction intensities, and transportation networks with different capacities. This approach enhances the ability to uncover meaningful community structures and understand the underlying processes governing the network. | ||
|
||
In this notebook, we will learn how to implement the WSBM approach using the `graph-tool` library in Python. We will use a synthetic network dataset to illustrate the key steps involved in inferring the modular structure of a weighted network using the WSBM. | ||
|
||
# Notebook setup | ||
|
||
We will use the `graph-tool` library for network analysis and visualization. If you haven't already installed `graph-tool`, you can do so using the following command: | ||
|
||
```{bash} | ||
conda activate %YOUR_ENVIRONMENT% | ||
conda install -c conda-forge graph-tool | ||
``` | ||
|
||
Then, we can import the necessary libraries and set up the notebook environment: | ||
|
||
```{python} | ||
from IPython.display import Image | ||
import numpy as np | ||
import matplotlib | ||
from graph_tool.all import * | ||
``` | ||
|
||
# Toy Dataset: 'Food Web' network | ||
|
||
To illustrate the WSBM approach, we will use a synthetic network dataset representing a food web. The food web network consists of different species (nodes) connected by interactions (edges) representing predator-prey relationships. The edge weights in the network represent the strength of these interactions, with higher weights indicating stronger relationships.[^3] It is worth noticing that the edge values are positive $[0, \infty]$. | ||
|
||
[^3]: Robert E. Ulanowicz, and Donald L. DeAngelis. “Network analysis of trophic dynamics in south florida ecosystems.” US Geological Survey Program on the South Florida Ecosystem 114 (2005). | ||
|
||
```{python} | ||
#| fig-cap: Food Web Network | ||
#| fig-width: 300 | ||
#| fig-cap-location: margin | ||
#| label: fig-foodweb | ||
g = collection.ns["foodweb_baywet"] | ||
graph_draw(g, pos=g.vp._pos, output_size=(300, 300), output="foodweb.png") | ||
Image(filename="foodweb.png") | ||
``` | ||
|
||
@fig-foodweb does not show a clear pattern. Let us see if we can uncover the underlying modular structure of this network using the WSBM approach. | ||
|
||
# Weighted Stochastic Blockmodeling (WSBM) | ||
|
||
The WSBM approach involves defining a likelihood function that accounts for the observed edge weights and optimizing this function to find the most likely partition of nodes into communities. As the following code cell shows, we can use the `graph-tool` library to implement the WSBM approach for inferring the modular structure of the food web network. Specifically, we will use the `minimize_nested_blockmodel_dl` function to fit the WSBM model to the network data. This function takes the following arguments in our example: | ||
|
||
+ `g`: the input graph object representing the food web network | ||
+ `state_args`: a dictionary containing the edge weights (`recs`) and their types (`rec_types`) for the model fitting process | ||
+ `rec_types`: the type of edge weights, in this case, real-exponential -- remember that the edge values are positive!! | ||
|
||
```{python} | ||
#| fig-cap: Modular Structure of the Food Web Network Inferred Using WSBM | ||
#| fig-width: 300 | ||
#| fig-cap-location: margin | ||
#| label: fig-foodweb-wsbm | ||
# model fit | ||
state = minimize_nested_blockmodel_dl( | ||
g, state_args=dict(recs=[g.ep.weight], rec_types=["real-exponential"]) | ||
) | ||
# improve solution with merge-split | ||
for i in range(100): | ||
ret = state.multiflip_mcmc_sweep(niter=10, beta=np.inf) | ||
state.draw( | ||
edge_color=prop_to_size(g.ep.weight, power=1, log=True), | ||
ecmap=(matplotlib.cm.inferno, 0.6), | ||
eorder=g.ep.weight, | ||
edge_pen_width=prop_to_size(g.ep.weight, 1, 4, power=1, log=True), | ||
edge_gradient=[], | ||
output_size=(300, 300), | ||
output="foodweb-wsbm.png", | ||
) | ||
# show the plot | ||
Image(filename="foodweb-wsbm.png") | ||
``` | ||
|
||
@fig-foodweb-wsbm shows the modular structure of the food web network inferred using the WSBM approach. Quite evidently, this is a complex visualization: | ||
|
||
+ The nodes are colored according to the community they belong to, with nodes of the same color representing the same community (that is, species dependent on the same resources) | ||
+ The size of nodes is proportional to their degree, with larger nodes having more connections | ||
+ The layout of the network is determined by the WSBM algorithm, which arranges nodes based on their community assignments and interaction strengths | ||
+ The color gradient represents the edge weights, with lighter colors indicating higher weights | ||
+ The set of square nodes in the center of the network represents the reduced form (i.e., simplified) version of the network | ||
+ The square nodes are arranged in a hierarchical structure, with each level representing a different level of community organization. In other words the square nodes in the outer layer are nested in the square nodes in the inner layer. |
Binary file added
BIN
+286 KB
tutorials/networkModules/blockmodels_files/figure-html/fig-foodweb-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+253 KB
...ials/networkModules/blockmodels_files/figure-html/fig-foodweb-wsbm-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions
12
...dules/blockmodels_files/libs/bootstrap/bootstrap-45eab9f33756b4204ced05762ced8738.min.css
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.