Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Move Louvain clustering from prototypes to core #3111

Merged
merged 14 commits into from
Aug 2, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
Louvain Clustering
=======
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to extend the signs to encompass at minimum the entire title.


Groups items using the Louvain clustering algorithm.

Inputs
Data
input dataset

Outputs
Data
dataset with cluster index as a class attribute
Graph (with the Network addon)
the weighted k-nearest neighbor graph


The widget first converts the input data into a k-nearest neighbor graph. To preserve the notions of distance, the Jaccard index for the number of shared neighbors is used to weight the edges. Finally, a `modularity optimization <https://en.wikipedia.org/wiki/Louvain_Modularity>`_ community detection algorithm is applied to the graph to retrieve clusters of highly interconnected nodes. The widget outputs a new dataset in which the cluster index is used as a meta attribute.


.. figure:: images/Louvain-stamped.png

1. PCA processing is typically be applied to the original data to remove noise.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically be applied

2. The distance metric is used for finding specified number of nearest
neighbors.
3. The number of nearest neighbors to use to form the KNN graph.
4. Resolution is a parameter for the Louvain community detection algorithm that
affects the size of the recovered clusters. Smaller resolutions recover
smaller, and therefore a larger number of clusters, and conversely, larger
values recover clusters containing more data points.
5. When *Apply Automatically* is ticked, the widget will automatically
communicate all changes. Alternatively, click *Apply*.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the bottom it is great to provide a minimum workflow for the widget. Or a cool way how to use it, preferably on a simple data set that exists in Orange.

References
----------

Blondel, Vincent D., et al. "Fast unfolding of communities in large networks." Journal of statistical mechanics: theory and experiment 2008.10 (2008): P10008.

Lambiotte, Renaud, J-C. Delvenne, and Mauricio Barahona. "Laplacian dynamics and multiscale modular structure in networks." arXiv preprint arXiv:0812.1770 (2008).