Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Engine: Stochastic Block Model (sbm) #189

Open
wants to merge 10 commits into
base: dev
Choose a base branch
from
Open

Conversation

sadrasabouri
Copy link
Collaborator

What does this implement/fix? Explain your changes.

This PR added a new engine: stochastic block model.
This generation model partitions n vertices into k set: ${C_1, C_2, \cdots, C_k}$, then a matrix $P_{k \times k}$ would assign the probability of existence of an edge between $v_i$ and $v_j$ with probability $P_{ij}$.

Also some minor edits for prior engines have been applied.

Any other comments?

Two points that need to be clarified:

  1. Due to the technical debt from the prior inputting system implementation I couldn't add validation for block_sizes and probability_matrix. Let me know what we should do.
  2. I implemented self-loop for this engine but we should note that the probability of self-loop is $P_{ii}$ for $v_i$ as well. If you think that's not appropriate we can remove it.

Copy link
Owner

@sepandhaghighi sepandhaghighi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sadrasabouri Thanks for your efforts 💯
I believe that obtaining probability_matrix and block_sizes directly via the CLI is not advisable due to size issue! Instead, we can prompt the user for parameters such as intra_probability, inter_probability, and number_of_blocks. Based on these inputs, we can generate the probability_matrix and block_sizes, restricting custom edits to the config mode.

Copy link

codecov bot commented Dec 14, 2024

Codecov Report

Attention: Patch coverage is 78.78788% with 14 lines in your changes missing coverage. Please review.

Project coverage is 96.87%. Comparing base (44afa5f) to head (5872946).
Report is 72 commits behind head on dev.

Files with missing lines Patch % Lines
pyrgg/functions.py 36.37% 12 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #189      +/-   ##
==========================================
- Coverage   98.33%   96.87%   -1.46%     
==========================================
  Files           4        8       +4     
  Lines         419      637     +218     
  Branches       68       91      +23     
==========================================
+ Hits          412      617     +205     
- Misses          4       15      +11     
- Partials        3        5       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sadrasabouri
Copy link
Collaborator Author

Hey @sepandhaghighi, I applied the changes. This PR needed an immediate post-merge PR which takes into consideration the following points:

  • Add new tests for warnings to increase the coverage
  • Modify save/load config to make it more engine-specific. We should only save engine-related information in the config file, maybe change the extension to represent the engine, and check for the formatting. An example of a config file now generated for SBM model is presented bellow.
{
  "file_name": "asd",
  "number_of_files": 1,
  "vertices": 10,
  "max_weight": 1,
  "min_weight": 1,
  "min_edge": 0,
  "max_edge": 0,
  "edge_number": 0,
  "sign": true,
  "output_format": "JSON",
  "weight": true,
  "engine": "sbm",
  "direct": false,
  "self_loop": false,
  "multigraph": false,
  "config": true,
  "probability": 0.5,
  "blocks": 3,
  "inter_probability": 0.1,
  "intra_probability": 0.8,
  "block_sizes": [
    3,
    3,
    4
  ],
  "probability_matrix": [
    [
      0.9,
      0,
      0
    ],
    [
      0,
      0.9,
      0
    ],
    [
      0,
      0,
      0.9
    ]
  ],
  "pyrgg_version": "1.6"
}

Copy link
Owner

@sepandhaghighi sepandhaghighi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sadrasabouri Thanks for your efforts 💯
I believe we can eliminate self_loop and direct from the inputs. These parameters can both be managed through the probability_matrix.

pyrgg/params.py Outdated
@@ -142,6 +154,10 @@

PYRGG_CONFIG_SAVE_ERROR_MESSAGE = "[Error] Failed to save config file!"

PYRGG_UNDIVISIBLE_WARNING_MESSAGE = "Warning : Vertices are not divisible by blocks. The last block will have the remaining vertices."
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Warning] Vertices ...

pyrgg/params.py Outdated
@@ -142,6 +154,10 @@

PYRGG_CONFIG_SAVE_ERROR_MESSAGE = "[Error] Failed to save config file!"

PYRGG_UNDIVISIBLE_WARNING_MESSAGE = "Warning : Vertices are not divisible by blocks. The last block will have the remaining vertices."

PYRGG_SBM_WARNING_MESSAGE = "[Warning] - In CLI mode, Stochastic Block Model gets the number of blocks and inter/intra probabilities. To get more detailed configuration, please save and edit the config file."
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Remove In CLI mode,
  2. [Warning] Stochastic ...

@sadrasabouri
Copy link
Collaborator Author

@sadrasabouri Thanks for your efforts 💯 I believe we can eliminate self_loop and direct from the inputs. These parameters can both be managed through the probability_matrix.

I agree with removing direct, but that's not the case for the self_loop. probability_matrix identifies the probability of blocks, not nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants