Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBSTREAM: multiple issues #1324

Closed
donny741 opened this issue Sep 24, 2023 · 5 comments
Closed

DBSTREAM: multiple issues #1324

donny741 opened this issue Sep 24, 2023 · 5 comments
Assignees

Comments

@donny741
Copy link
Contributor

Hi, I'm new to Python and data science, and I'm currently exploring the clustering algorithms. When comparing the DBSTREAM algorithm implementation with the original paper I noticed some inconsistencies:

  • It seems the - signs before the fading_factor are missing here and here what prevents the clusters with low weight from being removed
  • Length of the micro_clusters list is used as an index key when creating the new cluster (here) what can override the existing clusters because the length is decremented when clusters are removed from the list (here)
  • clustering_is_up_to_date variable introduced with this PR is never being set to True
@MaxHalford
Copy link
Member

Hey @hoanganhngo610, do you have time to take a look at this issue? :)

@hoanganhngo610
Copy link
Contributor

@MaxHalford I will have a look at the issues by the end of this week. Also, thank you @donny741 for raising the issues!

@hoanganhngo610
Copy link
Contributor

@donny741 @MaxHalford I have just made a PR to fix the points that @donny741 proposed within this issue. Hope that the changes are satisfactory.

If you have some time, please do not hesitate to have a look through the PR and leave comments wherever necessary!

@MaxHalford
Copy link
Member

MaxHalford commented Oct 9, 2023

Good job! Just merged the PR.

@donny741 I'll let you close the issue if you're satisfied

MaxHalford pushed a commit that referenced this issue Oct 9, 2023
* Adding negative signs before fading_factor for steps within Algoritjm 2 of the paper by Hashler and Bolanos to allow clusters with low weight removed.

* Change clustering_is_up_to_date to True after every time the function recluster is called.

* initiate new micro cluster based on the maximum key of the existing micro clusters, or indexed as 0 if the list of micro clusters is still empty.

* Add description to the UNRELEASED.md file
@donny741
Copy link
Contributor Author

donny741 commented Oct 9, 2023

Looks good! Thank you!

@donny741 donny741 closed this as completed Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants