Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

infinite loop occured while using kmeans on a dataset with 700 datapoints #3

Open
xmen4u opened this issue Sep 21, 2012 · 4 comments
Open

Comments

@xmen4u
Copy link

xmen4u commented Sep 21, 2012

I'd love to give more details . movement is always true... (inside the loop it ofcourse false, but then changes to true)...

not sure how to fix it, really problematic.
help is needed.
It might be because of a local minima point. I've added an iterations limit to the while, so while(movement && (iterations < max_iteratoins)){ ... }

this isn't a good solution obviously, but im not familiar with the algorithm
http://blog.endava.com/k-means-clustering-algorithm
" However, in this form, there is a risk to get stuck in a local minima. By local minima I mean the local minima of the cost function:"

@harthur
Copy link
Owner

harthur commented Sep 21, 2012

Yeah, it sounds like it's not converging. A max limit based on the number of items would be good. Also try using a different k.

@xmen4u
Copy link
Author

xmen4u commented Sep 21, 2012

Could you please elaborate more, with code concerning the "max limit based on the number of items"? what items? points? different k as in different k random points right? or different k as in different number of clusters?

tnx

P.S. if you'll notice in the code i've linked above, he has 2 guards, one uses epsilon , the other he iterates 100 the whole process , just not to fall into a minima.

@harthur
Copy link
Owner

harthur commented Feb 20, 2013

Sorry, just came across this again! I meant k as in number of clusters, should have specified. The limit would be on the number of iterations. In the k-means algorithm there are iterations, in each iteration, every point finds the best cluster for it, and this process stops when there's an iteration with no movement.

@irony
Copy link

irony commented May 13, 2014

I've found that if any of the points are undefined, the process stops in an endless loop, maybe this was what happened here. It could be solved by doing a check for undefined in distance.js:

       total += Math.pow(v2[i] || 0 - v1[i] || 0 , 2);      

instead of

   total += Math.pow(v2[i] - v1[i] , 2);      

and the same for manhattan and max.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants