Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up kmeans via improved distance calculation #469

Closed
Cdebus opened this issue Jan 30, 2020 · 1 comment
Closed

Speed-up kmeans via improved distance calculation #469

Cdebus opened this issue Jan 30, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request high-level functions High-level machine-learning algorithms

Comments

@Cdebus
Copy link
Contributor

Cdebus commented Jan 30, 2020

Related
Experiments have shown kmeans clustering to be rather slow. The main issue is the calcualtion of the distance matrix, which is currently done via dimension expansion and 3D Difference calculation. However, we suspect this to cause cache misses, and substantial overhead to the caculation, thus slowing it down

Feature functionality
Torch offers a cdist(X,Y) function to calculate pairwise distances between all samples (rows) from two vectors. There is also some alternative approaches being discussed.

Additional context
pytorch/pytorch#15253
https://discuss.pytorch.org/t/efficient-distance-matrix-computation/9065

@Cdebus Cdebus self-assigned this Jan 30, 2020
@Cdebus Cdebus added enhancement New feature or request high-level functions High-level machine-learning algorithms labels Jan 30, 2020
@Cdebus Cdebus mentioned this issue Jan 30, 2020
4 tasks
@Cdebus
Copy link
Contributor Author

Cdebus commented Feb 11, 2020

This was taken care of with PR #470 and the extension #479

@Cdebus Cdebus closed this as completed Feb 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high-level functions High-level machine-learning algorithms
Projects
None yet
Development

No branches or pull requests

1 participant