Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/469 kmeans rework #470

Merged

Conversation

Cdebus
Copy link
Contributor

@Cdebus Cdebus commented Jan 30, 2020

Description

Implementation of cdist function in heat/spatial/distances.py module for accelerated distance matrix computation
Rework of kmeans to speed up computation

Issue/s resolved: #467 and #469

Changes proposed:

  • Module spatial/distances with function cdist
  • heat.cdist works with either torch.cdist or via quadratic expansion (experimental!)
  • Kmeans uses cdist in kmeans++ initialization and Lloyed iterations
  • removed dimension expansion to 3D in kmeans (redundant)
  • still requires proper documentation

Type of change

  • Optimization

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relavent functions
  • Documentation updated (if needed)
  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

- needs further development for different split settings

Adjustment of kmeans to speed-up distance matrix computation
@codecov
Copy link

codecov bot commented Feb 3, 2020

Codecov Report

❗ No coverage uploaded for pull request base (master@9d9ebba). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #470   +/-   ##
=========================================
  Coverage          ?   96.71%           
=========================================
  Files             ?       60           
  Lines             ?    12255           
  Branches          ?        0           
=========================================
  Hits              ?    11853           
  Misses            ?      402           
  Partials          ?        0
Impacted Files Coverage Δ
heat/cluster/kmeans.py 82.22% <ø> (ø)
heat/cluster/tests/test_kmeans.py 84.37% <100%> (ø)
heat/spatial/tests/test_distances.py 90.19% <100%> (ø)
heat/spatial/distances.py 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d9ebba...e6ad627. Read the comment docs.

Copy link
Member

@coquelin77 coquelin77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please update the changelog?

heat/spatial/tests/test_distances.py Show resolved Hide resolved
@coquelin77 coquelin77 merged commit 9ff98ef into helmholtz-analytics:master Feb 6, 2020
@Cdebus Cdebus deleted the features/469-KmeansRework branch April 7, 2020 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move "cluster" submodule under heat
2 participants