Skip to content

Materials for Computing distance (or similarity) for different data types workshop

License

Notifications You must be signed in to change notification settings

nuitrcs/python_computing_distance

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Northwestern Research Computing Services - Distance and similarity - (Python) Workshop

General information

Clustering samples correctly requires accurate measures of similarity. Depending on your data type, we will explore several ways to compute similarity (or distance). Then, we will use hierarchical clustering to group data points. At the end, you will make a clustering dendrogram.

Preworkshop setup

Please have a way to run a Jupyter notebook.

Run on Google colab

One option is to use Google colab. If you have access, then you are done with the pre-work!

Run on your computer

Another option is to run the Jupyter notebook locally. Please install Anaconda or conda. The python packages used in this demonstration are: numpy, scipy, matplotlib, and jupyter.

You can also create a new environment named "dataanalysis" with the necessary python packages. In your terminal:

conda create -n dataanalysis python=3.10 numpy scipy matplotlib jupyter

The terminal will show the creation of an environment, including downloading these python packages. For more detailed information about conda environments.

To enter the environment:

conda activate dataanalysis

To exit the environment:

conda deactivate dataanalysis

The day-of

Run on Google colab

Make a copy of the linked colab notebook so that you can edit!

Run on your computer

Please enter the environment and launch jupyter notebook. In the terminal:

conda activate dataanalysis
jupyter notebook

References

https://docs.scipy.org/doc/scipy/reference/spatial.distance.html

About

Materials for Computing distance (or similarity) for different data types workshop

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%