Northwestern Research Computing Services - Distance and similarity - (Python) Workshop

General information

Clustering samples correctly requires accurate measures of similarity. Depending on your data type, we will explore several ways to compute similarity (or distance). Then, we will use hierarchical clustering to group data points. At the end, you will make a clustering dendrogram.

Preworkshop setup

Please have a way to run a Jupyter notebook.

Run on Google colab

One option is to use Google colab. If you have access, then you are done with the pre-work!

Run on your computer

Another option is to run the Jupyter notebook locally. Please install Anaconda or conda. The python packages used in this demonstration are: numpy, scipy, matplotlib, and jupyter.

You can also create a new environment named "dataanalysis" with the necessary python packages. In your terminal:

conda create -n dataanalysis python=3.10 numpy scipy matplotlib jupyter

The terminal will show the creation of an environment, including downloading these python packages. For more detailed information about conda environments.

To enter the environment:

conda activate dataanalysis

To exit the environment:

conda deactivate dataanalysis

The day-of

Run on Google colab

Make a copy of the linked colab notebook so that you can edit!

Run on your computer

Please enter the environment and launch jupyter notebook. In the terminal:

conda activate dataanalysis
jupyter notebook

References

https://docs.scipy.org/doc/scipy/reference/spatial.distance.html

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
NSIP_distance.ipynb		NSIP_distance.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Northwestern Research Computing Services - Distance and similarity - (Python) Workshop

General information

Preworkshop setup

Run on Google colab

Run on your computer

The day-of

Run on Google colab

Run on your computer

References

About

Releases

Packages

Languages

License

nuitrcs/python_computing_distance

Folders and files

Latest commit

History

Repository files navigation

Northwestern Research Computing Services - Distance and similarity - (Python) Workshop

General information

Preworkshop setup

Run on Google colab

Run on your computer

The day-of

Run on Google colab

Run on your computer

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages