Skip to content

A collection of useful scripts that show the potential of Open AI's powerful multimodal model CLIP

License

Notifications You must be signed in to change notification settings

Halvani/CLIP_Playground

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CLIP Playground

A collection of useful jupyter notebooks that show the potential of Open AI's powerful multimodal model CLIP

CLIP-based image categorizer

Given a folder containing unordered images, the CLIP-based image categorizer jupyter notebook allows you to categorize all the images according to the labels you specify. For example, imagine you have a folder with a bunch of photos of 🐱 and 🐸. You first specify the path to this folder, and define the labels by which you want to categorize the photos (in this case labels = ["cat", "frog"]. Depending on your hardware (gpu is recommended here) and the number of photos, the categorization process may take a while. The result looks similar to the following...


As impressive as the result of automatic categorization is, there are of course some limitations here:

  • The main limitation is that the underlying model reflects the quality of the categorization. Consequently, a weak model leads to unfavorable results. Currently, the image categorizer makes use of the ViT-L/14 model, which according to my experiments has led to the most reliable results so far. You can find other models at the 🤗 (Hugging Face) repo of Open AI

  • In addition, the image categorizer accepts only English texts as input, since the underlying ViT-L/14 model has been trained only on such texts. However, you can replace it with the multilingual clip-ViT-B-32 model, but this comes at the expense of performance.

  • Another limitation concerns the way you define the labels. When these do not adequately represent the images you provide, the images can be categorized incorrectly. In other words, there is currently no support for a garbage class.

Happy categorizing!

About

A collection of useful scripts that show the potential of Open AI's powerful multimodal model CLIP

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published