clip.cpp

Inference of clip model in pure C/C++

Features

Plain C/C++ implementation based on ggml
OPENAI clip model support
- openai-clip-vit-base-patch32
- openai-clip-vit-large-patch14
- openai-clip-vit-large-patch14-336
- openai-clip-vit-base-patch16
OFA-Sys chinese clip support
- ofasys_chinese_clip_vit_huge_patch14,
- ofasys_chinese_clip_vit_large_patch14_336,
- ofasys_chinese_clip_vit_base_patch16,
- ofasys_chinese_clip_vit_large_patch14
16-bit, 32-bit float support
8-bit integer quantization support
Accelerated memory-efficient CPU inference
AVX, AVX2 and AVX512 support for x86 architectures
Metal backend for GPU acceleration.
Cuda backend for GPU acceleration.
Supported platforms
- Linux
- Mac OS

Download weights

download original weights( .bin or .safetensors) For example
- openai clip
  - openai/clip-vit-base-patch32 from https://huggingface.co/openai/clip-vit-base-patch32/tree/main
  - openai/clip-vit-large-patch14 from https://huggingface.co/openai/clip-vit-large-patch14/tree/main
  - openai/clip-vit-large-patch14-336 from https://huggingface.co/openai/clip-vit-large-patch14-336/tree/main
  - openai/clip-vit-base-patch16 from https://huggingface.co/openai/clip-vit-base-patch16/tree/main
- ofasys chinese clip
  - ofasys_chinese_clip_vit_huge_patch14 from https://huggingface.co/OFA-Sys/chinese-clip-vit-huge-patch14
  - ofasys_chinese_clip_vit_large_patch14_336 from https://huggingface.co/OFA-Sys/chinese-clip-vit-large-patch14-336px
  - ofasys_chinese_clip_vit_base_patch16 from https://huggingface.co/OFA-Sys/chinese-clip-vit-base-patch16
  - ofasys_chinese_clip_vit_large_patch14 from https://huggingface.co/OFA-Sys/chinese-clip-vit-large-patch14

Build

Build from scratch

mkdir build
cd build

Using Metal

cmake .. -DGGML_METAL=ON
make -j4

Using Cuda

cmake .. -DGGML_CUDA=ON
make -j4

Using CPU

cmake ..
make -j4

Run

Usage: ./clip [options]

Options:
  -h, --help: Show this message and exit
  -m <path>, --model <path>: path to model
  --model_version: the clip model version [openai_clip_vit_base_patch32,openai_clip_vit_large_pathc14,openai_clip_vit_large_patch14_336 and list above model]
  --mode: clip mode must be one of [zeros_shot_image_classification,text_search_image],default = text_search_image
  --model_type: clip model type choose from [f32,f16,q8_0],default = GGML_TYPE_Q8_0 
  --image_path <path>: path to an image file for zeros_shot_image_classify 
  --label_path:txt file for zeros_shot_image_classification
  --text:  the text for text_search_image
  --img_directory: the image directory for text_search_image

model convert

 python3 convert_model/convert.py -model_dir ../clip_model/openai-clip-vit-base-patch32/ -convert_type f16 -output_dir ./

zeros_shot_image_classification example

## for openai clip 
./build/clip  -m  convert_model/patch32_clip_f16.gguf   --model_version openai_clip_vit_base_patch32 --mode zeros_shot_image_classification --model_type q8_0 --image_path example/images/apple.jpeg --label_path  example/label_english.txt

## for chinese_clip
./build/clip  -m  chinese_clip_vit_huge_patch14.gguf   --model_version ofasys_chinese_clip_vit_huge_patch14 --mode zeros_shot_image_classification --model_type q8_0 --image_path example/images/apple.jpeg --label_path  example/label_chinese.txt

label:example/images/apple.jpeg
苹果的照片                                                           :0.311196
梨的照片                                                             :0.219238
蓝色天空的照片                                                        :0.207216

text_search_image example

## openai clip 
 ./build/clip  -m openai_clip_vit_base_patch32.gguf  --model_version openai_clip_vit_base_patch32 --mode text_search_image --model_type q8_0   --img_directory example/images --text "a photo of apple"

## chinese clip 
 ./build/clip  -m  chinese_clip_vit_large_patch14_336.gguf --model_version ofasys_chinese_clip_vit_large_patch14_336 --mode  text_search_image --model_type q8_0   --img_directory example/images --text "男孩照片"

label:男孩照片
example/images/boy.jpeg                                                  :0.270299
example/images/cat.jpeg                                                  :0.256663
example/images/panda.jpg                                                 :0.246492

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
convert_model		convert_model
example		example
thirty_party		thirty_party
tokenizer		tokenizer
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
clip.hpp		clip.hpp
common.hpp		common.hpp
format-code.sh		format-code.sh
main.cpp		main.cpp
model_config.hpp		model_config.hpp
model_loader.hpp		model_loader.hpp
op.hpp		op.hpp
pipeline.hpp		pipeline.hpp
utils.hpp		utils.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

clip.cpp

Features

Download weights

Build

Build from scratch

Using Metal

Using Cuda

Using CPU

Run

model convert

zeros_shot_image_classification example

text_search_image example

References

About

Releases

Packages

Languages

License

yysu-888/clip.cpp

Folders and files

Latest commit

History

Repository files navigation

clip.cpp

Features

Download weights

Build

Build from scratch

Using Metal

Using Cuda

Using CPU

Run

model convert

zeros_shot_image_classification example

text_search_image example

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages