This repository contains the code supporting the BioCLIP base model for use with Autodistill.
BioCLIP is a CLIP model trained on the TreeOfLife-10M dataset, created by the researchers who made BioCLIP. The dataset on which BioCLIP was trained included more than 450,000 classes.
You can use BioCLIP to auto-label natural organisms (i.e. animals, plants) in images for use in training a classification model. You can combine this model with a grounded detection model to identify the exact region in which a given class is present in an image. Learn more about combining models with Autodistill.
Read the full Autodistill documentation.
Read the BioCLIP Autodistill documentation.
To use BioCLIP with autodistill, you need to install the following dependency:
pip3 install autodistill-bioclip
from autodistill_bioclip import BioCLIP
# define an ontology to map class names to our BioCLIP prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
classes = ["arabica", "robusta"]
base_model = BioCLIP(
ontology=CaptionOntology(
{
item: item for item in classes
}
)
)
results = base_model.predict("../arabica.jpeg")
top = results.get_top_k(1)
top_class = classes[top[0][0]]
print(f"Predicted class: {top_class}")
This project is licensed under an MIT license.
The underlying BioCLIP model is also licensed under an MIT license.
We love your input! Please see the core Autodistill contributing guide to get started. Thank you 🙏 to all our contributors!