Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
-
Updated
Aug 18, 2024 - Python
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
Compute the distance between people using a Stereolabs ZED stereovision camera or SVO file. People detection is done by YOLOv3 with a Tensorflow backend. Tracking is done by DeepSORT.
Subject-verb-object triplets extraction for russian language.
Add a description, image, and links to the svo topic page so that developers can more easily learn about it.
To associate your repository with the svo topic, visit your repo's landing page and select "manage topics."