cseg

Context-aware open-vocabulary semantic segmentation (adapted from ov-seg)

Data Preparation

Evaluation on the KITTI Dataset with OVSeg Pretrained Weights: (Swin-Base + CLIP-ViT-L/14)

Download cseg_verification.ipynb, or view colab file directly
Upload ovseg_swinbase_vitL14_ft_mpt.pth to accessable folder, preferably a google drive folder so that you don't have to re-upload every time. Edit the model_weights field under Build Model in the ipynb file to the path where ovseg_swinbase_vitL14_ft_mpt.pth is stored.
Place the KITTI Test images to an easily accessible place in drive. Edit the data_fldr field under Read in KITTI test data from drive in the ipynb file to this path.

Training on COCO or ADE20k dataset:

The cseg_training.ipynb has the basic training commands and setup, it assumes you have gone through data preparation required for the detectron2 datasets.
Due to the size of the datasets, the connection between google colab and google drive can time out.
The /sbatch/ folder features the files required to run this in a slurm environment. It assumes that the virtual environment has all required packages installed.
The code for full training is in open_vocab_seg subfolder.
More information on training can be found here.

Fine-tuning the classification stage (CLIP) is adapted from OVSeg:

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
configs		configs
open_clip_training		open_clip_training
open_vocab_seg		open_vocab_seg
sbatch		sbatch
README.md		README.md
cseg_training.ipynb		cseg_training.ipynb
cseg_verification.ipynb		cseg_verification.ipynb
requirements.txt		requirements.txt
train_net.py		train_net.py