Run Segment Anything Model 2 on a live video stream
- 20/08/2024 : Fix management of
non_cond_frame_outputs
for better performance and add bbox prompt
pip install -e .
Then, we need to download a model checkpoint.
cd checkpoints
./download_ckpts.sh
Then SAM-2-online can be used in a few lines as follows for image and video and camera prediction.
import torch
from sam2.build_sam import build_sam2_camera_predictor
checkpoint = "./checkpoints/sam2_hiera_large.pt"
model_cfg = "sam2_hiera_l.yaml"
predictor = build_sam2_camera_predictor(model_cfg, checkpoint)
cap = cv2.VideoCapture(<your video or camera >)
if_init = False
with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
while True:
ret, frame = cap.read()
if not ret:
break
width, height = frame.shape[:2][::-1]
if not if_init:
predictor.load_first_frame(frame)
if_init = True
_, out_obj_ids, out_mask_logits = predictor.add_new_prompt(<your promot >)
else:
out_obj_ids, out_mask_logits = predictor.track(frame)
...
- SAM2 Repository: https://github.com/facebookresearch/segment-anything-2