Ref-AVS

The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024

Project Page

Dataset Download

>>> Introduction

In this paper, we propose a pixel-level segmentation task called Referring Audio-Visual Segmentation (Ref-AVS), which requires the network to densely predict whether each pixel corresponds to the given multimodal-cue expression, including dynamic audio-visual information.

Top-left of Fig.1 highlights the distinctions between Ref-AVS and previous tasks.
Fig.2 shows the proposed baseline model to process multimodal-cues.
Fig.3 shows the statistics of this dataset.

>>> Run

Run the training & evaluation:

cd Ref_AVS
sh run.sh  # you should change your path configs. See /configs/config.py for more details.

You can download the checkpoint here.

Core dependencies:

transformers=4.30.2
towhee=1.1.3
towhee-models=1.1.3  # Towhee is used for extracting VGGish audio feature.

Citation

If you find this work useful, please consider citing it:

@article{wang2024refavs,
  title={Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes},
  author={Wang, Yaoting and Sun, Peiwen and Zhou, Dongzhan and Li, Guangyao and Zhang, Honggang and Hu, Di},
  journal={IEEE European Conference on Computer Vision (ECCV)},
  year={2024},
}

@inproceedings{wang2024prompting,
  title={Prompting segmentation with sound is generalizable audio-visual source localizer},
  author={Wang, Yaoting and Liu, Weisong and Li, Guangyao and Ding, Jian and Hu, Di and Li, Xi},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={6},
  pages={5669--5677},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
configs		configs
data		data
datasets		datasets
logs		logs
models		models
scripts		scripts
utils		utils
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
run.sh		run.sh
run_refavs.py		run_refavs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ref-AVS

Project Page

Dataset Download

>>> Introduction

>>> Run

Citation

About

Releases

Packages

Languages

License

GeWu-Lab/Ref-AVS

Folders and files

Latest commit

History

Repository files navigation

Ref-AVS

Project Page

Dataset Download

>>> Introduction

>>> Run

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages