A curated list of balanced multimodal learning methods.
[CVPR-2022]
Balanced Multimodal Learning via On-the-fly Gradient Modulation
Code
[ICASSP-2023]
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Code
[ICLR-2024]
Quantifying and Enhancing Multi-modal Robustness with Modality Preference
Code
[CVPR-2024]
Enhancing Multimodal Cooperation via Sample-level Modality Valuation
Code
[ICML-2024]
MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
Code
[ECCV-2024]
Diagnosing and Re-learning for Balanced Multimodal Learning
Code
- What Makes Training Multi-Modal Classification Networks Hard? [CVPR 2020]
- Learning to Balance the Learning Rates between Various Modalities via Adaptive Tracking Factor [IEEE Signal Processing Letters 2021]
- Audiovisual SlowFast Networks for Video Recognition [arXiv 2020]
- Joint Audio-Visual Deepfake Detection [ICCV 2021]
- Delving into Deep Imbalanced Regression [ICML2021]
- Characterizing and Overcoming the Greedy Nature of Learning in Multi-modal Deep Neural Networks [ICML 2022]
- MCL: A Contrastive Learning Method for Multimodal Data Fusion in Violence Detection [IEEE Signal Processing Letters 2022]
- Balanced MSE for Imbalanced Visual Regression[CVPR2022]
- On Uni-Modal Feature Learning in Supervised Multi-Modal Learning [ICML 2023]
- Multimodal Pre-Training with Self-Distillation for Product Understanding in E-Commerce [WSDM 2023]
- PMR: Prototypical Modal Rebalance for Multimodal Learning [CVPR 2023]
- Graph Interactive Network with Adaptive Gradient for Multi-Modal Rumor Detection [ICMR 2023]
- Multimodal Imbalance-Aware Gradient Modulation for Weakly-supervised Audio-Visual Video Parsing [IEEE Transactions on Circuits and Systems for Video Technology 2023]
- Multimodal Temporal Attention in Sentiment Analysis [MuSe 2023]
- Utilizing Greedy Nature for Multimodal Conditional Image Synthesis in Transformers [TMM 2023]
- Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation [arXiv 2023]
- Boosting Multi-modal Model Performance with Adaptive Gradient Modulation [ICCV 2023]
- Adaptive Mask Co-Optimization for Modal Dependence in Multimodal Learning [ICASSP 2024]
- Improving Multimodal Learning with Multi-Loss Gradient Modulation [arXiv 2024]
- Learning to Rebalance Multi-Modal Optimization by Adaptively Masking Subnetworks [arXiv 2024]
- Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing [CVPR 2024]
- Balanced Multi-modal Federated Learning via Cross-Modal Infiltration [arXiv 2024]
- Balancing Multimodal Learning via Online Logit Modulation [IJCAI 2024]
- Enhancing Unimodal Features Matters: A Multimodal Framework for Building Extraction [TGRS 2024]
- Multimodal Representation Learning by Alternating Unimodal Adaptation [CVPR 2024]
- Understanding Unimodal Bias in Multimodal Deep Linear Networks [ICML 2024]
- Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition [ACM MM 2024]
- Modality-Balanced Learning for Multimedia Recommendation [ACM MM 2024]
- ReconBoost: Boosting Can Achieve Modality Reconcilement. [ICML 2024]