Multi-modal classifications of digits with image and audio modality. One shot learning with Siamese network is used to predict if the given input image-audio pair belongs to same class or not.
one-shot-learning meta-learning siamese-network multimodal-classification multimodal-meta-learning audio-image-classification
-
Updated
Mar 25, 2023 - Python