=======================================================================
Our demo code is implemented in Keras (writtern in Python, and the backend is theano).
Usage:
$python main_run.py
or execute it in terminal background:
$bash run.sh
Notice:
(1). In order to aviod the version mismatch of Keras, we fork the verison_1.2.2 of Keras into this project.
(2). We use Matlab version of BSS_eval to evaluate NSDR.
Figure 1: Two specific attention tasks for auditory selection in a three speech mixture environment. One is top-down task-specific attention, and the other is bottom-up stimulus-driven attention.
Figure 2: An illustration of our Auditory Selection with Attention and Memory (ASAM). (a): The overall architecture of the proposed ASAM. (b): Life-long memory module to memory the prior knowledge. In top-down attention scene, the dashed boxes and arrow are only conducted in the training phase and removed in the evaluation time.
Figure 3: Effects of attention with different amounts of stimulus on one male and female mixture sample from WSJ0. (a) shows the SIR (Signal-to-Interference Ratio), SAR (Signal-to-Artifacts Ratio) and NSDR results, (b)-(d) are the auditory stimuli whose magnitudes are divided by the maximum magnitude, (e) is the mixture input spectrogram, (i) is the target spectrogram, (f)-(h) are attention maps based on the corresponding auditory stimuli and (j)-(l) are the corresponding predictions with their NSDR performances.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.