PyTorch implementations of FSMN (Feedforward Sequential Memory Networks):
sFSMNCell - scalar FSMN
vFSMNCell - vectorized FSMN
csFSMNCell - compact scalar FSMN
cvFSMNCell - compact vectorized FSMN
See:
- Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency [arXiv]
- Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition [PDF]
- Feedforward Sequential Memory Networks based Encoder-Decoder Model for Machine Translation [PDF] (http://www.apsipa.org/proceedings/2017/CONTENTS/papers2017/13DecWednesday/Poster%202/WP-P2.14.pdf).
- Deep-FSMN for Large Vocabulary Continuous Speech Recognition [arXiv]
- DEEP FEED-FORWARD SEQUENTIAL MEMORY NETWORKS FOR SPEECH SYNTHESIS [arXiv]
- A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition [arXiv]
!git clone https://github.com/d5555/FSMN.git
from FSMN.FSMN import *
batch = 2
memory_size = 3
input_size = 5
hidden_size = 10
layer_output_size = 5
sequence_size = 11
n_layers = 3 # number of layers
ff_size = 20
bidirectional = True
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
torch.manual_seed(20)
src=torch.randn((batch, sequence_size, input_size)).to(device)
# memory_size, input_size, hidden_size, layer_output_size, n_layers, fsmn_class, ff_size, drop=0.1, activation=F.relu, bidirectional=False, device=None, dtype=torch.float32
#fsmn_class : sFSMNCell, csFSMNCell, vFSMNCell, cvFSMNCell
fsmn = FSMN(memory_size, input_size, hidden_size , layer_output_size, n_layers, cvFSMNCell, ff_size, drop=0.1, device=device, activation=F.relu, bidirectional=bidirectional).to(device)
src_pad_mask = (torch.tensor([[1,2,3,5,6,6,8,8,13,13,13], [1,2,3,5,6,6,13,13,13,13,13]]) != 13).to(device)
predict = fsmn(src, pad_mask=src_pad_mask)
print ( predict.shape )
print ( predict )