Inverse operations, wiener filter, softmask #5

faroit · 2019-02-18T13:35:01Z

Concerning additional operators, the most valuable at this point would be inverse operators. However, for this, we would have to wait until ISTFT is implemented.

But maybe it would be nice to implement some operations already like wiener filtering/soft masking/binary masking. Preferably all for multichannel spectrograms.

I could add this, if you like the idea

The text was updated successfully, but these errors were encountered:

dansuh17 · 2019-03-12T00:55:27Z

Wiener filter would be incredible. I would love to have them.

hagenw · 2019-03-13T08:51:04Z

Do all the audio transforms have to work on torch.Tensor. I implemented so far most of my audio transforms for numpy arrays and did a conversion to torch.Tensor at the end.

For stft and istft it allows to use librosa and also adds the possibility to work easily with signals of arbitrary dimensions, e.g. (note: some code is missing to simplify the argument):

def stft(signal, window_size, hop_size, window='hann', axis=-1):
    fft_config = dict(n_fft=window_size, hop_length=hop_size,
                             win_length=window_size, window=window)
    return np.apply_along_axis(librosa.stft, axis, signal, **fft_config)

def istft(spectrogram, window_size, hop_size, window='hann', axis=-2):
    ifft_config = dict(hop_length=hop_size, win_length=window_size,
                              window=window)
    # ... some reshaping code
    return np.apply_along_axis(_istft, axis, D, f, t, **ifft_config)

def _istft(spectrogram, frequency_bins, time_bins, **config):
    spectrogram = np.reshape(spectrogram, [frequency_bins, time_bins])
    return librosa.istft(spectrogram, **config)

faroit · 2019-03-13T09:26:13Z

Do all the audio transforms have to work on torch.Tensor

yes, the aim is that all audio transforms (not to be confused with augmentation transforms should be able to run on GPU as aprt of the model.

In any case, we would need to wait for the istft

hagenw · 2019-03-13T09:34:20Z

Ah, ok. I use all my audio transforms as augmentation transforms.

faroit · 2019-03-13T09:39:46Z

Ah, ok. I use all my audio transform as augmentation transforms.

yes, currently I do the same. the dataset transforms are bound to CPU, but maybe that will change. I think we would still benefit from pure pytorch implementation

faroit mentioned this issue Mar 13, 2019

harmonic-percussive source separation #25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inverse operations, wiener filter, softmask #5

Inverse operations, wiener filter, softmask #5

faroit commented Feb 18, 2019

dansuh17 commented Mar 12, 2019

hagenw commented Mar 13, 2019

faroit commented Mar 13, 2019

hagenw commented Mar 13, 2019 •

edited

Loading

faroit commented Mar 13, 2019

Inverse operations, wiener filter, softmask #5

Inverse operations, wiener filter, softmask #5

Comments

faroit commented Feb 18, 2019

dansuh17 commented Mar 12, 2019

hagenw commented Mar 13, 2019

faroit commented Mar 13, 2019

hagenw commented Mar 13, 2019 • edited Loading

faroit commented Mar 13, 2019

hagenw commented Mar 13, 2019 •

edited

Loading