Antoine Liutkus, Fabian-Robert Stöter Inria and LIRMM, University of Montpellier, France antoine.liutkus@inria.fr
- is_blind: no
- additional_training_data: no
- Code: https://github.com/sigsep/sigsep-mus-oracle
- Demos: Not available
The Ideal Ratio Mask for magnitude spectrograms (IRM2) is also known as the generalized 1-Wiener filter.
We write
The IRM1 method has been popular for many years for source separation, where experience shows that using magnitude spectrograms works often better than using power spectrogram as advocated by the classical wide-sense stationary theory. However, no theoretical understanding of it was available until recently, where it was shown to be the optimal way to process locally stationary $\alpha$-harmonizable processes. A description of this theory is given in:
Liutkus, Antoine, and Roland Badeau. "Generalized Wiener filtering with fractional power spectrograms." Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015.
Basically, this method boils down to assuming all the entries of
In this submission, we take
Under this model, source estimates are computed very simply as: $\hat{y}{ij}(f,t)=\frac{v_j(f,t)}{\sum_j' v{ij'}(f,t)} x(f,t),$
just as in classical Wiener filtering. This is often called Ideal Ratio Mask, hence the name of this submission.
This submission is an oracle, meaning that it knows the true sources to compute the optimal parameters
Given the true sources
As may be seen, this approach is extremely similar to classical Wiener filtering, except for the choice of magnitude spectrograms instead of power. Note that this theory works for any exponent
- A. Liutkus and F.-R. Stöter, The 2018 Signal Separation Evaluation Campaign, Proceedings of LVA/ICA, 2018
@inproceedings{sisec2018, title={The 2018 signal separation evaluation campaign}, author={A. Liutkus and F.-R. St{"o}ter and N. Ito}, booktitle={International Conference on Latent Variable Analysis and Signal Separation}, year={2018}, }