This package is a python implementation of the Common Fate Transform and Model to be used for audio source separation as described in an ICASSP 2016 paper "Common Fate Model for Unison source Separation".
The Common Fate Transform is based on a signal representation that divides a complex spectrogram into a grid of patches of arbitrary size. These complex patches are then processed by a two-dimensional discrete Fourier transform, forming a tensor representation which reveals spectral and temporal modulation textures.
An adapted factorization model similar to the PARAFAC/CANDECOMP factorisation allows to decompose the common fate transform tesnor into different time-varying harmonic sources based on their particular common modulation profile: hence the name Common Fate Model.
See the full API documentation at http://aliutkus.github.io/commonfate.
import commonfate
# # forward transform
# STFT Parameters
framelength = 1024
hopsize = 256
X = commonfate.transform.forward(signal, framelength, hopsize)
# Patch Parameters
W = (32, 48)
mhop = (16, 24)
Z = commonfate.transform.forward(X, W, mhop, real=False)
# inverse transform of cft
Y = commonfate.transform.inverse(
Z, fdim=2, hop=mhop, shape=X.shape, real=False
)
# back to time domain
y = commonfate.transform.inverse(
Y, fdim=1, hop=hopsize, shape=x.shape
)
import commonfate
# initialiase and fit the common fate model
cfm = commonfate.model.CFM(z, nb_components=10, nb_iter=100).fit()
# get the fitted factors
(A, H, C) = cfm.factors
# returns the of z approximation using the fitted factors
z_hat = cfm.approx()
commonfate has a built-in wrapper which computes the Common Fate Transform, fits the model according to the Common Fate Model and return the synthesised time domain signal components obtained through wiener / soft mask filtering.
The following example requires to install pysoundfile.
import commonfate
import soundfile as sf
# loading signal
(audio, fs) = sf.read(filename, always_2d=True)
# decomposes the audio signal into
# (nb_components, nb_samples, nb_channels)
components = decompose.process(
audio,
nb_iter=100,
nb_components=10,
n_fft=1024,
n_hop=256,
cft_patch=(32, 48),
cft_hop=(16, 24)
)
# write out the third component to wave file
sf.write(
"comp_3.wav",
components[2, ...],
fs
)
The current common fate model implementation makes heavily use of the Einstein Notation. We use the numpy einsum
module which can be slow on large tensors. To speed up the computation time we recommend to install Daniel Smith's opt_einsum
package.
pip install -e 'git+https://github.com/dgasmith/opt_einsum.git#egg=opt_einsum'
commonfate automatically detects if the package is installed.
You can download and read the paper here. If you use this package, please reference to the following publication:
@inproceedings{stoeter2016cfm,
TITLE = {{Common Fate Model for Unison source Separation}},
AUTHOR = {St{\"o}ter, Fabian-Robert and Liutkus, Antoine and Badeau, Roland and Edler, Bernd and Magron, Paul},
BOOKTITLE = {{41st International Conference on Acoustics, Speech and Signal Processing (ICASSP)}},
ADDRESS = {Shanghai, China},
PUBLISHER = {{IEEE}},
SERIES = {Proceedings of the 41st International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
YEAR = {2016},
KEYWORDS = {Non-Negative tensor factorization ; Sound source separation ; Common Fate Model},
}