first attempt of mono downmix in the magnitude domain #45

faroit · 2019-05-11T09:57:33Z

Downmixing in the magnitude domain is the recommended way for multichannel audio, since its energy preserving. Let me know if the API makes sense here.

torchaudio_contrib/functional.py

keunwoochoi · 2019-05-11T22:26:29Z

The specific implementation seems alright in general. But probably a little more high-level question - are we gonna have waveform downmix, too?

Ugh, out of sudden, I felt like what if we have data types of WaveformTensor and SpectrumTensor with which we can do more like functional stuff e.g.,

x = WaveformTensor(batch_audio)
x.spectrogram().downmix().amplitude_to_decibel()

Ok none of this comment is not relevant to this commit :P

torchaudio_contrib/functional.py

torchaudio_contrib/layers.py

keunwoochoi · 2019-05-13T05:36:07Z

Hey, in #49, I tried to have this beta_something.py. Maybe without a test, this approach could make sense here, too? (not necessarily though)

hagenw · 2019-05-24T09:54:23Z

torchaudio_contrib/functional.py

@@ -47,6 +47,10 @@ def stft(signal, fft_len, hop_len, window,
    return spect


+def spectral_downmix(tensor, power=1.0):


In general, I would prefer to add an option axis=-2 that defaults to the expected axis in the case of (batch, channel, time). But why restrict to that? it might happen that some user has (batch, other_stuff, channel, time).

okay, yes. I agree

Note, this is a general comment. If we add the axis option here a user would expect the same for all other functionals as well. But it should be fine, if we start with spectral_downmix and later add it to other functionals.

keunwoochoi · 2019-06-04T15:37:06Z

Oops? Have no idea why all the commits I made to a different branch became a part of here. But thanks! Here're some comments.

I understood we agreed that modifying Spectrogram API (mono) is a separate issue -- and indeed it is, so it should be completely removed here.
On the name, I commented here and am still of the opinion (hence SpectrumDownmix and WaveformDownmix). And there, because 'waveform' is a noun, I'd prefer Spectrum to Spectral.
For waveform downmix, given that the current torchaudio is going to be changed a lot + this type conversion doesn't seem alright for me, probably it's better to implement our own functional here? Which will be simply..

return torch.mean(waveforms, ch_dim, True)

. Also, for a batch tensor, let's use a plural form!

faroit · 2019-06-04T15:42:21Z

Oops? Have no idea why all the commits I made to a different branch became a part of here. But thanks! Here're some comments.

should be fine now. There was a missbehaved rebase going on after I pulled in the recent master.

Regarding you requests, I will update those.

faroit · 2019-06-04T15:48:33Z

DownmixSpectrum would also work on power_specgrams. Should I rename the input for this to just specgrams then?

keunwoochoi · 2019-06-04T15:52:54Z

I don't think so. By having power= args we're expecting the input to be mag_specgrams.

keunwoochoi · 2019-06-04T15:54:45Z

(icymi, we didn't really put it on a vote, but DownmixSpectrum --> SpectrumDownmix? especially if you agree, which is kinda voting :)

faroit · 2019-06-04T15:58:06Z

(icymi, we didn't really put it on a vote, but DownmixSpectrum --> SpectrumDownmix? especially if you agree, which is kinda voting :)

ha, yep, sorry... let vote first ;-)

faroit · 2019-06-04T17:56:54Z

are the tests added in the master stable? In that case I can provide some for the downmix functions as well

keunwoochoi · 2019-06-04T17:57:47Z

Yes I think so.

keunwoochoi · 2019-06-05T20:21:51Z

#54 (comment) We got the names :) DownmixWaveform and DownmixSpectrum.

keunwoochoi · 2019-06-10T21:27:13Z

@faroit Hey, I'm a bit lost, but seems like we resolved all the naming issues as well as the implementations? Is it ready to be merged now?

faroit · 2019-06-10T21:29:29Z

Well, I want to add unit tests but I am confused now if we could stick with pytest or not?

keunwoochoi · 2019-06-10T21:37:25Z

I see, +1 for unit tests. I think we have to come back to unittest, at least for some 'core' features that we plan to make a PR there, and this one would be definitely core.

f0k

Two small comments, plus we seem to have agreed to go from "noun+verb" to "verb+noun" naming!

f0k · 2019-06-14T15:29:51Z

torchaudio_contrib/functional.py

@@ -103,6 +103,30 @@ def stft(waveforms, fft_len, hop_len, window,
    return complex_specgrams


+def waveform_downmix(waveforms, ch_dim=1):


I'd just call it dim. There is only one dimension to specify, and this is consistent with pytorch. Maybe someone may want to downmix across the batch dimension. Make sure it's included in the docstring, though. The docstring should also mention that it's downmixing by taking the mean.

f0k · 2019-06-14T15:33:27Z

torchaudio_contrib/layers.py

+    Wrap torchaudio_contrib.waveform_downmix in an nn.Module.
+    """
+
+    def __init__(self):


The dim option should be available (and documented) here as well.

faroit added the Feature label May 11, 2019

faroit requested review from ksanjeevan and keunwoochoi May 11, 2019 09:57

ksanjeevan reviewed May 11, 2019

View reviewed changes

torchaudio_contrib/functional.py Show resolved Hide resolved

keunwoochoi reviewed May 11, 2019

View reviewed changes

torchaudio_contrib/functional.py Show resolved Hide resolved

torchaudio_contrib/layers.py Outdated Show resolved Hide resolved

keunwoochoi mentioned this pull request May 12, 2019

Argument names for waveforms, 2d representations (real and complex) #46

Closed

keunwoochoi mentioned this pull request May 21, 2019

Merging plan from torchaudio-contrib pytorch/audio#110

Open

hagenw reviewed May 24, 2019

View reviewed changes

keunwoochoi mentioned this pull request May 26, 2019

Downmix - api design #54

Open

Fabian-Robert Stöter added 5 commits June 4, 2019 17:30

first attempt of mono downmix in the magnitude domain

774ec99

adjusting naming, replace spectral downmix with mean op

c2fb862

set mono default to false

27c8f4a

first attempt of mono downmix in the magnitude domain

6a345d6

adjusting naming, replace spectral downmix with mean op

fad5826

faroit force-pushed the feature_spectrogram_downmix branch from 157eb69 to fad5826 Compare June 4, 2019 15:36

Fabian-Robert Stöter added 2 commits June 4, 2019 17:51

update naming again (add plural form)

c6af8e7

remove mono option

26bc04d

revert according keunwoochoi#54

5912329

faroit marked this pull request as ready for review June 4, 2019 16:01

accidental removal

895764b

f0k requested changes Jun 14, 2019

View reviewed changes

jamarshon mentioned this pull request Jul 24, 2019

[BC] Standardization of Transforms/Functionals pytorch/audio#152

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first attempt of mono downmix in the magnitude domain #45

first attempt of mono downmix in the magnitude domain #45

faroit commented May 11, 2019

keunwoochoi commented May 11, 2019

keunwoochoi commented May 13, 2019

hagenw May 24, 2019 •

edited

Loading

faroit May 24, 2019

hagenw May 24, 2019

keunwoochoi commented Jun 4, 2019

faroit commented Jun 4, 2019

faroit commented Jun 4, 2019

keunwoochoi commented Jun 4, 2019

keunwoochoi commented Jun 4, 2019

faroit commented Jun 4, 2019

faroit commented Jun 4, 2019

keunwoochoi commented Jun 4, 2019

keunwoochoi commented Jun 5, 2019

keunwoochoi commented Jun 10, 2019

faroit commented Jun 10, 2019

keunwoochoi commented Jun 10, 2019

f0k left a comment

f0k Jun 14, 2019

f0k Jun 14, 2019

		@@ -47,6 +47,10 @@ def stft(signal, fft_len, hop_len, window,
		return spect


		def spectral_downmix(tensor, power=1.0):

		@@ -103,6 +103,30 @@ def stft(waveforms, fft_len, hop_len, window,
		return complex_specgrams


		def waveform_downmix(waveforms, ch_dim=1):

first attempt of mono downmix in the magnitude domain #45

Are you sure you want to change the base?

first attempt of mono downmix in the magnitude domain #45

Conversation

faroit commented May 11, 2019

keunwoochoi commented May 11, 2019

keunwoochoi commented May 13, 2019

hagenw May 24, 2019 • edited Loading

Choose a reason for hiding this comment

faroit May 24, 2019

Choose a reason for hiding this comment

hagenw May 24, 2019

Choose a reason for hiding this comment

keunwoochoi commented Jun 4, 2019

faroit commented Jun 4, 2019

faroit commented Jun 4, 2019

keunwoochoi commented Jun 4, 2019

keunwoochoi commented Jun 4, 2019

faroit commented Jun 4, 2019

faroit commented Jun 4, 2019

keunwoochoi commented Jun 4, 2019

keunwoochoi commented Jun 5, 2019

keunwoochoi commented Jun 10, 2019

faroit commented Jun 10, 2019

keunwoochoi commented Jun 10, 2019

f0k left a comment

Choose a reason for hiding this comment

f0k Jun 14, 2019

Choose a reason for hiding this comment

f0k Jun 14, 2019

Choose a reason for hiding this comment

hagenw May 24, 2019 •

edited

Loading