Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Constant-Q Transform, custom FFT and perceptual frequency scales #30

Open
TF3RDL opened this issue Apr 7, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@TF3RDL
Copy link

TF3RDL commented Apr 7, 2022

Although FFTs are fine, it gets really boring for me, so the constant-Q transform (actually the variable-Q transform) is preferred over FFT for octave band analysis, but my implementation of CQT (implemented using bunch of Goertzel algorithm) is slow and it needs to use a sliding DFT to do the real-time CQT

I also aware that spectrum analyzers on Web Audio API doesn't need to use AnalyserNode.getByteFrequencyData, you can just use any FFT library and getFloatTimeDomainData as an input just like my sketch does that, but beware you need to window it using Hann window or something before using FFT, see #3

I think perceptual frequency scales like Mel and Bark should be added because the bass frequencies are less shown than logarithmic scale and more shown than linear scale

@hvianna
Copy link
Owner

hvianna commented May 7, 2022

Thank you for letting me know about these techniques! Looks like I have a lot to catch on! 😅

Also, thank you for sharing your sketch! It made me realize that using linear values for the amplitude (instead of dB) makes a huge difference in visualization. I'll have this added as an option in the next release. Next, I think weighting filters would also be a good addition.

Can you recommend any good references for equations/algorithms of the CQT/variable-Q transform, perceptual scales and weighting filters?

Cheers!

@TF3RDL
Copy link
Author

TF3RDL commented May 18, 2022

The equation for Bark scale is from Traunmüller's work, and the A-weighting as well as other things is already covered on Wikipedia

As for the constant-Q transform, I prefer the sliding DFT, which works best for real-time
audio visualization and it even has a paper for it

@TF3RDL
Copy link
Author

TF3RDL commented Dec 5, 2022

Here's the problem that I realized before you implementing the CQT; the Brown-Puckette would require real/imag parts, which AnalyserNode doesn't have (as getByteFrequencyData/getFloatFrequencyData only outputs logarithmic magnitude values), thus it requires custom FFT functionality (which can be implemented using any FFT libraries including ones like this that bundled with FFT functions), and implementing the sliding CQT requires AudioWorklets since it doesn't work well with getFloatTimeDomainData as waveform data to process

@hvianna
Copy link
Owner

hvianna commented Dec 10, 2022

@TF3RDL Thanks for following up on this!

For the next beta release, I've done some improvement to the linear amplitude mode and I'm finishing up the work on the weighting filters. I'll try to take a look at the perceptual scales next.

@TF3RDL
Copy link
Author

TF3RDL commented Dec 30, 2022

As for the custom FFT, this could allow non-power of two sizes, zero-padding, and use different FFT streams or even non-audio data as an input (as custom FFT doesn't depend on Web Audio API), not just window functions right?

Not sure about the performance impact of using custom FFT over getByteFrequencyData/getFloatFrequencyData, but I do know that non-power of two FFTs are noticeably slower

@TF3RDL
Copy link
Author

TF3RDL commented Apr 17, 2024

Of course, analog-style analyzer (IIR filter bank, no FFT required) mode might be better to implement performance-wise though I think it works best if you implemented this type of non-FFT analyzer using custom implementation (using AudioWorklets), rather than using bunch of BiquadFilterNodes connecting to each AnalyserNodes

@hvianna
Copy link
Owner

hvianna commented May 1, 2024

I need to work on making the rendering function more independent of WebAudio / FFT, but I worry that a generic solution might impact performance.

By the way, I really like the idea of fading peaks in your demo. I'll try adding these next!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants