Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature to mute mixer channels with Inf/NaN output #7323

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

sakertooth
Copy link
Contributor

@sakertooth sakertooth commented Jun 16, 2024

Improves the handling of invalid output found within mixer channels. Invalid output is defined as any output inside the buffer that is an Inf or NaN.

This PR improves the performance of the sanitation itself by only considering if invalid output is found in the mixer channel and not in the effects individually, requiring only one loop across the buffer for sanitizing per period instead of by the number of effects in the chain. If an invalid output is found, the mute button will flash red, and a tooltip is added, explaining that the mixer channel was internally muted because of invalid output. And for those that do not want this behavior altogether for even more performance, an option is provided in the settings to disable it completely.

What makes this especially nice for investigating/debugging invalid output is that the user or developer can mute individual effects in the chain to see very clearly which one is the culprit. If all the effects are muted (or there are none in the chain), then that means that the invalid output was produced elsewhere in the code and fed into the mixer channel as input.

This PR also changes the nanhandler to be exposed and was renamed to muteinvalidoutput for more accuracy as to what the option does.

A demo of this is shown below. For testing purposes, I injected a NaN into Amplifier's output.

2024-06-24.14-16-47.mp4

@sakertooth sakertooth changed the title Remover mixer sanitation Remove mixer sanitation Jun 16, 2024
@sakertooth sakertooth force-pushed the remove-mixer-sanitation branch from 34cf3b5 to acfd40a Compare June 16, 2024 17:04
@zonkmachine
Copy link
Member

zonkmachine commented Jun 16, 2024

I believe it is worth trying to remove this because there is reason to believe a performance hit occurs when doing these operations.

Please, no. It's a good thing to try and optimize the code but this is not one of those cases. We still have cases where a delay/reverb is hit with a loud signal with a loud crash sound as a result as a reminder to what will come if we remove this. Complaints on loud noises will increase but that's not the worst. It's the users who can't figure out the various odd effects and just will drop out of using lmms, and perhaps music software completely. I've been that user who couldn't figure out why the mix went completely silent and just dropped out of using lmms for some two years.

The results are based on when I was playing the demo, not on idle. I also did not run into any odd experiences with Infs/NaNs. More testing may be necessary.

And the test results for the sanitizing disabled as is already possible? The setting is a hidden one but maybe we can make it a proper option in settings?

@sakertooth
Copy link
Contributor Author

sakertooth commented Jun 16, 2024

Please, no. It's a good thing to try and optimize the code but this is one of those cases. We still have cases where a delay/reverb is hit with a loud signal with a loud crash sound as a result as a reminder to what will come if we remove this. Complaints on loud noises will increase but that's not the worst. It's the users who can't figure out the various odd effects and just will drop out of using lmms, and perhaps music software completely. I've been that user who couldn't figure out why the mix went completely silent and just dropped out of using lmms for some two years.

I personally think if there's a issue with loud bangs or loud silence and it relates to NaNs popping up somewhere, we should fix it at the source, not put some wrapper function around it like a band-aid and move on. #7141 was a step in that direction. The right direction really. We should keep looking out for issues like these and continue fixing them when we can.

Another important thing to mention: Even with the sanitation in place, users STILL reported issues involving the mixer going silent. The EQ was muting output because of NaNs in the filters (reported at #7131). How do we fix that? We fix it at the root. This is what should've happened initially, and this sanitation really has no reason to be here if we fixed the root cause of the problem in the beginning.

(I just noticed that the issue I talked about where the EQ was muting output might've been the reason for why your mix was going slient. You should test this now. It might've been actually fixed this time around. The sanitation here would cause the mix to be muted if a NaN or Inf was found, which could have been the problem).

And the test results for the sanitizing disabled as is already possible? The setting is a hidden one but maybe we can make it a proper option in settings?

Not sure what you mean, but I removed this sanitation completely. It has no reason to be here if we fix what needs to be fixed.

@sakertooth
Copy link
Contributor Author

We still have cases where a delay/reverb is hit with a loud signal with a loud crash sound as a result as a reminder to what will come if we remove this.

Also important to note that in #7098, a user reports of a loud bang. So, we have this sanitation in place, and the intention was to have that functionality to avoid situations like this. But the problem exists even with this in place, so it makes you wonder what does this really fix?

@zonkmachine
Copy link
Member

But the problem exists even with this in place, so it makes you wonder what does this really fix?

Many of the instances of floating point errors will hand you a really large signal before reaching inf/NaN. This is why I opted to just nuke the whole buffer in these cases. It will handle most issues but not all. Also, some of the loud bangs aren't inf/NaN but just some other loud artifact, for these we have clipping of the signal to 1000.f.

else
{
	src[f][c] = std::clamp(src[f][c], -1000.0f, 1000.0f);
}

Why not allow both sanitizing the signal and then let the sound run free for the young and brave?

@sakertooth
Copy link
Contributor Author

sakertooth commented Jun 16, 2024

Many of the instances of floating point errors will hand you a really large signal before reaching inf/NaN. This is why I opted to just nuke the whole buffer in these cases. It will handle most issues but not all. Also, some of the loud bangs aren't inf/NaN but just some other loud artifact, for these we have clipping of the signal to 1000.f.

Okay, but I'm trying to tell you is that these aren't solutions. They are bandages covering the real problem. Setting the clipping of the signal to 1000.0f is a bandage (which is odd since for digital audio it goes from -1.0f to 1.0f). Nuking the whole buffer is a bandage (and also ended up being a problem itself). They simply cover up the real problem we need to address somewhere else entirely, even if its not because of an Inf or NaN.

Why not allow both sanitizing the signal and then let the sound run free for the young and brave?

Because sanitizing the buffer makes the processing slower, even if its a bit. If we can make this program's audio processing faster in any way, I see that as an immediate benefit.

@zonkmachine
Copy link
Member

Because sanitizing the buffer makes the processing slower, even if its a bit. I

It's quite a bit slower, yes. But my suggestion was, to instead make the nanhandler setting visible and known. Surely a single test is a reasonable trade off?

They are bandages

Right, I have heard those talking points before. They are not bandages but just a result of the final product meeting the real world. Things are simply messy out there.

They simply cover up the real problem we need to address somewhere else entirely, even if its not because of an Inf or NaN.

It's quite a lot of plugins that has glitches like that and this is the reason other hosts do sanitize the signal. Some plugins sanitize the signal too, especially delay based ones since these are the ones that allow the occasional large transients to loop around and grow to a bang.

which is odd since for digital audio it goes from -1.0f to 1.0f

I also found this odd but as it turns out, while the final sound file is mostly clipped to +/-1.0f, people like to break those numbers routinely. I think vst is mostly +/-1.0 but that may be wrong too.

Ps. This is related code that should be deleted if the nanhandler is nuked: https://github.com/LMMS/lmms/pull/4787/files

@zonkmachine
Copy link
Member

Let's ask @jasp00 . I think he's on your side by the way! :)

@zonkmachine
Copy link
Member

(I just noticed that the issue I talked about where the EQ was muting output might've been the reason for why your mix was going slient.

No. This was some twelve years ago but I think it was the Calf compressor that got me. It got me good though.

@zonkmachine
Copy link
Member

I tested this on some demos, results are shown below. The results are based on when I was playing the demo, not on idle. I also did not run into any odd experiences with Infs/NaNs. More testing may be necessary.

Demo Master: Effects % Master: Mixer % This PR: Effects % This PR: Mixer %
Alf42red - Mauiwowi 10%-11% 1% 4%-8% 1%
Greippi - Krem Kakkuja 3%-8% 10%-11% 2%-4% 5%-7%
unfa - Spoken 1% 3%-6% 1% 3%-6%
EsoXLB - CPU 0% 5%-6% 0% 3%-4%

Q - What method did you use to find out the load?
Q - Can you complement this list with Master: Effects % and Mixer % with nanhandler off? nanhandler="0"
The setting is somewhere in the top of your lmmsrc.xml. This is what the line looks like by default on my machine.

  <app displaydbfs="0" configured="1" nommpz="0" loopmarkermode="dual" nanhandler="1" language="en" disablebackup="0" 

This bypasses the sanitizer.

@tresf tresf mentioned this pull request Jun 18, 2024
@sakertooth
Copy link
Contributor Author

Q - What method did you use to find out the load?

I just used the CPU meter tooltip.

Can you complement this list with Master: Effects % and Mixer % with nanhandler off? nanhandler="0"
The setting is somewhere in the top of your lmmsrc.xml. This is what the line looks like by default on my machine.

Sure. I'll let you know when I'm done. Or, if you want to see for yourself, you can also try.

@softrabbit
Copy link
Member

How about making the sanitation faster? A little something I threw together over my morning coffee, but I think it might optimize better, and at the very least be faster for the common case, i.e. the buffer not containing any Inf/Nan.

finite = false
for(i=0; i<frames*channels; ++i) {
   finite &= std::isfinite(src[i];
   src[i] = std::clamp(src[i], -1000.0f, 1000.0f);
}
if(!finite) {
  // memset or whatever the buffer to 0 
 return true
}
return false

@sakertooth
Copy link
Contributor Author

sakertooth commented Jun 20, 2024

How about making the sanitation faster? A little something I threw together over my morning coffee, but I think it might optimize better, and at the very least be faster for the common case, i.e. the buffer not containing any Inf/Nan.

We can most certainly make it faster, but its even more faster to do nothing at all. I think doing nothing at all is the way to go in Release builds, even if we decide we don't want to remove it entirely, but only enable the checks for debugging (suggestion by @rdrpenguin04 on Discord).

Only enabling them debugging saves us the performance hit, but we also get the safety checks if we want to take a look more closely. Of course, if we intend on keeping sanitation, we should definitely optimize that function, no doubt.

Still, this suggestion still has some issues though. We still have the problem of checking if Infs and NaNs are being generated elsewhere in the code besides the mixing stage, so this sanitation wouldn't be able to cover those cases.

The better solution is to still remove sanitation completely, but use other tools out there to debug an Inf or NaN situation in the event that we suspect we are dealing with them. This will help to actually get to resolving the bug, no performance hit for regular users that did not ask for it, less code debt, etc.

There are tools such as MemorySanitizer and Valgrind. From what I've seen, most of the time we ran into an Inf or Nan was because we did not initialize our variables, which is what these two can help with. For example, in Valgrind's case, it will report issues involving unitialized variables and actually take us to the problem directly in the code through its stacktrace.

TLDR: Still remove sanitation, but use debugging tools to actually fix the root cause.

Edit: There is also UndefinedBehaviorSanitizer for more specific UB checks that can help find an Inf or NaN.

@sakertooth
Copy link
Contributor Author

Got around doing the nanhandler=0 test for master @zonkmachine:

Demo Master: Effects % Master: Mixing %
Greppi - Krem Kakkuja 2%-5% 6%-7%
Alf42red - Mauiwowi 4% 0%
unfa - Spoken 1% 4%
EsoXLB - CPU 0% 4%

So basically no difference from a performance standpoint.

@sakertooth
Copy link
Contributor Author

Changing this PR per this comment.

@sakertooth sakertooth marked this pull request as draft June 24, 2024 00:51
@tresf
Copy link
Member

tresf commented Jun 25, 2024

This PR also changes the nanhandler to be exposed and was renamed to muteinvalidoutput for more accuracy as to what the option does.

@sakertooth Can you help explain how this name is more accurate?

Existing behavior:

  • With nanhandler off, it implicitly and unobviously mutes the entire mixer due to math.
  • With nanhandler on, it silently sanitizes the values and plays properly, no warning or notice to the user (except, perhaps, in the logs) that the issue occurs.
  • The name is slightly ambiguous but accurate.

With the new behavior, does the muteinvalidoutput change the nanhandler behavior to mute the problematic plugin/mixer channel?

Edit: 1: The video does show the plugin getting muted, so I think that's what happens.

If so, that may not be desired. I also feel that this name may add further ambiguity as to the symptom being similar to the """"bug"""".

Edit: 2 I forgot to say "Thanks". Thanks!

@tresf

This comment was marked as outdated.

@sakertooth
Copy link
Contributor Author

sakertooth commented Jun 25, 2024

With nanhandler on, it silently sanitizes the values and plays properly, no warning or notice to the user (except, perhaps, in the logs) that the issue occurs.

Your choice of words don't sit right with me here. Nothing about playing/muting a corrupted buffer should be considered "the project is playing properly". In fact, the existing behavior I believe mutes the entire buffer when invalid output is found in it. So are you sure you aren't just hearing the rest of the song? That isn't it playing properly at all.

The name is slightly ambiguous but accurate.

Feel free to suggest a better name. I like it over nanhandler at least because it acts as a verb, while nanhandler is a noun. And I think its clear from its name what the option will do. I also don't see an issue with this option being separate from the old nanhandler option.

With the new behavior, does the muteinvalidoutput change the nanhandler behavior to mute the problematic plugin/mixer channel?

What happens here is that the mixer channel processes its audio normally, then when everything is said and done, it has the option (if the user permits) to look for invalid output. If invalid output is found, it mutes the buffer and emits a signal to the UI to flash its mute button red, saying that it was forcefully muted internally because of invalid output.

The current behavior achieves the same thing (besides the visual indicator), but with a lot more work. It checks the audio input first for invalid values, and checks the buffer after processing each effect. If at any point it finds an Inf or NaN, it mutes the buffer and continues processing. There is no reason to continue the processing after muting the buffer, because that effectively would mute the entire effect chain from the issue (whether the issue is a faulty plugin or some audio input that was fed into the mixer channel) onwards. Simply a waste of CPU time.

TLDR: They both mute the mixer channel, but the current behavior does so in a suboptimal way, and doesn't provide the visual indicator of course.

NB: I also want to clarify that this invalid output handling should not be considered a fix for anything, but a tool to help find and eventually fix Infs or NaNs appearing in the signal chain. It seems as if the current sanitation was implemented as a fix, when it very evidently isn't. Resolving the issue for why the Inf or NaN appears and removing it is a fix.

I forgot to say "Thanks". Thanks!

No problem. 👍

@tresf
Copy link
Member

tresf commented Jun 26, 2024

In fact, the existing behavior I believe mutes the entire buffer when invalid output is found in it

The code chooses a "sane" value when an invalid value is found. In testing, this sounds rather fine. These may be audible under certain circumstances, but with the zita-reverb plug-in, they sound fine enough for casual listening. The modified buffers are not noticable, at least not to my ear on first listen. The reverb is clearly audible and "working enough" for the ear to think it's working. This makes sense because the bug report was marked as resolved, consistent with my findings.

This PR, as I understand it, mutes the plugin, breaking the sound.

I don't want to minimize the importance of notifying a user when a plugin is misbehaving, I hope that's not misconstrued, but muting the plugin is a regression to the user -- the person -- without knowledge of how to fix this plugin ( because projects in older versions of LMMS could play and sound "normal" prior to this patch ) don't just magically patch old broken plugins because they're muted. Developers may choose to, but this has still proven to be a rabbit hole. In the case of zita-reverb, I'm not even certain where the upstream code is considered to live to file such a bug report.

I've been thinking how to represent this to the user interface and I believe there should be some type of notification and the plugin should go to an intermediate state (e.g. not green, not red, but something else perhaps purple or yellow) and the user can chose -- if they wish -- to remove, mute or replace the problematic plugin.

With regards to the name of the feature (verb, noun or otherwise) it will be better defined if we can agree on what it should be doing, which still seems to be in disagreement.

@sakertooth
Copy link
Contributor Author

The code chooses a "sane" value when an invalid value is found

                        if( std::isinf( src[f][c] ) || std::isnan( src[f][c] ) )
			{
				#ifdef LMMS_DEBUG
					// TODO don't use printf here
					printf("Bad data, clearing buffer. frame: ");
					printf("%d: value %f\n", f, src[f][c]);
				#endif
				for( int f = 0; f < frames; ++f )
				{
					for( int c = 0; c < 2; ++c )
					{
						src[f][c] = 0.0f;
					}
				}
				found = true;
				return found;
			}
			...

Does this not mute the entire buffer? Also, from @zonkmachine, the commit involving this code reads:

* If we find NaN/inf, we declare the whole buffer bad and set it to 0.0f. This
is because the noise leading up to, or coming from, an infinite or NaN value
is often very large and will create problems later in the sound chain. Especially
if it hits a delay based fx with feedback.

It's the same thing, the only difference is that I just mute the buffer at the very end if an Inf/NaN is found. I understand your concerns about a regression, but AFAICT, the behavior is the same. It might be helpful to test the PR yourself for the faulty project, I may have misunderstood something.

With regards to the name of the feature (verb, noun or otherwise) it will be better defined if we can agree on what it should be doing, which still seems to be in disagreement.

muteinvalidoutput mutes invalid output from a mixer channel.

@tresf
Copy link
Member

tresf commented Jun 26, 2024

Does this not mute the entire buffer?

You're right, I retested and it does. Sorry for any confusion.

What's particularly odd about this bug -- unfa-Dancefloor-04.mmpz.zip -- is that the kick gets silenced, not the snare, (both have the same reverb plugin applied) and it only occurs on a fresh load of LMMS. When reloading the project, it completely masks the bug, which is likely why it was hard for you to reproduce and why some of my tests sounded "fine".

TLDR: They both mute the mixer channel, but the current behavior does so in a suboptimal way, and doesn't provide the visual indicator of course.

I agree with this now.

This PR also changes the nanhandler to be exposed and was renamed to muteinvalidoutput for more accuracy as to what the option does.

With regards to an upgrade handler, I believe this nanhandler flag never landed on a stable release, so I won't fight for the upgrade routine. I'll mark that comment as resolved..

Edit: Found it, it seems to have landed on a stable release: b28d405. It's a hidden flag, so it's probably safe to rename without an upgrade routine but we would normally upgrade these regardless, especially if it's a 1:1 upgrade. I'm fine either way.

@sakertooth
Copy link
Contributor Author

I think I'm done here.

@tresf
Copy link
Member

tresf commented Jun 27, 2024

I'm basing my review solely from a usability perspective.

  • Upon retesting KILLER BUG: Inifinitely Loud Silence (ILS or rather NaN) #1048, the test fails. The master channel is muted and the track is unplayable.
    • This is still preferable to the old nanhandler=0 behavior because this PR introduces a visual indicator as to WHY the entire track goes silent, which is a huge improvement.
    • This is a regression to the old nanhandler=1 behavior, where the track would still play all other channels.
  • I cannot observe a difference between muteinvalidoutput=1 and muteinvalidoutput=0.
  • I cannot reproduce the described, expected behavior where the problematic plugin will become muted.

@sakertooth at one point, the problematic plugin was advertised as being muted, but this seems to have changed. My thoughts in previous conversations were based on this assumption. If the behavior changes, those comments become outdated.

@sakertooth sakertooth force-pushed the remove-mixer-sanitation branch from 56dc9cd to 5a6e20c Compare June 27, 2024 22:30
@sakertooth
Copy link
Contributor Author

@sakertooth at one point, the problematic plugin was advertised as being muted, but this seems to have changed. My thoughts in previous conversations were based on this assumption. If the behavior changes, those comments become outdated.

I restored the previous behavior of this PR, sorry for any confusion. I was still considering potentially other routes the implementation can go in. The current behavior is: if invalid output (Inf/NaN) is found in the mixer channel's buffer at the very end, the mixer channel is muted (this goes for any of the mixer channels, not just the master channel), and a visual indicator is presented. I'm looking to get this merged soon since its still an improvement over the old behavior.

@sakertooth sakertooth changed the title Improve handling of invalid output Add feature to mute mixer channels with Inf/NaN output Jun 27, 2024
@tresf
Copy link
Member

tresf commented Jul 16, 2024

@sakertooth thanks for all the work on this. This does exactly as described. I consider this an improvement over existing behavior.

An enhancement to this functionality would be to display some type of alert when the issue occurs, but the logs do show some helpful information, so I would consider this ready.

image

@tresf tresf self-requested a review July 16, 2024 15:27
Copy link
Member

@tresf tresf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved (testing and behavior).

@sakertooth
Copy link
Contributor Author

An enhancement to this functionality would be to display some type of alert when the issue occurs, but the logs do show some helpful information, so I would consider this ready.

It seems like I did away with the logging recently (not exactly sure why). I'll add it back since it is helpful and think of merging after if no one objects.

@sakertooth
Copy link
Contributor Author

There's an issue with the connections not applying to the right channels, causing some channels to report having invalid output when there isn't. I'll have to fix this issue before merge.

@sakertooth sakertooth marked this pull request as draft July 19, 2024 22:56
Since I realized that this can cause a lot of unwanted
spam (updates to the invalid output state can happen even when just moving mixer channels around), I am removing this again. Having some text indiciator in the GUI would be a better solution.
@sakertooth sakertooth marked this pull request as ready for review September 22, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants