Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASoC: SOF: Fix suspend while paused corner case #5054

Conversation

ujfalusi
Copy link
Collaborator

@ujfalusi ujfalusi commented Jun 11, 2024

Hi,

Summary: this PR fixes the suspend while paused corner case for both IPC3 and IPC4 (was locking up Intel platforms)

The suspend while we have paused stream has been broken likely for several years (?). I cannot pin-point the exact commit, but there are several underlying issues contributing to the broken corner case:

  • wrong order of stopping the link DMA on the paused stream (set_hw_params_upon_resume did that)
  • IPC3 will not stop the host DMA on suspend if there were a paused stream.
  • IPC4 pipelines were not reset because the pipelines were in paused state, so unbind/free fails and the whole stack got confused.

This PR replaces #5040 with a more complicated and in SOF fix:

  • replace the set_hw_params_upon_resume() with suspend_early() and call it before we tear down the remaining pipelines:
  • the stream is paused, so we need to follow the same sequence as we have when we suspend while we have active audio
  • reset the started/paused count when we suspend the paused pipelines
  • stop the host DMA with IPC3 if we suspend while we have paused stream.

With this PR the system is no longer locks up and the paused aplay/arecord can be PAUSE_RELEASEd after the system resume.
Since we don't support the RESUME trigger, the stream is going to be re-started (like with active audio), but things work now correctly.

Fixes: #5035

ujfalusi added 8 commits June 11, 2024 10:29
…in dai_suspend

The TRIGGER_SUSPEND sequence is:
pre_trigger()
trigger()
post_trigger()
hda_link_dma_cleanup()

In hda_dai_suspend() we only call the post_trigger() and the link cleanup,
do it in the same order as it would have been during a trigger.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…/SUSPEND

The same rule applies to the paused_count as for the started_count: on
STOP/SUSPEND both needs to be reset to 0.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
The new suspend_early() can be used by platform code to prepare for the
system suspend and execute tasks needed to be ready to handle the suspend
call.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…set_hw_params_upon_resume

The function 'hda_dsp_set_hw_params_upon_resume' has lost it's purpose as
we are not remotely doing anything resembling what the function implies.
The only thing we do is to stop any paused streams (paused streams will
not receive suspend trigger and would block the suspend in hardware).

The call sequence in pm.c is also wrong for the set_hw_params_upon_resume()
since we would need to make sure that the link DMA is stopped _before_ we
tear down the pipelines belonging to the paused stream.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
The set_hw_params_upon_resume() was used by Intel platforms. Over time it's
purpose got changed and finally the Intel code is converted to use the
suspend_early() callback instead.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…spend

Introduce a new flag to mark a stream during system suspend and if it is
set, send the STOP trigger for the platform driver to stop the DMA.

This is needed in case the platform_stop_during_hw_free is not true (IPC3)
and the suspend happens when a stream is in paused state.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…on system suspend

The suspending flag will force the generic code to stop the platform DMA
of the paused stream.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…ams correctly

During system suspend while we have paused streams we need to make sure
that the pipelines are properly reset.
Since the stream is in paused state, the pipelines are also in paused
state, which is correct for the RESET transition.
Reset the started/paused counters to indicate that the pipeline will need
to be initialized after resume and PAUSE_RELEASE.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
*/
swidget->spipe->started_count = 0;
swidget->spipe->paused_count = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the behavior on resume then? The stream would restart? How can we keep the stream paused on resume?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot keep the streams paused, we don't support RESUME trigger, so all streams would be restarted anyways, paused or not paused.
In practice: on resume the audio will remain paused (will not run), but when you PAUSE_RELEASE it, then we will restart it. We need to re-initialize everything to get working audio.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will the audio remain paused if the pause_count remains zero? I don't understand how this counter is used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This counter is used internally to ipc4 (to track the individual pipeline states) , ALSA keeps track of the PCM state. On suspend we need to stop everything, so we can put the DSP to off.
So, when we suspend all pipelines will be reset.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ujfalusi this is too confusing. We increment the paused_count when we pause the stream. So lets assume this sequence:

  1. Start the stream : started_count = 1
  2. Pause the stream : paused_count = 1
  3. Suspend the system: The stream doesnt get the suspend trigger, so the started/paused_counts remain at 1
  4. hda_dai_suspend() gets invoked so both counts get reset to 0.
  5. Resume the system
  6. Pause_release the stream: this will decrement the paused_count to -1?
    Am I understanding this correctly?

@@ -210,6 +210,16 @@ static int sof_suspend(struct device *dev, bool runtime_suspend)
if (runtime_suspend && !sof_ops(sdev)->runtime_suspend)
return 0;

/* Prepare the DSP for system suspend */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the difference with a prepare callback, as we do for SoundWire to go top down and make sure all parent/children are properly pm_runtime resumed before the system suspend.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the prepare callback happens before ALSA sends the suspend trigger (it is also sent in prepare callback), we use pm callbacks, not ASoC component callbacks..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was talking about the pm .prepare callback, which is called before suspend. I was not referring to the prepare as a followup to hw_params - naming is confusing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have tried that exactly as first thing, it does not work, it comes before the trigger suspend and breaks suspend with active audio.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough. There should still be a more detailed description of what this callback should implement vs, what needs to remain in the suspend callback proper.
Last time we introduced the probe_early, it was rather straightforward as it included all the features that could not be done in a workqueue. A suspend_early isn't very clear.

pipeline->state, spipe->started_count, spipe->paused_count);

spipe->started_count = 0;
spipe->paused_count = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not clear on when this paused_count would be restored when resuming back to a PAUSED state.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so we have moved the pipelines to PAUSED state, it's started_count and paused_count is not 0, then we suspend the system.
We need to move the pipelines to RESET state to be able to UNBIND and then DELETE them. This is not optional, this is a must.
When the DSP is powered down we will not going to be able to PAUSE_RELEASE (not that we can do that correctly by PAUSE PUSH/RELEASE alone), so we need to reset these counters to 0, nothing will be running or being paused when the system resumes in firmware.

As for the PAUSE_RELEASE, ALSA will notice that we don't support RESUME, so it will re-start the stream.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has been semi broken always. We had errors on suspend while paused, then we had errors on pause release and more errors when user space gave up, then we rpm suspend and reset the counters and on next try things would work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question was "when is paused_count increased again on resume" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we receive PAUSE_PUSH. After system resume ALSA will not touch a paused PCM, from ALSA pow PAUSED == SUSPENDED and a PAUSED stream can only be changed via PAUSE_RELEASE (even a stop needs intermediate release).

Since we don't support RESUME (from suspend), when user sends the trigger PAUSE_RELEASE then ALSA will return with error and the stream is restarted.

The point is: when we resume from suspend we don't have any active pipelines in DSP, so no pipeline can have a started/paused counter other then 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the pipeline is paused but not paused, depending on which state machine you're looking at, and you're relying on userspace to deal with errors?
Humm...this seems a bit weird if I am honest, we end-up with a non-symmetrical state after resume.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PCM stream (as in ALSA terms) is paused but the state of the pipelines (as in IPC4 terms) are stopped (stopped and paused count == 0).
If the PCM is paused and the system suspends we will not get any triggers to react, so we end up with stale counters. We don't have PAUSED->SUSPENDED state transition handler (nether ALSA), but:
RUNNING->SUSPENDED is started_count-- (paused_count is 0) and if both 0 then we send the IPCs and stuff
PAUSED->RUNNING is paused_count-- (started_count is > 0)
so
PAUSE->SUSPENDED is rightfully started_count--, paused_count--, unless they were already 0, hrm, what if we have multiple paused streams with common pipelines? The first one will clear the shared portion, the second will clear it's own.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, multiple paused stream with common pipelines indeed is broken. (hw:0,0 and hw:0,31 on HDA machine)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But to be fair without this PR the whole system just locks up, so a bit better than what we have atm.

Copy link
Collaborator Author

@ujfalusi ujfalusi Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, back again to the same page: #5040 is fixing all of these issues without any side effect.
single paused stream
two paused streams with shared pipeline section

hext_stream,
cpu_dai);
if (ret < 0)
return ret;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't recall if this was an editing mistake or intentional. @ranj063 would you happen to remember this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart I dont think the order really matters in this case. All we do in the post trigger op for suspend is clear the count for pipeline.

sound/soc/sof/intel/hda-common-ops.c Show resolved Hide resolved
@ujfalusi
Copy link
Collaborator Author

Closing, replaced by: #5058

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HD-A] System does not wake up Playback/Capture-> pause -> suspend->resume scenario
3 participants