Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSL:BUG: DSP boot failed after system resume due to memory alloc failed : -12 #5161

Open
Vamshigopal opened this issue Aug 30, 2024 · 10 comments

Comments

@Vamshigopal
Copy link

Describe the bug
On JSL chromebook device , When system goes to low memory , we see memory alloc failed and DSP failed to boot after system resume.

To Reproduce

Boot the chromebook
Restrict the system memory to 4gb
Run memory intense workloads
Paralley run suspend_stress_test

Redroduce rate

Very rare , but frequent reports from field.

Environment
Kernel Branch: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/tags/v6.1.105
Platform: JSL

Logs:
dmesg (6).txt

Screenshots or console output:

[ 136.436831] sof-audio-pci-intel-icl 0000:00:1f.3: error: memory alloc failed: -12
[ 136.436865] sof-audio-pci-intel-icl 0000:00:1f.3: error: dma prepare for fw loading failed
[ 136.436869] sof-audio-pci-intel-icl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 136.436871] sof-audio-pci-intel-icl 0000:00:1f.3: Failed to start DSP
[ 136.436874] sof-audio-pci-intel-icl 0000:00:1f.3: fw_state: SOF_FW_BOOT_IN_PROGRESS (3)
[ 136.436897] sof-audio-pci-intel-icl 0000:00:1f.3: 0xffffffff: unknown ROM status value
[ 136.436926] sof-audio-pci-intel-icl 0000:00:1f.3: extended rom status: 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff 0xffffffff
[ 136.436929] sof-audio-pci-intel-icl 0000:00:1f.3: ------------[ DSP dump end ]------------
[ 136.436932] sof-audio-pci-intel-icl 0000:00:1f.3: error: failed to boot DSP firmware after resume -12
[ 136.436937] sof-audio-pci-intel-icl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -12
[ 136.436942] SSP1-Codec: ASoC: error at __soc_pcm_open on SSP1-Codec: -12
[ 136.436946] Speakers: ASoC: error at dpcm_be_dai_startup on Speakers: -12
[ 136.436950] Speakers: ASoC: error at dpcm_fe_dai_startup on Speakers: -12
[ 137.461411] sof-audio-pci-intel-icl 0000:00:1f.3: ASoC: error at snd_soc_pcm_component_pm_runtime_get on 0000:00:1f.3: -22

@Vamshigopal
Copy link
Author

@plbossart @ujfalusi @kv2019i

@plbossart
Copy link
Member

I am not sure what you would expect from the audio driver @Vamshigopal. The system has a problem or some sort of memory leak. There's not much we can do here.

IIRC JSL does not rely on the IMR boot, maybe that's part of the problem. The firmware would be too old to make use of enhanced capabilities.

@Vamshigopal
Copy link
Author

Yes @plbossart i can understand that there not much we can do if there is no memory left , but here if DSP boot fails and when system has as sufficient memory, we cant recover DSP without rebooting the whole system. This experience to user seems problematic. Espically in the sytems with 4gb RAM this issue occurs more frequently.

What Customer is looking from audio driver is is there a way we can upgrade the flags passed into the allocation such that it never fails ? I can understand if we try too hard we get watchdog and giving up is breaking audio.

Yes IMR is supported from CAVS2_5-001-drop-stable branch , so we cant support it for JSL platform.

@ujfalusi
Copy link
Collaborator

This is 6.1 kernel, right? I think there were some fixes, improvements to make this allocation failure more rare.
It is also possible that some backported patch broke the allocation logic (afaik we had that in the past).

If we cannot allocate memory for the firmware to download then it is least of the problem in the system, which will fail in all sorts of way humanly possible.

@kv2019i
Copy link
Collaborator

kv2019i commented Aug 30, 2024

@Vamshigopal I think this is same issue as we had with ADL-N -> #3915 and #3844

For latter, we submitted

commit a61c7d88d38cf3b9c88cf667c4f8a389a57744d4
Author: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Date:   Fri Sep 23 18:35:01 2022 +0300

    ALSA: memalloc: use __GFP_RETRY_MAYFAIL for DMA mem allocs

to fix the case. There is a risk the issue can come back as the solution done to try harder in audio driver, seems to trigger issues in other cases (in low-memory conditions).

@Vamshigopal
Copy link
Author

This is 6.1 kernel, right? I think there were some fixes, improvements to make this allocation failure more rare. It is also possible that some backported patch broke the allocation logic (afaik we had that in the past).

yes 6.1 chrome kernel (https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/tags/v6.1.105). If you know any fixes, please do share it i can check if its part of the kernel.

@ujfalusi
Copy link
Collaborator

@Vamshigopal, do you have timeline when the reports started to com in or this has always been there since launch?

@Vamshigopal
Copy link
Author

@Vamshigopal, do you have timeline when the reports started to com in or this has always been there since launch?

As per customer , they started seeing this issue once they migrated the kernel from v5.4 to v6.1.
v5.4 https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/chromeos-5.4
v6.1 https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/chromeos-6.1

In v5.4 kernel they didnt see this issue, only in v6.1 kernel they saw this issue.

@Vamshigopal
Copy link
Author

@Vamshigopal I think this is same issue as we had with ADL-N -> #3915 and #3844
Yes @kv2019i we have similar issues in ADL-N and ADL , but one difference i would see here in JSL we dont have IMR.

@kv2019i
Copy link
Collaborator

kv2019i commented Aug 30, 2024

Ack, @Vamshigopal not using IMR makes this condition easier to hit. tiwai@a61c7d8 is worth a try on top of v6.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants