Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WHL] Unable to boot linux to UI with SOF enabled for 4.19 kernel #250

Closed
emilchudzik opened this issue Nov 7, 2018 · 23 comments
Closed
Assignees
Labels
bug Something isn't working P2 Critical bugs or normal features WHL Applies to WhiskeyLake platform

Comments

@emilchudzik
Copy link

emilchudzik commented Nov 7, 2018

Builded kernel with default config - can boot to UI.
Steps:

git clone https://github.com/thesofproject/linux.git
cd linux
git status

branch: sof-dev
commit: af44aa9

make defconfig
make deb-pkg -j16 KCONFIG_CONFIG=.config

After install this image there is a possibility to boot linux to UI.

When I use config where is enabled SOF, HDA, CNL and codec for WHL (from RanderWang) there is an issue during boot.

branch: sof-dev
commit: af44aa9

cp ../Rander/hda_config ./.config
make menuconfig KCONFIG_CONFIG=.config -> check options: SOF, HDA, CNL, codec
make deb-pkg -j16 KCONFIG_CONFIG=.config

During boot I have this issue:
image

@lgirdwood
Copy link
Member

@echudzi sorry, there is not much info realted to audio on the screenshot. Can you attached the output from dmesg so @RanderWang can check.

@RanderWang
Copy link

l test this config on three whl ,no such issue. and kernel config depends on hw, so my config maybe not suitable for yours. lt boot failed before loading sof. l can't debug it

@emilchudzik
Copy link
Author

@lgirdwood there is no dmesg, I see only this screen after boot.
@RanderWang How did you generate your config? I know where SOF/HDA/CNL and codec enable. But I need stable base config.

@emilchudzik
Copy link
Author

emilchudzik commented Nov 7, 2018

I just make on commit af44aa9 (sof-dev branch):

make defconfig

I can boot to UI, but there is an issue on dmesg:

[    2.805079] snd_hda_intel 0000:00:1f.3: enabling device (0000 -> 0002)
...
[    2.809040] ALSA device list:
[    2.809040]   No soundcards found.
...
[    2.984838] snd_hda_intel 0000:00:1f.3: bound 0000:00:02.0 (ops 0xffffffff8a298d00)
[    3.003333] hdaudio hdaudioC0D0: Unable to bind the codec
[    3.003502] hdaudio hdaudioC0D2: Unable to bind the codec
...

@emilchudzik
Copy link
Author

And this is the config file based on "make defconfig":
config_af44aa9_def-hdaconfig.txt

@RanderWang could you check what is still to be done in the config? Maybe you will need to add the missing settings to the Wiki?

@plbossart
Copy link
Member

@RanderWang please submit a PR for the Kconfig repository (https://github.com/thesofproject/kconfig)
so that we have your setup maintained and reproduced. It's not good practice to maintain your own config shared privately.
we only want the deltas on top of defconfig, not the whole thing.
Thanks!

@plbossart
Copy link
Member

@echudzi you may want to use ktest to simplify the kernel updates. There is no need to use the Debian packages. See https://github.com/thesofproject/sof-docs/blob/master/getting_started/setup/setup_ktest_environment.rst

@RanderWang
Copy link

@RanderWang please submit a PR for the Kconfig repository (https://github.com/thesofproject/kconfig)
so that we have your setup maintained and reproduced. It's not good practice to maintain your own config shared privately.
we only want the deltas on top of defconfig, not the whole thing.
Thanks!

Yes, I will do it

@RanderWang
Copy link

RanderWang commented Nov 8, 2018

@plbossart @echudzi

I try make defconfig to make a default config. And then I build a image without any change. I get kernel crash. I tried your idea a few month ago on CNL, also failed.

crash

You may find hda info in this picture, but it is not related to it. I disabled audio driver and got another crash at different point.

make defconfig doesn't work on WHL. please see my wikipage to build kernel image. Please try it, I just first remove .config and make menuconfig (it will select almost all the items, more than make defconfig) then select SOF. It works well

@emilchudzik
Copy link
Author

@plbossart @RanderWang when I delete .config file from repository then command "make menuconfig" create new .config based on local machine where it will be built.
In my case this is dev machine with linux (some market desktop). I think this is not good solution to use this config for WHL.
image

@emilchudzik
Copy link
Author

But with that config (based on my local dev machine from market) I can boot linux with HDA and SOF enabled to UI and playback/capture works well.

We can close issue, but for me this in to good way to prepare config for any platform.

@plbossart
Copy link
Member

plbossart commented Nov 8, 2018

@emilchudzik please use the recommended path of doing "make defconfig" then use merge_config.sh to add the needed options. We cannot reproduce any of your problems if you use a locally generated config based on a kernel that is 4 generations older than the one you are testing.
There's a reason why I worked on these kconfig fragments... only use "make menuconfig" to alter a configuration, never to start from scratch. Thank you.

@emilchudzik
Copy link
Author

@plbossart your method in my WHL doesn't work, I still can't boot to UI, when I use "make defconfig" and then follow steps from https://github.com/thesofproject/kconfig.

when I use Rander method - I can boot to UI, but this steps I still think is not good idea:
rm .config
make menuconfig
It enable a lot of options in kernel, time of building is long, but it works.
I try to explain it in ths topic, for me it is still not clear.
If only I met this problems we can close this issue.

@plbossart
Copy link
Member

@xiulipan can you confirm if the configs on https://github.com/thesofproject/kconfig work for WHL and if not work with @RanderWang to fix this? If @emilchudzik is right, we have a gap in our configs. Thanks!

@xiulipan
Copy link

@plbossart
Yes our config need some on device test.
Now the https://github.com/thesofproject/kconfig could not even boot on APL UP2 board.
I am doing the fix to have the kernel built with kconfig could run on every platform we have.
Now I found most gaps are in base-defconfig part, will send the PR once I did all the patch and test.

@plbossart
Copy link
Member

@xiulipan I use all the kconfig stuff on Up2 and have no problem at all. It's been working for at least 6 months. Did you mean the WHL board?

@plbossart
Copy link
Member

@emilchudzik I also just tried on a WHL device, things work just fine for me. I am able to boot to UI with make defconfig+ merge_config.sh .config base-defconfig. The only thing that doesn't work is the trackpad.

As to the error on "Unable to bind codec" error, it's normal since we disable the legacy HDAudio in the base config to avoid conflicts with SOF or SST drivers. It's a feature, not a bug. I just pushed an additional kconfig fragment called hdaudio-legacy-defconfig that gives you legacy audio directly (incompatible and mutually exclusive with sof-defconfig and sst-defconfig of course).

@xiulipan
Copy link

@plbossart
Sorry about that. This seems to be the CI issue, I rechecked the kernel binary with kconfig on BYT, APL, CNL, WHL and all works find with UI.
Have no ICL platform. Will update status later.

@emilchudzik
Copy link
Author

emilchudzik commented Nov 14, 2018

@plbossart new repo kconfig and command below resolve issue, there is possibility to boot to UI.
# cd <kernel_path>
# make defconfig
# /scripts/kconfig/merge_config.sh .config [kconfig_path]/kconfig/base-defconfig [kconfig_path]/kconfig/sof-defconfig [kconfig_path]/kconfig/hdaudio-codecs-defconfig
# make bindeb-pkg -j32

This is config file created on above steps:
_config.txt

But loading topology failed:
image

for HDA I need to use: /lib/firmware/intel/sof-hda-generic.tplg, not sof-cnl.tplg as in dmesg.
@RanderWang could you look at this why driver forces loading sof-cnl.tplg

@plbossart
Copy link
Member

@emilchudzik you most likely have an ACPI ID that selects an I2S-based machine. You need to make sure your BIOS or ACPI initrd overrides don't have anything that prevents HDaudio from being selected. Can you run alsa-info.sh and paste the results somewhere.

@emilchudzik emilchudzik added the WHL Applies to WhiskeyLake platform label Nov 15, 2018
@emilchudzik
Copy link
Author

Here is an alsa-info output when used Rander method (created config based on local machine config).
Status: playback and capture works.
alsa-info.txt.dN7IBDWdtd_usingRanderMethod.txt

Here is an alsa-info output when used kconfig to prepare .config file.
Status: loaded sof-cnl.tplg instead of sof-hda-generic.tplg
alsa-info.txt.kz4taOuHFc_usingKconfig.txt

BIOS settings not changed - from the Rander Wiki

@plbossart
Copy link
Member

plbossart commented Nov 15, 2018

@emilchudzik the log does not include anything related to ACPI devices?
the chunk of alsa-info.sh is this:

# Check for ACPI device status
if [ -d /sys/bus/acpi/devices ]; then
    for f in /sys/bus/acpi/devices/*/status; do
	ACPI_STATUS=$(cat $f 2>/dev/null);
	if [[ "$ACPI_STATUS" -ne 0 ]]; then
	    echo $f $'\t' $ACPI_STATUS >>$TEMPDIR/acpidevicestatus.tmp;
	fi
    done
fi

We need to figure out why you have an I2S-related ACPI ID. Chances are the config you use locally does not include anything but HDAudio support, while the generic one includes both HDaudio and I2S support for CNL-rt274, and the latter is likely selected. Keep in mind that HDaudio is selected as a fall-back when I2S devices are not detected.

@mengdonglin mengdonglin added bug Something isn't working P2 Critical bugs or normal features labels Nov 22, 2018
@mengdonglin
Copy link
Collaborator

@emilchudzik Closing this issue since Ubuntu can boot to UI now. Please file a new bug if you still see the proper HDaudio machine driver cannot be selected on you WHL RVP.

bardliao pushed a commit to bardliao/linux that referenced this issue May 29, 2020
Don't call req->page_done() on each page as we finish filling it with
the data coming from the network.  Whilst this might speed up the
application a bit, it's a problem if there's a network failure and the
operation has to be reissued.

If this happens, an oops occurs because afs_readpages_page_done() clears
the pointer to each page it unlocks and when a retry happens, the
pointers to the pages it wants to fill are now NULL (and the pages have
been unlocked anyway).

Instead, wait till the operation completes successfully and only then
release all the pages after clearing any terminal gap (the server can
give us less data than we requested as we're allowed to ask for more
than is available).

KASAN produces a bug like the following, and even without KASAN, it can
oops and panic.

    BUG: KASAN: wild-memory-access in _copy_to_iter+0x323/0x5f4
    Write of size 1404 at addr 0005088000000000 by task md5sum/5235

    CPU: 0 PID: 5235 Comm: md5sum Not tainted 5.7.0-rc3-fscache+ thesofproject#250
    Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
    Call Trace:
     memcpy+0x39/0x58
     _copy_to_iter+0x323/0x5f4
     __skb_datagram_iter+0x89/0x2a6
     skb_copy_datagram_iter+0x129/0x135
     rxrpc_recvmsg_data.isra.0+0x615/0xd42
     rxrpc_kernel_recv_data+0x1e9/0x3ae
     afs_extract_data+0x139/0x33a
     yfs_deliver_fs_fetch_data64+0x47a/0x91b
     afs_deliver_to_call+0x304/0x709
     afs_wait_for_call_to_complete+0x1cc/0x4ad
     yfs_fs_fetch_data+0x279/0x288
     afs_fetch_data+0x1e1/0x38d
     afs_readpages+0x593/0x72e
     read_pages+0xf5/0x21e
     __do_page_cache_readahead+0x128/0x23f
     ondemand_readahead+0x36e/0x37f
     generic_file_buffered_read+0x234/0x680
     new_sync_read+0x109/0x17e
     vfs_read+0xe6/0x138
     ksys_read+0xd8/0x14d
     do_syscall_64+0x6e/0x8a
     entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fixes: 196ee9c ("afs: Make afs_fs_fetch_data() take a list of pages")
Fixes: 30062bd ("afs: Implement YFS support in the fs client")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
naveen-manohar pushed a commit to naveen-manohar/linux that referenced this issue Jun 22, 2020
[ Upstream commit 9d1be4f ]

Don't call req->page_done() on each page as we finish filling it with
the data coming from the network.  Whilst this might speed up the
application a bit, it's a problem if there's a network failure and the
operation has to be reissued.

If this happens, an oops occurs because afs_readpages_page_done() clears
the pointer to each page it unlocks and when a retry happens, the
pointers to the pages it wants to fill are now NULL (and the pages have
been unlocked anyway).

Instead, wait till the operation completes successfully and only then
release all the pages after clearing any terminal gap (the server can
give us less data than we requested as we're allowed to ask for more
than is available).

KASAN produces a bug like the following, and even without KASAN, it can
oops and panic.

    BUG: KASAN: wild-memory-access in _copy_to_iter+0x323/0x5f4
    Write of size 1404 at addr 0005088000000000 by task md5sum/5235

    CPU: 0 PID: 5235 Comm: md5sum Not tainted 5.7.0-rc3-fscache+ thesofproject#250
    Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
    Call Trace:
     memcpy+0x39/0x58
     _copy_to_iter+0x323/0x5f4
     __skb_datagram_iter+0x89/0x2a6
     skb_copy_datagram_iter+0x129/0x135
     rxrpc_recvmsg_data.isra.0+0x615/0xd42
     rxrpc_kernel_recv_data+0x1e9/0x3ae
     afs_extract_data+0x139/0x33a
     yfs_deliver_fs_fetch_data64+0x47a/0x91b
     afs_deliver_to_call+0x304/0x709
     afs_wait_for_call_to_complete+0x1cc/0x4ad
     yfs_fs_fetch_data+0x279/0x288
     afs_fetch_data+0x1e1/0x38d
     afs_readpages+0x593/0x72e
     read_pages+0xf5/0x21e
     __do_page_cache_readahead+0x128/0x23f
     ondemand_readahead+0x36e/0x37f
     generic_file_buffered_read+0x234/0x680
     new_sync_read+0x109/0x17e
     vfs_read+0xe6/0x138
     ksys_read+0xd8/0x14d
     do_syscall_64+0x6e/0x8a
     entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fixes: 196ee9c ("afs: Make afs_fs_fetch_data() take a list of pages")
Fixes: 30062bd ("afs: Implement YFS support in the fs client")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aiChaoSONG pushed a commit to aiChaoSONG/linux that referenced this issue May 6, 2021
CI: save ~25% of time with pre-built `bindgen`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2 Critical bugs or normal features WHL Applies to WhiskeyLake platform
Projects
None yet
Development

No branches or pull requests

6 participants