Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Zephyr baseline update blocked #8818

Closed
kv2019i opened this issue Jan 30, 2024 · 13 comments
Closed

[BUG] Zephyr baseline update blocked #8818

kv2019i opened this issue Jan 30, 2024 · 13 comments
Assignees
Labels
bug Something isn't working as expected P1 Blocker bugs or important features Zephyr Issues only observed with Zephyr integrated
Milestone

Comments

@kv2019i
Copy link
Collaborator

kv2019i commented Jan 30, 2024

Describe the bug
Updating SOF main to Zephyr main shows issues in SOF CI tests, blocking merging.

Last good #8797
First failed attempt: #8804 (Zephyr zephyrproject-rtos/zephyr@6a0b1da )
Latest failed attempt: #8813

To Reproduce
Update Zephyr version in SOF west.yml and submit PR to SOF CI.

Reproduction Rate
100%

Expected behavior
SOF test suite passed with latest Zephyr.

Impact
Blocks updates of Zephyr in SOF main.

Environment
See above

Screenshots or console output

Notes
Discovery of this was delayed by an earlier regression spotted in #8747 and later fixed on SOF side (by #8790 ).

Bisect is slowed down by interface change decsribed in zephyrproject-rtos/zephyr#64755

@kv2019i kv2019i added bug Something isn't working as expected P1 Blocker bugs or important features Zephyr Issues only observed with Zephyr integrated labels Jan 30, 2024
@kv2019i
Copy link
Collaborator Author

kv2019i commented Jan 31, 2024

Status as of 31th Jan, we have two blocking issues:

@kv2019i
Copy link
Collaborator Author

kv2019i commented Feb 14, 2024

The issues with SMP interface change are still unresolved, but I took a quick peek at latest Zephyr 3.6.0-rc2 and we have more problems:

  • -fno-lto added -> xtensa toolchain fails
  • various union initializers that break -> break Intel TGL builds with xtensa toolchain in SOF CI
  • the cache interface change (to sys_cache) -> requires SOF side changes

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 14, 2024

I'll look at the toolchain issues.

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 16, 2024

-fno-lto added -> xtensa toolchain fails

Fix submitted:

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 16, 2024

various union initializers that break -> break Intel TGL builds with xtensa toolchain in SOF CI

Bisected to big commit zephyrproject-rtos/zephyr@25173f71cda630d4fb (zephyrproject-rtos/zephyr#67424)

Déjà vu?

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 21, 2024

various union initializers that break -> break Intel TGL builds with xtensa toolchain in SOF CI

Fix submitted:

@kv2019i
Copy link
Collaborator Author

kv2019i commented Feb 23, 2024

@LaurentiuM1234 @dbaluta Any preference how we handle this for SOF2.9? We are still struggling with the SMP interface change on Intel platforms (merged in Zephyr upstream), but the problems only affect multicore configurations. The 2.9 branch point is next week, so we have two options to proceed:
a) fix all the remain issues and update to 3.6-rc for SOF2.9
b) branch SOF2.9 with the current version of Zephyr used by SOF and merge new Zephyr baseline ASAP after stable-v2.9 branch is created

Option (a) is obviously preferrable, but do you see any showstopper with (b)? We can backport Zephyr changes to stable-v2.9 branch, but of course if there's a lot, that's going to be hard.

FYI @marc-hb @lgirdwood .

@LaurentiuM1234
Copy link
Contributor

Option (a) is obviously preferrable, but do you see any showstopper with (b)? We can backport Zephyr changes to stable-v2.9 branch, but of course if there's a lot, that's going to be hard.

as far as the NXP native switch is concerned there shouldn't be an issue with option (b). Also, as far as I'm aware the only changes to Zephyr (that can impact SOF) we've had lately are all native-related. @dbaluta and @iuliana-prodan can correct me if I'm wrong here.

@lgirdwood
Copy link
Member

a) fix all the remain issues and update to 3.6-rc for SOF2.9 b) branch SOF2.9 with the current version of Zephyr used by SOF and merge new Zephyr baseline ASAP after stable-v2.9 branch is created

Option (a) is obviously preferrable, but do you see any showstopper with (b)? We can backport Zephyr changes to stable-v2.9 branch, but of course if there's a lot, that's going to be hard.

Lets try a) and extend code freeze by 1 week. If too many problems then b) will be backup.

@kv2019i
Copy link
Collaborator Author

kv2019i commented Mar 4, 2024

#8732 is getting close, but there are still issues (plus 8732 uses unmerged commits from Zephyr). So let's go with option (b) for SOF2.9. We can backport Zephyr upstream fixes to the stable branch used by SOF.

@kv2019i kv2019i removed this from the v2.9 milestone Mar 4, 2024
@kv2019i kv2019i added this to the v2.10 milestone Mar 4, 2024
kv2019i added a commit to kv2019i/sof that referenced this issue Mar 5, 2024
Update to Zephyr in sof/main-rebase-20240305 branch of SOF
project's clone of Zephyr upstream repository. Revert one
Zephyr commit "pm: Remove CURRENT_CPU macro" that is leading to
failed tests in SOF CI test suite.

The revert allows us to update Zephyr to a newer version and tackle the
SMP boot and cache interface changes in SOF. The latest Zephyr upstream
has further changes needed in SOF for platform configuration and these
will require separarate changes.

Link: thesofproject#8818
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i
Copy link
Collaborator Author

kv2019i commented Mar 5, 2024

Test results look good now #8903 . After code reviews, we can merge this.

We do need one revert, filed following bug in Zephyr to track zephyrproject-rtos/zephyr#69807

kv2019i added a commit to kv2019i/sof that referenced this issue Mar 5, 2024
Update to Zephyr in sof/main-rebase-20240305 branch of SOF
project's clone of Zephyr upstream repository. Revert one
Zephyr commit "pm: Remove CURRENT_CPU macro" that is leading to
failed tests in SOF CI test suite.

The revert allows us to update Zephyr to a newer version and tackle the
SMP boot and cache interface changes in SOF. The latest Zephyr upstream
has further changes needed in SOF for platform configuration and these
will require separarate changes.

Link: thesofproject#8818
Link: zephyrproject-rtos/zephyr#69807
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
dbaluta pushed a commit that referenced this issue Mar 6, 2024
Update to Zephyr in sof/main-rebase-20240305 branch of SOF
project's clone of Zephyr upstream repository. Revert one
Zephyr commit "pm: Remove CURRENT_CPU macro" that is leading to
failed tests in SOF CI test suite.

The revert allows us to update Zephyr to a newer version and tackle the
SMP boot and cache interface changes in SOF. The latest Zephyr upstream
has further changes needed in SOF for platform configuration and these
will require separarate changes.

Link: #8818
Link: zephyrproject-rtos/zephyr#69807
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i
Copy link
Collaborator Author

kv2019i commented Mar 6, 2024

Baseline now updated via #8903 so we can close this. I filed a new bug #8908 to track the need for the revert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected P1 Blocker bugs or important features Zephyr Issues only observed with Zephyr integrated
Projects
None yet
Development

No branches or pull requests

5 participants