Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zephyr Build info file #79118

Merged
merged 6 commits into from
Oct 8, 2024
Merged

Conversation

tejlmand
Copy link
Collaborator

@tejlmand tejlmand commented Sep 27, 2024

This PR introduces a common build info file for Zephyr.

Some tools peeks into Zephyr's CMakeCache.txt file and relies on what is considered internal build variables.
This is bad for several reasons, especially as it suddenly puts unknown expectation on the Zephyr build system.
And it makes it harder to cleanup Zephyr's build system when we are doing clean ups in the build code.

Besides the lack of visibility, such as which variables are used externally / downstream, then it's also hard for developers that are integrating to Zephyr to know which settings they can expect to be constant.

Some settings are documented as input to Zephyr build system, like EXTRA_CONF_FILE, EXTRA_DTS_OVERLAY_FILE, etc. but are used together with internal logic on conf file selection, making it hard to extract knowledge on configuration file.

#78727 also present a use-case where internal build knowledge must be exported to Python

To support various use-cases, then this PR proposes a common interface which allows to store build values in a common yaml file.
The yaml file is proposed to be named and placed in <build-dir>/build_info.yml.

A yaml schema is provided in https://github.com/zephyrproject-rtos/zephyr/blob/82dd714dc8dbcf0b1fa138cd521117ec49e3a2be/scripts/schemas/build-schema.yml.
This schema acts as the contract regarding information from the build system.
For example, the schema contains:

  cmake:
    type: map
    mapping:
      ...
      kconfig:
        type: map
        mapping:
          files:
            type: seq
            sequence:
              - type: str

thus promising to place kconfig files in use under this key.
The build system are the free to have this information in an internal variable and rename this variable as it find fit.

To facilitate populating the build info file, then a new CMake function is created: build_info(<key> VALUE <value>)
Example:

  build_info(kconfig user-files VALUE ${CONF_FILE_AS_LIST})

this allows the code to call build_info() close to the final usage of the internal CMake variable.
Furthermore, having a build_info() function makes it easy to find internal settings that are exported, and any CMake variables expanded in the <value> field can easily be renamed if build system internals are changed.

If build_info() is called with an invalid key, like build_info(does not exists VALUE ok), then an error is thrown.
This both helps to ensure that only keys described in the schema are allowed as well as catching typos.

It also ensures that in the event of a key is renamed / removed then usage of said key in build_info() will raise an error.

To both minimize the need for Zephyr modules to require n-entries in the schema as well as supporting vendor specific information, then a free form vendor-specific section is provided. This section is considered under the responsibilities of the vendors using it.

vendor-specific:
type: map
mapping:
regex;(.*):
type: map
mapping:
regex;(.*):
type: str

@pdgendt
Copy link
Collaborator

pdgendt commented Sep 27, 2024

Any reason not to use a .yml/.yaml extension? Free syntax highlighting.

@pdgendt
Copy link
Collaborator

pdgendt commented Sep 27, 2024

Some comments when running

$ west build -p always -b native_sim samples/hello_world
# output snippets
  kconfig:
    files:
     - /home/pdgendt/zephyrproject/zephyr/boards/native/native_sim/native_sim_defconfig
     - /home/pdgendt/zephyrproject/zephyr/samples/hello_world/prj.conf
    user-files:
     - /home/pdgendt/zephyrproject/zephyr/samples/hello_world/prj.conf
  toolchain:
    name: host
    path: 
  zephyr:
    version: 3.7.99
  • Duplicate entries in the kconfig file lists
  • The toolchain info isn't really useful
  • Use/add git_describe from git.cmake output for zephyr/application

@tejlmand
Copy link
Collaborator Author

Any reason not to use a .yml/.yaml extension? Free syntax highlighting.

Only reason was to be somewhat aligned with the default zephyr.hex, zephyr.elf, zephyr.bin, ... files which gave zephyr.build as to indicate the purpose of the file at the expense of file extension. (The alternative zephyr.yml doesn't say much about purpose).

Naming the file build.yaml has the risk of being confused with a "true" build file, such as build.ninja.
But i'm open to good naming suggestions.

@pdgendt
Copy link
Collaborator

pdgendt commented Sep 27, 2024

Naming the file build.yaml has the risk of being confused with a "true" build file, such as build.ninja. But i'm open to good naming suggestions.

build_info.yaml would be my suggestion, showing a direct link from the function to the output.

@tejlmand
Copy link
Collaborator Author

Some comments when running

Thanks, appreciated 👍

  • Duplicate entries in the kconfig file lists

Correct, and the reason for this is to provide a list of all files which are used for the kconfig, but also a dedicated list of those files which are considered user / application controlled and thus directly under the developers control.
As example, a normal developer should be editing prj.conf, board/<board-target>.conf, files given by EXTRA_CONF_FILE, etc, but not files provided by the board (those should be updated by the board maintainer), files created by sysbuild, etc.

One may argue that external tool can concatenate those lists but the tricky part is that internally in CMake all files are being appended to a common new list later at which point we would need to split the lists / rework how they are constructed.
Thus I decided to keep a list of all files, and the list of user facing files.

  • The toolchain info isn't really useful

True for the host toolchain, but when you start to be using different toolchains, such as Zehyr SDK, LLVM for Arm, Arm compiler 6 (armclang), then this info actually has value.
Also for an IDE which may want to show which toolchain was selected.

  • Use/add git_describe from git.cmake output for zephyr/application

Doing so will have other implications, so should not be part of this PR. If we want this, then a dedicated issue / PR for this should be opened.
Some short reasons why this should initially be left out is because the build info is generated at CMake configure time and if any information currently populated in the CMake section changes then CMake will be automatically re-run.
However a regular incremental build, that is where you change a source file, for example main.c, commit the changes and then do an incremental build will not trigger a CMake re-run, which means that the git describe will not match what is actually build (because the commit just changed the SHA of the repo, but CMake did not re-run to update the build info).
See also: #42527 and #40167 as to why we should avoid re-running CMake on every git commit / incremental build.

Secondly, for the git SHAs we have the build meta file for Zephyr which already generates an output file at build time containing the SHA of Zephyr as well as each Zephyr module, so that feature covers much more than just a git describe of Zephyr itself.
See:
https://docs.zephyrproject.org/latest/kconfig.html#CONFIG_BUILD_OUTPUT_META
https://docs.zephyrproject.org/latest/develop/west/zephyr-cmds.html#software-bill-of-materials-west-spdx

So no reason to duplicate information in the build info which we already in a safer way produces at build time.

Copy link
Collaborator

@nordicjm nordicjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts:

  • Comments refer to build.info file but it's actually zephyr.build.
  • In a sysbuild project (smp_svr) it seems to have the same file in multiple places:
   kconfig:
    files:
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/build/_sysbuild/empty.conf
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/sysbuild.conf
    user-files:
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/sysbuild.conf

and

  kconfig:
    files:
     - /tmp/aa/zephyr/boards/nordic/nrf5340dk/nrf5340dk_nrf5340_cpuapp_defconfig
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/prj.conf
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/build/smp_svr/zephyr/.config.sysbuild
    user-files:
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/prj.conf

and

  devicetree:
    bindings-dirs:
     - /tmp/aa/zephyr/dts/bindings
    files:
     - /tmp/aa/zephyr/boards/nordic/nrf5340dk/nrf5340dk_nrf5340_cpuapp.dts
     - /tmp/aa/bootloader/mcuboot/boot/zephyr/app.overlay
    include-dirs:
     - /tmp/aa/bootloader/mcuboot/boot/zephyr/include
...
    user-files:
     - /tmp/aa/bootloader/mcuboot/boot/zephyr/app.overlay
  • Another oddity:
  kconfig:
    files:
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/_AA/_sysbuild/empty.conf
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/_AA/_sysbuild/empty.conf
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/sysbuild.conf
    user-files:
     - /tmp/aa/zephyr/samples/subsys/mgmt/mcumgr/smp_svr/sysbuild.conf
  • Would it make sense to also have the original zephyr commit as well as the version, so that it can be tracked too?
  • This might include the files that were used but does not mean a build can be reproduced e.g. if a user includes their own command line overrides or if they used FILE_SUFFIX

@tejlmand
Copy link
Collaborator Author

@nordicjm thanks for the comments.

Comments refer to build.info file but it's actually zephyr.build.

you mean commit messages ?
Will correct that to build_info.yaml if no other names are proposed, see also ref: #79118 (comment)

In a sysbuild project (smp_svr) it seems to have the same file in multiple places:

See explanation here: #79118 (comment)

Another oddity:

yes, this is a sysbuild / kconfig reuse issue. The current kconfig.cmake implementation requires a given number of files (even if they are empty), so sysbuild simply provides references to empty files to make Kconfig happy.

This has nothing to do with build info but becomes more visible now.
Cleanup of kconfig.cmake in this regard is outside scope of build info PR and should be addressed separately.

Would it make sense to also have the original zephyr commit as well as the version, so that it can be tracked too?

See reply here #79118 (comment) why git commit or similar should be left out initially.

This might include the files that were used but does not mean a build can be reproduced e.g. if a user includes their own command line overrides or if they used FILE_SUFFIX

It's reproducible in the sense that you can see exactly what was used by the build system, but not in the sense that you have an explicit command that will produce exactly the list of files, as example if extra Zephyr modules are available.

The purpose of this PR is to provide a solution which shows what was used, for example to be used by an IDE to present configuration files, or a downstream project which needs to use settings for own build invocation, see: #78727 for an example.

Trying to provide a solution which provides a single command for reproducible builds is a different task, because a user might manually modify CMakeCache.txt after the first CMake invocation, but you have no way to know what changed and if a given setting can just be appended to original CMake invocation.
As example, running cmake -DFOO=y + modify CMakeCache and add BAR=y may not behave the same as cmake -DFOO=y -DBAR=y because a given cache setting BAZ could default in first run based on the value of BAR, but if BAR is later changed then BAZ may be sticky (FORCE vs no FORCE when writing cache value).
This is especially true in Kconfig wrt. stuck symbols: https://docs.zephyrproject.org/latest/build/kconfig/tips.html#stuck-symbols-in-menuconfig-and-guiconfig.

Please pay attention to the description of the PR:

This PR introduces a common build info file for Zephyr.

Some tools peeks into Zephyr's CMakeCache.txt file and relies on what is considered internal build variables.
This is bad for several reasons, especially as it suddenly puts unknown expectation on the Zephyr build system.
And it makes it harder to cleanup Zephyr's build system when we are doing clean ups in the build code.

what is useful with this PR in relation to reproducible builds is the fact that you now have a file with the internal info from a given build which you may compare against a known expectation, and if the file you generate doesn't match expectation then you can compare and see what the difference are, which is much easier than comparing two elf files or two build.ninja files.
It might even be beneficial during a git bisect session as you have this file after CMake run, before the build stage.

And combined with build meta feature we already have:
https://docs.zephyrproject.org/latest/kconfig.html#CONFIG_BUILD_OUTPUT_META
https://docs.zephyrproject.org/latest/develop/west/zephyr-cmds.html#software-bill-of-materials-west-spdx

then we are one more step closer to really be able to provide real support for reproducible builds.

@tejlmand tejlmand force-pushed the build_info_file branch 2 times, most recently from 3832b26 to be99434 Compare October 1, 2024 07:01
@tejlmand
Copy link
Collaborator Author

tejlmand commented Oct 1, 2024

build_info.yaml would be my suggestion, showing a direct link from the function to the output.

Done.

pillo79
pillo79 previously approved these changes Oct 3, 2024
@fabiobaltieri
Copy link
Member

@tejlmand can you rebase please?

@nashif
Copy link
Member

nashif commented Oct 5, 2024

this is showing some issues that we need to fix...

    bindings-dirs:
     - /home/nashif/zephyrproject/zephyr/dts/bindings
    files:
     - /home/nashif/zephyrproject/zephyr/boards/qemu/x86/qemu_x86_atom_nopae.dts
    include-dirs:
     - /home/nashif/zephyrproject/modules/hal/atmel/include
     - /home/nashif/zephyrproject/modules/hal/gigadevice/include
     - /home/nashif/zephyrproject/modules/hal/microchip/dts
     - /home/nashif/zephyrproject/modules/hal/nuvoton/dts
     - /home/nashif/zephyrproject/modules/hal/nxp/dts
     - /home/nashif/zephyrproject/modules/hal/stm32/dts
     - /home/nashif/zephyrproject/zephyr/include
     - /home/nashif/zephyrproject/zephyr/include/zephyr
     - /home/nashif/zephyrproject/zephyr/dts/common
     - /home/nashif/zephyrproject/zephyr/dts/x86
     - /home/nashif/zephyrproject/zephyr/dts/xtensa
     - /home/nashif/zephyrproject/zephyr/dts/sparc
     - /home/nashif/zephyrproject/zephyr/dts/riscv
     - /home/nashif/zephyrproject/zephyr/dts/posix
     - /home/nashif/zephyrproject/zephyr/dts/nios2
     - /home/nashif/zephyrproject/zephyr/dts/arm64
     - /home/nashif/zephyrproject/zephyr/dts/arm
     - /home/nashif/zephyrproject/zephyr/dts/arc
     - /home/nashif/zephyrproject/zephyr/dts

why are we getting includes for unrelated modules and hals? looks like those are just being added for each build?

@tejlmand
Copy link
Collaborator Author

tejlmand commented Oct 7, 2024

this is showing some issues that we need to fix...

agree, but fixing this is outside the scope of build info.
Build info just reveals what is already there.

@nashif
Copy link
Member

nashif commented Oct 7, 2024

agree, but fixing this is outside the scope of build info.
Build info just reveals what is already there.

right, it seems the dts part is coming from

dts_root: .

in some of the hals.

@nashif
Copy link
Member

nashif commented Oct 7, 2024

cmake:
  application:
    configuration-dir: /home/nashif/zephyrproject/zephyr/samples/hello_world
    source-dir: /home/nashif/zephyrproject/zephyr/samples/hello_world
  board:
    name: qemu_x86
    qualifiers: /atom
    revision:
  devicetree:
    bindings-dirs:
     - /home/nashif/zephyrproject/zephyr/dts/bindings
     

why do the qualifiers start with '/'? IMO this is wrong and misleading and conflict with what we have in the docs, the "/" is how you connect the various elements, not part of the qualifiers themselves.

Other toolchains uses <toolchain>_TOOLCHAIN_PATH, align Zephyr SDK
by setting ZEPHYR_TOOLCHAIN_PATH to be identical to the
ZEPHYR_SDK_INSTALL_DIR.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
Move Zephyr CMake script mode handling from package_helper.cmake into
extensions.cmake.

This ensures that all Zephyr CMake script which includes
extensions.cmake will have the same functions stubbed or mocked and thus
does not need to replicate this behavior.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
The build_info function provides a generic and stable way of dumping
build information to the <build>/build_info.yml file.

The build info file is in YAML format and the keys in the file are
intended to be stable, as to allow external tools to retrieve
information regarding the build.

The main differences to the CMakeCache.txt are:
- Settings in the CMakeCache.txt are user controlled, whereas the
  information in the build info file is intended to be those values
  which are used by the build system regardless if those are specified
  by the developer or picked up automatically.
- Internal build system variables are not present in the CMake cache
  and should not be, because their values are calculated when CMake
  runs.

This also has the benefits of decoupling CMake variable names from
build info keys. Several CMake variables has internal build system
names, and the build system is free to rename those at its own
discretion.

Having dedicated key names ensures a stable API that external tools can
rely upon.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
For pristine builds 'west build' will now create a build_info.yml file
containing the west build command including arguments.

This is done to help users and external tools to recreate builds.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
Store informations regarding the current Zephyr build.
The following informations are stored during CMake configure:
- Board information
- Application source directory
- Application configuration directory
- Toolchain information
- Devicetree files
- Kconfig config files
- Zephyr version

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
Save information regarding SVD file in use in vendor-specific section
of the build info file.

Information is stored under Nordic section.

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
@tejlmand
Copy link
Collaborator Author

tejlmand commented Oct 7, 2024

why do the qualifiers start with '/'? IMO this is wrong and misleading and conflict with what we have in the docs,

Thanks for catching, fixed 👍

tejlmand added a commit to tejlmand/fw-nrfconnect-zephyr-1 that referenced this pull request Oct 8, 2024
Other toolchains uses <toolchain>_TOOLCHAIN_PATH, align Zephyr SDK
by setting ZEPHYR_TOOLCHAIN_PATH to be identical to the
ZEPHYR_SDK_INSTALL_DIR.

Upstream PR: zephyrproject-rtos/zephyr#79118

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
tejlmand added a commit to tejlmand/fw-nrfconnect-zephyr-1 that referenced this pull request Oct 8, 2024
…o extensions.cmake

Move Zephyr CMake script mode handling from package_helper.cmake into
extensions.cmake.

This ensures that all Zephyr CMake script which includes
extensions.cmake will have the same functions stubbed or mocked and thus
does not need to replicate this behavior.

Upstream PR: zephyrproject-rtos/zephyr#79118

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
tejlmand added a commit to tejlmand/fw-nrfconnect-zephyr-1 that referenced this pull request Oct 8, 2024
The build_info function provides a generic and stable way of dumping
build information to the <build>/build_info.yml file.

The build info file is in YAML format and the keys in the file are
intended to be stable, as to allow external tools to retrieve
information regarding the build.

The main differences to the CMakeCache.txt are:
- Settings in the CMakeCache.txt are user controlled, whereas the
  information in the build info file is intended to be those values
  which are used by the build system regardless if those are specified
  by the developer or picked up automatically.
- Internal build system variables are not present in the CMake cache
  and should not be, because their values are calculated when CMake
  runs.

This also has the benefits of decoupling CMake variable names from
build info keys. Several CMake variables has internal build system
names, and the build system is free to rename those at its own
discretion.

Having dedicated key names ensures a stable API that external tools can
rely upon.

Upstream PR: zephyrproject-rtos/zephyr#79118

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
tejlmand added a commit to tejlmand/fw-nrfconnect-zephyr-1 that referenced this pull request Oct 8, 2024
For pristine builds 'west build' will now create a build_info.yml file
containing the west build command including arguments.

This is done to help users and external tools to recreate builds.

Upstream PR: zephyrproject-rtos/zephyr#79118

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
tejlmand added a commit to tejlmand/fw-nrfconnect-zephyr-1 that referenced this pull request Oct 8, 2024
Store informations regarding the current Zephyr build.
The following informations are stored during CMake configure:
- Board information
- Application source directory
- Application configuration directory
- Toolchain information
- Devicetree files
- Kconfig config files
- Zephyr version

Upstream PR: zephyrproject-rtos/zephyr#79118

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
tejlmand added a commit to tejlmand/fw-nrfconnect-zephyr-1 that referenced this pull request Oct 8, 2024
…le used.

Save information regarding SVD file in use in vendor-specific section
of the build info file.

Information is stored under Nordic section.

Upstream PR: zephyrproject-rtos/zephyr#79118

Signed-off-by: Torsten Rasmussen <Torsten.Rasmussen@nordicsemi.no>
@carlescufi carlescufi merged commit 46a3e61 into zephyrproject-rtos:main Oct 8, 2024
32 checks passed
@@ -114,3 +114,6 @@ set_ifndef(TOOLCHAIN_KCONFIG_DIR ${TOOLCHAIN_ROOT}/cmake/toolchain/${ZEPHYR_TOOL

set(HostTools_FOUND TRUE)
set(HOSTTOOLS_FOUND TRUE)
build_info(toolchain name VALUE ${ZEPHYR_TOOLCHAIN_VARIANT})
string(TOUPPER ${ZEPHYR_TOOLCHAIN_VARIANT} zephyr_toolchain_variant_upper)
build_info(toolchain path VALUE "${${zephyr_toolchain_variant_upper}_TOOLCHAIN_PATH}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Causing old compilation errors as ???_TOOLCHAIN_PATH is invalid.

-- Found toolchain: gnuarmemb (C:\gnu_arm_embedded_10)
CMake Error at C:/Users/BillyBob/zephyrGoodX/zephyrNew/cmake/modules/yaml.cmake:313 (string):
string sub-command JSON failed parsing json string: * Line 1, Column 1

Bad escape sequence in string

See Line 1, Column 6 for detail.

I changed it to:

build_info(toolchain path VALUE "${TOOLCHAIN_ROOT}")
# build_info(toolchain path VALUE "${${zephyr_toolchain_variant_upper}_TOOLCHAIN_PATH}")

Copy link
Member

@fabiobaltieri fabiobaltieri Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @WilliamGFish, could you open a pull request with the fix? This is the contribution guideline documentation. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, found out its a Linux / Windows path issue.
I'll raise a 'fix' via a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.