Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA: tracking issue #9342

Closed
bergzand opened this issue Jun 13, 2018 · 55 comments
Closed

OTA: tracking issue #9342

bergzand opened this issue Jun 13, 2018 · 55 comments
Assignees
Labels
Area: OTA Area: Over-the-air updates Type: tracking The issue tracks and organizes the sub-tasks of a larger effort

Comments

@bergzand
Copy link
Member

bergzand commented Jun 13, 2018

Tracking OTA progress

Tracking issue for the different parts required to enable over the air updates.

Overview

We are aiming at a simple OTA prototype in a first phase,
which can be extended in later phases.
The blueprint for the first prototype is the implementation we had working at the SUIT Hackathon in June 2018 (see slide 3 of this recap).
This combines a simplistic bootloader, 2 firmware slots and metadata that follows the SUIT specification.
A basic update module, embarked in each image, manages firmware transfer (via CoAP), verification (complying with SUIT) and storage in Flash within the non-running image slot.

Work breakdown

ota work breakdown_v3

Current state of development

@cladmi is addressing the tooling for the bootloader and metadata
@kYc0o is addressing the bootloader
@OYTIS is working on support for RIOT, including OTA, for KEA128LEDLIGHTRD board (#9655)

See the GitHub Software Updates project for more details.

System diagrams

Both of the diagrams reflect the content of the working OTA update branch. This contains working examples for both a metadata-based update, and a SUIT manifest-based update. The areas not relevant to the example addressed in each section are greyed out.

Non-SUIT metadata based update mechanism

A description of the metadata is given here.

ota component diagram - metadata based updating_v1

SUIT metadata based update mechanism

ota component diagram - SUIT-based updating_v1

The process being followed is to merge the functionality required for OTA updates piecewise, instead of as a large monolithic PR.

Files for diagrams above

The diagrams were done using draw.io (https://www.draw.io/). Contact @danpetry for the project files.

@cladmi
Copy link
Contributor

cladmi commented Jun 28, 2018

Linker offset supports depends on #9451 to allow testing.

@danpetry
Copy link
Contributor

danpetry commented Aug 7, 2018

State of development

Please update this list as required so that we all know what we're up to

Development branches and repos

Here's a list of active branches and repos where people are developing. Please add your own with a short description if you can get around to it.
https://github.com/kYc0o/RIOT/tree/hackathon_suit_2018 - Paco dev branch
https://github.com/kYc0o/riot-ota-server/tree/master/otaserver - OTA server (Paco fork)
https://github.com/aabadie/ota-server - OTA server (original)
https://github.com/danpetry/RIOT/tree/ota/sandbox - Dan dev branch. Works with samr21-xpro along with a gen_and_flash.sh script which builds/flashes/generates manifest for use with a CoAP client/server for firmware distribution
https://github.com/suit-wg/SUIT-repository-manager - CoAP client/server for firmware distribution
https://github.com/suit-wg - suit working group.
https://github.com/hannestschofenig/suit-manifest-generator - Manifest generator. fork of ARMmbed/suit-manifest-generator with a couple of additions.

Related issues not in the diagram above

Either because they are dependencies to the issues above, or some other reason. Basically here are some issues you might want to be aware of, and help along if you can.
#8902 - Original OTA PR
#9351 - PR to allow partial ROM flashing for cortex m cpus EDIT (cladmi): to allow generating firmwares in part of the ROM for cortex-m

Current work being undertaken

Dan - working on RAPstore project on integration, so architecture can be informed by situational concerns
Anton (OYTIS) - implementing periph/flasher module for a board (? which one)

@cladmi
Copy link
Contributor

cladmi commented Aug 13, 2018

For binflash I would also have a dependency of FLASHFILE: #8838

And I am thinking, why do we even need to do binflash. Flashing a binary means having both a file and an address. And this could be done by generating an ihex file instead.

@kYc0o
Copy link
Contributor

kYc0o commented Aug 13, 2018

And I am thinking, why do we even need to do binflash. Flashing a binary means having both a file and an address. And this could be done by generating an ihex file instead.

As far as my early tests went, it was very complicated to manage ihex files especially when you want to flash bootloader + image (slot 1 or 2). Then build an hex file with such information was very challenging at that time. To speed up things, and make something more reliable I used bin files flashed at the needed address.

Now we have all the information we need to know where to flash, so I'd say there's no problem using and flashing binary files while a bootloader takes the beginning of the flash.

@bergzand
Copy link
Member Author

bergzand commented Aug 15, 2018

For the bootloader, the mimimal set of required metadata is this:

/**                                                                              
 * @brief Structure to store firmware metadata                                   
 * @{                                                                            
 */                                                                              
typedef struct {                                                                 
    uint32_t magic_number;              /**< metadata magic_number (always "RIOT")  */
    uint32_t version;                   /**< Integer representing firmware version  */
    uint32_t start_addr;                /**< Start address in flash                 */
    uint32_t chksum;                    /**< checksum of metadata                   */
} firmware_metadata_t;                                                           
/** @} */                                                                        
  • magic_number: To indicate that a firmware metadata field is located here.
  • version: Version of this firmware. The bootloader will boot the firmware with the highest version number
  • start_addr: Start address of the firmware. This way the bootloader can easily work with variable metadata sizes.
  • chksum: Checksum of the metadata to check for a valid metadata (not firmware).

@bergzand
Copy link
Member Author

bergzand commented Aug 15, 2018

Documentation todo

  • Dependencies:
    • CPU peripherals required
    • metadata format and bootloader interaction
  • Metadata fields
  • CPU initialization (happens twice)

@cladmi
Copy link
Contributor

cladmi commented Aug 16, 2018

Notes about flashbin:

edbg

edbg is already flashing binary files.
An offset from the start address can be given using -o|--offset the PR currently uses this as an absolute address which is wrong in general. So flashing a binary from start address works.

Some work is only required to disable erasing everything and supporting flashing from an offset of the start address. Which is only needed when flashing sub parts of the rom/invalidating a slot.

edbg.inc.mk: allow flashing with an offset in rom without erasing all ROM #9788

Openocd

Openocd supports flashing from binary as is, except that the flashing address must be given.
It uses offset which is an offset from what is in the firmware, so offset from the link address for elf files/hex files, offset from 0 for binary files.

http://openocd.org/doc/html/Flash-Commands.html flash write_image.

I plan to use the ROM_START_ADDR in the openocd script to make flashing a binary work by default without external configuration. And use OFFSET as an offset from this.

EDIT: The address cannot easily extracted from the build system, openocd.inc.mk is included before cpu/.../Makefile.include and having export ROM_START_ADDR would give it a default empty value breaking the ROM_START_ADDR ?= lines.

I queried this address from openocd.

openocd.sh: allow flashing binary files without configuration #9787

This would give a common behavior between boards using edbg and openocd.

@emmanuelsearch
Copy link
Member

Minutes from this week's OTA meeting (15.08.2018) available here

@waehlisch
Copy link
Member

Which option is preferred to add comments/questions on the minutes? Should I comment via this issue, create a new issue, or ...?

@emmanuelsearch
Copy link
Member

I'd say: comment here?

@kb2ma
Copy link
Member

kb2ma commented Aug 20, 2018

I agree that it makes sense to use the existing Block1 functionality for now. Conceptually, some host sends firmware to the (RIOT) device, although I guess use of PUT/POST or GET depends on the higher-level application protocol. At any rate, the device is on the receiving end.

Also, "roadblock" is not a word I like to see associated with my name! I'm reviewing the RFC and Block2 PR (#8932), and plan to comment in the next day or so on moving forward.

@emmanuelsearch
Copy link
Member

emmanuelsearch commented Aug 20, 2018

@kb2ma actually Block2 has advantages in terms of security characteristics (since it is pull-based).
If #8932 could be merged fast into master (before other CoAP reworks), it would improve the initial OTA solution we are going for, no doubt.
That said, we plan OTA enhancements/reworks anyways, after an initial (basic) OTA mechanism is merged in the master branch.
Hence, thanks a lot for your proposal to let us know about #8932, indeed ;-)

@cladmi
Copy link
Contributor

cladmi commented Sep 5, 2018

Both edbg and openocd binflash have been merged. The branch will need to be rebased with the changes. @kYc0o we can spend some time doing this and testing this afternoon if you want.

It currently still requires setting two different variables to handle both openocd and edbg to flash a binary file but will be addressed by #8838.

@cladmi
Copy link
Contributor

cladmi commented Sep 5, 2018

Another dependency I remember as I am rebasing the branch, is process dependencies for CPU and move stm32_common periph_flashpage dependencies to Makefile.dep I could easily provide a test for this one even if the test does not need to be merged.

@kYc0o
Copy link
Contributor

kYc0o commented Sep 14, 2018

Finally I got to test the current state with a rebased branch with most recent changes in master related to OTA. The most updated branch is at [1].

Two problems:

  • Compile for slot 2 doesn't seem to work with the current verification for slot sizes:
    Specified firmware size does not fit in ROM. However, by bypassing the assert in cortexm.ld it compiles and actually is linked correctly, but:
  • For edbg this line is not working EDBG_ARGS += $(addprefix --offset ,$(IMAGE_OFFSET)):
/Users/facosta/git/RIOT-OS/RIOT/dist/tools/edbg/edbg --offset $((0x1000 --offset + --offset 0x1f800)) -t atmel_cm0p -b -v -p -f /Users/facosta/git/RIOT-OS/RIOT/examples/suit_updater/bin/samr21-xpro/suit_updater-slot2.signed.bin
bash: 0x1000 --offset + --offset 0x1f800: syntax error in expression (error token is "--offset + --offset 0x1f800")
make: *** [/Users/facosta/git/RIOT-OS/RIOT/makefiles/riotboot.mk:112: riotboot/flash-slot2] Error 1

[1] https://github.com/kYc0o/RIOT/tree/wip/rebase/ota_work_branch

@cladmi
Copy link
Contributor

cladmi commented Sep 17, 2018

For edbg I did not think about the value with spaces so 'addprefix' does not work in that case.
Two solutions:

  • Replace IMAGE_OFFSET=$((0x1000+0x1f800)) without spaces
  • Replace the EDBG_ARGS line with $(if $(IMAGE_OFFSET),--offset $(IMAGE_OFFSET)))

The first one can be done directly on our side to make it work, and upstream the proper fix to RIOT after.

@cladmi
Copy link
Contributor

cladmi commented Sep 17, 2018

I found the issue for the offset on slot 2, the assert was right. In makefiles/riotboot.mk I wrote
FW_ROM_LEN=$(RIOTBOOT_FW_SLOT_SIZE)
when the slot size includes both the metadata and the firmware, so should be
FW_ROM_LEN=$(RIOTBOOT_FW_SLOT_SIZE - $(RIOTBOOT_HDR_LEN))

And then the ldscript changes can be reverted.

@cladmi
Copy link
Contributor

cladmi commented Sep 17, 2018

We are not consistent with the names yet. If slot means metadata + firmware, the generated files should be adapted.

@cladmi
Copy link
Contributor

cladmi commented Sep 17, 2018

I also have some issues when building the bootloader on my laptop as git am, which applies patches to packages, tries to use HOME to find git configuration and extract email and author name but we do not pass it down for building.

For me it is an issue that 'git am` tries to use local configuration to do the commit. I want to fix this for some time but did not have real visible problems to show.

@waehlisch
Copy link
Member

waehlisch commented Sep 21, 2018

@cladmi the previous dependency graph highlighted two building blocks (1) Linker/Bootloader and (2) OTA, and each building showed the same amount of details. Furthermore, roughly half of the discussions in this issue relate to the bootloader. I really don't see why this issue is exclusively related to OTA.

Just to be clear: If there is a separate bootloader issue, fine with me. But then, #9342 should refer to this issue because the bootloader is a requirement to implement OTA; and any bootloader discussions should then be noted there.

@danpetry
Copy link
Contributor

Would "OTA support" be a title that would satisfy the requirements discussed here? It seems to me that all parties are, essentially, arguing for something overarching which doesn't exclude bootloader implementation (or, indeed, anything else we need to implement). Would this definition therefore be appropriate?

@cladmi
Copy link
Contributor

cladmi commented Sep 25, 2018

@waehlisch to be more precise maybe, our goal, is not really to do a real "bootloader", in fact the bootloader implementation does only "find newest firmware, boot" 30 lines with spaces https://github.com/RIOT-OS/RIOT/pull/9969/files#diff-1dce55b2dc2ce6dc6fd5b018e55a1482R27 which for me would not really need discussions on its own, except BS integration and filesystem organization but that will go in its PR.

What is however related to the bootloader, and mixed in our description graph and discussions, is that the implementation of OTA and, what the bootloader handles, is based on "ROM slots".
Having the ROM split in 3, a bootloader, and two equal sized sections with the firmware and a description of its version.
It is the part that is currently called "firmware" in the PR.

And all the work currently integrated since this issues started and was in "linker/bootloader" block is 95% only to help create and manage these rom slots, without concerning any radio transfer. And maybe naming it "bootloader" was not reflecting exactly our goal for the outsiders but was enough to split the work between involved developers.

If it was mixed to the "linker/bootloader" section is that for us it made sense as an intermediate standalone goal that could be integrated in RIOT

The reason we push toward adding a bootloader as an intermediate step, is that it is the simplest way of really testing these mechanisms and so merge them to master without any coap/suit dependencies, but it is never a goal on its own.
The tracking graph was only a technical split of the existing code in different standalone PRs.

Maybe the fact that it is "OTA using ROM slots" should be more emphasized, but not that we try to do a bootloader.

@cladmi cladmi added the Type: tracking The issue tracks and organizes the sub-tasks of a larger effort label Sep 28, 2018
@waehlisch
Copy link
Member

@cladmi I'm confused. A boot loader basically calls the OS into memory. Boot loaders may have different capabilities (e.g., think about first-stage and second-stage boot loaders). That a boot loader decides between two different OS versions (or "firmwares") is not a conflict. And, you can use this piece of software outside of OTA

I think what confused me is that a link to #10065 (or any other issue that tracks the state ot the boot loader) was missed, and that some discussions in #9342 related more to a boot loader instead of core OTA.

@kYc0o
Copy link
Contributor

kYc0o commented Oct 2, 2018

I think what confused me is that a link to #10065 (or any other issue that tracks the state ot the boot loader) was missed,

#10065 is pretty new and in the diagram is clearly the next step (it was also clear in the previous diagram) so it was missed for obvious reasons. Now that it exists we can link it here and discuss there the implementation and design. Here we can continue to discuss the next steps.

and that some discussions in #9342 related more to a boot loader instead of core OTA.

What do you call core OTA? In that regard, you might want to send firmwares, or whatever, Over The Air, but without a bootloader I don't think it will work flawlessly (ok you can boot the update from the running image, just don't dare to reboot the node).

Thus, to be clear about my position: OTA stands for the whole process of updating the firmware of a running RIOT node, which involves a bootloader, wired/wireless protocols to transfer the firmware and the tools to secure it, if it's wanted. IMHO no renaming is needed.

@waehlisch
Copy link
Member

@kYc0o

#10065 is pretty new and in the diagram is clearly the next step (it was also clear in the previous diagram) so it was missed for obvious reasons.

I don't know why it was missed for obvious reasons but this missing piece of information was one reason why I was originally arguing that the caption is misleading. And, I don't care if you discuss bootloader and OTA in a single or in separate issues/PRs ;).

Thus, to be clear about my position: OTA stands for the whole process of updating the firmware of a running RIOT node, which involves a bootloader, wired/wireless protocols to transfer the firmware and the tools to secure it, if it's wanted. IMHO no renaming is needed.

I don't agree here: OTA is one functionality, boot loader is another. OTA needs a boot loader, but a boot loader doesn't need OTA.

For me, OTA includes everything that (securely) delivers the OS wirelessly to the target.

The boot loader ensures to execute the stored OS, maybe also securely.

Side comment: There are other options to ship the OS. Think about an USB stick.

@emmanuelsearch
Copy link
Member

@bergzand @cladmi @kYc0o @danpetry for the record, to summarize based on the last TF discussions, the immediate next steps are simplified such that we go for:
(please modify/complement if I missed something)

After that, a simple follow-up PR providing CoAP handlers for OTA compliant with SUIT (draft-01 version) would be close-at-hand, now that CoAP block2 support has been merged.

@emmanuelsearch
Copy link
Member

emmanuelsearch commented Aug 15, 2019

Note: efforts have moved to #11818

@cladmi
Copy link
Contributor

cladmi commented Aug 15, 2019

Issue with riotboot: "bootloaders|tests/riotboot: broken with BUILD_IN_DOCKER and wrong flashfile" #12003

@miri64
Copy link
Member

miri64 commented Jul 2, 2020

I don't see any progress tracking in this tracking issue. So what is the progress?

@aabadie aabadie removed the Type: tracking The issue tracks and organizes the sub-tasks of a larger effort label May 20, 2021
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 22, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Mar 2, 2022
@stale stale bot closed this as completed Apr 18, 2022
@miri64 miri64 added the Type: tracking The issue tracks and organizes the sub-tasks of a larger effort label Apr 18, 2022
@miri64 miri64 reopened this Apr 18, 2022
@stale stale bot removed the State: stale State: The issue / PR has no activity for >185 days label Apr 18, 2022
@maribu
Copy link
Member

maribu commented Sep 19, 2022

Ping?

@danpetry danpetry removed their assignment Sep 20, 2022
@maribu
Copy link
Member

maribu commented May 22, 2023

Let's see if anyone screams when I close this.

@maribu maribu closed this as completed May 22, 2023
@kYc0o
Copy link
Contributor

kYc0o commented May 22, 2023

😱

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: OTA Area: Over-the-air updates Type: tracking The issue tracks and organizes the sub-tasks of a larger effort
Projects
None yet
Development

No branches or pull requests