Core SPM programmer: request for comments #940

stefanrueger · 2022-04-18T19:55:01Z

stefanrueger
Apr 18, 2022
Maintainer

This is to start a discussion on avrdude support for bootloaders (ie, SPM programmers).

I propose modelling SPM programmers along existing bootloaders and along the capabilities of the SPM/LPM opcode and EEPROM access (or, in newer parts, along the capabilities of the NVM Controller). This to complement avrdude's modelling of SPI programming, PDI, UPDI, HVPP, JTAG, Debugwire etc.

Some of the peculiarities of SPM programmers: they

Work without intermediate programming device
Want to be really, really small more often than not
Hence, are unlikely to implement all possible r/w on memories
Do have a page erase command
Could perform a read-erase-modify-write page feature for sub-page modifications in flash (but more often than not don't)
Operate on byte addresses, even for flash (RAMPZ and Z are set to byte addresses for the SPM and LPM opcodes)
Cannot write fuses (in contrast to external programming)
Need a reset signal for the part to enter the bootloader code

In essence, SPM programmers allow a higher abstraction level than other modes of programming: the vagaries of the actual to be programmed part are mostly addressed in the bootloader, though avrdude benefits from knowing what the highest flash address is, the flash page size and a few parameters that depend on the bootloader but not on the part (for example, which memories the bootloader can read/write and at which granularity, whether the bootloader is a vector bootloader, which interrupt is used for that, how big the bootloader is so avrdude does not overwrite it etc).

One popular current avrdude SPM programmer is arduino, which is used for uploading/downloading applications and EEPROM contents via a bootloader on the part. It builds upon the STK500 protocol that models external programming and behaves as if the bootloader was an external programming device. Essentially, for all intents and purposes, the arduino programmer is an STK500 programmer one minor, but important, difference being that it plucks the DTR line with a view to issue reset on the connected part, so it runs the bootloader code.

In theory, a bootloader could emulate an STK500 external programmer, but in practice they don't necessarily do this.

I can see the following clear disadvantages of this bootloader-as-an-STK500 approach: avrdude -c arduino

Requests SW/HW version from the SPM programmer, which costs code bytes in the bootloader (decode the request send something back) with little benefit
Humours not-needed get parameter and set parameter requests; there is no intermediate device with parameters; decoding the requests and ignoring them costs unnecessary bootloader code
Expects a chip erase command to be honoured, although such a command is neither needed nor universally implemented in bootloaders (because SPM flash programming can erasing a page before writing it)
Communicates 16-bit word addresses for both EEPROM and flash page reads/writes: the bootloader therefore needs to multiply the word address by two (4 bytes code), and for 128 kB flash parts puts the carry into RAMPZ (more code bytes); bootloaders of larger parts require the decoding (a few handful of code bytes gone) of a specific load extended-byte-of-a-word-address command and store that upper byte of a three-byte word address in a variable to be used later to compute the effective byte address. More code bytes gone. For bootloaders this is a clumsy way of learning the needed byte address. A small change of the protocol between avrdude and the bootloader would communicate the 16-bit byte address of part memories with up to 64 kB flash, and the 24-bit byte address of part with larger flash. Simples.
Does not particularly support vector bootloaders, which want the reset vector to point to them and want another (never otherwise used) vector in (or just behind) the vector table to point to the application; patching the vector table by bootloader itself during avrdude upload/verify is costly (the optiboot source code reckons some 110 bytes), but this is a task that a dedicated avrdude SPM programmer could easily do meaning it should cost the bootloader zero bytes code
Identifies the to be programmed part by requesting signature bytes through a protocol command exchange that needs to be decoded and served; again, this puts some strain on the bootloader but there are other mechanisms costing less (or no) bootloader code
And then there is Issue Some input files are written incompletely to flash #918, where trailing 0xff-suppression prevents the file input from being properly programmed and verified

So, I suggest writing a generic core SPM programmer (I am happy to do so, and have experience in doing so), which supports the following minimum capability of a bootloader

Paged flash write at page boundaries

and zero or more of the following

Buffer flash read at arbitrary byte positions (max size is page size or 256 whichever is greater)
Buffer EEPROM read and write at arbitrary byte positions (max 256 buffer size)
EEPROM erase
Flash erase (apart from obviously the bootloader itself)
Vector bootloader

Optiboot is an example of a bootloader that supports flash read/write as above, with some optiboot renderings also supporting EEPROM r/w (it's a compile-time option). The core SPM programmer would emulate the EEPROM erase and/or flash erase writing 0xff to flash/EEPROM as a slower alternative if the bootloader does not provide this functionality.

I suggest using by and large the syntax of STK500 communications with small changes to the protocol to support 2/3-byte addresses and to support the bootloader telling avrdude early on which part it is and which subset of the 5 optional capabilities the bootloader has implemented. I suggest using two bytes for this, so this scheme can afford 11 bits for 2048 different parts; currently avrdude knows 313 parts (one of which AVR32), avr-libc 293 parts and Microchip's ATDF files know 323 AVR8 parts. The union of these three sources amounts to 374 parts though some of the older/smaller parts are clearly not suitable for bootloader programming at all (no RAM or no SPM).

The remaining parameters (application vector number for vector bootloaders, size of bootloader) can come from a few table bytes at the end of the bootloader, or failing that, from extended -x command line options.

For all this to work, avrdude's proposed core SPM programmer might need/benefit from

A programmer parameter is_bootloader in .conf (default off for backward compatibility)
The ability to switch off the removal of trailing 0xff sequences in the input file or on avr read
A readhook function at the end of input file reads, so the programmer can patch the vector table on upload and verify; the SPM programmer would insert a function during initialisation.

OK, that's the long of it. Comments appreciated.

stefanrueger · 2022-04-27T18:33:54Z

stefanrueger
Apr 27, 2022
Maintainer Author

From PR #936:

I don't think there's a big difference in program size between "upon 'chip erase', loop over all application flash pages, and erase them" vs. "upon 'chip erase', do nothing, but when programming a page, first erase it".

This is not wrong. Below the two scenarios on the back of an SSD packet.

Upon 'chip erase', do nothing, but when programming a page, first erase it:

Bootloaders served by avrdude -c arduino will ignore a few requests anyway; with code for that already spent (out of necessity) the marginal cost of ignoring chip erase is zero
The marginal cost of a page erase before writing a page is 4 bytes (loading the page erase command and calling a subroutine that carries out the SPM opcode and waits/polls until it is finished)

Upon 'chip erase', loop over all application flash pages, and erase them:

Compare avrdude's request with chip erase and branch (4 bytes min)
Managing the page erase loop: initialise RAMPZ/Z, and increment/decrement that pointer by page size until it hits the bootloader/NULL; depending on part, page size, flash size, how well the compiler optimises (or you write assembler) I'd estimate min 12 bytes, typ 20 bytes, max 30 bytes
Page erase 4 bytes (loop contents)
Sign off code (min 2 bytes but probably more like 4-6 bytes)
Housekeeping (reinstating RAMPZ/Z etc) min zero bytes (but can be more)

So we are looking at approximately 22-44 bytes vs 4 bytes.

One question is whether this difference is big: it can make 10% of the bootloader size and, in many cases, risks pushing the bootloader size over a page boundary. And we are not even talking about reading the actual fuse byte that tells us whether chip erase implies EEPROM erase, and if needed erase that too.

Another is whether the second scenario is really needed.

In my world there is nothing wrong with the first scenario. The unused flash may look messy from previous use, but the suggested SPM programmer could emulate the chip erase by looping in avrdude (plenty o' RAM there) to clear the flash (slower, but that's the trade-off to save flash for every application). And the user calls the shots by using -D or not for explicit chip erase.

I like keeping flash for the application. But I am also partial to bootloaders: I like them for rapid prototyping. And testing. And EEPROM r/w. And for providing a writepage(flash, ram) subroutine accessible by the application. And, hence, for the ability to use the flash not used by the application as poor man's EEPROM. (It is mildly satisfying to write a little sketch for a 3 USD ProMini rig with CR2032 and a temperature sensor that records to flash the 7200 temperature values of a 2 hour washing machine run, the result of which I can read with avrdude using the bootloader).

0 replies

dl8dtl · 2022-04-27T20:04:40Z

dl8dtl
Apr 27, 2022
Maintainer

I was thinking that for an explicit SPM programmer implementation, it might be possible to offload a "chip erase emulation" (erase everything except the bootloader) into AVRDUDE - but then, only the bootloader knows the size of what constitutes "application flash", and least on small devices where it cannot be deduced by the fuses (which btw. is currently not possible anyway as AVRDUDE has no knowledge about the meaning of each fuse bit in each controller).

So, for an SPM programmer protocol, it would be essential to provide a method where AVRDUDE can query the size of the application section.

0 replies

dl8dtl · 2022-04-27T20:05:40Z

dl8dtl
Apr 27, 2022
Maintainer

Btw., I tend to transfer this into the discussion area. Objections?

0 replies

stefanrueger · 2022-04-27T21:24:17Z

stefanrueger
Apr 27, 2022
Maintainer Author

Btw., I tend to transfer this into the discussion area. Objections?

Great Idea

So, for an SPM programmer protocol, it would be essential to provide a method where AVRDUDE can query the size of the application section.

Correct. There are plenty of possibilities of doing so. The way I see it is that we have freedom to determine a) what SPM programmers can and cannot do (ie, how they should behave); b) how they communicate with the host; c) what avrdude promises to provide and what it promises to not request (to avoid code in the bootloader that ignores requests). The best outcome for this exercise is something that helps bootloader writers to write efficient bootloaders.

As an example, I have a prototype implementation of the following idea: the bootloader keeps a six-byte table at FLASHEND with

its version number (one byte)
a capabilities byte (EEPROM r/w, is_vector_bootloader, keep_reset_cause in R2, is_over_the_air bootloader, ...)
a two byte rjmp to a writepage(flash, ram) function or a ret if not implemented
the number of pages that the bootloader occupies
the vector number used for the r/jmp to the application

During handshake the bootloader lets know which part it sits on. Avrdude knows where FLASHEND of the part is and requests the six bytes below FLASHEND from the bootloader using the normal buffer flash read. So, that flash read routine doubles up for this info exchange at no extra cost for the bootloader. At the time I had not thought about bootloaders that do not wish to implement flash read.

0 replies

stefanrueger · 2022-05-06T13:07:39Z

stefanrueger
May 6, 2022
Maintainer Author

provide a method where AVRDUDE can query the size of the application section

Several options exist with minimal code footprint in the bootloader

One byte specifying the number of pages of the bootloader in a table at FLASHEND is sufficient
Could be a new parameter in the SPM bootloader description in .conf
Could be an extended option -x for the user to specify

Method 2 can lead to a proliferation of SPM programmers in .conf or the user's config file, so I am slightly in favour of 1 if the bootloader provides a method to read flash and 3 if not.

Other query methods/protocols exist but I don't see how they can be implemented without spending more than one byte bootloader code.

The same set of methods could be used for SPM programmers communicating

Which vector to use for vector bootloaders
Capabilities/properties of the SPM programmer (apart from being able to read flash)
Version number of the bootloader

Backward compatibility: Existing bootloaders, eg, served by -c arduino should be recognised, modelled and served through the SPM programmer.

From a practical point of view, an implementation of a generic SPM programmer could (try to) figure out whether it talks to a bootloader with a specific protocol. For example, -c arduino and optiboot use the STK500 communication protocol to talk to each other about the tasks at hand. If the generic SPM programmer sees responses compatible with the STK500 communication protocol, it could indeed restrict itself to that protocol and implement the ideas developed here as far as possible. Would require Method 2 or 3 for communicating bootloader size when needed.

0 replies

stefanrueger · 2022-05-06T13:09:47Z

stefanrueger
May 6, 2022
Maintainer Author

From Issue #944

If the SPM programmer realises that the bootloader does not implement flash read it would be good if it was able to tell upload() not to bother with flash verification. It is probably a better user experience if AVRDUDE explains to the user on flash verifies that the bootloader cannot perform flash reads and continue with its operation. Currently, a user requesting verification of all memories through absence of -V, will see AVRDUDE exit with an error message causing unnecessary worry.

0 replies

stefanrueger · 2022-05-06T13:21:24Z

stefanrueger
May 6, 2022
Maintainer Author

Memory granularity.

AVRDUDE reads/writes/verifies the whole memory only (with the exception of specific, hardcoded named subsections of flash in the case of application, apptable and boot).

There is utility in being able to do so at a smaller granularity. Use examples:

-U flash.104.4:r:-:h to output the first four bytes at position 104 (for ATmega328p this is just after the vector table); putting tables or data into the .vectors section is a neat trick for developers to store read-only parameters at a known address
-U pastvectors.0.4:r:-:h same as above but the user does not need to know where the vector table ends (AVRDUDE knowing the size of a part's vector table is useful for SPM programmers anyway)
-U flash.-6.3:r:-:h to output the first three bytes of the top six bytes in flash
-U eeprom.256.8:w:-:h to write an 8-byte parameter into EEPROM at position 256
-U startboot.-14400.14400:r:-:h to output the top 7200 words of flash below the bootloader (that, eg, may have been used by the application as poor man's EEPROM to save the temperature profile of a washing machine run)

This is relatively easy to implement for SPM programmers because bootloaders have the super power of SPM page erase. The SPM programmer in AVRDUDE can therefore implement sub-page writes through a page read, modify and page write cycle. Again, at no extra code footprint to the bootloader.

The "small memory granularity" idea is simple. A backward compatible extension of the current AVRDUDE usage syntax is simple (eg, as indicated above). I do realise that the current codebase does not lend itself readily to a simple implementation, though. Essentially, this would require a remodelling of AVRDUDE's understanding of memory with consequences for how update() goes about its task.

Being able to write small subsections of flash is (relatively) easy for any AVRDUDE programmer that has explicit page erase or implicit page erase in paged writes at their disposal.

Small-granularity read/writes/verifies is an idea that transcends SPM programmers, though: it is also possible for AVRDUDE to effect this for any of its programmers. In the worst case, flash writes of a small section would involve a chip read (including EEPROM), modify, chip erase and write. (This is what a user would have to do manually otherwise).

0 replies

stefanrueger · 2022-05-11T18:19:13Z

stefanrueger
May 11, 2022
Maintainer Author

Metadata.

One of the benefits of a bootloader is that it can, and often does, export a function writepage() for writing a page to flash or, alternatively, a function dospm() for executing a parameterised SPM opcode (which on some parts only works in "bootable" flash memory). The API for how the bootloader function(s) are accessed and what they do varies from bootloader to bootloader. This in itself is of little concern to AVRDUDE as, ultimately, it is the application that needs to interact with the bootloader in these cases.

However, the compiled application usually has no knowledge how large it is. An application that wishes to utilise writepage() for implementing an EEPROM-like storage really needs to know where the application ends so it can keep to the bounds of unused flash for the additional store.

AVRDUDE not only knows the code size but also can easily write it to a location in flash that the application can access. A natural place is just below the bootloader. Generalising this idea, AVRDUDE can (optionally, driven by the user) drop the last-modified date and name of the input file just below the bootloader, too.

The idea of an optional metadata section just below the bootloader only costs one byte flash expressing which pieces of metadata are available and where they reside.

Of course, users can, and always could, assemble their ultimate flash image outside AVRDUDE, and just use AVRDUDE to upload the entire image. However, this requires mastery of tools that manipulate code sections, patch the vector table (for a vector bootloader), merge code sections, tables, metadata, the bootloader itself etc.

AVRDUDE offering to write a few pertinent meta-data on its own accord would give the "ordinary" user new functionality. Optional, so backward compatible.

5 replies

dl8dtl May 11, 2022
Maintainer

However, the compiled application usually has no knowledge how large it is.

Well, that's easy to handle, because the linker leaves global symbols for all kind of stuff. With the standard AVR linker script, the symbol __data_load_end marks the end of flash allocation. The application can simply query that symbol.

We once used to have that kind of things in AVRDUDE that doesn't really belong into it (the "cycle counter"), but it let me come to the conclusion that's basically something that would better be handled outside of the programming tool. If at all, such functionality should be made pretty generic to not just handle that single case but to allow for a kind of scripting that can then access both, the loaded file metadata, as well as the libavrdude backend functions. (Anyone asking for embedding a lua interpreter? ;-)

The bootloader itself (i.e. the code running on the AVR) is obviously outside of AVRDUDE's domain anyway. Of course, nothing prevents us from creating a kind of sibling project here next to avrdudes/avrdude, that might act as a kind of bootloader reference, but maintaining it is beyond my available time.

Regarding the AVRDUDE part of it, I'd like to point out again that I really wish to see page erasing to become a part of AVRDUDE itself, and enable it not only for an SPI bootloader but also for all the tool/MCU combinations where it is available. I still think the default behavior should remain a chip erase (for backwards compatibility, and for data security), but page erases are (as I once have heard) much less stressful to the device so we might teach our users to prefer them whereever they are available.

stefanrueger May 11, 2022
Maintainer Author

symbol __data_load_end

Great tip, I had not known that. Thanks. Solves the inside-application access, indeed. But see below for external reading/writing contents from/to flash behind __data_load_end.

If at all, such functionality should be made pretty generic

I agree. The generic use is a kind of flexible apptable; I call that store. I have started exploring whether/how the SPM programmer can upon initialisation create a memory store that occupies the unused flash between application and metadata/bootloader. Here some pretty generic uses

avrdude -c urboot -U store:r:-:h reads out the data that an application has written to flash behind __data_load_end (no code needed in the application to serve the stored data, the bootloader does that)
avrdude -c urboot -U store:r:data.tab:h -U flash:w:myapp.hex:i -U store:w:data.tab:h for uploading a new application while keeping the previous data

stefanrueger May 12, 2022
Maintainer Author

wish to see page erasing to become a part of AVRDUDE itself

Certainly useful for programmers of parts that have external page erase through PDI/UPDI.

For now, in the absence of explicit AVRDUDE page erases, the SPM programmer can still easily handle this on its own. Two cases:

The bootloader's paged write carries out an implicit SPM erase before actually writing the data to flash: the SPM programmer calls its paged write routine that transfers the page data to the bootloader; this is also how optiboot works.
The bootloader instead provides two commands, explicit SPM page erase and paged write without page erase: the SPM programmer's paged write would need to call the bootloader's SPM page erase first followed by pushing the data to the bootloader's paged write.

The first case actually makes for more compact bootloader code, particularly for those bootloaders that provide a writepage() function.

Once AVRDUDE provides explicit page erase management, the first case can then be dealt with by ignoring AVRDUDE's page erase requests (or, if need be, emulating it by requesting the bootloader to write a page full of 0xff), and the second case by untangling the SPM programmer's paged write routine. Easy.

data security

Overrated. Anyone who is keen on data security should primarily analyse the board, the fuses, the application and any bootloader. They should not look to AVRDUDE. That has a limited role, principally because a malicious attacker is free to roll their own tools; they do not have to use AVRDUDE. So, if a board sports an MCU with U/PDI, and the fuses are set to allow using it, then, well, Eve can do all she wants with Bob's data on the MCU once she has physical access to the board. Same if an MCU has a bootloader, and Eve can invoke it by resetting the board, then she may well be able to read out the chip.

There is some role for AVRDUDE, though, wrt data security, which is protection against accidental or involuntary leaks. Imagine an MCU board with optiboot and an application containing the secret recipe of Coca Cola: When the board is decommissioned the security engineer might upload blink.hex in the mistaken belief that this erases the chip. Not cool! (However, if the upload was avrdude -D -U flash:w:blink.hex:i, well, Coca Cola wouldn't get much sympathy from me. And in fairness to optiboot, the Makefiles I have ever seen for its use always specified a -D option on the avrdude -c arduino command line).

While discussing -D: Let's assume my blink.hex has 239 bytes (deliberately prime). With my proposed SPM programmer avrdude -c urboot -D -U flash:w:blink.hex:i would actually only alter the first 239 bytes: it would first read the padding bytes for the last application page from the chip, so that bytes number 240 and higher remain unchanged. Obvs, if a bootloader does not implement flash read, it would have to be 0xff that are written when filling up the page.

stefanrueger May 13, 2022
Maintainer Author

Second thoughts

I find it hard to justify spending code bytes in the bootloader for separating any atomic erase and write page function into a stand-alone page erase and a paged write that only works when the page has been erased beforehand. As Confucius says, one hundred times ten bytes is (almost) a kilobyte.

So suggest only considering/modelling scenario 1 above.

stefanrueger May 13, 2022
Maintainer Author

bootloader reference

Good idea for the SPM programmer to be able to point to an existing reference bootloader. I plan to publish and maintain my suite of pretty small standard bootloaders (typ 256-384 bytes depending on code for EEPROM r/w or other frills are included) that will go hand in hand with the suggested SPM programmer.

Even without a bootloader reference the very existence of a well-conceived SPM programmer will make people take advantage of it. I imagine that optiboot was borne out of a desire to reduce the size of the 2k default bootloader; they decided to model after avrdude -c arduino comms, so optiboot emulates an STK500 skeleton (though it never needed to be/claim/pose as one). Once AVRDUDE offers an alternative SPM programmer then I predict there will be people utilising that, possibly even the optiboot project itself, particularly as it will serve vector bootloaders with ease.

stefanrueger · 2022-05-11T18:22:32Z

stefanrueger
May 11, 2022
Maintainer Author

Summarising, I suggest implementing an SPM programmer that

Is backward compatible with current bootloaders served by -c arduino
Puts far less burden on the bootloader code through a slightly modified communication protocol
Offloads chip erase from bootloader to the suggested SPM programmer
Supports vector bootloaders through automated patching of the vector table
Entails metadata just below bootloader

I suggest not exporting higher-granularity memory r/w through -U ... at this time (though SPM programmers can easily do this) as this is an idea outside the domain of an encapsulated programmer and would require a non-trivial change of the code base.

1 reply

stefanrueger May 13, 2022
Maintainer Author

Also

Emulates stand-alone page erase (where needed) as upload of 0xff page

stefanrueger · 2022-05-13T17:34:13Z

stefanrueger
May 13, 2022
Maintainer Author

Mandate

When the user asks a file to be programmed without chip erase, I don't see a mandate to alter bytes outside the segment(s) that the input file defines, so suggest read(s) if needed (and possible) to pad page, then modify the page, followed by an atomic erase and write page.

In fairness, more often than not the users won't care what's outside the to be uploaded file (but how can I be sure?), so the read would be unnecessary and just cost time. Can be solved by ...

ignoring (not anticipating a lot of extra time to read the couple of occasions that need padding the page, so can run with that)
another option -Q for quick (and dirty) padding with 0xff

0 replies

stefanrueger · 2022-05-21T10:36:51Z

stefanrueger
May 21, 2022
Maintainer Author

AVRDUDE terminal

SPM programmers give a poor experience when combined with AVRDUDE's -t terminal use because

Bootloaders typically only provide paged flash/EEPROM access whilst the terminal code uses byte access
Bootloaders time out (you'd have to type pretty fast!)

Therefore you currently get no joy when doing, eg,

$ echo hello, world | avrdude -qq -c arduino -p m328p -P /dev/ttyUSB0 -U eeprom:w:-:r
$ echo "d ee 0 32"  | avrdude -qq -c arduino -p m328p -P /dev/ttyUSB0 -t
>>> d ee 0 32 
0000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
0010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

I solved this in my urclock SPM bootloader by emulating flash/EEPROM byte reads through calling paged reads:

$ echo "d ee 0 32"  | avrdude -qq -c urclock -p m328p -P /dev/ttyUSB0 -t
>>> d ee 0 32 
0000  68 65 6c 6c 6f 2c 20 77  6f 72 6c 64 0a ff ff ff  |hello, world ...|
0010  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

BTW, this also goes some way to address the memory granularity access mentioned earlier. Above allows bootloader users to extract quickly certain sections of the EEPROM/flash, eg, to obtain a board id that the project stored somewhere.

Given that bootloaders are often connected with a slow-ish serial connection, that solution is pretty slow for larger memory dumps through the overhead of address loads for each byte. Here, an explicit programmer characterisation is_bootloader might tell the terminal code to use paged reads for larger memory dumps only. Despite its name bootloaders' *_paged_load() can read 256 bytes at a time irrespective of page size.

The other aspect is keeping a bootloader alive in terminal mode. That can be achieved by changing readline() in term.c to use callbacks for is_bootloader programmers, so the terminal_get_input() function can keep the bootloader alive by requesting GET_SYNC every 100 ms or so.

Summarising, with an explicit characterisation of an AVRDUDE programmer as is_bootloader small-ish changes of the code in term.c could give bootloader users a much better AVRDUDE experience.

2 replies

mcuee Jun 11, 2022
Maintainer

@stefanrueger Just wondering if you have some open repo to test out your urclock SPM bootloader in terminal mode? Thanks.

stefanrueger Jun 11, 2022
Maintainer Author

Planning to publish an SPM programmer fit for AVRDUDE over the (northern hemisphere) summer. I want to take feedback from this discussion into consideration first.

Its integration into the AVRDUDE code base is relatively straightforward, but does require changes there. So far even small suggested changes otherwise (eg, PR #936) have not been accepted - so it remains unclear whether the AVRDUDE project is open to better bootloader support.

One of the stumbling blocks is that AVRDUDE models flash programming as NAND memories (where writing 0xff is a NOP) while bootloaders with their SPM/NVMCTRL page-erase-and-write sequences very much look like ordinary memories (where writing 0xff overwrites existing contents). This is, eg, visible in the cool feature of AVRDUDE to allow assembling your application piece by piece (code, table1, table2, ...) using multiple -U options . This does not work with bootloader upload unless the pieces are page-aligned. It is possible to make multiple -U work for the SPM programmer under discussion, and I have done so in my draft implementation (see the "Mandate" discussion above), but it's not clear whether the project welcomes this shift of modelling.

mcuee · 2022-06-05T02:39:48Z

mcuee
Jun 5, 2022
Maintainer

@stefanrueger Can you share your urclock bootlader? Thanks. I can not get optiboot to work with EEPROM under avrdude.
Ref: #986

8 replies

MCUdude Jun 11, 2022
Maintainer

@mcuee for reference, I host a whole bunch of pre-compiled bootloaders for most AVR ATmega targets.
In each target folder, there's also a build info file that contains information about each build. To keep the bootloader size below 512 bytes for 32kB targets and smaller, I've left out EEPROM support. EEPROM support is present on all targets with 64kB flash or more.

Snippet from atmega328p_build_info.txt

Optiboot_flash build output for atmega328p 
https://github.com/MCUdude/optiboot_flash 
Build date and time: 11.04.2020 15:45 
Compiler: avr-gcc 7.3.0
 
F_CPU:		UART NUMBER:	LED PIN:	LED FLASHES:	EEPROM SUPPORT:	FLASH PG COPY:	DESIRED BAUD:	ACTUAL BAUD:	ERROR:	COMPILED SIZE:	OUTPUT FILE:
24000000L	UART0		B5		2		No		No		1000000		1000000		0.0%	482 bytes	optiboot_flash_atmega328p_UART0_1000000_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		500000		500000		0.0%	482 bytes	optiboot_flash_atmega328p_UART0_500000_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		250000		250000		0.0%	484 bytes	optiboot_flash_atmega328p_UART0_250000_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		230400		230769		0.1%	484 bytes	optiboot_flash_atmega328p_UART0_230400_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		115200		115384		0.1%	484 bytes	optiboot_flash_atmega328p_UART0_115200_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		57600		57692		0.1%	484 bytes	optiboot_flash_atmega328p_UART0_57600_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		38400		38461		0.1%	484 bytes	optiboot_flash_atmega328p_UART0_38400_24000000L_B5.hex
24000000L	UART0		B5		2		No		No		19200		19230		0.1%	484 bytes	optiboot_flash_atmega328p_UART0_19200_24000000L_B5.hex
22118400L	UART0		B5		2		No		No		460800		460800		0.0%	482 bytes	optiboot_flash_atmega328p_UART0_460800_22118400L_B5.hex
22118400L	UART0		B5		2		No		No		230400		230400		0.0%	484 bytes	optiboot_flash_atmega328p_UART0_230400_22118400L_B5.hex
22118400L	UART0		B5		2		No		No		115200		115200		0.0%	484 bytes	optiboot_flash_atmega328p_UART0_115200_22118400L_B5.hex

mcuee Jun 11, 2022
Maintainer

@MCUdude Yes I have tried your optiboot hex file and it works for my Arduino Uno clones (without EEPROM support as expected).
https://github.com/MCUdude/MegaCore/blob/master/avr/bootloaders/optiboot_flash/bootloaders/atmega328p/16000000L/optiboot_flash_atmega328p_UART0_115200_16000000L_B5.hex

However, the optiboot hex file for the ATmega2560 did not work (no LED flashing, not able to connect using avrdude) for my Arduino Mega2560 clone. I am using this paticular hex file.
https://github.com/MCUdude/MegaCore/blob/master/avr/bootloaders/optiboot_flash/bootloaders/atmega2560/16000000L/optiboot_flash_atmega2560_UART0_115200_16000000L_B7_BIGBOOT.hex

mcuee Jun 11, 2022
Maintainer

See below for two bootloaders that work with AVRDUDE -c arduino and can do EEPROM r/w (and chip erase)

@stefanrueger Yes the file https://github.com/avrdudes/avrdude/files/8882667/atmega328p.txt works as expected for my Arduino Uno clone.

PS C:\work\avr\avrdude_test\avrdude-v7.0-windows-x64> .\avrdude -p atmega328p -c arduino -b 115200 -P COM9 -U eeprom:w:entest.eep:i -v

avrdude.exe: Version 7.0
             Copyright (c) Brian Dean, http://www.bdmicro.com/
             Copyright (c) Joerg Wunsch

             System wide configuration file is "C:/work/avr/avrdude_test/avrdude-v7.0-windows-x64/avrdude.conf"

             Using Port                    : COM9
             Using Programmer              : arduino
             Overriding Baud Rate          : 115200
             AVR Part                      : ATmega328P
             Chip Erase delay              : 9000 us
             PAGEL                         : PD7
             BS2                           : PC2
             RESET disposition             : dedicated
             RETRY pulse                   : SCK
             Serial program mode           : yes
             Parallel program mode         : yes
             Timeout                       : 200
             StabDelay                     : 100
             CmdexeDelay                   : 25
             SyncLoops                     : 32
             PollIndex                     : 3
             PollValue                     : 0x53
             Memory Detail                 :

                                               Block Poll               Page                       Polled
               Memory Type Alias    Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
               ----------- -------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
               eeprom                 65    20     4    0 no       1024    4      0  3600  3600 0xff 0xff
               flash                  65     6   128    0 yes     32768  128    256  4500  4500 0xff 0xff
               lfuse                   0     0     0    0 no          1    1      0  4500  4500 0x00 0x00
               hfuse                   0     0     0    0 no          1    1      0  4500  4500 0x00 0x00
               efuse                   0     0     0    0 no          1    1      0  4500  4500 0x00 0x00
               lock                    0     0     0    0 no          1    1      0  4500  4500 0x00 0x00
               calibration             0     0     0    0 no          1    1      0     0     0 0x00 0x00
               signature               0     0     0    0 no          3    1      0     0     0 0x00 0x00

             Programmer Type : Arduino
             Description     : Arduino
             Hardware Version: 7
             Firmware Version: 7.6

avrdude.exe: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.00s

avrdude.exe: Device signature = 0x1e950f (probably m328p)
avrdude.exe: reading input file "entest.eep"
avrdude.exe: writing eeprom (512 bytes):

Writing | ################################################## | 100% 1.97s

avrdude.exe: 512 bytes of eeprom written
avrdude.exe: verifying eeprom memory against entest.eep:

Reading | ################################################## | 100% 0.68s

avrdude.exe: 512 bytes of eeprom verified

avrdude.exe done.  Thank you.

mcuee Jun 11, 2022
Maintainer

@stefanrueger Thanks and I can confirm your hex file https://github.com/avrdudes/avrdude/files/8882669/atmega2560_v3.txt works as well with my Arduino Mega2560 clone.

PS C:\work\avr\avrdude_test\avrdude-v7.0-windows-x64> .\avrdude -p atmega2560 -c arduino -b 115200 -P COM9 -U eeprom:w:e
ntest.eep:i -v

avrdude.exe: Version 7.0
             Copyright (c) Brian Dean, http://www.bdmicro.com/
             Copyright (c) Joerg Wunsch

             System wide configuration file is "C:/work/avr/avrdude_test/avrdude-v7.0-windows-x64/avrdude.conf"

             Using Port                    : COM9
             Using Programmer              : arduino
             Overriding Baud Rate          : 115200
             AVR Part                      : ATmega2560
             Chip Erase delay              : 9000 us
             PAGEL                         : PD7
             BS2                           : PA0
             RESET disposition             : dedicated
             RETRY pulse                   : SCK
             Serial program mode           : yes
             Parallel program mode         : yes
             Timeout                       : 200
             StabDelay                     : 100
             CmdexeDelay                   : 25
             SyncLoops                     : 32
             PollIndex                     : 3
             PollValue                     : 0x53
             Memory Detail                 :

                                               Block Poll               Page                       Polled
               Memory Type Alias    Mode Delay Size  Indx Paged  Size   Size #Pages MinW  MaxW   ReadBack
               ----------- -------- ---- ----- ----- ---- ------ ------ ---- ------ ----- ----- ---------
               eeprom                 65    10     8    0 no       4096    8      0  9000  9000 0x00 0x00
               flash                  65    10   256    0 yes    262144  256   1024  4500  4500 0x00 0x00
               lfuse                   0     0     0    0 no          1    1      0  9000  9000 0x00 0x00
               hfuse                   0     0     0    0 no          1    1      0  9000  9000 0x00 0x00
               efuse                   0     0     0    0 no          1    1      0  9000  9000 0x00 0x00
               lock                    0     0     0    0 no          1    1      0  9000  9000 0x00 0x00
               calibration             0     0     0    0 no          1    1      0     0     0 0x00 0x00
               signature               0     0     0    0 no          3    1      0     0     0 0x00 0x00

             Programmer Type : Arduino
             Description     : Arduino
             Hardware Version: 7
             Firmware Version: 7.6

avrdude.exe: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.00s

avrdude.exe: Device signature = 0x1e9801 (probably m2560)
avrdude.exe: reading input file "entest.eep"
avrdude.exe: writing eeprom (512 bytes):

Writing | ################################################## | 100% 1.89s

avrdude.exe: 512 bytes of eeprom written
avrdude.exe: verifying eeprom memory against entest.eep:

Reading | ################################################## | 100% 0.37s

avrdude.exe: 512 bytes of eeprom verified

avrdude.exe done.  Thank you.

mcuee Jun 11, 2022
Maintainer

@MCUdude
I have created an issue for your repo and we can continue there.

Optiboot bootloader hex file not working for Arduino Mega2560 clone MCUdude/MegaCore#188

Edit: it is a false alarm. I have closed the issue.

stefanrueger · 2022-06-10T23:34:09Z

stefanrueger
Jun 10, 2022
Maintainer Author

Chip erase in the bootloader

I have now implemented (compile-time optional) chip erase in my bootloaders, which turns out to cost between 18 and 58 bytes code. For roughly every other bootloader this pushes its size over a page boundary. Instead of implementing chip erase in the bootloader for these, the new SPM programmer under discussion will erase flash during upload (except the bootloader). This is not far from my earlier estimate but I

Forgot that the watchdog timer needs resetting in the page erase loop so that short bootloader timeouts continue to work
Mistakenly thought the -c arduino programmer sends a STK_CHIP_ERASE command but instead it's a STK_UNIVERSAL command followed by the 4-byte SPI programming chip erase command; this needs more code in the bootloader to correctly decipher
Cannot remove the page erase from the bootloader writepage function despite implementing chip erase, as writepage is typically exported for use by the application

Here a 512-byte optiboot drop-in bootloader atmega328p.hex with chip erase and EEPROM r/w that works with -c arduino (tested). And here another one with chip erase for the atmega2560_v3.hex (untested); note the latter is a vector bootloader (768 bytes instead of otherwise min 1024 boot section), so it's important the fuses are set to jump to the reset-vector on reset, not the bootloader. The bootloader itself has code to patch and verify an uploaded program (so it works with -c arduino) but that's pretty wasteful; again something the new SPM programmer will fix. Both are template bootloaders that can engage blinkenlights of your board once you replace the following three nops in the bootloader code with corresponding set/clear bit commands to drive the LED. Only replace code bytes, not the table bytes on top of the bootloader.

mov r0, r0 -> LED sbi port
mov r1, r1 -> LED cbi port
mov r12, r12 -> LED sbi ddr

The table below shows for individual bootloaders the number of extra code bytes, and the number of extra bytes owing to crossing page boundaries, the actual size of the bootloader, the effective space needed considering page size, and what the bootloader can do:

w writepage() function in bootloader that can be used by applications
e EEPROM read/write
u requires SPM programmer under discussion and will not work with -c arduino
d dual boot from external SPI flash memory
j vector bootloader supported by the core programmer under discussion
V vector bootloader that patches upload and download for verify
v vector bootloader that patches upload only, ie, verify always fails
p protection from overwriting: useful for calls to writepage()
r keep reset cause in register R2

dsiz deff siz  eff ver  feat's name
  18   32 240  256 7.6 w-u-jpr attiny2313_min.hex
  24  256 276  512 7.6 w-u--pr atmega168_min.hex
  22  256 278  512 7.6 w-u-jpr atmega644p_min.hex
  28   64 280  320 7.6 w-u-jpr atmega88_min.hex
  28  128 280  384 7.6 w-u-jpr atmega328p_min.hex
  26  256 280  512 7.6 w-u--pr atmega8_min.hex
  28   64 282  320 7.6 w-u-jpr attiny84_min.hex
  28   64 282  320 7.6 w-u-jpr attiny85_min.hex
  28  128 282  384 7.6 w-u-jpr attiny167_min.hex
  30  128 284  384 7.6 w-u-jpr atmega32_min.hex
  40  256 294  512 7.6 w-u-jpr atmega1280_min.hex
  40  256 294  512 7.6 w-u-jpr atmega1284p_min.hex
  40  256 294  512 7.6 w-u-jpr atmega2560_min.hex
  24   64 324  384 7.6 weu-jpr atmega8_ur.hex
  24   64 332  384 7.6 weu-jpr atmega88_ur.hex
  26    0 344  384 7.6 weu-jpr atmega32_ur.hex
  26    0 346  384 7.6 weu-jpr pro_20mhz_ur.hex
  26    0 352  384 7.6 weu-jpr atmega328p_ur.hex
  26    0 352  384 7.6 weu-jpr pro_16mhz_ur.hex
  26    0 360  384 7.6 weu-jpr attiny85_ur.hex
  26    0 360  384 7.6 weu-jpr digispark.hex
  26    0 362  384 7.6 weu-jpr attiny167_ur.hex
  26    0 362  384 7.6 weu-jpr digisparkpro.hex
  26    0 364  384 7.6 weu-jpr atmega328p_8125khz_swio_ur.hex
  26    0 366  384 7.6 weu-jpr atmega328p_8000khz_swio_ur.hex
  26    0 366  384 7.6 weu-jpr atmega328p_8mhz_ur.hex
  26    0 366  384 7.6 weu-jpr lilypad_ur.hex
  26    0 366  384 7.6 weu-jpr pro_8mhz_ur.hex
  26    0 368  384 7.6 weu-jpr atmega328p_7875khz_swio_ur.hex
  26    0 368  384 7.6 weu-jpr atmega328p_ur_testing.hex
  42    0 374  512 7.6 weu-jpr atmega1280_ur.hex
  42    0 374  512 7.6 weu-jpr atmega2560_ur.hex
  26    0 376  384 7.6 weu-jpr luminet_baud9600_ur.hex
  42    0 444  512 7.6 we---pr atmega8.hex
  42    0 446  512 7.6 we---pr atmega32.hex
  42    0 452  512 7.6 we---pr pro_20mhz.hex
  42    0 458  512 7.6 we---pr diecimila.hex
  42    0 458  512 7.6 we---pr pro_16mhz.hex
  42    0 460  512 7.6 we---pr anarduino.hex
  42    0 460  512 7.6 we---pr atmega328p.hex
  42    0 460  512 7.6 we---pr atmega88.hex
  42    0 460  512 7.6 we---pr jeenode.hex
  42    0 460  512 7.6 we---pr moteino.hex
  42    0 460  512 7.6 we---pr promini_led13.hex
  42    0 460  512 7.6 we---pr promini_led9.hex
  42    0 460  512 7.6 we---pr timeduino.hex
  42    0 460  512 7.6 we---pr urclock.hex
  42    0 472  512 7.6 we---pr pro_8mhz.hex
  42    0 476  512 7.6 we---pr atmega328p_led9_50Hz_fp9.hex
  26    0 494  512 7.6 weud-pr moteino_cs8_d3ur.hex
  26    0 494  512 7.6 weud-pr timeduino_cs8_d3ur.hex
  26    0 494  512 7.6 weud-pr urclock_cs8_d3ur.hex
  26    0 502  512 7.6 weud-pr anarduino_cs5_d3ur.hex
  26    0 502  512 7.6 weud-pr atmega328p_d3ur.hex
  26    0 512  512 7.6 weudjpr attiny167_d3ur.hex
  40  256 536  768 7.6 weudjpr atmega1280_d3ur.hex
  40  256 536  768 7.6 weudjpr atmega2560_d3ur.hex
  40  256 546  768 7.6 we--Vpr atmega644p_v3.hex
  40  256 546  768 7.6 we--Vpr sanguino.hex
  58  256 554  768 7.6 we--vpr atmega1280_v2.hex
  58  256 558  768 7.6 we--vpr wildfirev2.hex
  58  256 566  768 7.6 we--vpr atmega1284p_v2.hex
  58  256 566  768 7.6 we--vpr bobuino.hex
  58  256 566  768 7.6 we--vpr mighty1284.hex
  58  256 566  768 7.6 we--vpr moteinomega.hex
  58  256 566  768 7.6 we--vpr timeduinomega.hex
  58  256 566  768 7.6 we--vpr urclockmega.hex
  58  256 566  768 7.6 we--vpr wildfire.hex
  48    0 648  768 7.6 we--Vpr atmega2560_v3.hex
  40    0 726  768 7.6 we-dVpr atmega644p_d3v3.hex
  42    0 732  768 7.6 we-dVpr atmega328p_d3v3.hex
  58  256 782 1024 7.6 we-dVpr timeduinomega_cs3_d3.hex
  58  256 782 1024 7.6 we-dVpr urclockmega_cs3_d3.hex
  58  256 788 1024 7.6 we-dVpr moteinomega_cs23_d3.hex
  58  256 790 1024 7.6 we-dVpr atmega1284p_d3v3.hex
  58  256 792 1024 7.6 we-dVpr atmega1280_d3v3.hex
  46  256 810 1024 7.6 we-dVpr atmega2560_d3v3.hex

0 replies

stefanrueger · 2022-06-18T16:29:39Z

stefanrueger
Jun 18, 2022
Maintainer Author

Terminal mode (cont'd)

@mariusgreuel Windows questions below

I have now added terminal support for bootloaders in my draft urclock programmer and term.c by

Sending a GET_SYNC command to the bootloader every 180 ms so its WDT is reset
Performing the terminal's byte-wise r/w through the bootloader's paged r/w
Adding r/w caches so that read/writes are fast and don't unnecessarily wear out the memories

This is now a real joy to use!

$ avrdude -qq -p m2560 -P /dev/ttyUSB0 -c urclock -t
avrdude> r ee 256 16
0100  ff c0 ff ee ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

avrdude> w ee 257 0xca 0xfe 0xff

avrdude> r ee 256 16
0100  ff ca fe ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

avrdude> r fl 0x3b4 16
03b4  48 65 6c 6c 6f 2c 20 77  6f 72 6c 64 0a 00 ff ff  |Hello, world ...|

avrdude> w fl 0x3b4 0x68

avrdude> r fl 0x3b4 16
03b4  68 65 6c 6c 6f 2c 20 77  6f 72 6c 64 0a 00 ff ff  |hello, world ...|

My solution for periodic "keep alive" calls to the bootloader uses callbacks in the readline() library. Unfortunately, the readline() code in term.c is in a #if !defined(WIN32NAIVE) branch. @mariusgreuel Any tips how readline can be made available in Windows? The other thing needed is a function that tells me whether or not the next read on standard input would block. My Linux solution is

/* any character in standard input available? */
static int readytoread() {
  struct timeval tv = { 0L, 0L };
  fd_set fds;
  FD_ZERO(&fds);
  FD_SET(0, &fds);

  return select(1, &fds, NULL, NULL, &tv) > 0;
}

How is that functionality available in Windows? Cheers

0 replies

stefanrueger · 2022-06-24T12:24:07Z

stefanrueger
Jun 24, 2022
Maintainer Author

Urprotocol proposal

Comments welcome [edited 2022-06-26 to reflect discussions].

Explicit communication between an uploader/downloader program ("the programmer") and the bootloader is driven by the programmer, which sends command sequences to the bootloader and evaluates its return sequence. A command sequence starts by a command byte, followed by its parameters, followed by an end-of-parameter byte UR_EOP. In return the bootloader sends a fixed byte UR_INSYNC to acknowledge the command, then executes it, possibly returning data, followed by sending a different fixed byte UR_OK.

Although the UR_INSYNC and UR_OK are fixed constants for a particular bootloader, they can vary between bootloaders to indicate

Which MCU the bootloader sits on (using one of up to 2040 predefined different MCU IDs)
Whether or not the bootloader provides a paged read flash command
Whether or not the bootloader can handle EEPROM
Whether or not the bootloader can carry out a chip erase and/or, independent of this, a page erase
Whether or not writing a flash page looks like programming NOR memory

As UR_INSYNC and UR_OK should always differ, there are 256 · 255 possible combinations, one of which is reserved for backward compatibility mode where UR_INSYNC and UR_OK coincide with the respective STK500v1 constants. The urprotocol enables the bootloader to pass to the programmer log₂(256 · 255 - 1) bits = 15.9943... bits of configuration information without having to spend a single additional byte of bootloader code. Subtracting the 5 bits for the "whether or not" info leaves 10.9943... bits which allows 2040 ≈ 2^10.9943... MCU ids.

Parameters. Paged EEPROM/flash read/write commands need the address followed by the length of the memory block, possibly followed by n=length bytes to be written. The address is always a byte address in big endian; it is 16-bit for MCUs that have up to 65536 bytes flash and 24-bit for MCUs with larger flash. Zero-length reads or writes are not supported by the protocol. If the flash page size is 256 or less, then the length parameter is sent as one byte (where 0 means 256 bytes). Otherwise the length parameter is sent as two bytes big endian (where 0 means 65536). Note, however, that the only valid length for the write flash page command is the MCU flash page size. The other three paged access commands can request any length between 1 and 256, and 1 and 65536, respectively. The programmer must always meet the following constraints: the EEPROM write length must not exceed the MCU flash page size; each address block must be fully within the range of EEPROM or flash on the device. It is best practice that the programmer generally sends page-aligned addresses for paged writes and page erases as some parts require even addresses for this. Whilst the sizes of the address and length parameters may differ between bootloaders, for a particular bootloader incarnation both the address and length are always given in the same way. This means that the EEPROM address on an MCU with a large flash will be a 24-bit address despite a small EEPROM. The length parameter must always be specified, even though the write flash page command only allows one valid length, and the page erase command does not need a length at all. This is to simplify the bootloader effort to decode the programmer's command sequences.

The following commands are known

UR_GET_SYNC does nothing beyond returning the two protocol bytes as a side effect
Its main purpose is to synchronise the programmer with the bootloader, to identify the type of bootloader and (some of) its properties, and to prevent the bootloader from timing out during interactive sessions. For synchronisation, the programmer should issue a number of UR_GET_SYNC commands until it receives consistent UR_INSYNC and UR_OK values. At this point the programmer knows whether or not to switch to backward compatibility mode using the STK500v1 protocol as in -c arduino, which MCU is to be programmed etc. It is advised the programmer sets its read timeout in the synchronisation phase to less than 100 ms to avoid triggering the bootloader's watchdog timer should the bootloader come out of reset too late to see the first UR_GET_SYNC. It is also recommended the programmer drain any residual input after successfully reading two response bytes to ensure the response has not been brought about by an application before the board was reset into bootloader mode. Again, the draining timeout should be small, preferably less than 100 ms.
UR_PROG_PAGE_FL writes one flash page to the device at the page boundary that the given address identifies
If the bootloader neither implements an UR_CHIP_ERASE nor an UR_PAGE_ERASE (see below), the bootloader is expected to program the page as atomic page erase, page load and page write. If the bootloader implements either erase command, it has the choice of erasing a flash page before programming it or not. In case the bootloader erases pages before writing them, the UR_PROG_PAGE_FL payload is programmed exactly as is: the programmer should implement desired sub-page modifications by first reading flash contents to correctly pad the page. If the bootloader does not erase the page at the beginning of UR_PROG_PAGE_FL, effectively the payload is anded to the existing contents of the page thereby exposing the physical property of the underlying NOR flash memory; the programmer should pad the page with 0xff for sub-page modifications, as writing 0xff is a NOP for AVR NOR flash memories.
UR_READ_PAGE_FL (optional) returns n=length bytes of flash contents from the given address
UR_READ_PAGE_EE (optional) returns n=length bytes of EEPROM contents from the given address
UR_PROG_PAGE_EE (optional) writes n=length bytes to the EEPROM at the given address
UR_PAGE_ERASE (optional) erases to 0xff a page at the given address (length must be given but is ignored)
UR_CHIP_ERASE (optional) erases to 0xff all flash except itself
It is advised the programmer temporarily sets its read timeout to longer than the bootloader will need to erase the flash memory to avoid the programmer timing out on the UR_OK handshake. 20 s should be sufficient. If the bootloader does not implement chip erase then the programmer should ensure that flash is erased to 0xff by, eg, repeated page erase commands or repeated page writes with 0xff-only pages. Either normally takes longer than bootloader chip erase but is otherwise functionally equivalent. The protocol does not expect EEPROM to be erased in either case. However, when implementing chip erase the bootloader is free to read fuses to determine whether or not EEPROM should also be erased, and do so.
UR_LEAVE_PROGMODE (optional) might reduce the watchdog timeout so the application starts faster after programming
Any other command should behave like UR_GET_SYNC, ie, the bootloader returns UR_INSYNC and UR_OK

Error handling. It is generally considered an error if the programmer asks for non-implemented functionality, as it knows after synchronisation how the bootloader is configured. Hence, the bootloader WDT should reset on request of an optional, not implemented command. Typically, the bootloader would need to save the payload of EEPROM/flash writes to SRAM; for security reasons the bootloader should trigger a WDT reset if an illegitimate length of a paged write could overwrite the stack (eg, a request for writing 256 bytes EEPROM on a part with only 256 bytes SRAM). A protocol error detected by the bootloader (failure to match UR_EOP) should lead to a WDT reset. Protocol errors detected by the programmer (not matching UR_INSYNC or UR_OK) should lead to a termination of programming attempts. Frame errors in serial communication should also lead to a WDT reset or termination of programming, respectively. The bootloader should protect itself from being overwritten through own page writes and page erases.

Implicit communication of further bootloader properties such as the bootloader size happens through a small table located at the top of flash. Normally, the programmer can read this table after establishing the MCU id, and therefore the part for which the bootloader was compiled and the location of top flash. The 6-byte table contains (top to bottom)

Version byte: minor version number in three lsb (0..7), major version number in the 5 msb (0..31)
Capabilities byte detailing, eg, whether the bootloader uses urprotocol, supports EEPROM r/w, dual boot etc
Two-byte rjmp opcode to a writepage(ram, flash) function or a ret opcode if not implemented
Bootloader size as the number (1..127) of pages that it occupies
Vector number (1..127) used for the r/jmp to the application if it is a vector bootloader, 0 otherwise

If the bootloader does implement flash read, the user needs to supply any necessary parameters on the command line.

Backward compatibility mode. When urprotocol after synchronisation with the bootloader settles on UR_INSYNC and UR_OK values that turn out to be the STK500v1 values of 0x14 and 0x10, this triggers a backward compatibility mode. In this instance the programmer behaves (almost) like the STK500v1 implementation in avrdude's arduino programmer, ie, it handles optiboot and legacy bootloaders gracefully: In particular, the programmer can issue STK_READ_SIGN and two STK_UNIVERSAL requests (load extended address and chip erase) that the bootloader must implement in the backward compatbility mode. All EEPROM/flash addresses are sent as two-byte word addresses little endian, all length arguments are two-byte big endian, etc. Unlike avrdude -c arduino the programmer for the urprotocol should not pass on get and set hardware parameter requests, enquire software and hardware versions etc, as these requests would be wasteful for the bootloader. Under the urprotocol, bootloaders should be assured they do not need to even provide code to ignore these requests, even if they operate in the backwards compatibility mode.

Limitations. Urprotocol has only provisions for reading EEPROM and flash, for writing EEPROM and for writing flash other than the bootloader area. In particular, urprotocol has no provisions for reading other memories such as the signature (other than in backward compatibility mode), calibration bytes, locks or fuses, and neither for writing lock bytes. The protocol does not consider sub-page flash writes, which are shifted to the programmer. If the bootloader's flash write does not look like NOR programming and if the bootloader does not provide flash read, then sub-page modifications simply cannot be done. Installing a bootloader has security implications as it provides a means to modify flash thus weakening the Harvard architecture of AVR microprocessors. Even bootloader implementations that are hardened against prohibited address and length parameters have, out of necessity, somewhere a code sequence that manipulates flash memory. A flawed application might still give an attacker a way to call these code sequences, so be warned here be dragons.

2 replies

MCUdude Jul 10, 2022
Maintainer

Sorry for noob questions, but I haven't really worked with the "inner workings" of AVR bootloaders.
It would be very interesting to try your bootloader out, together with a modified version of Avrdude that supports urclock / urboot.
Would automatic baud rate detection be possible to add? I've seen Optiboot forks that calculated values for the baud rate registers based on the calculated timing from the stk500 get sync bytes.

urprotocol has no provisions for reading other memories such as the signature

Does this mean that there will not be any checks that the actual hardware connected is the same as what's specified by the -p flag? I'd personally prefer to have this check, as it has "saved" me from uploading programs that weren't meant for that particular device, typically when using Arduino IDE.

Supports vector bootloaders through automated patching of the vector table

Sorry, but what's the problem with setting the boot reset vector through the fuse bits? Why would we need all this reset vector patching? I have used Optiboot a lot, and the bootloader has always been preserved no matter how badly I screwed up while uploading. However, Optiboot also supports "virtual boot", which is intended for ATtinys where the reset vector can't be changed using fuse bits. When using "virtual boot", I've experienced that the entire bootloader went down after a failed upload. Can/will this happend to urboot/urclock?

stefanrueger Jul 11, 2022
Maintainer Author

It would be very interesting to try your bootloader out, together with a modified version of Avrdude that supports urclock / urboot

Thanks for your interest! It'll be some 4-8 weeks. (Publishing this is chicken/egg. One kinda needs both, and they need to work towards the same protocol. The protocol is settling; I have tested it, the prototype bootloaders and my draft programmer. Currently, the programmer needs some work to integrate into AVRDUDE, but I first aim to finish avrdude.conf work that I've started and the terminal flash/eeprom read/write cache.)

stefanrueger · 2022-07-11T10:29:35Z

stefanrueger
Jul 11, 2022
Maintainer Author

Would automatic baud rate detection be possible to add?

Neat idea! I hadn't thought of that.

I would first try and see whether AVRDUDE could provide automated baud rate detection: one place, one algorithm, a gazillion more resources than AVRs. My overriding mantra here is "Do as much as possible in the programmer and as little as you can get away with in the bootloader". I don't know what is involved in doing that, and if possible whether a portable solution could be found. An AVRDUDE solution would benefit a slew of other use cases.

Failing that, it's eminently possible to add automated baud rate detection to the urboot bootloader in a similar way as it will have been added in the optiboot fork. I would implement this as a "neat-to-have" compile-time option. You will be aware of limitations of trimming the baud rate registers of AVR USARTs, though: at the high end of baud rate divided by F_CPU, the granularity available for matching the PC baud rate is not great. For example, an 8 MHz AVR finds it hard to match 115200 baud with the UART. Actually, for that combo I need to use software bit toggling instead in the urboot bootloader.

0 replies

stefanrueger · 2022-07-11T10:45:27Z

stefanrueger
Jul 11, 2022
Maintainer Author

Should urprotocol cater for other memories than flash/EEPROM?

urprotocol has no provisions for reading other memories such as the signature

The protocol could (actually relatively easily) be extended to provide a means of reading other memories; I thought about this but didn't see a lot of benefit or use cases to justify implementation. I am happy to be convinced otherwise.

Does this mean that there will not be any checks that the actual hardware connected is the same as what's specified by the -p flag?

No, urboot bootloaders know exactly which MCU they have been compiled for and pass a unique MCU id on to the programmer.

My ideal user scenario is: if the user does not specify the -p flag, there are no checks, and the MCU passed on by the bootloader is used; if the user specifies an MCU, a check happens just like it has been the case all along: AVRDUDE requests the signature bytes from the urclock programmer that looks up the signature bytes of the exact part that the urboot bootloader has indicated (without bothering the bootloader).

As you'll probably know, bootloaders such as optiboot don't actually read the signature bytes from the device, instead they send the SIGNATURE_0..2 avr-libc compile-time constants for the MCU to the programmer. Using signature bytes for id-ing the device does not work universally, as AVRDUDE and avr-gcc don't necessarily have the same signature bytes for the same devices, as signature bytes are not unique for the device, and as they not even determine the flash memory size or the flash page size of the device.

"saved" me from uploading programs that weren't meant for that particular device

Same here! Happens easily when you have a number of boards with different MCUs connected to the PC :-O

0 replies

stefanrueger · 2022-07-11T10:53:10Z

stefanrueger
Jul 11, 2022
Maintainer Author

Vector bootloaders

What's the problem with setting the boot reset vector through the fuse bits?

Sure, if you can use HW-supported boot sections, go for it. Urclock/urboot support ordinary HW supported bootloading.

Vector bootloaders are SW-supported bootloaders. Their benefit is generality and size granularity:

They work on devices that simply have no HW support (some 30 classic AVRs, mostly ATtinys but also, eg, the venerable ATmega48).
Vector bootloaders can be virtually any multiples of the flash memory page size. So, not a problem to have a 384-byte bootloader on an ATmega328P (the full shebang with EEPROM support etc). That's 128 less than the smallest HW-supported size.
Some 46 classic AVRs (think ATmega644 et al) have 1024 bytes as the smallest bootloader size. The XMEGA family offers only one bootloader size which is comparatively large bordering wasteful.

Fun fact: I only set the fuse bits when the bootloader size for an MCU happens to be a HW-supported size.

The newer AVR8X devices are much better in that respect, though, as they support any number of 256-byte pages for bootloaders, and I suspect there is no benefit for vector bootloaders for these, indeed.

Yes, what I call a vector bootloader really is the same idea as optiboot's virtual boot partitions:

The reset vector of the to-be-uploaded application is patched to point to the bootloader
A not otherwise used ISR vector is patched with a jump to the application
When the bootloader starts the application it jumps to that ISR vector instead of reset
A great choice for that ISR vector is the SPM_Ready vector as it is genuinely unused by applications: SPM cannot be usefully carried out in an application program, and the urboot bootloaders poll

The urclock/urboot implementation is radically different from optiboot's: The latter patches programs on upload in the bootloader itself, which costs some 110 bytes of AVR flash memory, and is therefore likely to cut corners. Last time I looked at the optiboot code, eg, it missed to check whether the to-be-uploaded sketch was already patched for vector bootloaders (arises when you want to upload a previously downloaded .hex image that would already have been patched during upload).

In contrast, an urboot vector bootloader shifts the patching to the external programmer, which can spend a lot more resources on making sure it just works.

When using "virtual boot", I've experienced that the entire bootloader went down after a failed upload.

Yes, I noticed the same when I tried that a few years back. I suspect the optiboot code isn't (wasn't?) watertight.

Can/will this happen to urboot/urclock?

Yes, it can, but no, it won't.

It can because as soon as an incident destroys the reset vector the bootloader stops working (same incident-category as accidentally zapping the bootloader itself). It won't because urclock is carefully coded to avoid that ;). Hasn't been as an issue for me. But thanks for that question; I will review the code specifically for that.

0 replies

Core SPM programmer: request for comments #940

stefanrueger Apr 18, 2022 Maintainer

Replies: 18 comments · 18 replies

stefanrueger Apr 27, 2022 Maintainer Author

dl8dtl Apr 27, 2022 Maintainer

dl8dtl Apr 27, 2022 Maintainer

stefanrueger Apr 27, 2022 Maintainer Author

stefanrueger May 6, 2022 Maintainer Author

stefanrueger May 6, 2022 Maintainer Author

stefanrueger May 6, 2022 Maintainer Author

stefanrueger May 11, 2022 Maintainer Author

dl8dtl May 11, 2022 Maintainer

stefanrueger May 11, 2022 Maintainer Author

stefanrueger May 12, 2022 Maintainer Author

stefanrueger May 13, 2022 Maintainer Author

stefanrueger May 13, 2022 Maintainer Author

stefanrueger May 11, 2022 Maintainer Author

stefanrueger May 13, 2022 Maintainer Author

stefanrueger May 13, 2022 Maintainer Author

stefanrueger May 21, 2022 Maintainer Author

mcuee Jun 11, 2022 Maintainer

stefanrueger Jun 11, 2022 Maintainer Author

mcuee Jun 5, 2022 Maintainer

MCUdude Jun 11, 2022 Maintainer

mcuee Jun 11, 2022 Maintainer

mcuee Jun 11, 2022 Maintainer

mcuee Jun 11, 2022 Maintainer

mcuee Jun 11, 2022 Maintainer

stefanrueger Jun 10, 2022 Maintainer Author

stefanrueger Jun 18, 2022 Maintainer Author

stefanrueger Jun 24, 2022 Maintainer Author

MCUdude Jul 10, 2022 Maintainer

stefanrueger Jul 11, 2022 Maintainer Author

stefanrueger Jul 11, 2022 Maintainer Author

stefanrueger Jul 11, 2022 Maintainer Author

stefanrueger Jul 11, 2022 Maintainer Author

stefanrueger
Apr 18, 2022
Maintainer

Replies: 18 comments 18 replies

stefanrueger
Apr 27, 2022
Maintainer Author

dl8dtl
Apr 27, 2022
Maintainer

dl8dtl
Apr 27, 2022
Maintainer

stefanrueger
Apr 27, 2022
Maintainer Author

stefanrueger
May 6, 2022
Maintainer Author

stefanrueger
May 6, 2022
Maintainer Author

stefanrueger
May 6, 2022
Maintainer Author

stefanrueger
May 11, 2022
Maintainer Author

dl8dtl May 11, 2022
Maintainer

stefanrueger May 11, 2022
Maintainer Author

stefanrueger May 12, 2022
Maintainer Author

stefanrueger May 13, 2022
Maintainer Author

stefanrueger May 13, 2022
Maintainer Author

stefanrueger
May 11, 2022
Maintainer Author

stefanrueger May 13, 2022
Maintainer Author

stefanrueger
May 13, 2022
Maintainer Author

stefanrueger
May 21, 2022
Maintainer Author

mcuee Jun 11, 2022
Maintainer

stefanrueger Jun 11, 2022
Maintainer Author

mcuee
Jun 5, 2022
Maintainer

MCUdude Jun 11, 2022
Maintainer

mcuee Jun 11, 2022
Maintainer

mcuee Jun 11, 2022
Maintainer

mcuee Jun 11, 2022
Maintainer

mcuee Jun 11, 2022
Maintainer

stefanrueger
Jun 10, 2022
Maintainer Author

stefanrueger
Jun 18, 2022
Maintainer Author

stefanrueger
Jun 24, 2022
Maintainer Author

MCUdude Jul 10, 2022
Maintainer

stefanrueger Jul 11, 2022
Maintainer Author

stefanrueger
Jul 11, 2022
Maintainer Author

stefanrueger
Jul 11, 2022
Maintainer Author

stefanrueger
Jul 11, 2022
Maintainer Author