Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GPIO functions for AVR atomic #9575

Closed
wants to merge 5 commits into from

Conversation

sigprof
Copy link
Contributor

@sigprof sigprof commented Jun 28, 2020

Description

On AVR MCUs the GPIO macros like writePinLow() perform read-modify-write operations on PORTx and DDRx registers, unless their argument is a compile time constant (in that case the compiler can optimize those operations into single cbi or sbi instructions). Usually this works without any problems; however, the software PWM emulation code in quantum/backlight/backlight_avr.c manipulates GPIOs from interrupt handlers, and this may cause issues which manifest in a flickering backlight on some boards (#5953). The problem happens when the read-modify-write sequence performed by the matrix scanning code is interrupted by the software PWM handler, and the software PWM uses a pin from the same port as one of the matrix row pins (or column pins for a ROW2COL matrix); in this case the resumed read-modify-write code will overwrite the update performed by the software PWM emulation code.

Change the implementation of all GPIO functions so that they are atomic with respect to interrupts; this is achieved by disabling interrupts around the read-modify-write operations. This fixes #5953, and should not break other boards, although there will be some additional overhead for some GPIO operations (but the most common case of performing writePinLow() or writePinHigh() on pins which are known at compile time should have no additional overhead, because in this case the assembly implementation using cbi or sbi should be picked by the compiler).


This change fixes the issue #5953 (tested on the XD87 HS board), at least if the RGB underglow is not used (a running rgblight animation would still cause noticeable flicker, especially when the backlight brightness is low, but that is a separate issue). However, I'm not really happy about adding overhead to all those GPIO function (although that overhead may be not that large in practice — I measured the scanning rate on XD87 HS, and this change decreased it from 2187 to 2077 scans per second, which is not very significant).

Technically some part of these changes are not really required to fix the backlight issue:

  • The backlight code actually touches only the PORTx register from interrupts, therefore the code which accesses DDRx may be left as is, but that just does not feel right.
  • Intermediate variables are used so that the compiler will hopefully perform those calculations outside of the code section where interrupts are disabled (this does not actually work 100%, but at least the shifting loop is usually ran with interrupts still enabled). This should decrease interrupt latency, which could be especially important for V-USB devices (although maybe we could disable this extra code for them anyway; at the moment I don't have any V-USB hardware and cannot check whether it would have any issues with interrupt latency when this code is enabled).
  • Disabling interrupts around the body of setPinInput() and setPinInputHigh() as a whole is not required — it could be possible to add a special case for the constant argument there and just perform a sequence of two cbi/sbi instructions without disabling interrupts. However, that would result in a possibility of receiving an interrupt in an intermediate state when the pin is already configured as input, but the pullup state is wrong. (Note that for switching from input to output that possibility still exists, and the intermediate state in that case might be even worse — the pin could actively output a wrong logical level for some time; maybe we should add functions like setPinOutputLow() and setPinOutputHigh(), which change PORTx before DDRx, and use setPinOutputLow() in the matrix scanning code.)

Types of Changes

  • Core
  • Bugfix
  • New feature
  • Enhancement/optimization
  • Keyboard (addition or update)
  • Keymap/layout/userspace (addition or update)
  • Documentation

Issues Fixed or Closed by This PR

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • I have tested the changes and verified that they work and don't break anything (as well as I can manage).

On AVR MCUs the GPIO macros like writePinLow() perform read-modify-write
operations on PORTx and DDRx registers, unless their argument is a
compile time constant (in that case the compiler can optimize those
operations into single `cbi` or `sbi` instructions).  Usually this works
without any problems; however, the software PWM emulation code in
quantum/backlight/backlight_avr.c manipulates GPIOs from interrupt
handlers, and this may cause issues which manifest in a flickering
backlight on some boards (qmk#5953).  The problem happens when the
read-modify-write sequence performed by the matrix scanning code is
interrupted by the software PWM handler, and the software PWM uses a pin
from the same port as one of the matrix row pins (or column pins for a
ROW2COL matrix); in this case the resumed read-modify-write code will
overwrite the update performed by the software PWM emulation code.

Change the implementation of all GPIO functions so that they are atomic
with respect to interrupts; this is achieved by disabling interrupts
around the read-modify-write operations.  This fixes qmk#5953, and should
not break other boards, although there will be some additional overhead
for some GPIO operations (but the most common case of performing
writePinLow() or writePinHigh() on pins which are known at compile time
should have no additional overhead, because in this case the assembly
implementation using `cbi` or `sbi` should be picked by the compiler).
@sigprof sigprof mentioned this pull request Jun 28, 2020
@sigprof
Copy link
Contributor Author

sigprof commented Jun 28, 2020

Making those functions static inline does not work because some people use them in plain inline functions without static, and converting them into C99-style inline + extern inline pairs also does not work, because in this case avr-gcc does not want to inline them at all (even asm optimizations for constant arguments do not work). Sigh.

sigprof added 2 commits June 29, 2020 10:34
Previous change to make AVR GPIO operations atomic converted the GPIO
macros into `static inline` function.  Unfortunately, there is some code
that uses those operations in `inline` functions, and gcc complains when
an `inline` function tries to call a `static inline` function.  Using a
plain `inline` for AVR GPIO functions also did not work, because in this
case gcc does not actually inline the code, therefore the optimization
for constant arguments does not work.  Apparently the only solution is
to convert those functions back to macros, so that the code would always
be inlined properly.

The GPIO_FORCE_PRECOMPUTE() and GPIO_BARRIER() macros were added to make
the code which executes with interrupts disabled as small as possible;
without those macros gcc sometimes puts the slow _BV((pin) & 0xF)
calculation (which contains a loop) after `cli`, which needlessly
increases the interrupt latency.
Some existing code attempted to perform GPIO operations on NO_PIN.  The
previous implementation of the GPIO macros for AVR effectively treated
those calls as noops, because `_BV((pin) & 0xF)` truncated to a byte
ended up being 0; however, the atomic implementation attempted to
generate `sbi` or `cbi` instructions with out-of-range operands if the
`pin` argument was a compile time constant equal to NO_PIN.  Add checks
to avoid assembler errors and make such calls do nothing again.
@sigprof
Copy link
Contributor Author

sigprof commented Jun 29, 2020

Because using static inline did not work, I converted the GPIO functions back to macros, and also added some bandaids to make the existing broken code compile as before (the handwired/lovelive9 keyboard uses NO_PIN in MATRIX_ROW_PINS, which generated some broken assembly code; it should probably be fixed to use DIRECT_PINS instead).

This version of the code was not tested on any real hardware at the moment.

@mtei
Copy link
Contributor

mtei commented Jun 29, 2020

I'd like the GPIO pin operation macros to keep it simple and direct, because I'll use those macros in the interrupt handling routine as well.

So don't add atomic operations to the existing macros. Instead, you are welcome to add a new macro with atomic operations, such as the following

  • setPinInput_atomic(pin)
  • setPinInputHigh_atomic(pin)
  • setPinOutput_atomic(pin)
  • writePinHigh_atomic(pin)
  • writePinLow_atomic(pin)
  • writePin_atomic(pin, level)
  • readPin_atomic(pin)
  • togglePin_atomic(pin)

@sigprof
Copy link
Contributor Author

sigprof commented Jun 29, 2020

The issue is that most of the existing code (e.g., quantum/matrix.c, which is the main culprit, but any other GPIO access may cause the same problem) must use atomic GPIO operations. So the way to make this 100% correct is to make the existing operation atomic (they will still work inside interrupt handlers, although slower than necessary), and then maybe add writePinHigh_fast(pin) and friends for use inside interrupt handlers.

In addition, setPinOutput(pin), writePinHigh(pin), writePinLow(pin) and writePin(pin, level) are already compiled to a single instruction if the pin argument is constant. setPinInput(pin) and setPinInputHigh(pin) do not have this special case (the compiler will still use the cbi and sbi instructions for them if pin is constant, but they will be wrapped in a cli/restore SREG block), because they perform operations on two separate registers, and handing an interrupt between those operations may not be desired in some cases (but for the particular case in #5953 it does not actually matter).

The macro implementation did not work again, this time due to the code in the chidori keyboard:

    status ? writePinHigh(pin) : writePinLow(pin);

Looks like I need to wrap those macros in a layer of __extension__ to make this kind of code compile.

@mtei mtei requested review from fauxpark, skullydazed and a team June 29, 2020 15:47
sigprof added 2 commits June 30, 2020 09:21
Use the gcc statement expression extension to make AVR GPIO macros
usable inside expressions, in particular, in a ternary operator:

     level ? writePinHigh(pin) : writePinLow(pin)

The code using a ternary operator was found in the `chidori` keyboard;
it was apparently written before `writePin(pin, level)` was introduced.

The implementation of `writePin(pin, level)` is also reverted to the one
using the ternary operator, like shown above, to make sure that this
style of code is kept working.

There is still a difference from the old implementation: old macros
returned the value of the last modified register (`PORTx` or `DDRx`),
while new ones have type `void`.  It is not possible to keep the old
behavior without losing the `cbi`/`sbi` optimization, and if any old
code was trying to actually use that return value, it should break in an
obvious way with new macros.
@drashna drashna added the breaking_change Changes that need to wait for a version increment label Jul 5, 2020
@drashna
Copy link
Member

drashna commented Jul 5, 2020

If this is an issue, then it is something that should be fixed. But given how the functions are used, it definitely needs to be handled carefully.

@sigprof
Copy link
Contributor Author

sigprof commented Jul 5, 2020

The unfortunate thing here is that apparently the only case when AVR GPIOs are manipulated from interrupt handlers is the BACKLIGHT_PWM_TIMER code, which is required mostly for badly designed hardware. Maybe it would make sense to implement some option like USE_ATOMIC_GPIO_ACCESS, which could be enabled in config.h just for these problematic boards (enabling it automatically would require moving lots of preprocessor logic from quantum/backlight/backlight_avr.c to, I suppose, quantum/config_common.h).

I also looked at how ChibiOS implements the GPIO access. The documentation for palSetLine() and other similar macros says that those operations are not guaranteed to be atomic, so the problem may potentially happen there too, if some code tries to manipulate GPIOs from interrupt handlers or other threads.

For STM32 the actual implementation of GPIO functions in lib/chibios/os/hal/ports/STM32/LLD/GPIOv1/ seems to be atomic even for the “set mode” part (atomic in the sense “the code should not permanently leave registers in an invalid state”, but if _pal_lld_setgroupmode() is interrupted, the hardware registers may be left in some intermediate state for a longer time than desired). Unfortunately, this applies only to GPIOv1 — the implementation of _pal_lld_setgroupmode() in lib/chibios/os/hal/ports/STM32/LLD/GPIOv2/hal_pal_lld.c and lib/chibios/os/hal/ports/STM32/LLD/GPIOv3/hal_pal_lld.c is not atomic (although apparently it does not access the data register, so it might still be safe if the interrupt code does not try to change modes).

For KINETIS the quality of implementation in lib/chibios-contrib/os/hal/ports/KINETIS/LLD/GPIOv1/ seems to be worse — although pal_lld_setpad() and pal_lld_clearpad() are atomic, pal_lld_writepad() is implemented in a non-atomic way for some reason; _pal_lld_setpadmode() is not atomic too (but apparently does not actually conflict with a concurrent pal_lld_setpad() or pal_lld_clearpad(), like in the STM32 GPIOv{2,3} case).

@tzarc tzarc deleted the branch qmk:develop February 27, 2021 20:27
@noroadsleft noroadsleft reopened this Feb 27, 2021
@drashna
Copy link
Member

drashna commented Mar 14, 2021

This has merge conflicts

@fauxpark
Copy link
Member

@sigprof
Copy link
Contributor Author

sigprof commented Apr 3, 2021

Yes, #10491 solved the most important part of the problem, and a more general solution might not be needed for now, so this PR can be closed.

@sigprof sigprof closed this Apr 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking_change Changes that need to wait for a version increment core enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hardware PWM on PD0
6 participants