Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink STM32 I2C v2 HAL #15350

Merged
merged 4 commits into from
Dec 5, 2022

Conversation

multiplemonomials
Copy link
Contributor

Summary of changes

Hi all. As you may have heard, I have been doing a lot of work on my mbed-ce fork to better test and fix up the I2C HAL for STM32 devices. I think that several serious bugs have been fixed, so I'd like to upstream these changes if possible.

In summary, I created a new test program that runs lots of different I2C scenarios against real hardware. As-is, the STM32 I2C v2 HAL failed many of these test cases, and all of the ones that involved single-byte functions (e.g I2C::write(int)). These functions are problematic because they are so low-level that the STM32 HAL cannot be used, and chip registers must be accessed directly to implement them. There were several places where the I2C peripheral could get into a bad state, freezing it and preventing subsequent transactions from going through.

To fix the issue, I rethought how a lot of the low-level I2C code works, implementing a new state enum to track just what the hardware is doing at any one time. After several weeks of nights on the logic analyzer working through test failures, the code is finally able to handle anything I can throw at it, and is much more compliant with how other manufacturers' HALs work.

Impact of changes

This PR is a major rethink of the STM32 I2C code for their v2 I2C peripheral. It tries to get it into a more API-compliant and less buggy state. It should fix several known bugs:

Additionally, I found more issues while making this PR that will also be fixed:

  • Doing a single-byte operation that fails, then doing a transaction operation after that causes the transaction to return early and blindly report success, because the STOPF flag from earlier was not cleared
  • Doing an asynchronous read transaction then a synchronous write transaction could cause the asynchronous read transaction to be done a second time due to incorrect logic in the interrupt handler
  • The entire I2C peripheral was being reinitialized whenever a NACK was received, which is a fairly expensive operation, especially for an interrupt handler. Now NACKs are handled differently from other errors and don't cause a complete re-init.
  • Doing a repeated start with the single-byte API did not work, because the RELOAD and NBYTES fields have to be zeroed before the repeated start can be sent.

The biggest change this PR makes is the addition of a new state enum, obj_s->state. This provides a simple indicator of what the I2C code is doing at any given time, making certain parts of the logic a lot simpler and enabling some of the fixes to work. I also added a lot of new checks based on this state variable, so if you use the I2C object in an incorrect way, you have a fighting chance of getting a useful error code back instead of undefined behavior. (it's still tough though because certain functions do not have the ability to return an error code...)

Also, up until now, the single-byte API would return after enqueuing the byte into hardware, without waiting for the byte to actually be sent. I think this was done to provide a performance increase, but means that the HAL does not comply with standard Mbed behavior because it cannot return a NACK for a byte until later on, after it's been sent. (and the performance increase didn't mean a whole lot anyway, because it still spinlocks for byte N before sending byte N+1). I have switched things around so it's now compliant and returns the correct result for the byte it was received on.

Migration actions required

Documentation

None, this just brings it into compliance with the existing API.


Pull request type

[X] Patch update (Bug fix / Target update / Docs update / Test update / Refactor)
[] Feature update (New feature / Functionality change / New API)
[] Major update (Breaking change E.g. Return code change / API behaviour change)

Test results

[] No Tests required for this change (E.g docs only update)
[] Covered by existing mbed-os tests (Greentea or Unittest)
[X] Tests / results supplied as part of this PR

I ran my new comprehensive I2C test on both a Nucleo L452RE and a Nucleo H743ZI dev board, and all test cases passed. I also ran the EEPROM test case, and everything passed except the 4096 byte ("full chip") operations. This is due to another bug.


Reviewers

@0xc0170
@jeromecoutant


@@ -77,39 +77,6 @@ struct serial_s {
#endif
};

struct i2c_s {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit silly that this structure is copied verbatim into like 10 different objects.h headers, despite all of them being identical and being used by a single version of i2c_api.c. Since I had to modify the structure, I consolidated these into a single version in stm_i2c_api.h while I was at it.

@ciarmcom ciarmcom added the release-type: patch Indentifies a PR as containing just a patch label Nov 21, 2022
@ciarmcom ciarmcom requested review from 0xc0170, jeromecoutant and a team November 21, 2022 08:00
@ciarmcom
Copy link
Member

@multiplemonomials, thank you for your changes.
@0xc0170 @jeromecoutant @ARMmbed/mbed-os-maintainers please review.

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's list of changes in the commit message. Was this pull request squashed? It's not easy to review so much changes within i2c implementation, should this be split into smaller logical changes (my 2 below comments are related to the same remark).

hal/include/hal/i2c_api.h Show resolved Hide resolved
hal/include/hal/i2c_api.h Outdated Show resolved Hide resolved
@0xc0170
Copy link
Contributor

0xc0170 commented Nov 24, 2022

@jeromecoutant Would you find time to review ?

@jeromecoutant
Copy link
Collaborator

@jeromecoutant Would you find time to review ?

To be honest, for a complete review, I think not...
But I executed tests with CI shield with some families, it is OK.
I would like to start FPGA tests as well

@multiplemonomials
Copy link
Contributor Author

multiplemonomials commented Nov 25, 2022

To answer your question, yes this was squashed together from a bunch of commits, many of which were bugfixes. If you want, I can try to break a couple of the standalone fixes (e.g. adding docs, adding return codes) into separate MRs or separate commits. However, I'll warn you, at least 80% of the total changes need to be made at once and reviewed together, or they won't work and they won't make sense.

I definitely understand that this type of change is hard to review -- the STM I2C register level operation is complex and not very intuitive in how it maps to Mbed's I2C API. I can count at least 10 different times when my test broke and it turns out the hardware was working in a way completely different from how I'd expected. But please believe me that I've tested it extensively -- I'm pretty confident that if there were issues in the code, my hardware test cases would catch them (even if the FPGA board tests don't, because they only do one transaction before deleting the I2C object). And even if there are some small issues, please believe me that it's an order of magnitude less buggy than it was before.

I really don't want to let the perfect be the enemy of the good here, especially when people are constantly running into these I2C bugs. I can personally say that, in ~4 years of Mbed OS development, I have seen developers run into this bug on 4 separate occasions, each time causing major havoc (and one time requiring me to drive all the way to a lab at UC San Diego). I really would like to put it to bed.

@0xc0170
Copy link
Contributor

0xc0170 commented Nov 25, 2022

understood, so lets just split what can be (my 2 remarks above, possibly moving struct i2c_s into the header can also be separate logical unit, the rest within i2c implementation can stay to address the bugs there) - separate commits, no need to create another PR.

@multiplemonomials multiplemonomials force-pushed the upstreamed/rethink-stm32-i2c-hal branch from a567b81 to a850734 Compare November 28, 2022 05:19
@mergify mergify bot dismissed 0xc0170’s stale review November 28, 2022 05:19

Pull request has been modified.

@0xc0170
Copy link
Contributor

0xc0170 commented Nov 28, 2022

Can you check the styling check failing (just one line).

I'll need to fix license check, it looks like something broke in the scancode dependencies

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding 2 commits, it looks good to me (just styling fixes required).

} else {

if(obj_s->state == STM_I2C_SB_READ_IN_PROGRESS)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep the styling consistent, it uses { attached to the condition, and space after if, see line 1143, please fix all instances in this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a command I can run to autoformat everything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, dang notepad++...

@multiplemonomials
Copy link
Contributor Author

Regarding the license check, feel free to use my patch mbed-ce#104

@pombredanne
Copy link

I'll need to fix license check, it looks like something broke in the scancode dependencies

@0xc0170 I maintain scancode and this is now fixed. Sorry for the churn!

@multiplemonomials multiplemonomials force-pushed the upstreamed/rethink-stm32-i2c-hal branch from a850734 to 0c2ee4f Compare November 29, 2022 08:12
0xc0170
0xc0170 previously approved these changes Nov 29, 2022
@mergify mergify bot added needs: CI and removed needs: work labels Nov 29, 2022
@mbed-ci
Copy link

mbed-ci commented Nov 29, 2022

Jenkins CI Test : ❌ FAILED

Build Number: 1 | 🔒 Jenkins CI Job | 🌐 Logs & Artifacts

CLICK for Detailed Summary

jobs Status
jenkins-ci/mbed-os-ci_build-cloud-example-ARM
jenkins-ci/mbed-os-ci_build-cloud-example-GCC_ARM
jenkins-ci/mbed-os-ci_unittests ✔️
jenkins-ci/mbed-os-ci_build-greentea-ARM
jenkins-ci/mbed-os-ci_build-greentea-GCC_ARM
jenkins-ci/mbed-os-ci_build-example-GCC_ARM
jenkins-ci/mbed-os-ci_build-example-ARM

@mergify mergify bot added needs: work and removed needs: CI labels Nov 29, 2022
@multiplemonomials
Copy link
Contributor Author

Hmm, do you have any idea why the CI failed? I don't really know how to read the logs

@0xc0170
Copy link
Contributor

0xc0170 commented Nov 29, 2022

Compile [ 97.9%]: i2c_api.c
[Error] i2c_api.c@432,16: no member named 'sda_func' in 'struct i2c_s'
[Error] i2c_api.c@433,16: no member named 'scl_func' in 'struct i2c_s'
[Error] i2c_api.c@507,37: no member named 'sda_func' in 'struct i2c_s'
[Error] i2c_api.c@508,37: no member named 'scl_func' in 'struct i2c_s'
[ERROR] ./mbed-os/targets/TARGET_STM/i2c_api.c:432:16: error: no member named 'sda_func' in 'struct i2c_s'
        obj_s->sda_func = pinmap->sda_function;
        ~~~~~  ^
./mbed-os/targets/TARGET_STM/i2c_api.c:433:16: error: no member named 'scl_func' in 'struct i2c_s'
        obj_s->scl_func = pinmap->scl_function;
        ~~~~~  ^
./mbed-os/targets/TARGET_STM/i2c_api.c:507:37: error: no member named 'sda_func' in 'struct i2c_s'
    pin_function(obj_s->sda, obj_s->sda_func);
                             ~~~~~  ^
./mbed-os/targets/TARGET_STM/i2c_api.c:508:37: error: no member named 'scl_func' in 'struct i2c_s'
    pin_function(obj_s->scl, obj_s->scl_func);
                             ~~~~~  ^
4 errors generated.

This are the errors there for DISCO_L475VG_IOT01A board.

@multiplemonomials
Copy link
Contributor Author

OK, looks like this was a mistake in my original Mbed CE MR. Fixed it there and here.

@mergify mergify bot dismissed 0xc0170’s stale review December 4, 2022 07:20

Pull request has been modified.

@mergify mergify bot added needs: CI and removed needs: work labels Dec 5, 2022
@0xc0170
Copy link
Contributor

0xc0170 commented Dec 5, 2022

CI started

@mbed-ci
Copy link

mbed-ci commented Dec 5, 2022

Jenkins CI Test : ✔️ SUCCESS

Build Number: 2 | 🔒 Jenkins CI Job | 🌐 Logs & Artifacts

CLICK for Detailed Summary

jobs Status
jenkins-ci/mbed-os-ci_build-cloud-example-ARM ✔️
jenkins-ci/mbed-os-ci_build-cloud-example-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_unittests ✔️
jenkins-ci/mbed-os-ci_build-greentea-ARM ✔️
jenkins-ci/mbed-os-ci_build-example-ARM ✔️
jenkins-ci/mbed-os-ci_build-greentea-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_build-example-GCC_ARM ✔️
jenkins-ci/mbed-os-ci_greentea-test ✔️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants