Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NUC123] USB driver bug fixes #244

Merged
merged 6 commits into from
Dec 31, 2020

Conversation

alexclewontin
Copy link
Member

@alexclewontin alexclewontin commented Dec 22, 2020

@fishman without knowing the exact configuration you're using to build QMK it's hard to say anything for certain, but here are a couple of things I've fixed that may help.

  1. For the moment, I've removed all support for HW endpoints 6 & 7. Endpoint 6 suffers from a known hardware error where in certain configurations it will incorrectly mirror the behavior of endpoint 5. More can be found in the errata from Nuvoton. The long term solution here is to make the HW endpoint allocation more sophisticated, so that a working configuration is used, but short term, this is safer than leaving it enabled. This was causing a hard fault for me in my QMK build, so this seems like the most likely candidate.
    2. I reverted the wake-up/suspend/resume handling to the way the driver handled it previously. My QMK build was (and continues to) having some issues where it was getting stuck asleep. I don't think this is a USB issue, and This more likely has something to do with the dummy matrix scan function I injected not waking the board properly, but on the off chance it might help you, I've changed this back.
    3. I changed the order of handling bus events & endpoint events. There were situations where the ISR was too slow, which would cause some zero-length packets to get handled out of order, which messes with the EP0 state machine. This probably needs some continued attention, but this tweak might help in the mean time. This problem for me has been very intermittent, however, and so if your computer is categorically not recognizing the keyboard then I would guess this is not what your problem is.

If one of these this works, we'll get it committed here. If not, I'll probably need to see your full QMK config files to really be of any additional help.

EDIT: Either 2 or 3 caused something to break. I'm pretty sure those weren't your problem though

@alexclewontin alexclewontin force-pushed the nuc123-usb-driver branch 2 times, most recently from f7f7b58 to 395ec32 Compare December 24, 2020 01:41
@alexclewontin
Copy link
Member Author

alexclewontin commented Dec 24, 2020

As an update:

  1. EP3 remains disabled. This will require a more sophisticated HW EP allocation system, as detailed above
  2. I did add the _usb_wakeup event callback to the RESUME event detection in the ISR. USB suspend/wake may need continued testing.
  3. I finally found and fixed a bug that had been intermittently popping up for a while: I thought it was just a matter of the device being too slow to respond because if you set the HCLK speed to maximum you could outrun it, but it was actually a DATA0/1 PID misalignment that was causing the device to lag behind the host, which would occasionally result in bad transitions on the EP0 state machine. This was only really noticeable if you had assertions enabled in chconf.h, but it also probably meant that the maximum throughput was cut in half. In any case, problem solved.

@fishman
Copy link
Contributor

fishman commented Dec 24, 2020

@alexclewontin this is the pull request. I added the config from your USB testhal. I guess the board init may not be working.

qmk/qmk_firmware#11112

@alexclewontin
Copy link
Member Author

alexclewontin commented Dec 24, 2020

@fishman using your config, plus the updates in this PR, plus this:

/**
 * @brief   Early initialization code.
 * @details This initialization must be performed just after stack setup
 *          and before any other initialization.
 */
void __early_init(void) { NUC123_clock_init(); }

added to the end of my board.h file, I can reproduce. For me, a temporary fix is to remove the line in your keyboard's rules.mk that says

MOUSEKEY_ENABLE = yes       # Mouse keys

I have literally no idea why that changes things, but for some reason it does. (CAVEAT: this is not an identical reproduction, because I do not have the ducky hardware, but rather a simulation on the dev board hardware I do have)

Tomorrow I can try to look at what MOUSEKEY_ENABLE does, but I am not super familiar with qmk.

Actually I'm not even sure MOUSEKEY_ENABLE is the culprit. I'll do some more digging tomorrow

@fishman
Copy link
Contributor

fishman commented Dec 24, 2020

I tried your board and disabling mousekeys(not sure why that should matter) and it doesn't work. your testhal works(as in a usb device shows up), maybe you have some usb init in the usbcfg that's incompatible with the way qmk inits? It's a little frustrating to be honest. The hal and mcu config should be the same as yours except for usb_wait. Can you check the branch I posted above?

@alexclewontin
Copy link
Member Author

@fishman Using your exact rules.mk (except I had to comment out DIP_SWITCH_ENABLE = yes because it was causing some errors during compilation, my best guess is that this isn't the problem though) and mcuconf.h I've identified and fixed another bug in the USB driver that was causing some bad descriptors to be transmitted in some edge cases (One of these edge cases was the HID report descriptor that resulted when both MOUSEKEY_ENABLE and EXTRAKEY_ENABLE were set to yes. With it fixed, the keyboard should be recognized even when they are both set to yes). My Mac recognizes the keyboard as a connected device, and running hidutil list includes it as a connected HID device.

I'm sorry it's frustrating; unfortunately this is the nature of debugging embedded software remotely. I think we're making progress here though. Please give this a try

@alexclewontin
Copy link
Member Author

The CI failure has nothing to do with this commit/platform

@fpoussin
Copy link
Member

fpoussin commented Dec 24, 2020

ChibiOS stable branch is broken, I'll open a bug report
edit: http://www.chibios.com/forum/viewtopic.php?f=35&t=5729

@fpoussin
Copy link
Member

Fix went live

@fishman
Copy link
Contributor

fishman commented Dec 29, 2020

@alexclewontin still not working. What's frustrating about it isn't debugging. It's that it was working and since it's a complete rewrite it's not easy to see where the error is. I get that your code is a lot cleaner and better than the original implementation, so I appreciate that.

@fishman
Copy link
Contributor

fishman commented Dec 29, 2020

I've just tested it again. The keyboard is detected, but only sometimes, so it's pretty unreliable. When it is detected it will work for one keypress and then get stuck.

@alexclewontin
Copy link
Member Author

@fishman right and I get how it is frustrating, but the original driver did not work for some pretty basic test cases.

Re the problems: I can reproduce the single keypress & getting stuck (I had assumed that was an issue with the mock GPIO driver I rigged up, but clearly not), but not the unreliability of connection (I can see the keyboard connect every time I plug into the computer). Can you make sure you are using the most recent commit in this PR? If you are, can you set DIP_SWITCH_ENABLE = no? I don't think that is the problem, but that is the only setting I could not copy in my build.

@alexclewontin
Copy link
Member Author

alexclewontin commented Dec 31, 2020

@fishman Okay, I have identified what was causing it to get stuck after one keypress (a continuation of the known hardware bug discussed earlier in this thread) and fixed that issue. As I mentioned, I cannot reproduce the unreliability issue (it shows up every time I connect on my computer), but I have added a couple of delays in the initialization routine, in case that helps.

When you try this out, make sure you pull down a fresh copy of this branch, because I squashed the new fix into 891a0b3, as it is essentially one single bug.

@fishman
Copy link
Contributor

fishman commented Dec 31, 2020

@alexclewontin LGTM, thank you.

I noticed this delay missing compared to the other driver. Can you please also removed the unused variable on line 501?

@fishman
Copy link
Contributor

fishman commented Dec 31, 2020

@alexclewontin I know why you don't see the connection issue. Try to connect your device through a hub and you'll probably run into this issue.

@fishman
Copy link
Contributor

fishman commented Dec 31, 2020

@fpoussin @alexclewontin i'd probably recommend to merge this and add the rest in a separate pr.

@alexclewontin
Copy link
Member Author

@fishman I've fixed the unused variable (squashed into 75cbc08). I'll mark it ready for merge.

@alexclewontin alexclewontin marked this pull request as ready for review December 31, 2020 17:19
@fpoussin fpoussin merged commit 61baa6b into ChibiOS:chibios-20.3.x Dec 31, 2020
@fpoussin
Copy link
Member

Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants