-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pulling Serial1 RX low causes Artemis to hang with v2.1 of the core #423
Comments
Running the following sketch shows that holding the RX1 pin low for ~9.5 bit periods causes the hang. I.e. half way through the stop bit. (1 start bit, eight data bits, no parity, 1 stop bit). So I guess there must be some subtle difference in the way the UART is configured in v2.1 compared with v1.2?
At 9600 baud, the code hangs when delayBy reaches 983 microseconds = 9.4 bits: At 4800 baud and starting with delayBy = 1900, the code hangs when delayBy reaches 1981 microseconds = 9.5 bits: |
Looked at this issue before, it is described + solution in this issue : #349 regards, |
Thank you Paul (@paulvha ), I think this issue might be the cause of problems we've been seeing on the Artemis Global Tracker. I upgraded the firmware to use v2.1.0 of the core recently and some customers have reported the new code hanging. We have the Iridium modem connected to Serial1. Power for the modem is switched by a separate circuit controlled by a GPIO pin. I think the modem may be pulling the RX pin low briefly as it powers up causing the code to hang. I suspect that adding a pull-up resistor in this case may not work but I will give it a try! Thanks again, |
Some progress... Disabling the RX pin seems to avoid the issue. This sketch keeps running on v2.1.0:
|
I think we're in business! This sketch:
produces: |
Could this be the explanation? In am_hal_uart.c But in serial_api.c If pulling the RX pin low causes an RX_TMOUT, does this mean the UART will hang because the interrupt is not enabled and not being serviced? I suspect we need to change line 229:
to:
And in line 242, change:
to:
Thoughts please... |
@PaulZC I'll be back working on Monday, I'll take a look first thing! |
Same here...
|
Thank you both! No rush... |
Alright. Here is what I found so far. It looks like when the RX line is going low we are hanging in the UART::rxISR.
More specifically the UnbufferedSerial::readable() is returning true (erroneously), and we are getting stuck in UnbufferedSerial::read(&c, 1);. The read function is a blocking function that does not exit until we have read atleast 1 byte, since the line is just being held low there is no data to read and we just sit there. Digging into the mbed driver the read is utilizing the serial_getc implementation. which is hanging in this loop:
We don't have much control over this behavior since we can see above:
The problem really seems to be that the call before this ( UnbufferedSerial::readable() which thru mbed will arrive at our driver implementation of serial_readable) is returning true.
So RXFE (according to the datasheet "This bit holds the receive FIFO empty indicator") is changing to false. indicating that the FIFO of the hardware serial is not empty... then we are attempting to read the FIFO, finding it empty and waiting for it to not be empty... I have confirmed this is the problem by commenting out the body of the UART::rxISR and then running this sketch.
When I jump RX to ground the RXFE register swaps to 0. So I have found what I believe to be the root cause and now I'm working on figuring out a solution. |
Reading between the lines here, I think the Again, reading between the lines, I'm guessing that v1.2 was OK with the RX pin being pulled low because the ISR checked the
The code never got as far as calling
Can we change As a nasty thought experiment, what would happen if a valid serial byte was received (with a proper stop bit) followed immediately by the RX pin being pulled low for 10+ bits? Maybe, if that happens, whatever is in the FIFO should be discarded? Thanks for digging into this...! |
I have extended your sketch to include reading the DR-status bits. Turns out an error has occurred ( parity, framing, overrun or break) when connecting a wire to GND.
I had included the serial.print() in UART: rxISR and in case of an error when we try to read the character it will "hang". The reason is that thru MBED, Serial_getc(in Serial_API.c), am_hal_uart_transfer (in am_hal_uart.c), read_timeout(in am_hal_uart.c), read_nonblocking (in am_hal_uart.c) we end up in uart_fifo_read ((in am_hal_uart.c). Here it performs a check on the error bits and if detected it will return an error AND will NOT increase the number of bytes read. The returned error code is neglected in Serial_getc it will only check that data has been received :
Now you got your "hang". Unfortunately, the error code can NOT be passed back thru MBED, only the read character is passed back. Passing back a 0x00 or 0xFF can not be done as that might be valid data. Potential solution
We need to define the UART module in HardwareSerial.h, setting default as 1 as the UART1 might be mapped to different pins: We need to add a line to determine that UART module 0 is used:
regards, |
Thank you Paul - I really appreciate your thorough detective work (as always), |
Arg. So I end up missing characters when I try to make this solution work. That is, this fixes the problem... but it seems to introduce problems in normal communications. I think this is because reading this register makes it so that it is busy and the data doesn't end up populating it... I can't figure out exactly what it is, but this new problem happens when you try to read Uart->DR_b before doing a read, even if there is no error in the error bits. This happens regardless of baud rates. This makes me think that a solution like this can't work, but maybe I am missing something obvious. I am tempted to just have the getc function in the serial driver return 0 (or FF or 55 or something) when it hits an error like this. As you pointed out this seems like a pretty bad solution since this could be misinterpreted as valid data, but it sure beats just hanging in this function in the middle of an ISR....
The other solution would be to just ditch the use of UnbufferedSerial::read in the ISR. Thoughts? |
I forgot to mention above, if you want to reproduce the problems I speak of. Try example 4_serial included with the core. |
I have tested and apparently, when reading part of the data register (the status bits), the character is also assumed to be read and removed from the FIFO. Instead of reading the status from the DR, we can also read the status from the STATUS register.
I have checked this also and now the correct data is passed on with the code in UART::rxISR
regards, |
Ahh, good catch. I got too focused on it being a timing issue! I am testing this out now. I am also considering moving this check into the serial_readable function in the mbed driver. I have not tested it yet, but I imagine that the BufferedSerial in mbed will have this exact same problem, as the ISR follows the same logic. something along the lines of
|
Could be the right place. I would make a small change and first check there is a character in the FIFO before checking the status. something like :
Regards, |
Thinking a bit more about adding in Serial_readable(), the proposed setup could still cause an issue if there are 2 characters after each other in the FIFO in error. The first one is caught, but the second is not. So maybe change to :
regards, |
Alright, this change is in the v2.2.0 pre-release. @PaulZC Are there any other things that you were hoping I would include in the v2.2.0 release for the global tracker? |
Hi @Wenn0101 , Excellent stuff - thank you! Many thanks, |
Closing. This is fixed in 2.2.0 |
Re-opening issue. Received internal bug report of a negative interaction between Servo and Serial. After a servo write, Serial.available is no longer reporting received characters. Confirmed the issue with this sketch. Confirmed that this sketch worked in v2.1.1 and not in v2.2.0. Suspect it will end up being related to the changes made here, will move to its own issue if I determine it is not.
|
Weird... looked into this. The OESTAT flag is set (overflow) and stays set. It does not seem to clear when reading the DR register. reproduce In the sketch loop(), before Serial.available(), I check for the flag register values. Once OESTAT becomes 1, I read the DR and print it. Once OESTAT is 1 in every next loop it keeps indicating the FIFO is not empty, the OESTAT stays 1 and with every read it keeps showing the 0x62. To clear it I had added writing UARTn(0)->RSR_b.OESTAT = 1; At least NOW that clears the error and it that stops repeating (SO NO MATTER WHAT WE NEED TO ADD CLEAR THE ERROR CONDITION BY WRITING A 1) BUT that does not solve the issue, because with nearly every keyboard input it generates an OESTAT error. Something else is going on. Using the standard sketch as you provided, and in serial_readable() I removed checking on OESTAT. if you enter a small word (asdfg ) and just do the small word a couple of times, it often happens that characters are lost. I then tried the same standard sketch on V2.1.1. Get exactly the same errors when you enter a small word it often happens that characters are lost (On V2.1.1 the same as on V2.2.0). Something going on with PWM setting. Maybe one of the system timers is set differently. Need more time to look at that. regards, |
I am also seeing the missed characters in v2.1.1. I tried v1.2.3 and I still see missed characters, and OESTAT is set. I think I am going to try to isolate this issue in a baremetal AmiqSuiteSDK project, to see if I get similar errors in a simpler example. |
The root cause is the following section in : the file : apollo3/2.1.1/cores/arduino/sdk/core-implement/CommonAnalog.cpp,
If you comment out the lines : AM_CRITICAL_BEGIN and AM_CRITICAL_END. I am not losing any characters anymore and also the OESTAT error is not happening anymore. I can understand that we would like a clean-cut over to the next PWM output, but the chance of any sketch/program updating the CTIMER while we read/wait for the cutover is extremely low. I would not expect this to be a critical section as we only read and would remove the 2 lines. regards, |
I wondered yesterday WHY commenting out AM_CRITICAL_BEGIN / AM_CRITICAL_END would make that difference. I guessed there is a problem with the FIFO handling. Looking into this for sure the FIFO is not enabled due to calls in serial_api.c, functions serial_init() and serail_format(). MBED_ASSERT(am_hal_uart_configure_fifo(obj->serial.uart_control->handle, &(obj->serial.uart_control->cfg), false) == AM_HAL_STATUS_SUCCESS); In am_hal_usrt_configure(), the last 'false" is used as 'bEnableFIFO' to enable or disable the FIFO The same 'bEnableFIFO' is used to enable/disable TX and RX interrupts but then there need to be RX/TX buffers defined, which are not set in serial_init(). Changing the 'false' to 'true' in both locations the characters are not lost anymore, BUT the reading of the FIFO is only starting after 34 characters. So the interrupt seems to be only raised when the FIFO is full and not when one character is in the FIFO. Looking into that now. regards, |
The FIFO handling in the UART is still a mystery for me. From the remarks at the top of serial_api.c :
I have been reading the different UART registers as part of the loop. Clearly, in the FR register, the TXFE is set to zero as soon as I sent a byte (or actually 3 bytes: byte + \r + \n). BUT it does not generate an IRQ. Nothing in the EIS or MIS is changing. It seems to waits for the FIFO to become to the threshold before it raises an interrupt and then bytes from the FIFO are read. I have also changed in serial_init() setting the obj->serial.uart_control->cfg.ui32FifoLevels = AM_HAL_UART_RX_FIFO_7_8; Or any other value. The IRQ triggers earlier, but never after 2 or 3 bytes even with AM_HAL_UART_RX_FIFO_1_2 (which is setting IFLS to zero). Once there are enough bytes in the FIFO, it will read some of them, but not all although the FR register indicates there is the FIFO is not empty! I have not figured out this FIFO. Maybe I am not alone, looking at V1.2.3 I see in aps_uart_cpp, function _begin(), line 350 : I can read all the data from the FIFO with the following line in the sketch :
Still (very) puzzled, maybe you have better ideas. regards, |
Hey Paul, Thanks for your input here. I wrote this driver and I am trying to remember everything that I was thinking at the time. I remember that I had significant problems getting the IRQ to trigger when I wanted it to. I committed the only solution that I found that would raise the interrupt when MBED was expected it. I think it is very possible that there is a better solution, but I think you are hitting the same confusion I did. I had difficulty using the FIFO and getting the interrupt to fire, I remember simple "echo" serial examples that would only echo characters back after 4 or 8 characters were submitted. This resulted in me thinking like the FIFO was only useful in situations where you were relying on polling the FIFO rather than utilizing interrupts. I cant help but feel like it has to be possible to configure the UART module to behave the way that we want, but that "// Disable that pesky FIFO" comment does make it seem like others before us have tried and failed. To summarize this new issue, it seems like there are 3 pieces to this problem:
It sounds like I can quickly fix item 1, and item 2 may be a simple fix (although I want to make sure I understand why this is here, I think it might have something to do with "inverted PWM signals" that I remember being an issue, but that I didn't work on). But item 3 is still a bit of a problem that may not have a direct solution. -Kyle |
Glad I am not the only one who does not understand the UART- FIFO. Looking at Ambiq SDK code the ' AM_HAL_UART_RX_FIFO_1_2' seems that the IRQ happens when the FIFO is half-full. I can't find a good description for the other settings. We could add a check (as part of available()) the RXFE status, if not 1 call the routine "serial_available" to read all the characters.. but it is only a work-around (and maybe not the best).. As it relates to the ap_pwm_output. We should not remove the check, but in my mind it does not need to be AM_CRITICAL_BEGIN as am_hal_ctimer_read() is only reading. Not updating or writting. The worsed that would happen is that it misses a servo-cycle regards, |
Fix is in dev and scheduled for release 2.2.1 |
Hi gang,
Another interesting Apollo3 v2.1 quirk:
Pulling the Serial1 RX pin low causes the Artemis to hang with v2.1 of the core...
Steps to reproduce:
Redboard Artemis ATP
Run the following sketch
Use a jumper wire to pull the RX1 pin to GND
With v2.1.0 and v2.1.1 of the core, the Artemis hangs permanently - even when the jumper wire is removed
With v1.2.1, the code keeps on running
I haven't yet figured out why this happens, or exactly where the code hangs. I'm hoping @paulvha may have some insight on this (thank you in advance - as always).
I know that holding a serial RX pin low is 'illegal' (manifesting as a continuous start bit?) but folks need to know about this just in case the thing they have attached to the RX pin is (e.g.) powered down and pulls the pin low.
I'm kicking myself, because I actually noticed this a few weeks ago when I was upgrading OpenLog Artemis. I noticed that Serial1 RX would hang after deep sleep if nothing was connected to the RX pin. I eventually figured out that the (weak?) pull-up on the RX pin is removed after deep sleep. If the pin then floats low, the code hangs. I worked around this at the time by manually re-enabling the pull-up on the RX pin after deep sleep. I guess this might explain some of the behavior seen in #411?
Best wishes,
Paul (ZC)
The text was updated successfully, but these errors were encountered: