Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SX128X begin() SPI timeout when BUSY defined on certain platforms #716

Closed
MUSTARDTIGERFPV opened this issue Mar 30, 2023 · 9 comments
Closed
Labels
bug Something isn't working resolved Issue was resolved (e.g. bug fixed, or feature implemented)

Comments

@MUSTARDTIGERFPV
Copy link

MUSTARDTIGERFPV commented Mar 30, 2023

Describe the bug
Using RadioLib on ESP8285/SX1280 (ExpressLRS Receiver Hardware), SX128x::begin() returns -705. Instrumenting RadioLib I can see that it's hitting an SPI timeout during the reset() call's attempt to put the chip in standby. Adding an interrupt handler on the BUSY pin, I can see that the pin is toggling state but that SPIwriteStream and SPITransferStream's delays waiting for the BUSY pin to toggle occur just after the pin changes state. It seems like there's a race condition in catching the transition using the current approach (a spin loop) on the ESP8266 platform.

I've tried:

  • Increasing the ESP8285 to 160MHz core speed
  • Running without the BUSY pin (works for TX, but readData() requires BUSY to function properly)
  • Commenting out some of the validation in begin() (gets us further, but eventually something breaks down the line)

To Reproduce
In-situ: https://github.com/MUSTARDTIGERFPV/ESP32-INAV-Radar/blob/master/src/lib/Radios/LoRa_SX128X.cpp
Minimal reproduction:

// include the library
#include <RadioLib.h>
#include <SPI.h>

// https://github.com/ExpressLRS/ExpressLRS/blob/master/src/hardware/RX/Generic%202400%20Diversity%20PA.json
#define PIN_BUSY 5
#define PIN_DIO 4
#define PIN_NSS 15
#define PIN_RST 2


// begin() fails with SPI timeout when this is used
//SX1281 radio = new Module(PIN_NSS, PIN_DIO, PIN_RST, PIN_BUSY);
// readData() doesn't actually output any data when this is used (since it doesn't wait on the chip to be done)
SX1281 radio = new Module(PIN_NSS, PIN_DIO, PIN_RST);

// IRQ Handling
volatile bool irqFired = false;
void IRAM_ATTR onDIO(void)
{
  irqFired = true;
}

void setup()
{
  Serial.begin(115200);

  // Initialize SX1280 with default settings
  Serial.print(F("[SX1280] Initializing ... "));
  int state = RADIOLIB_ERR_NONE;
  do
  {
    state = radio.begin(2500.0f, 812.5f, 5, 6, 0x17, 2, 12);
    // int state = radio.begin();
    radio.setDio1Action(onDIO);
    if (state == RADIOLIB_ERR_NONE)
    {
      Serial.println(F("success!"));
    }
    else
    {
      Serial.print(F("failed, code "));
      Serial.println(state);
    }
    delay(1000);
  } while (state != RADIOLIB_ERR_NONE);
  // Configure radio settings
  radio.implicitHeader(16);
  radio.setCRC(0);
  // Set amplifier / antenna switch pins
  pinMode(10, OUTPUT);
  digitalWrite(10, LOW);
  pinMode(9, OUTPUT);
  digitalWrite(9, LOW);
  radio.startReceive();
}

uint8_t buf[16];

void loop()
{
  if (irqFired)
  {
    irqFired = false;
    if (radio.getIrqStatus() & RADIOLIB_SX128X_IRQ_RX_DONE)
    {
      int state = radio.readData(buf, 16);
      if (state != RADIOLIB_ERR_NONE)
      {
        Serial.printf("[SX1280] readData failure: %d\n", state);
      }
      radio.startReceive();
      Serial.print("[SX1280] Data:");
      for (uint8_t i = 0; i < 16; i++)
      {
        Serial.printf("%02X ", buf[i]);
      }
      Serial.println();

      Serial.print(F("[SX1280] RSSI:\t\t"));
      Serial.print(radio.getRSSI());
      Serial.println(F(" dBm"));

      // print the SNR (Signal-to-Noise Ratio)
      // of the last received packet
      Serial.print(F("[SX1280] SNR:\t\t"));
      Serial.print(radio.getSNR());
      Serial.println(F(" dB"));

      // print the Frequency Error
      // of the last received packet
      Serial.print(F("[SX1280] Frequency Error:\t"));
      Serial.print(radio.getFrequencyError());
      Serial.println(F(" Hz"));
    }
  }
}

Expected behavior
SX1280 can initialize without an SPI timeout when the BUSY pin is used such that readData returns valid data.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional info (please complete):

  • MCU: ESP8285 on a Matek R24-D ExpressLRS Receiver
  • Wireless module type: SX1281
  • PlatformIO version: 6.1.6
  • Library version: 5.7.0
@jgromes
Copy link
Owner

jgromes commented Mar 30, 2023

I can see that the pin is toggling state but that SPIwriteStream and SPITransferStream's delays waiting for the BUSY pin to toggle occur just after the pin changes state. It seems like there's a race condition in catching the transition using the current approach (a spin loop) on the ESP8266 platform

I'm not sure I fully understand. Do you mean to say that the BUSY does go low after reset (as it should be held high during startup), but then goes high again indefinitely before SPITransferStream starts the waiting loop?

@MUSTARDTIGERFPV
Copy link
Author

That seems about right - it looks like BUSY stays high after SPIwriteStream does its command write to enter standby mode, so the SPIcheckStream call later never sees BUSY go low again. Here's some logs that hopefully show the issue:

"F" is falling interrupt, "R" is rising interrupt on the BUSY pin

writestream sending command
begin ensure GPIO is low
FR
begin wait for GPIO to go high an
F
d then low
writestream verifying command
R
begin ensure GPIO is low
timeout in ensure GPIO is low
failed, code -705

Explained:

SPIwriteStream calls SPItransferStream to send the initial command

In the section // ensure GPIO is low, we get a falling, then a rising shortly after

A falling interrupt arrives just before the section // wait for GPIO to go high and then low in SPItransferStream so that got skipped during the initial command send from SPIwriteStream.

SPIwriteStream then goes back to verify the command (calls SPIcheckStream) and SPIcheckStream calls SPItransferStream again

In that go-around, the // ensure GPIO is low never sees GPIO go low, because it rose again after the last command.

@jgromes
Copy link
Owner

jgromes commented Mar 30, 2023

Thank you for the explanation. Could you try moving the digitalWrite that's pulling NSS low after the BUSY waiting loop? It's currently before that loop which is a bit of a strange order.

@MUSTARDTIGERFPV
Copy link
Author

Ahaha, I did exactly that just a few minutes ago! That was able to get things to initialize, but readData() still seems to leave the buffer unmodified, or get hung up for a few dozen ms (likely waiting on BUSY).

@jgromes
Copy link
Owner

jgromes commented Mar 30, 2023

Thank you for confirming this, I suspected the order of NSS/BUSY was weird (from an unrelated issue), this is likely a bug.

Regarding readData, are you sure there is really something to read? You don't seem to be checking the length of received packet using getPacketLength

@jgromes jgromes added the bug Something isn't working label Mar 30, 2023
@MUSTARDTIGERFPV
Copy link
Author

Thanks! I'm 99.9% certain the transmit side is functional; the buffer I'm feeding to the transmit function has valid data in it. The transmitter and receiver are flashed with a version that has NSS after the busy check, but the readData call seems to not modify the buffer that's handed to it.

Here's the actual code in-situ: https://github.com/MUSTARDTIGERFPV/ESP32-INAV-Radar/blob/master/src/lib/Radios/LoRa_SX128X.cpp

Note that the crypto call is a no-op for debugging purposes right now, I'm sending the buffer as-is, which always has at least some non-null data on account of the CRC at the end.

@MUSTARDTIGERFPV
Copy link
Author

Hmm, that's notable: getPacketLength() is returning 0. I'm explicitly setting the packet length in the transmit call so I have no reason to doubt the packet should have length > 0.

@MUSTARDTIGERFPV
Copy link
Author

It's worth saying: very very similar code (see LoRa_SX127X.cpp) works flawlessly on the SX127X series of chips, it's only the SX1280 that's giving me trouble.

@jgromes
Copy link
Owner

jgromes commented Mar 31, 2023

I was not able to replicate the issue, until I noticed that you are using implicit header mode. A bit more digging revealed that this mode is not actually handled properly, and getPacketLength always returns 0. It should be fixed in the latest commit, thanks for reporting!

@jgromes jgromes closed this as completed Mar 31, 2023
@jgromes jgromes added the resolved Issue was resolved (e.g. bug fixed, or feature implemented) label Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working resolved Issue was resolved (e.g. bug fixed, or feature implemented)
Projects
None yet
Development

No branches or pull requests

2 participants