Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAILURE at src/lmic/lmic.c:1606 #209

Closed
frazar opened this issue Jan 24, 2019 · 7 comments
Closed

FAILURE at src/lmic/lmic.c:1606 #209

frazar opened this issue Jan 24, 2019 · 7 comments

Comments

@frazar
Copy link

frazar commented Jan 24, 2019

I am experiencing a FAILURE using LMIC with cfg_EU868 along with a private gateway, which then connects to a private lora-server.

I am able to reproduce the failure consistently by simply running the ttn-otaa.ino example sketch. I can not predict when it happens, but usually it happens within the first 5 minutes.

The content of lmic_project_config.h is

#define CFG_sx1276_radio 1
#define CFG_eu868 1
#define DISABLE_BEACONS
#define DISABLE_PING
#define LMIC_DEBUG_LEVEL 2
#define LMIC_ENABLE_DeviceTimeReq 1
#define LMIC_PRINTF_TO Serial

The serial output is

Starting
RXMODE_RSSI
168: engineUpdate, opmode=0x8
Packet queued
214: EV_JOINING
233: engineUpdate, opmode=0xc
423613: engineUpdate, opmode=0xc
423891: EV_TXSTART
423916: engineUpdate, opmode=0x88c
424038: TXMODE, freq=868100000, len=23, SF=7, BW=125, CR=4/5, IH=0
613235: setupRx1 txrxFlags 00 --> 01
start single rx: now-rxtime: -126905
613374: RXMODE_SINGLE, freq=868100000, SF=7, BW=125, CR=4/5, IH=0
rxtimeout: entry: 613341 rxtime: 740242 entry-rxtime: -126901 now-entry: 7 rxtime-txend: 312596
805595: setupRx2 txrxFlags 0x1 --> 02
start single rx: now-rxtime: 6
805749: RXMODE_SINGLE, freq=869525000, SF=12, BW=125, CR=4/5, IH=0
rxtimeout: entry: 4624263 rxtime: 805718 entry-rxtime: 3818545 now-entry: 8 rxtime-txend: 378072
4624478: processRx2Jacc txrxFlags 0x2 --> 00
4624690: engineUpdate, opmode=0xc
4638588: engineUpdate, opmode=0xc
4646157: engineUpdate, opmode=0xc
4654595: engineUpdate, opmode=0xc
4676344: engineUpdate, opmode=0xc
4686463: engineUpdate, opmode=0xc
4691190: engineUpdate, opmode=0xc
5075687: engineUpdate, opmode=0xc
5075967: EV_TXSTART
5075996: engineUpdate, opmode=0x88c
5076119: TXMODE, freq=868300000, len=23, SF=7, BW=125, CR=4/5, IH=0
5392200: setupRx1 txrxFlags 00 --> 01
start single rx: now-rxtime: -58
5392292: RXMODE_SINGLE, freq=868300000, SF=7, BW=125, CR=4/5, IH=0
5395438: EV_JOINED
netid: 0
devaddr: XXXXXX
artKey: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
nwkKey: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5395803: engineUpdate, opmode=0x808
5396422: EV_TXSTART
5396451: engineUpdate, opmode=0x888
5396576: TXMODE, freq=868100000, len=26, SF=7, BW=125, CR=4/5, IH=0
5462656: setupRx1 txrxFlags 0x1 --> 01
start single rx: now-rxtime: -57
5462749: RXMODE_SINGLE, freq=868100000, SF=7, BW=125, CR=4/5, IH=0
5465826: process options (olen=0x6)
5465866: LinkAdrReq: p1:52 chmap:0007 chpage:00 uprt:01 ans:87
5466015: decodeFrame txrxFlags 0x1 --> 21
5466237: Received downlink, window=RX1, port=-1, ack=0, txrxFlags=0x21
5466614: EV_TXCOMPLETE (includes waiting for RX windows)
5466924: engineUpdate, opmode=0x800
6716924: engineUpdate, opmode=0x808
6717375: EV_TXSTART
6717404: engineUpdate, opmode=0x888
6717532: TXMODE, freq=868300000, len=31, SF=7, BW=125, CR=4/5, IH=0
Packet queued
6784253: setupRx1 txrxFlags 0x21 --> 01
start single rx: now-rxtime: 7
6784343: RXMODE_SINGLE, freq=868300000, SF=7, BW=125, CR=4/5, IH=0
6787419: process options (olen=0x5)
6787459: LinkAdrReq: p1:54 chmap:0007 chpage:00 uprt:01 ans:87
6787607: decodeFrame txrxFlags 0x1 --> 21
6787830: Received downlink, window=RX1, port=-1, ack=0, txrxFlags=0x21
6788207: EV_TXCOMPLETE (includes waiting for RX windows)
6788517: engineUpdate, opmode=0x800
8038517: engineUpdate, opmode=0x808
8038968: EV_TXSTART
8038996: engineUpdate, opmode=0x888
8039058: TXMODE, freq=868500000, len=28, SF=7, BW=125, CR=4/5, IH=0
Packet queued
8105459: setupRx1 txrxFlags 0x21 --> 01
start single rx: now-rxtime: 6
8105549: RXMODE_SINGLE, freq=868500000, SF=7, BW=125, CR=4/5, IH=0
8108626: process options (olen=0x5)
8108666: LinkAdrReq: p1:55 chmap:0007 chpage:00 uprt:01 ans:87
8108815: decodeFrame txrxFlags 0x1 --> 21
8109037: Received downlink, window=RX1, port=-1, ack=0, txrxFlags=0x21
8109415: EV_TXCOMPLETE (includes waiting for RX windows)
8109724: engineUpdate, opmode=0x800
9359724: engineUpdate, opmode=0x808
9360179: EV_TXSTART
9360207: engineUpdate, opmode=0x888
9360269: TXMODE, freq=868100000, len=28, SF=7, BW=125, CR=4/5, IH=0
Packet queued
9426670: setupRx1 txrxFlags 0x21 --> 01
start single rx: now-rxtime: 6
9426760: RXMODE_SINGLE, freq=868100000, SF=7, BW=125, CR=4/5, IH=0
9429837: process options (olen=0x5)
9429877: LinkAdrReq: p1:57 chmap:0007 chpage:00 uprt:01 ans:87
9430026: decodeFrame txrxFlags 0x1 --> 21
9430248: Received downlink, window=RX1, port=-1, ack=0, txrxFlags=0x21
9430626: EV_TXCOMPLETE (includes waiting for RX windows)
9430936: engineUpdate, opmode=0x800
10680936: engineUpdate, opmode=0x808
10681388: EV_TXSTART
10681420: engineUpdate, opmode=0x888
10681483: TXMODE, freq=868300000, len=28, SF=7, BW=125, CR=4/5, IH=0
Packet queued
10747884: setupRx1 txrxFlags 0x21 --> 01
start single rx: now-rxtime: 6
10747974: RXMODE_SINGLE, freq=868300000, SF=7, BW=125, CR=4/5, IH=0
10750412: process options (olen=0)
10750448: decodeFrame txrxFlags 0x1 --> 21
10750493: Received downlink, window=RX1, port=-1, ack=0, txrxFlags=0x21
10750872: EV_TXCOMPLETE (includes waiting for RX windows)
10751187: engineUpdate, opmode=0x800
12001188: engineUpdate, opmode=0x808
12001640: EV_TXSTART
12001672: engineUpdate, opmode=0x888
12001734: TXMODE, freq=868500000, len=26, SF=7, BW=125, CR=4/5, IH=0
Packet queued
12067815: setupRx1 txrxFlags 0x21 --> 01
start single rx: now-rxtime: 6
12067905: RXMODE_SINGLE, freq=868500000, SF=7, BW=125, CR=4/5, IH=0
rxtimeout: entry: 13673215 rxtime: 12067938 entry-rxtime: 1605277 now-entry: 9 rxtime-txend: 62596
13673446: setupRx2 txrxFlags 0x1 --> 02
start single rx: now-rxtime: 1540270
13673681: RXMODE_SINGLE, freq=869525000, SF=12, BW=125, CR=4/5, IH=0
rxtimeout: entry: 17185682 rxtime: 12133414 entry-rxtime: 5052268 now-entry: 9 rxtime-txend: 128072
17185914: processRx2DnData txrxFlags 0x2 --> 00
17478418: processDnData txrxFlags 00 --> 20
17478464: EV_TXCOMPLETE (includes waiting for RX windows)
17478630: engineUpdate, opmode=0x800
18728631: engineUpdate, opmode=0x808
18729086: EV_TXSTART
18729118: engineUpdate, opmode=0x888
18729245: TXMODE, freq=868100000, len=26, SF=7, BW=125, CR=4/5, IH=0
Packet queued
18795262: setupRx1 txrxFlags 0x20 --> 01
start single rx: now-rxtime: 6
18795352: RXMODE_SINGLE, freq=868100000, SF=7, BW=125, CR=4/5, IH=0
rxtimeout: entry: 19909125 rxtime: 18795385 entry-rxtime: 1113740 now-entry: 10 rxtime-txend: 62596
19909363: setupRx2 txrxFlags 0x1 --> 02
start single rx: now-rxtime: 1048739
19909597: RXMODE_SINGLE, freq=869525000, SF=12, BW=125, CR=4/5, IH=0
rxtimeout: entry: 19919282 rxtime: 18860861 entry-rxtime: 1058421 now-entry: 7 rxtime-txend: 128072
19919539: processRx2DnData txrxFlags 0x2 --> 00
20189262: processDnData txrxFlags 00 --> 20
20189307: EV_TXCOMPLETE (includes waiting for RX windows)
20189474: engineUpdate, opmode=0x800
FAILURE 
/home/frazar/Arduino/libraries/MCCI_LoRaWAN_LMIC_library/src/lmic/lmic.c:1606
@terrillmoore
Copy link
Member

terrillmoore commented Jan 24, 2019

Line for reference:

ASSERT((LMIC.opmode & OP_TXRXPEND)!=0);

This is checking internal state. I wonder if this is a bug due to the fact that you're receiving during window 2? I say this because I've never seen a normal RX during window 2 before, and it's (of course) the networks' option to use or not use window 2. More study needed.

Instead of the ASSERT, can we try something like:

if ((LMIC.opmode & OP_TXRXPEND) == 0) {
   // debug print -- at least the LMIC.opmode value.
   // trick to get a useful message in the assert:
   ASSERT(0 && "LMIC.opmode & OP_TXRXPEND must be non-zero");
}

@frazar
Copy link
Author

frazar commented Jan 25, 2019

So, I add the following lines inside the if of the snippet you provided (because otherwise I wouldn't get any message)

LMIC_DEBUG_PRINTF("LMIC.opmode: %d\n", LMIC.opmode);
delay(1000);

The result is the following

Starting
RXMODE_RSSI
208: engineUpdate, opmode=0x8
Packet queued
256: EV_JOINING
274: engineUpdate, opmode=0xc
294948: engineUpdate, opmode=0xc
295226: EV_TXSTART
295252: engineUpdate, opmode=0x88c
295326: TXMODE, freq=868300000, len=23, SF=7, BW=125, CR=4/5, IH=0
611346: setupRx1 txrxFlags 00 --> 01
start single rx: now-rxtime: -57
611437: RXMODE_SINGLE, freq=868300000, SF=7, BW=125, CR=4/5, IH=0
614622: EV_JOINED
netid: 0
devaddr: XXXXXX
artKey: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
nwkKey: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
614974: engineUpdate, opmode=0x808
615588: EV_TXSTART
615613: engineUpdate, opmode=0x888
615689: TXMODE, freq=868100000, len=26, SF=7, BW=125, CR=4/5, IH=0
681708: setupRx1 txrxFlags 0x1 --> 01
start single rx: now-rxtime: 7
681798: RXMODE_SINGLE, freq=868100000, SF=7, BW=125, CR=4/5, IH=0
684981: process options (olen=0x6)
685021: LinkAdrReq: p1:52 chmap:0007 chpage:00 uprt:01 ans:87
685212: decodeFrame txrxFlags 0x1 --> 21
685429: Received downlink, window=RX1, port=-1, ack=0, txrxFlags=0x21
685801: EV_TXCOMPLETE (includes waiting for RX windows)
686107: engineUpdate, opmode=0x800
1592840: setupRx2 txrxFlags 0x21 --> 02
start single rx: now-rxtime: 845706
1593004: RXMODE_SINGLE, freq=869525000, SF=12, BW=125, CR=4/5, IH=0
rxtimeout: entry: 1602695 rxtime: 747306 entry-rxtime: 855389 now-entry: 31 rxtime-txend: 128072
1603186: processRx2DnData txrxFlags 0x2 --> 00
LMIC.opmode: 2048

@terrillmoore
Copy link
Member

Sorry for the slow response.

Here are the bits from the opmode:

enum { OP_NONE = 0x0000,
OP_SCAN = 0x0001, // radio scan to find a beacon
OP_TRACK = 0x0002, // track my networks beacon (netid)
OP_JOINING = 0x0004, // device joining in progress (blocks other activities)
OP_TXDATA = 0x0008, // TX user data (buffered in pendTxData)
OP_POLL = 0x0010, // send empty UP frame to ACK confirmed DN/fetch more DN data
OP_REJOIN = 0x0020, // occasionally send JOIN REQUEST
OP_SHUTDOWN = 0x0040, // prevent MAC from doing anything
OP_TXRXPEND = 0x0080, // TX/RX transaction pending
OP_RNDTX = 0x0100, // prevent TX lining up after a beacon
OP_PINGINI = 0x0200, // pingable is initialized and scheduling active
OP_PINGABLE = 0x0400, // we're pingable
OP_NEXTCHNL = 0x0800, // find a new channel
OP_LINKDEAD = 0x1000, // link was reported as dead
OP_TESTMODE = 0x2000, // developer test mode
};

This looks like OP_NEXTCHNL is set. Hmm. It looks like the high-level sequence of events is:

  1. Join.
  2. Do an uplink.
  3. get a downlink and get a mac request that does LinkAdrReq.
  4. Then there's a second RX (after the EV_TXCOMPLETE). That's a little strange as this is a Class A device, and should not try to receive at all.

The time stamps suggest about 65000 os-ticks/second in your setup, so this later time is about 13 seconds after the last operation.

This looks like a memory corruption problem or some other "meta" problem. I have never seen the LMIC behave like this, and the only way to cause an RX2 event at that point is for an osjob_t to get rescheduled without setting it up properly. Nothing in the trace suggests that the LMIC is doing this. If you're using osjob_t yourself, look for a double use of a block or something similar that causes a dangling pointer to be de-referenced.

@frazar
Copy link
Author

frazar commented Mar 4, 2019

Actually, the code being run is that of the ttn-otaa example.
However, I'm not able to reproduce the problem consistently anymore.
If this is related to receiving a packet in the 2nd window, is there a way to force the gateway to use the 2nd receive window?

@terrillmoore
Copy link
Member

terrillmoore commented Mar 16, 2019

I think this is related to a complex app plus #249. The multiple events are exactly the symptom of unanticipated re-entrancy to engineUpdate() (engineUpdate -> reportEvent -> client code -> API incall -> engineUpdate). I saw this kind of stuff while I was getting my head wrapped around how the events really work.

ttn_otaa does a doSend from onEvent; and doSend calls an API that calls engineUpdate. You can check whether you can work around this by using an os job to delay the call to doSend.

Also, btw: the print of EV_TXSTART and the timestamp happens at an awkward time - it uses up some of the timeslot. I'm strongly considering moving all the printing out of onEvent() because ... EV_TXSTART and a few other events are timing critical.

@terrillmoore
Copy link
Member

Based on my experience this weekend, ABP plus #249 is likely to be the issue, combined with the overhead introduced by debug prints. Have you modified your ttn-otaa? The one in the tree disables ABP after a join.

@frazar
Copy link
Author

frazar commented Mar 25, 2019

No, it was the default ttn-otaa example.

However, since I'm not able to reproduce the issue anymore, I suspect it was related to the setup.

Closing for the moment. I will re-open the issue if it occurs again.

@frazar frazar closed this as completed Mar 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants