Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAQC2plate - ADC Nodes lockup if external python script to toggle DOUT pins is executed #22

Open
KD4Z opened this issue Aug 16, 2021 · 3 comments

Comments

@KD4Z
Copy link
Contributor

KD4Z commented Aug 16, 2021

This may not be an issue with node-red-contrib-pi-plates, however...

Attached Example Flow merely reads two ADC values. Import to Node-Red, configure the Pi-Plate, and Deploy.
Validate ADC nodes are getting values.

Log into a shell on the Pi running Node-Red.
Create wigglepin.py from attachment:

import piplates.DAQC2plate as DAQC2
import RPi.GPIO as GPIO
import time

try:
        while 1:
                DAQC2.setDOUTbit(0,5)
                time.sleep(.9)
                DAQC2.clrDOUTbit(0,5)
                time.sleep(.1)

except KeyboardInterrupt:
        DAQC2.clrDOUTbit(0,5)
        GPIO.cleanup()

Execute it
python3 wigglepin.py

The DOUT 5 pin should start toggling, or sometimes you get the callstack listed below.
Notice the ADC values stop and the Pi-Plate is no longer accessible from Node-Red.
Restart the Flow. Nope. Still dead.

Restart Pi to reconnect to the Pi-Plate.

Wash-Rinse-Repeat

Error Callstack

Traceback (most recent call last):
 File "wigglepin.py", line 1, in <module>
   import piplates.DAQC2plate as DAQC2
 File "/usr/local/lib/python3.7/dist-packages/piplates/DAQC2plate.py", line 748, in <module>
   quietPoll()
 File "/usr/local/lib/python3.7/dist-packages/piplates/DAQC2plate.py", line 713, in quietPoll
   getCalVals(i)
 File "/usr/local/lib/python3.7/dist-packages/piplates/DAQC2plate.py", line 727, in getCalVals
   values[j]=CalGetByte(addr,6*i+j)
 File "/usr/local/lib/python3.7/dist-packages/piplates/DAQC2plate.py", line 586, in CalGetByte
   return resp[0]
IndexError: list index out of range

ADC_Demo_flow.json.txt
stepsToRepro.txt
wigglepin.py.txt

@mharsch
Copy link
Collaborator

mharsch commented Aug 17, 2021

There should really only be one process talking to the Pi Plates at a time. node-pi-plates (which is used by node-red-contrib-pi-plates under the hood) spawns a python co-process (plate_io.py) that imports the python pi-plates module and makes calls to the pi-plates api.

In order to allow multiple processes to share the pi plates, we'd need a different architecture where the process talking to the pi-plates exposed some API where multiple consumers could connect and make requests (e.g. pigpiod)

@mharsch
Copy link
Collaborator

mharsch commented Aug 20, 2021

So, what's happening when the plate(s) appear to stop responding from the Node-RED interface is a crash of the underlying python process that talks to the plates on behalf of the Node-RED pi-plates nodes. This is probably triggered by a reset of the pi plates microcontroller which means a period of time where the python API calls fail. The wiggle script can also fail in a similar way, but since it's making calls less frequently, it usually survives longer than the node-pi-plates plate_io.py process. The plates are actually back to being functional within a second or two, but the node-pi-plates python co-process won't be re-spawned until Node-RED itself is restarted (e.g. systemctl restart nodered).

So, first we should be providing more helpful error messages when our python co-process crashes: Harsch-Systems/node-pi-plates#10

Secondly, we should really survive such crashes by re-spawning the co-process automatically (or, better yet, offer a configuration option to respawn upon python process crash). Harsch-Systems/node-pi-plates#11

We should probably keep a counter of how many times we've restarted and mention that in the error message each time, so the user can easily detect these kind of 'dueling processes' failure scenarios.

@pi-plates
Copy link
Contributor

The Pi-Plates microcontroller does not reset unless explicitly told to do so. What can happen (in our later products) is that the processor will "give up" on a data exchange with the RPi if there is no response within 50msec and reset the I/O process. This may lead to erroneous data being received by the RPi and/or a loss of synchronization. Our older products (DAQC, MOTOR, and RELAY) use a simpler protocol and require lots of undesireable delays to maintain synchronization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants