Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

measuring bitrate and arbitration delays #14

Open
john30 opened this issue Sep 3, 2023 · 8 comments
Open

measuring bitrate and arbitration delays #14

john30 opened this issue Sep 3, 2023 · 8 comments
Labels
wontfix This will not be worked on

Comments

@john30
Copy link
Owner

john30 commented Sep 3, 2023

this is a continuation of john30/ebusd#891 as it was misplaced there.

the recently released firmware is capable of analysing the signal wrt bitrate and arbitration delay.

in order to use that feature, the adapter needs to be udpated and then the following steps need to be performed:

  • connect to ebus
  • don't run an ebusd connected to it (in order to really analyse only the target and not the adapter itself)
  • open the REPL (e.g. via web UI)
  • enter "ebus -v" there several times in short intervals

the interesting part of the output after each "ebus -v" will look like this:

init_ebus: detected: 2412 Hz on 6 edges with 1990 +/1990 - edge width, 995 H/994 L pulse width, i.e. 415 us H/414 us L period
init_ebus: master 8 0x31 delay=197 us [193-198]
init_ebus: master 25 0xff delay=197 us [196-205]

the first line shows bitrate measurement details with the calculated bitrate, min. clock counts for consecutive pos/neg edges, min. clock counts for H/L pulse widths, as well as calculated H/L pulse widths in microseconds.

the 2400 Bd target a H/L pulse width of 416 us which is basically 1000 clock counts. the minimal pos/neg edge width is targeted at 2000.
so in the above example it looks like the actual bitrate on the line is a little bit higher than it should be, i.e. 2412 Bd instead of 2400, which is 0.5%. this is still tolerable, but when it exceeds 1.2% it might be problematic.

anyway, the firmware now also allows to adhere such situations by using the new bitrate deviation option in the web UI. this way, a slightly higher bitrate can be set which will then be used for transmission as well as reception.

the other 2 lines reveal the seen master addresses, independent of whether the message was valid wrt CRC, so when arbitration errors occur, "fantasy" master addresses would appear here.
anyway, for each master it shows its current arbitration delay as well as the minimum and maximum seen for it since boot.

this allows for using the second feature of the adapter: to adjust the arbitration delay.
the default for the delay is 200 us, but when other participants have a lower delay, then this should be adjusted accordingly.

@maffi-git
Copy link

I've done the mesuring as described and I get weird readings:

init_ebus: already switched to enhanced eBUS mode on TCP port.
init_ebus: clock frq 80000000 div 33 1/3, src clk 4
init_ebus: detected: 2575 Hz on 244 edges with 1980 +/1982 - edge width, 1005 H/858 L pulse width, i.e. 419 us H/358 us L period
init_ebus: master 2 0x10 delay=210 us [208-234]
execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on TCP port.
init_ebus: clock frq 80000000 div 33 1/3, src clk 4
init_ebus: detected: 2711 Hz on 1023 edges with 1980 +/1982 - edge width, 910 H/858 L pulse width, i.e. 379 us H/358 us L period
init_ebus: master 2 0x10 delay=216 us [208-234]
execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on TCP port.
init_ebus: clock frq 80000000 div 33 1/3, src clk 4
init_ebus: detected: 2424 Hz on 254 edges with 1980 +/1983 - edge width, 1120 H/859 L pulse width, i.e. 467 us H/358 us L period
init_ebus: master 2 0x10 delay=216 us [208-234]

I'm starting to think that my connection issues with that Vaillant auroCOMPACT are because of the large deviations. In most cases it says 2424Hz. But 2711 is quite out of bounds.

I tried quite a lot of things, but nothing enabled a stable connection to the heater. Is it "broken"?

@john30
Copy link
Owner Author

john30 commented Dec 17, 2023

the more edges are counted (i.e. the longer the time between two "ebus -v" calls is), the more unreliable is the measurement, so the 2711 Hz are not really reliable. I guess your device it right at the edge with 2424 Hz, so I'd suggest to set the deviation to 30 and see if that helps.
"stable connection" meaning communication with the devices on the bus I guess? or do you refer to WIFI stability? in the latter case, try with the new firmware version released today, as it contains several commits in ESP-IDF wrt WIFI

@maffi-git
Copy link

Sorry, I get back so late.
Stable would mean for me, retrieving Data via comand line on my raspberry running ebusd. I receive various error messages besides successfull ones, like "SYN Received", "received ERR: invalid argument" or "wrong symbol received".
My ping rates to the adapter are like this - usually arround 2-4ms:
18 packets transmitted, 18 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 2.041/4.520/14.681/3.697 ms
I had to move the wifi-mesh-station next to the heater or the adapter would connect to a different base. Bases are connected over ethernet then.
For now I tried your latest firmware, which improves the situation a bit. I also switched the raspberry pi to an ethernet-connection.

I'm thinking about ditching the network connection and move my py next to the heater and use USB.

@john30
Copy link
Owner Author

john30 commented Mar 31, 2024

you might want to give the current version a try

@Zolli
Copy link

Zolli commented Aug 29, 2024

Hey @john30 thanks for this great guid on adjusting these 2 values, but if you able to help me a bit i need som clarifications.
I have an Ariston Nmbus split HP system and added the adapter (c6 version) ebusd runs in a raspberry pi 3B in a container. The connection between the adapter and raspberry is USB.

Ebusd config:

EBUSD_DEVICE='ens:/dev/ttyACM0'
EBUSD_CONFIGPATH='/etc/ebusd'
EBUSD_RECEIVETIMEOUT=35
EBUSD_LATENCY=10
EBUSD_POLLINTERVAL=5
EBUSD_SENDRETRIES=5
EBUSD_ACQUIRERETRIES=2
EBUSD_ACQUIRETIMEOUT=15
EBUSD_LOG='all:error'

# MQTT
EBUSD_MQTTHOS='172.10.1.30'
EBUSD_MQTTPORT=1883
EBUSD_MQTTUSER='ebusd'
EBUSD_MQTTPASS='****'
EBUSD_MQTTINT='/etc/ebusd/mqtt/homeassistant.cfg'
EBUSD_MQTTTOPIC='ebusd/%circuit/%name'

Command argument:

--mqttjson --lograwdata --mqttqos=1

I configured polling for some values (arund 30 or so, 5s interval) and sometimes get "arbitration lost" messages in my log:

...
Ebusd  | 2024-08-29 10:34:56.359 [bus error] poll heatpump heatpump_comp_discharge_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 10:37:26.405 [bus error] poll heatpump heatpump_exp_valve failed: ERR: arbitration lost
Ebusd  | 2024-08-29 10:40:20.331 [bus error] poll heatpump heatpump_LWT_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 10:40:38.469 [bus error] poll heatpump heatpump_comp_discharge_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 10:41:38.325 [bus error] poll energymgr dhw_store_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 10:51:20.331 [bus error] poll heatpump heatpump_compr_frequency failed: ERR: arbitration lost

A run the procedure you described and got the following logs:

 easi>
Build: 20240825 (on swap partition)
ebusd device string: ens:/dev/ttyACM0 (number may vary)
WiFi station/client: Sun-IoT_2G, 172.10.10.215, 100% (-47dBm)
WiFi access point: inactive
Chip ID: 543204291be0, ESP32-C6, rev 1
Hostname: ebus-fffe29
Up time: 319
Free heap: 203264 / 367720
ebusd connected: yes (inactive)
eBUS signal: acquired
 REPL

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on JTAG. updating verbosity.
init_ebus: detected: 2429 Hz on 91 edges with 1976 +/1976 - edge width, 986 H/989 L pulse width, i.e. 411 us H/412 us L period
init_ebus: master 1 0x00 delay=186 us [185-230]
init_ebus: master 2 0x10 delay=201 us [188-207]
init_ebus: master 4 0x70 delay=225 us [180-231]
init_ebus: master 11 0x03 delay=198 us [198-198]
init_ebus: master 12 0x13 delay=207 us [181-215]
init_ebus: master 24 0x7f delay=294 us [257-355]

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on JTAG.
init_ebus: detected: 2434 Hz on 734 edges with 1973 +/1974 - edge width, 985 H/985 L pulse width, i.e. 410 us H/410 us L period
init_ebus: master 1 0x00 delay=207 us [185-230]
init_ebus: master 2 0x10 delay=201 us [188-207]
init_ebus: master 4 0x70 delay=210 us [180-231]
init_ebus: master 11 0x03 delay=198 us [198-198]
init_ebus: master 12 0x13 delay=207 us [181-215]
init_ebus: master 24 0x7f delay=294 us [257-355]

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on JTAG.
init_ebus: detected: 2507 Hz on 334 edges with 1975 +/1975 - edge width, 925 H/988 L pulse width, i.e. 385 us H/412 us L period
init_ebus: master 1 0x00 delay=207 us [185-230]
init_ebus: master 2 0x10 delay=201 us [188-207]
init_ebus: master 4 0x70 delay=210 us [180-231]
init_ebus: master 11 0x03 delay=198 us [198-198]
init_ebus: master 12 0x13 delay=206 us [181-215]
init_ebus: master 24 0x7f delay=294 us [257-355]

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on JTAG.
init_ebus: detected: 2790 Hz on 190 edges with 1975 +/1975 - edge width, 731 H/988 L pulse width, i.e. 305 us H/412 us L period
init_ebus: master 1 0x00 delay=207 us [185-230]
init_ebus: master 2 0x10 delay=201 us [188-207]
init_ebus: master 4 0x70 delay=210 us [180-231]
init_ebus: master 11 0x03 delay=186 us [186-198]
init_ebus: master 12 0x13 delay=206 us [181-215]
init_ebus: master 24 0x7f delay=294 us [257-355]

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on JTAG.
init_ebus: detected: 2431 Hz on 152 edges with 1975 +/1975 - edge width, 985 H/988 L pulse width, i.e. 410 us H/412 us L period
init_ebus: master 1 0x00 delay=207 us [185-230]
init_ebus: master 2 0x10 delay=201 us [188-207]
init_ebus: master 4 0x70 delay=210 us [180-231]
init_ebus: master 11 0x03 delay=186 us [186-198]
init_ebus: master 12 0x13 delay=206 us [181-215]
init_ebus: master 24 0x7f delay=294 us [257-355]

execute: ebus -v
init_ebus: already switched to enhanced eBUS mode on JTAG.
init_ebus: detected: 2434 Hz on 206 edges with 1974 +/1974 - edge width, 985 H/985 L pulse width, i.e. 410 us H/410 us L period
init_ebus: master 1 0x00 delay=207 us [185-230]
init_ebus: master 2 0x10 delay=201 us [188-207]
init_ebus: master 4 0x70 delay=224 us [180-231]
init_ebus: master 11 0x03 delay=186 us [186-198]
init_ebus: master 12 0x13 delay=206 us [181-215]
init_ebus: master 24 0x7f delay=294 us [257-355]

I discarded the entries with high values (above 2450hz) and figured out that 35 is a good value to set as "bitrate deviation" but this not resolved it.

After that I looked at the reported master delays and increased the arbitration delay to 300, but after ~2hr this setting completly killed the bus in my system, the bus was reset and not able to rediscover devices until force shutdown the indoor unit and restarted it.

After settings this back to 200 looks like it working stable again but the menioned errors still ocuures time to time.
A bit of an update, the bus killed again after roughly 6 hrs of working fine, sadly i dont have any logs, a reverted to default settings (200 and 0) let see what happens next.

Update:
This was in my logs just before bus collapsed:

Ebusd  | 2024-08-29 14:46:26.332 [bus error] poll heatpump heatpump_LWT_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 14:54:14.361 [bus error] poll heatpump heatpump_LWT_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 14:59:56.438 [bus error] poll heatpump heatpump_LWT_temp failed: ERR: arbitration lost
Ebusd  | 2024-08-29 15:03:02.076 [bus error] device status: buffer overflow
Ebusd  | 2024-08-29 15:03:02.076 [bus error] device status: buffer overflow
Ebusd  | 2024-08-29 15:03:04.047 [bus error] poll heatpump heatpump_electric_heater failed: ERR: no signal
Ebusd  | 2024-08-29 15:03:04.047 [bus error] signal lost
Ebusd  | 2024-08-29 15:03:07.065 [bus error] signal lost
Ebusd  | 2024-08-29 15:03:17.056 [bus error] signal lost

Update:
Happend again, after 14 hrs of uptime. i saw that my pi rebooted just before my monitoring system reported the adpter as down (ebus.signal is false on the v1/status endpoint) Can this cause that error to happen? Like, the adapter writes some strange character to the bus on boot maybe, i always connected the ebus wires when the adpater already booted up.

@john30
Copy link
Owner Author

john30 commented Sep 5, 2024

@Zolli as this is a multi-master bus with arbitration access, arbitration errors are normal and nothing unexpected. especially when polling, ebusd does not immediately repeat a request if arbitration was lost, since it is going to be polled again later on anyway

@Zolli
Copy link

Zolli commented Sep 5, 2024

@Zolli as this is a multi-master bus with arbitration access, arbitration errors are normal and nothing unexpected. especially when polling, ebusd does not immediately repeat a request if arbitration was lost, since it is going to be polled again later on anyway

Thanks, i know that, i just assumed these "Errors" are caused by some other event. Can you maybe have some ideae about the other part of my comment where i mentioned that my system bus collapsed 3 times when the adapter was connected.

Just to mention i have a tip, but i really curious about your opinion as well.

(Maybe the SD card gets failing in my PI, i saw a few times that commands get executed very slowly (not the execution is slow, it felt like huge delays) and maybe during one of these delays ebusd tried to write something incomplete to the bus that caused the whole bus to collapse. I will get back from holiday within a week and try to get some low power x86 platform to run ebusd on.)

@Zolli
Copy link

Zolli commented Oct 22, 2024

Just a small uupdate on this, i switched to a small n3350 based mini PC today, i will wait what happens in the coming days, right now its works as expected.

But i still need to mention that the 'ebus -v' command on the adapter REPL works, but after i enter it, it causes my system bus to collapse, still dont know why, the adapter logs does nto contain anything useful why this happens, fortunately an indoor unit (this is a split HP system) restart fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants