Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telnet OTA command is broken #2562

Closed
Sag06 opened this issue Dec 2, 2022 · 42 comments
Closed

Telnet OTA command is broken #2562

Sag06 opened this issue Dec 2, 2022 · 42 comments
Labels

Comments

@Sag06
Copy link

Sag06 commented Dec 2, 2022

Device

Sonoff Mini

Version

ESPURNA_MINIMAL_ARDUINO_OTA dev 220212

Bug description

ESPURNA_MINIMAL_ARDUINO_OTA today dev compilation. Telnet connection is OK. But OTA command doesn't connect to http server. Tested on Sonoff mini three times. Compiled and esptool flashed. After that, I flashed my last compilation, from may 19th and it works flawlessly.

Steps to reproduce

No response

Build tools used

Arduino ide 1.8.19
Putty 0.74
HFS 2.3M

Any relevant log output (when available)

No response

Decoded stack trace (when available)

No response

@Sag06 Sag06 added the bug label Dec 2, 2022
@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

What's the server it fails to connect to?
Does it work after changing http client type?
In general .h, /#define OTA_CLIENT/ and try changing it to httupdate variant

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

I did all tasks as always. Same server. HFS program on Windows 10. Running on same Notebook. In fact, old compilations work fine here. Last one is from may 2022.
Did anything change in general.h from may 2022 to now? I never changed this file.

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

I have no idea what version is may 2022 though, git commit is the only way to determine build tree contents with certainty
(and why I always ask about it in the issue template, but so far it have not proved to be very effective :)
Updating with ota form latest version to latest version minimal .bin works for me w/ https://github.com/caddyserver/caddy

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

Does it immediately disconnect after the command? Perhaps there is a crash output / info telling us that there was a crash?
And by 'does not connect', what is ota command output (if any)?
Is it trying to open a domain name or an IP address? (i.e. ota http://blah.lan/my-firmware.bin.gz)

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

Now, I tried from same Sonoff Mini flashed with compillation "full". I mean, Web UI, home assistant, mqtt, etc. Same results, ota command doesn't work.

[026344] [TELNET] Connected 10.12.16.188:51759 [060079] [MAIN] Resetting stability counter [300163] [MAIN] Uptime: 00y 00d 00h 05m 00s [300164] [MAIN] Heap: initial 41200 available 22984 contiguous 21080 [300165] [MAIN] VCC: 3499 mV [300166] [MAIN] Datetime: 2022-12-03 17:28:10 [471941] [WEBSOCKET] #1 disconnected [600177] [MAIN] Uptime: 00y 00d 00h 10m 00s [600178] [MAIN] Heap: initial 41200 available 24480 contiguous 22856 [600179] [MAIN] VCC: 3499 mV [600180] [MAIN] Datetime: 2022-12-03 17:33:10 ota http://10.12.16.188:8080/espurna.lightcore.jemmwifi.190522.bin [900191] [MAIN] Uptime: 00y 00d 00h 15m 00s [900192] [MAIN] Heap: initial 41200 available 24480 contiguous 22856 [900193] [MAIN] VCC: 3499 mV [900193] [MAIN] Datetime: 2022-12-03 17:38:10

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

I tried with another Sonoff Mini devices, compilations from january and may, and ota command works like a charm.
I didn't change anything in hardware. Always
#define OTA_CLIENT OTA_CLIENT_ASYNCTCP // Terminal / MQTT OTA support
I think something is broken now.
You are right with git commits. I downloaded zip and compiled. I don't know how to identify commit it was then with zip file. I should learn more about github. Sorry.

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

So, zero [OTA] ERROR: ...something wrong here... some time (15-30sec) after ota command?
Is SDK version the same in both old and new builds? (in WebUI or terminal info ... starting with sdk: ...)

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

Not working, yesterday compilation:
SDK version
2.2.2-dev(38a443e)
Core version
3.0.2

Working:
SDK version
2.2.2-dev(38a443e)
Core version
2.7.4

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

So, zero [OTA] ERROR: ...something wrong here... some time (15-30sec) after ota command? Is SDK version the same in both old and new builds? (in WebUI or terminal info ... starting with sdk: ...)

No error. It remains without output. After >3 min, extra info again:
ota http://10.12.16.188:8080/espurna.lightcore.jemmwifi.190522.bin [100371] [MAIN] Uptime: 00y 00d 01h 25m 00s [100373] [MAIN] Heap: initial 41200 available 23712 contiguous 21632 [100374] [MAIN] VCC: 3499 mV [100374] [MAIN] Datetime: 2022-12-03 18:48:10

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

Going back to 2.7.4 might help

Other issue I have in mind is Windows... But, you'll need to fetch https://github.com/earlephilhower/esp-quick-toolchain/releases/download/3.1.0-gcc10.3/x86_64-w64-mingw32.xtensa-lx106-elf-e5f9fec.220621.zip and c/p replace C:\Users\YOU\AppData\Local\Arduino15\packages\esp8266\tools\xtensa-lx106-elf-gcc\3.0.4-gcc10.3-1757bed contents with the archive ones (just verify that calling bin/xtensa-lx106-elf-gcc -v in that dir shows the same hash as the archive - e5f9fec), then rebuild the .bin

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

Should give any debug info if I run ota command on Espurna debug windows?

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

I don't think is Windows related. I brought bin files compiled from another house/PC. They failed and I compiled again on my notebook to not avail.
As OTA with old bin files from may and january work, router, server, etc shouldn't be cause too.
I'm lost.

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

Have you tried to change toolchain version though?
Or downloading releasesnapshot .bin binaries?

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

Last snapshot is from june, isn't it?

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

I,ll try just now

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

espurna-1.15.0-dev.git70d25aea+github221116-espurna-minimal-arduino-ota-1m.bin
Same results. It doesn't work.
[096530] [TELNET] Connected 10.12.16.188:63023
ota http://10.12.16.188:8080/espurna.mini.190522.bin
Time to recover via Esptool :-(

@mcspr
Copy link
Collaborator

mcspr commented Dec 3, 2022

Have you tried caddy?

caddy file-server --browse  --listen 10.12.16.188:8080

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

Have you tried caddy?

caddy file-server --browse  --listen 10.12.16.188:8080

I'll try. But HFS works with all my old compilations. I don't think it should be problem.

@Sag06
Copy link
Author

Sag06 commented Dec 3, 2022

Caddy, same results.
[556942] [TELNET] Connected 10.12.16.188:52243
ota http://10.12.16.188:8080/espurna.mini.221129.bin

@mcspr
Copy link
Collaborator

mcspr commented Dec 4, 2022

Our error reporting might be to blame here, OTA needs debug enabled which minimal builds disable. But you should've seen at least this while using normal ones

[788466] [OTA] Connecting to 10.12.16.188:8080
+OK
...
[807057] [OTA] ERROR: Connection aborted
[807057]
[807058] [EEPROM] Enabling EEPROM rotation
[807066] [OTA] Disconnected

(don't really see the reason to not at least generate 'Connecting' message in #2562 (comment))


Still (strongly) suggest to use the toolchain update regardless of whether it helps OTA or not. This is not some hearsay - just a known bug that will crash our code at random when using Windows builds
esp8266/Arduino#8393
(will be synced with boards as soon as Core 3.1.x is released)
PIO use it by default already, Arduino IDE installs roll the dice...

@Sag06
Copy link
Author

Sag06 commented Dec 4, 2022

Not sure if understood. Does windows build mean compiled in a Windows OS machine?
Last snapshot failed too. Is it windows build too?

@Sag06
Copy link
Author

Sag06 commented Dec 4, 2022

Do you suggest to update Espurna using an old minimal bin that works?
People should be warned. Two step update is almost mandatory for 1 mega devices.

@Sag06
Copy link
Author

Sag06 commented Dec 4, 2022

Thinking twice. Snapshots are PIO build, aren't them?

@mcspr
Copy link
Collaborator

mcspr commented Dec 4, 2022

Two separate issues; toolchain package shipped with ESP8266 Core 3.0.2 is partially broken on Windows for our builds. You are building on Windows, hence my warning.

^ is my log printout on a sonoff-basic .bin downloaded from the releases page, which works for me just fine. You should be able to see OTA logs when using non-minimal builds. -minimal .bin does not include them because of disabled DEBUG_SUPPORT

@Sag06
Copy link
Author

Sag06 commented Dec 5, 2022

Two separate issues; toolchain package shipped with ESP8266 Core 3.0.2 is partially broken on Windows for our builds. You are building on Windows, hence my warning.
I don´t think this is the problem. Snapshot build (github221116) is not Windows build, and it fails too.
^ is my log printout on a sonoff-basic .bin downloaded from the releases page, which works for me just fine. You should be able to see OTA logs when using non-minimal builds. -minimal .bin does not include them because of disabled DEBUG_SUPPORT
Ok. I flashed espurna-1.15.0-dev.git70d25aea+github221116-itead-sonoff-mini.bin and tried OTA once more.
It doesn't work and NO OTA logs either:
[022690] [TELNET] Connected 10.12.16.188:51322 [038258] [NTP] Server pool.ntp.org [038259] [NTP] Last Sync 2022-12-05 19:35:28 (UTC) [038259] [NTP] UTC Time 2022-12-05 19:35:28 ota http://10.12.16.188:8080/espurna.lightcore.jemmwifi.190522.bin [060077] [MAIN] Resetting stability counter [300169] [MAIN] Uptime: 00y 00d 00h 05m 00s [300170] [MAIN] Heap: initial 41840 available 24696 contiguous 22448 [300171] [MAIN] VCC: 3499 mV [300172] [MAIN] Datetime: 2022-12-05 19:39:49 ^C[600176] [MAIN] Uptime: 00y 00d 00h 10m 00s [600178] [MAIN] Heap: initial 41840 available 24496 contiguous 22936 [600178] [MAIN] VCC: 3499 mV [600179] [MAIN] Datetime: 2022-12-05 19:44:49 ota http://10.12.16.188:8080/espurna.lightcore.jemmwifi.190522.bin
You can see I tried OTA command twice and no OTA response.

One interesting issue to add. No way to Telnet with password. Connection is always refused in spite password is right. I tried "fibonacci" too to no avail. More detailed:

My full working compilation from last may, sorry to don't know github commit:
[418746] [TELNET] Client #0 connected Password: myPassword Password correct, welcome!

And now, Snapshot build (github221116):
`Password (disconnects after 1 failed attempt): myPassword

And Putty disconnection error!`

I don't know if this new issue is somehow connected to OTA one. Hope it helps.

@mcspr
Copy link
Collaborator

mcspr commented Dec 5, 2022

Are you using telnet mode or something like that?
Have you tried different clients (netcat, nc)? Do other commands work at all btw? heap, get, set, etc.

@Sag06
Copy link
Author

Sag06 commented Dec 5, 2022

I always use Putty. Never had any problem. I can Putty Telnet now to all my Espurna devices 2.7.4 Core version.

@mcspr
Copy link
Collaborator

mcspr commented Dec 5, 2022

I mean, do any other commands work? help, get hostname, or reset
edit: on espurna-1.15.0-dev.git70d25aea+github221116-itead-sonoff-mini.bin

@Sag06
Copy link
Author

Sag06 commented Dec 5, 2022

As I told, I can Telnet (Putty) to 3.0.2 Core on minimal bin or full Espurna bin in case Telnet is set without authentication. Odd.

@Sag06
Copy link
Author

Sag06 commented Dec 5, 2022

I mean, do any other commands work? help, get hostname, or reset edit: on espurna-1.15.0-dev.git70d25aea+github221116-itead-sonoff-mini.bin

I´ll flash again. Let me one minute

@Sag06
Copy link
Author

Sag06 commented Dec 5, 2022

No response at all from these commands.
`[077751] [TELNET] Connected 10.12.16.188:55845
get hostname
[099252] [NTP] Server pool.ntp.org
[099252] [NTP] Last Sync 2022-12-05 20:46:07 (UTC)
[099253] [NTP] UTC Time 2022-12-05 20:46:07
get hostname

`

@mcspr mcspr closed this as completed Dec 5, 2022
@mcspr
Copy link
Collaborator

mcspr commented Dec 5, 2022

...in favour of a 2nd issue. This is a telnet server problem, not ota

@Sag06
Copy link
Author

Sag06 commented Dec 5, 2022

It makes sense

@mcspr
Copy link
Collaborator

mcspr commented Dec 5, 2022

See #2563
Gist of it is we dont handle terminal clients that send data in batches, and expect a single network data packet as cmd\r\n (aka what netcat does)
PuTTY sends it as cmd, then \r\n. Not intended to happen, just a badly applied optimization on our side

@richtoy
Copy link

richtoy commented Dec 12, 2022

OTA still not working for me.

 12/12/2022   16:40.50   /home/mobaxterm  telnet 192.168.1.50
Trying 192.168.1.50...
Connected to 192.168.1.50.
Escape character is '^]'.
[780853] [TELNET] Connected 192.168.1.20:13990
info
-ERROR: TERMINAL: UnterminatedQuote
info
ESPURNA 1.15.0-dev.git312c3ef2+github221212 built 2022-12-12 12:35:56
device: ESPURNA_MINIMAL_ARDUINO_OTA
mcu: esp8266 chipid: CC50E3C8C3C3 freq: 80mhz
sdk: 2.2.2-dev(38a443e) core: 2.7.4
md5: b205220c4d753117b468415091aad7e0
support: DEBUG_TELNET MDNS ARDUINO_OTA OTA_CLIENT TELNET TERMINAL
system: OK boot counter: 1
last reset reason: External System
extra info: External System
+OK
[800146] [MAIN] Uptime: 00y 00d 00h 30m 00s
[800147] [MAIN] Heap: initial 50232 available 39688 contiguous 39408
[800148] [MAIN] VCC: 3521 mV
ota http://github.com/xoseperez/espurna/releases/download/github221212/espurna-1.15.0-dev.git312c3ef2+github221212-neo-coolcam-power-plug-wifi.bin
[806225] [OTA] Connecting to github.com:80
+OK
[806261] [EEPROM] Disabling EEPROM rotation
[806261] [OTA] Downloading /xoseperez/espurna/releases/download/github221212/espurna-1.15.0-dev.git312c3ef2+github221212-neo-coolcam-power-plug-wifi.bin
[806511]
[806512] [EEPROM] Enabling EEPROM rotation
[806516] [OTA] Disconnected

I always get an error for the first command and OTA silently fails.

Is there a simple way to recover to a working image from this MINIMAL_OTA build?

@mcspr
Copy link
Collaborator

mcspr commented Dec 12, 2022

^
ota <url> is not the only way to upgrade
https://github.com/xoseperez/espurna/wiki/OTA#using-espotapy - -minimal-arduino-ota
https://github.com/xoseperez/espurna/wiki/OTA#updating-via-http - minimal-webui (or, just open the :80 port in browser)

for PuTTY, switch to raw protocol instead of telnet (#2563)

@richtoy
Copy link

richtoy commented Dec 12, 2022

Isn't the PuTTY bug fixed in the latest snapshot? I have flashed a 1.14.1 with ESPURNA_MINIMAL_ARDUINO_OTA from today (2022-12-12) as a first step because there was insufficient memory. Next step is to flash the latest 1.15.0 dev snapshot or revert to 1.14.1.
Does the espota.py assume a reset device with the default ip of 192.168.4.1?

Tried with raw too but OTA still silently failed both from MobaXterm and PuTTY

@mcspr
Copy link
Collaborator

mcspr commented Dec 12, 2022

Isn't the PuTTY bug fixed in the latest snapshot

Only the 'unable to use any command' part of the issue. TELNET in PuTTY is silently sending some extra data before you even start typing anything, causing the 'first command' error. We can't guarantee our workarounds for telnet always work though, so I would suggest to switch to RAW mode instead to avoid any more issues
(and testing mostly happens with raw text clients like ncat)

Does the espota.py assume a reset device with the default ip of 192.168.4.1?

espota.py works in both AP and STA modes and it should have the same IP & hostname on the network you used with a normal build

Tried with raw too but OTA still silently failed both from MobaXterm and PuTTY

Command wouldn't have worked anyway. We don't handle redirects and HTTPS support is missing.

> curl -I 'http://github.com/xoseperez/espurna/releases/download/github221212/espurna-1.15.0-dev.git312c3ef2+github221212-neo-coolcam-power-plug-wifi.bin'
HTTP/1.1 301 Moved Permanently
Content-Length: 0
Location: https://github.com/xoseperez/espurna/releases/download/github221212/espurna-1.15.0-dev.git312c3ef2+github221212-neo-coolcam-power-plug-wifi.bin

Not logging that upgrade did not happen is an issue, though

@richtoy
Copy link

richtoy commented Dec 12, 2022

Not having much success, letting the device boot ESPURNA_MINIMAL_ARDUINO_OTA and then trying espota.py gives the following: -
 12/12/2022   20:35.42   /home/mobaxterm  python espota.py --progress --ip 192.168.1.50 --auth XXXXXX --file espurna-1.14.1-neo-coolcam-power-plug-wifi.bin
Authenticating...OK
20:36:01 [ERROR]: No response from device

Is there something else that needs to be done to make it accept a new firmware?

 12/12/2022   20:36.01   /home/mobaxterm  telnet 192.168.1.50
Trying 192.168.1.50...
Connected to 192.168.1.50.
Escape character is '^]'.
[556319] [TELNET] Connected 192.168.1.20:60522
info
-ERROR: TERMINAL: UnterminatedQuote
info
ESPURNA 1.15.0-dev.git312c3ef2+github221212 built 2022-12-12 12:35:56
device: ESPURNA_MINIMAL_ARDUINO_OTA
mcu: esp8266 chipid: CC50E3C8C3C3 freq: 80mhz
sdk: 2.2.2-dev(38a443e) core: 2.7.4
md5: b205220c4d753117b468415091aad7e0
support: DEBUG_TELNET MDNS ARDUINO_OTA OTA_CLIENT TELNET TERMINAL
system: OK boot counter: 1
last reset reason: Power On
extra info: Power On
+OK

With debug: -

 12/12/2022   20:39.49   /home/mobaxterm  python espota.py --progress --debug --ip 192.168.1.50 --auth XXXXXXXX --file espurna-1.14.1-neo-coolcam-power-plug-wifi.bin
20:40:34 [DEBUG]: Options: {'esp_ip': '192.168.1.50', 'host_ip': '0.0.0.0', 'esp_port': 8266, 'host_port': 37575, 'auth': 'XXXXXXXX', 'image': 'espurna-1.14.1-neo-coolcam-power-plug-wifi.bin', 'spiffs': False, 'debug': True, 'progress': True}
20:40:34 [INFO]: Starting on 0.0.0.0:37575
20:40:34 [INFO]: Upload size: 484800
20:40:34 [INFO]: Sending invitation to: 192.168.1.50
Authenticating...OK
20:40:34 [INFO]: Waiting for device...
20:40:44 [ERROR]: No response from device

@mcspr
Copy link
Collaborator

mcspr commented Dec 12, 2022

Make sure espota.py isn't blocked by firewall, it would listen on :RANDOM_PORT_10000_TO_60000 and device would open a connection there
(see 37575 above; --host_port=PORT to override)

I would also caution about 1.15 -> 1.14.1. Anything could happen to our config, it does not expect version downgrades

@richtoy
Copy link

richtoy commented Dec 12, 2022

Thanks, it was Windows defender. Downgrade worked fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants