Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA and upload of larger files fails #9990

Closed
1 task done
TD-er opened this issue Jul 5, 2024 · 3 comments · Fixed by #9991 or tasmota/arduino-esp32#457
Closed
1 task done

OTA and upload of larger files fails #9990

TD-er opened this issue Jul 5, 2024 · 3 comments · Fixed by #9991 or tasmota/arduino-esp32#457
Labels
Status: In Progress Issue is in progress

Comments

@TD-er
Copy link
Contributor

TD-er commented Jul 5, 2024

Board

Any with large enough flash

Device Description

This is about larger uploads of either the sketch, or file to the filesystem.
So any board with large enough flash to have an upload of files > 2 MB

Hardware Configuration

Version

latest master (checkout manually)

IDE Name

PlatformIO

Operating System

Windows 11

Flash frequency

40MHz

PSRAM enabled

yes

Upload speed

115200

Description

OTA updates of larger sketches (typically > 2 MB in size) lately tend to fail.
The device then is inaccessible for 10 - 15 minutes.

On the serial console, you'll see the dots per 1436 bytes of upload and they typically stop at 1100 - 1600 dots.
Thus somewhere between 1.4 - 2.2 MB and never seem to make it to the full 2.5 MB needed for my "max" build bin files.

I found 2 issues here.

This function does no longer have a timeout implemented:

int WebServer::_uploadReadByte(NetworkClient &client) {
int res = client.read();
if (res < 0) {
while (!client.available() && client.connected()) {
delay(2);
}
res = client.read();
}
return res;
}

Related issue: #9163
Related PR: https://github.com/espressif/arduino-esp32/pull/9167/files

IMHO the old implementation of this function was just fine:

int WebServer::_uploadReadByte(WiFiClient& client){
  int res = client.read();
  if(res < 0) {
    // keep trying until you either read a valid byte or timeout
    unsigned long startMillis = millis();
    long timeoutIntervalMillis = client.getTimeout();
    boolean timedOut = false;
    for(;;) {
      if (!client.connected()) return -1;
      // loosely modeled after blinkWithoutDelay pattern
      while(!timedOut && !client.available() && client.connected()){
        delay(2);
        timedOut = millis() - startMillis >= timeoutIntervalMillis;
      }

      res = client.read();
      if(res >= 0) {
        return res; // exit on a valid read
      }
      // NOTE: it is possible to get here and have all of the following
      //       assertions hold true
      //
      //       -- client.available() > 0
      //       -- client.connected == true
      //       -- res == -1
      //
      //       a simple retry strategy overcomes this which is to say the
      //       assertion is not permanent, but the reason that this works
      //       is elusive, and possibly indicative of a more subtle underlying
      //       issue

      timedOut = millis() - startMillis >= timeoutIntervalMillis;
      if(timedOut) {
        return res; // exit on a timeout
      }
    }
  }

  return res;
}

(of course needs to have NetworkClient as function argument now)

The other issue is in WebServer::handleClient():

case HC_WAIT_READ:
// Wait for data from client to become available
if (_currentClient.available()) {
if (_parseRequest(_currentClient)) {
// because HTTP_MAX_SEND_WAIT is expressed in milliseconds,
// it must be divided by 1000
_currentClient.setTimeout(HTTP_MAX_SEND_WAIT); /* / 1000 removed, WifiClient setTimeout changed to ms */
_contentLength = CONTENT_LENGTH_NOT_SET;
_handleRequest();

setTimeout should be called before _parseRequest

I'm not 100% sure this is all that's needed to make OTA uploads work in all situations again.
But at least with these changes I managed to get the OTA working again on the 2nd attempt where it was nearly impossible before these changes.
Also it is nice the board will remain accessible and not hang for 10+ minutes in that _uploadReadByte function :)

Sketch

-

Debug Message

-

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.
@TD-er
Copy link
Contributor Author

TD-er commented Jul 7, 2024

I did a few tests with my PR and it does for sure solve the 10+ minutes hanging, but OTA updating with files > 2MB still often fail.
So either my initial tests were extremely lucky, or there was something else in Parsing.cpp changed which does also affect these uploads.
My initial tests were to restore the Parsing.cpp file from ~2 years ago, but this also contains some other fixes which I didn't want to 'break' using this fix, so I only included the obvious mistakes in my PR.

Maybe someone else has some idea why OTA'ing of large files is now hardly working where it used to work just fine.
I also tested with multiplying the timeout with a factor 3 in int WebServer::_uploadReadByte(NetworkClient &client) and I got a successful upload again, but the next attempt already failed again, so not sure if it was just a fluke.

@TD-er
Copy link
Contributor Author

TD-er commented Jul 7, 2024

Hmm this issue is even bigger...
I have some code where I generate a CSV file download on the fly while reading bin files and decoding them.
This does generate file downloads of several MB in size and used to work just fine.
I just tested this on the corrent code base and it consistently fails now before 1M has been downloaded.

@TD-er
Copy link
Contributor Author

TD-er commented Jul 7, 2024

I think I found the remaining issue.
HTTP_MAX_CLOSE_WAIT was set to 2000, where all other HTTP_xxx_WAIT values were set to 5000.

After changing this, the OTA update was working just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: In Progress Issue is in progress
Projects
None yet
3 participants