Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore fragLoadPolicy and levelLoadPolicy on livestreams when 404 error #5647

Closed
5 tasks done
PavelFomin90 opened this issue Jul 12, 2023 · 7 comments · Fixed by #6853
Closed
5 tasks done

Ignore fragLoadPolicy and levelLoadPolicy on livestreams when 404 error #5647

PavelFomin90 opened this issue Jul 12, 2023 · 7 comments · Fixed by #6853
Labels
Error handling Revisit-at-later-release-cycle Will revisit during release cycle indicated by the Milestone Works as expected
Milestone

Comments

@PavelFomin90
Copy link
Contributor

What version of Hls.js are you using?

1.4.3

What browser (including version) are you using?

Chrome 114.0.5735.198 (Official Build) (x86_64)

What OS (including version) are you using?

MacOs 11.6

Test stream

https://mirismanager.ubicast.eu/media/storage/ts-404.m3u8

Configuration

{  
  debug: false,
  enableWorker: true,
  lowLatencyMode: true,
  backBufferLength: 60,
  autoStartLoad: true,
  startLevel: 3,
  liveSyncDurationCount: 2,
  capLevelToPlayerSize: true,
  fragLoadPolicy: {
    default: {
      maxTimeToFirstByteMs: 5000,
      maxLoadTimeMs: 20000,
      timeoutRetry: {
        maxNumRetry: 2,
        retryDelayMs: 0,
        maxRetryDelayMs: 0,
      },
      errorRetry: {
        maxNumRetry: 4,
        retryDelayMs: 1000,
        maxRetryDelayMs: 2000,
      },
    },
  }
}

Additional player setup steps

No response

Checklist

Steps to reproduce

  1. Try to play stream with level or frag which return 404 error

Expected behaviour

The player should wait for the delay configured in fragLoadPolicy/levelLoadPolicy between two attempts and have an attempt limits

What actually happened?

When trying to play live-stream (with VOD it works fine) and have an 404 error on fragment or level there is no delay between attempts, also limits from maxNumRetry also doesn't work, so that we have endless requests for segments/levels with no delays.
Other problem is when level is ok, but .ts have an 404, player retry to fetch level with no delay
har-file: https://drive.google.com/file/d/1aIKrJEn-b46fZDinun26mp0qRi8-a48T/view?usp=sharing

The problem is similar to the one with the old versions of the settings
#4312

Now we solve it with hls.stopLoad() if we have an error, and hls.startLoad() after timeout, but on old versions of hls.js delay works

Console output

Status:
261.784 | Media element detached
261.787 | Loading https://mirismanager.ubicast.eu/media/storage/ts-404.m3u8
261.794 | Loading manifest and attaching video element...
261.805 | 1 quality levels found
261.805 | Manifest successfully loaded
261.807 | Media element attached
Error:
261.88 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10660.ts
261.953 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10661.ts
262.027 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10662.ts
262.094 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10662.ts
262.164 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10662.ts
262.232 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10662.ts
262.299 | Error while loading fragment https://mirismanager.ubicast.eu/media/storage/404ts-10662.ts
262.3 | A network error occurred: fragLoadError

Chrome media internals output

No response

@PavelFomin90 PavelFomin90 added Bug Needs Triage If there is a suspected stream issue, apply this label to triage if it is something we should fix. labels Jul 12, 2023
@robwalch
Copy link
Collaborator

When a 4xx status is returned, there is no expectation that re-requesting the fragment will change the result, so the player moves on to the next fragment. If you would like to retry on 4xx segment errors, try calling stopLoad() followed by startLoad().

@robwalch robwalch removed Bug Needs Triage If there is a suspected stream issue, apply this label to triage if it is something we should fix. labels Jul 12, 2023
@robwalch
Copy link
Collaborator

If you would like to be able to provide a shouldRetry callback in the load policies that overrides this behavior, please file PR or feature request. The callback should implement this helpers signature, with an additional boolean parameter taking the result of this check, and it should expect a boolean return value that overrides the default result:

export function shouldRetry(
retryConfig: RetryConfig | null | undefined,
retryCount: number,
isTimeout: boolean,
httpStatus?: number | undefined
): retryConfig is RetryConfig & boolean {
return (
!!retryConfig &&
retryCount < retryConfig.maxNumRetry &&
(retryForHttpStatus(httpStatus) || !!isTimeout)

@PavelFomin90
Copy link
Contributor Author

When a 4xx status is returned, there is no expectation that re-requesting the fragment will change the result, so the player moves on to the next fragment. If you would like to retry on 4xx segment errors, try calling stopLoad() followed by startLoad().

Yes, you're right, but when we working with livestreams it can be no #EXT-ENDLIST and hls.js start to retrying the last segment of list, as you can see in log, and it doing without delay and some times ignoring maxNumRetry.

Also we have the same problem with an 404 error on level m3u8. It's just going crazy 😃
livestream_manifest_404.log
livestream_manifests_404.har.zip
screen record: https://drive.google.com/file/d/1ynGY1976PfJuqx6lNQNxuE4nGIlUkQnZ/view?usp=sharing

@robwalch
Copy link
Collaborator

Sounds like an HLS serving issue. segments should not be advertised that 404. Your playlist publishing sounds a bit eager.

@PavelFomin90
Copy link
Contributor Author

Thank you for helping!
I will solve it with stopLoad and startLoad after timeout
It's interesting from me to research it. If I find something, I will text to you

@robwalch
Copy link
Collaborator

Are you getting the same loop loading issue with v1.4.9?

For VOD and live the player should switch levels, if it can not switch because there are no alternatives I would expect a fatal error eventually.

@PavelFomin90
Copy link
Contributor Author

It's reproducing with 1.4.9

[log] > Debug logs enabled for "Hls instance" in hls.js version 1.4.9
hls.ts:410 [log] > stopLoad
hls.ts:379 [log] > loadSource:https://pavelfomin.ru/streams/index_404.m3u8
stream-controller.ts:569 [log] > [stream-controller]: Trigger BUFFER_RESET
hls.ts:351 [log] > attachMedia
level-controller.ts:269 [log] > [level-controller]: manifest loaded, 2 level(s) found, first bitrate: 358000
buffer-controller.ts:148 [log] > 1 bufferCodec event(s) expected
hls.ts:400 [log] > startLoad(-1)
level-controller.ts:351 [log] > [level-controller]: Switching to level 0 from level -1
level-controller.ts:520 [log] > [level-controller]: Loading level index 0 with URI 1/1 https://pavelfomin.ru/streams/ts-404_vod.m3u8
base-stream-controller.ts:1754 [log] > [stream-controller]: STOPPED->IDLE
base-stream-controller.ts:1754 [log] > [subtitle-stream-controller]: STOPPED->IDLE
buffer-controller.ts:800 [log] > [buffer-controller]: Media source opened
stream-controller.ts:634 [log] > [stream-controller]: Level 0 loaded [10660,10662][part-10662--1], cc [0, 0] duration:9
buffer-controller.ts:692 [log] > [buffer-controller]: Updating Media Source duration to 9.000
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10660 cc: 0 of [10660-10662] level: 0, target: 0
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
level-controller.ts:351 [log] > [level-controller]: Switching to level 1 from level 0
level-controller.ts:520 [log] > [level-controller]: Loading level index 1 with URI 1/1 https://pavelfomin.ru/streams/ts-404_vod.m3u8
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->WAITING_LEVEL
base-stream-controller.ts:1754 [log] > [stream-controller]: WAITING_LEVEL->IDLE
level-controller.ts:351 [log] > [level-controller]: Switching to level 1 from level 1
level-controller.ts:520 [log] > [level-controller]: Loading level index 1 with URI 1/1 https://pavelfomin.ru/streams/ts-404_vod.m3u8
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->WAITING_LEVEL
stream-controller.ts:634 [log] > [stream-controller]: Level 1 loaded [10660,10662][part-10662--1], cc [0, 0] duration:9
base-stream-controller.ts:1754 [log] > [stream-controller]: WAITING_LEVEL->IDLE
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10660 cc: 0 of [10660-10662] level: 1, target: 0
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10661 cc: 0 of [10660-10662] level: 1, target: 3
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10662 cc: 0 of [10660-10662] level: 1, target: 6
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10662 cc: 0 of [10660-10662] level: 1, target: 9
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10662 cc: 0 of [10660-10662] level: 1, target: 9
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
base-stream-controller.ts:727 [log] > [stream-controller]: Loading fragment 10662 cc: 0 of [10660-10662] level: 1, target: 9
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->FRAG_LOADING
base-stream-controller.ts:1754 [log] > [stream-controller]: FRAG_LOADING->IDLE
hls.ts:410 [log] > stopLoad
base-stream-controller.ts:1754 [log] > [stream-controller]: IDLE->STOPPED
base-stream-controller.ts:1754 [log] > [subtitle-stream-controller]: IDLE->STOPPED

As you can see the player try to load the last fragment several times.

I think it happens because there is no comparison with previous try

base-stream-controller.ts:

 if (
        fragState === FragmentState.OK ||
        (fragState === FragmentState.PARTIAL && frag.gap)
      ) {
        fragPrevious = frag;
      }
      if (
        fragPrevious &&
        frag.sn === fragPrevious.sn &&
        (!loadingParts || partList[0].fragment.sn > frag.sn)
      ) {
        // Force the next fragment to load if the previous one was already selected. This can occasionally happen with
        // non-uniform fragment durations
        const sameLevel = fragPrevious && frag.level === fragPrevious.level;
        if (sameLevel) {
          const nextFrag = fragments[curSNIdx + 1];
          if (
            frag.sn < endSN &&
            this.fragmentTracker.getState(nextFrag) !== FragmentState.OK
          ) {
            frag = nextFrag;
          } else {
            frag = null;
          }
        }
      }

On your test stand it's not a big problem, because maxNumRetry work correctly in VOD and Live streams.
But on our site sometimes we have a problem, that the player try to switch level and counter of retries fall in 0 and it become a infinite loop. It happens both with levels and with fragments. I understand that it's first of all the backend issue, but I will be happy if there is a flexible 404 handling.

Now I process it on our side with handling NETWORK_ERROR, it's works for us. I open this issue just to let your know about problem, and maybe hear a better way to fix it:)

@PavelFomin90 PavelFomin90 reopened this Jul 17, 2023
@robwalch robwalch added the Revisit-at-later-release-cycle Will revisit during release cycle indicated by the Milestone label May 12, 2024
@robwalch robwalch added this to the 1.6.0 milestone Nov 18, 2024
robwalch added a commit that referenced this issue Nov 18, 2024
… when there are no alternates

Resolves #6741
Resolves #5647
Resolves #5153
Closes #6171 (replaces)
robwalch added a commit that referenced this issue Nov 19, 2024
… when there are no alternates

Resolves #6741
Resolves #5647
Resolves #5153
Closes #6171 (replaces)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error handling Revisit-at-later-release-cycle Will revisit during release cycle indicated by the Milestone Works as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants