Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DASH Live stream stops playing multiperiod manifest #4492

Closed
bhjortsberg opened this issue Jul 10, 2018 · 27 comments
Closed

DASH Live stream stops playing multiperiod manifest #4492

bhjortsberg opened this issue Jul 10, 2018 · 27 comments
Assignees

Comments

@bhjortsberg
Copy link

bhjortsberg commented Jul 10, 2018

Issue description

Playing a DASH Live stream with multiperiods.
The player is stuck and stops playing when there is a new period. Sometimes the player continues after 30-60s and eventually the player also crashes with

07-10 11:52:25.114  9551  9551 D AndroidRuntime: Shutting down VM
07-10 11:52:25.133  9551  9551 E AndroidRuntime: FATAL EXCEPTION: main
07-10 11:52:25.133  9551  9551 E AndroidRuntime: Process: ---------, PID: 9551
07-10 11:52:25.133  9551  9551 E AndroidRuntime: java.lang.IndexOutOfBoundsException
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.util.Assertions.checkIndex(Assertions.java:68)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.source.dash.DashMediaSource$DashTimeline.getPeriod(DashMediaSource.java:1054)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.analytics.AnalyticsCollector$MediaPeriodQueueTracker.updateMediaPeriodToNewTimeline(AnalyticsCollector.java:778)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.analytics.AnalyticsCollector$MediaPeriodQueueTracker.onTimelineChanged(AnalyticsCollector.java:727)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.analytics.AnalyticsCollector.onTimelineChanged(AnalyticsCollector.java:424)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.ExoPlayerImpl$PlaybackInfoUpdate.notifyListeners(ExoPlayerImpl.java:746)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.ExoPlayerImpl.updatePlaybackInfo(ExoPlayerImpl.java:681)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.ExoPlayerImpl.handlePlaybackInfo(ExoPlayerImpl.java:622)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.ExoPlayerImpl.handleEvent(ExoPlayerImpl.java:567)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.google.android.exoplayer2.ExoPlayerImpl$1.handleMessage(ExoPlayerImpl.java:109)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at android.os.Handler.dispatchMessage(Handler.java:105)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at android.os.Looper.loop(Looper.java:164)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at android.app.ActivityThread.main(ActivityThread.java:6798)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at java.lang.reflect.Method.invoke(Native Method)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:240)
07-10 11:52:25.133  9551  9551 E AndroidRuntime: 	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:767)

Using an older version of ExoPlayer works fine (2.6.1).

Reproduction steps

See the provided DASH media presentation

Link to test content

Unfortunately no link but the manifest file that cases the issue:
manifest_period.mpd.txt

Version of ExoPlayer being used

Non-working version: 2.8.2
Working fine in: 2.6.1

Device(s) and version(s) of Android being used

Emulator Android 7.1.1 and OnePlus3T Android 8.0

Link

The following issue seem to be similar use case:
#3431

@ojw28
Copy link
Contributor

ojw28 commented Jul 12, 2018

@tonihei - The crash looks like a bug in MediaPeriodQueueTracker. I didn't get as far as reproducing, but my hypothesis is that:

  1. There's an active period in the timeline.
  2. The timeline updates and the period is removed. updateMediaPeriodToNewTimeline is called. newPeriodIndex is INDEX_UNSET for the removed period, and so the period entry is returned without being updated. Importantly, this means there may now be an entry in activeMediaPeriods with an invalid period index.
  3. The timeline updates again. updateMediaPeriodToNewTimeline is called, and tries to use the invalid period index to access the timeline.

I think you'd normally expect onMediaPeriodReleased to happen between (2) and (3), which would avoid the problem, but I doubt that's guaranteed and it seems flaky to rely on it.

@bhjortsberg - For the playback getting stuck, this may be a result of invalid changes being made to the manifest. Unfortunately we can't determine whether this is the case just from a snapshot of the manifest. We'd need a working test stream ideally, or alternatively a sequence of manifest files showing how it changes over time.

@ojw28 ojw28 added the bug label Jul 12, 2018
@ojw28 ojw28 assigned tonihei and unassigned ojw28 Jul 12, 2018
@bhjortsberg
Copy link
Author

bhjortsberg commented Jul 13, 2018

It's possible that it can be something with the manifest, I can't see anything myself though. Attaching a sequence of manifest when the player get stuck. I think it's hard to get a working stream, but I can do a network capture if that would help.

I'm not sure if it get stuck when the old period (id: p_3_382868305) disappears from the manifest or when the new (id: p_3_382868325) is created.
Also in manifest_11.mpd there are 2 new segments compared to manifest_10.mpd, but that should not be a problem?

manifest_player_stop_08.mpd.txt
manifest_player_stop_09.mpd.txt
manifest_player_stop_10.mpd.txt
manifest_player_stop_11.mpd.txt
manifest_player_stop_12.mpd.txt
manifest_player_stop_13.mpd.txt

ojw28 pushed a commit that referenced this issue Jul 23, 2018
…t update.

Issue:#4492

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=205636634
ojw28 pushed a commit that referenced this issue Jul 23, 2018
…t update.

Issue:#4492

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=205636634
@ojw28
Copy link
Contributor

ojw28 commented Jul 23, 2018

The actual crash should be fixed now. Please could you try dev-v2 to see if your live stream still gets stuck (i.e. there's something else wrong), or whether the issue is now resolved?

@bhjortsberg
Copy link
Author

Hi, yes the crash is fixed! However, the stream still gets stuck.

@ojw28
Copy link
Contributor

ojw28 commented Jul 27, 2018

Ok. I'm not sure we can do much about that without a live test stream, given we're unable to reproduce with other streams we have. I couldn't see anything obviously wrong from the manifests provided.

@ojw28 ojw28 added need more info and removed bug labels Jul 27, 2018
@ojw28 ojw28 assigned ojw28 and unassigned tonihei Jul 27, 2018
@bhjortsberg
Copy link
Author

Ok, I'll see if I can come arrange something.

@bhjortsberg
Copy link
Author

bhjortsberg commented Aug 1, 2018

Hi, I have now created a live stream.
Available here:
removed
Conditions are not ideal but I'll try to keep the stream alive.

@ojw28
Copy link
Contributor

ojw28 commented Aug 1, 2018

Using the latest dev-v2 branch, I'm not able reproduce a permanent freeze. I do see a freeze for a few seconds at the end of the green/white striped part at 29:27, but it always recovers after a few seconds. I haven't worked out whether that's due to the content or not yet.

Could you describe exactly what you're seeing when you try and play this stream?

@bhjortsberg
Copy link
Author

There is a new period when switching to and from the green/white stripe part. This stream works better and I can see what you are seeing that it freezes at 29:27, the period ends at 30:00. This freeze does not happen on 2.6.1 or in shaka-player.

@bhjortsberg
Copy link
Author

Sometimes it also freezes at 09:27 just before green/white stripe preiod should start, sometimes you can see just millisecond stall. And sometimes it just works without problem. A timing issue?

@ojw28
Copy link
Contributor

ojw28 commented Aug 2, 2018

I tried reproducing again with some additional logging this morning. Playback worked flawlessly for about half an hour. Now I'm seeing very frequent socket timeouts, which is making it difficult to test. Could you take a look and see if there's a problem?

For what it's worth, my suspicion is that there's something subtly wrong with the stream. The logging I've added is to look at the sample timestamps being extracted (after adjustment by presentation offset) from each period. I'm interested in checking they align properly with what the manifest indicates is there (i.e. they start at t~=0 and go up to t~=period_duration). When the stream was working flawlessly they looking absolutely fine, but that doesn't really tell us much :).

@ojw28
Copy link
Contributor

ojw28 commented Aug 2, 2018

In addition to checking the socket timeout problem, is it possible for you to increase the size of the live window for the stream? It's quite frustrating getting to the point where the issue reproduces, and not being able to rewind to go over that point again because it's dropped off the back of the live window.

@bhjortsberg
Copy link
Author

Yes, I can increase the live window. And I will try to make the connection more stable.

@bhjortsberg
Copy link
Author

You should have 20min live window now (at least when it has been running for 20min). Segments will be available for 60min.

@ojw28
Copy link
Contributor

ojw28 commented Aug 3, 2018

I think this issue occurs only when the DASH manifest is both multi-period and contains EventStream elements. We'll push a fix shortly, and would be grateful if you could verify on your end. Thanks!

@ojw28
Copy link
Contributor

ojw28 commented Aug 3, 2018

In addition, I'm fairly confident that the Event.presentationTime values are incorrect in the test stream. As per the DASH spec they should be relative to the start of the period, however they appear to be absolute (i.e. relative to the start of the live stream). This may cause an eventual hang, even after the fix we'll be pushing. I'm not too sure about this though.

@bhjortsberg
Copy link
Author

Nice! Yes, I'll be able to verify the fix.
You are probably right about the Event.presentationTime, thanks for pointing that out. I'll fix that on our side.

@bhjortsberg
Copy link
Author

If I remove the EventStream element everything works fine.

ojw28 added a commit that referenced this issue Aug 6, 2018
EventSampleStream.readData required eventStreamUpdatable to be
false for END_OF_STREAM to be returned, but it was only being
set to false when the manifest became static. This change also
sets it to false if the period is not the last one in a dynamic
manifest. It also renames it the parameter to appendable, since
that's what it really represents. Events may still be removed
from the start of the EventStream as the period starts moving
out of the live window.

Issue: #4492

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=207517265
@ojw28
Copy link
Contributor

ojw28 commented Aug 6, 2018

The ExoPlayer part of this issue is addressed by the change ref'd above. I'm not certain, but you may still encounter eventual freezing for as long as the presentation timestamps in the EventStream are incorrect. You should fix that on your side.

Thanks for the test stream; very helpful!

@ojw28 ojw28 closed this as completed Aug 6, 2018
ojw28 added a commit that referenced this issue Aug 6, 2018
EventSampleStream.readData required eventStreamUpdatable to be
false for END_OF_STREAM to be returned, but it was only being
set to false when the manifest became static. This change also
sets it to false if the period is not the last one in a dynamic
manifest. It also renames it the parameter to appendable, since
that's what it really represents. Events may still be removed
from the start of the EventStream as the period starts moving
out of the live window.

Issue: #4492

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=207517265
@bhjortsberg
Copy link
Author

I've verified the fix and it works! Thanks a lot!
Will this go into verion 2.8.3?

@ojw28
Copy link
Contributor

ojw28 commented Aug 6, 2018

Yes, it has been cherry picked already: e92d655

ojw28 pushed a commit that referenced this issue Sep 3, 2018
The current structure tries to associate events to media periods and windows
based on the reported values and the current timeline. However the reported
EventTime may not always be consistent in case the timeline doesn't contain
windows or media periods yet or not anymore.

The recent changes to MediaPeriodId allow to use it as a unique identifer for
media periods independent of the timeline. This enables more accurate tracking
of the media period queue and prevents reporting events with inconsistent
data.

Issue:#4492
Issue:#4634

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=210713120
@nicusorflorin

This comment has been minimized.

@ojw28

This comment has been minimized.

@nicusorflorin

This comment has been minimized.

@ojw28

This comment has been minimized.

@nicusorflorin

This comment has been minimized.

@ojw28

This comment has been minimized.

@google google locked and limited conversation to collaborators Dec 20, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants