-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High memory pressure with live dash playback #6070
Comments
Updating issue from Slack chat with @avelad Above image is from a test was ran with the nightly build but compiled it is strange, cause the eviction logic seems correct but still we see it is retained But in any case even with the no dvr there is some memory back pressure with the new version vs old version (we were on 4.3.0 and planning to upgrade) We are going to try and test in which version this behviour started so we can try and track the change that caused it. |
On our side, Tizen and WebOS seemed to be crashing with out-of-memory errors. Rollback to 4.5.0 for now as it seems stable. |
Interesting. I will start to investigate this a bit too. Maybe using |
Thanks for the feedback @ricardo-a-alves-alb and thanks @nrcrast, sounds like u might know possible root cause, we were about to start doing bisect to find the PR that started it but from these comments maybe u can spot it faster then us |
It's probably just the PR that introduced the TimelineSegmentIndex -- #5061. Not 100% sure but that was a large change and I'd imagine if there was a memory issue with it, it would have been there since the beginning. |
I remember this PR, and I actually think @avelad mentioned it to me on another issue related to performance on smart TV but we didn't notice improvement with VST specifically and assumed our app might consume too much memory maybe to feel the gains so we started improving more on FE side. |
I think I've found one of the issues -- the At that point, the |
@nrcrast thanks for the update, any chance you are planning to push this fix so we can test it? |
Hi! I will push at some point this week. I want to do some long running tests here myself so I'm confident in the fix myself before I push anything. I can push a branch though at some point if you want to do some preliminary testing before I open a PR. |
@nrcrast for sure if you can set up a side branch we can build and test from it |
you mean in the release function of "shaka.dash.TimelineSegmentIndex" we should ensure that segment reference are release Happy to check your branch, i'm testing it too |
It cannot be released in the The issue with releasing everything in the overridden I am still chasing down the issue here -- it seems that for some reason even though I'm setting I've just made a branch with this change in it as well as using splice if you also want to do some testing: |
Hi @nrcrast I tried to chase it down till the segment reference "partialReferences" and "initSegmentReference" but i'm not sure it's correct. Thanks for the information about release and it's at the end what i thought reading more deeply the code. |
Hello. Seems that leaking memory and high allocation rate are still present in timeline-segment-index-free-all branch. |
Interesting. From what I can tell, the eviction logic in the TimelineSegmentIndex seems to work pretty much the same way as the eviction logic in the base SegmentIndex, plus it only creates SegmentReference objects when it's asked to. So I'm surprised at the number still remaining. That's definitely concerning. Have we checked if #5762 had any impact? I have no reason to suspect that it did, but it would be an interesting test. |
Some more interesting info! Here is a comparison of my fork of v3.3.10 on the left and current shaka main on the right. Both contain the Timeline Segment Index, but my fork does not contain #5762 or any other changes that happened since my initial contribution. Going to play the "binary search" game and see if I can reproduce similar results using official shaka releases and not my own fork. |
"Have we checked if #5762 had any impact? I have no reason to suspect that it did, but it would be an interesting test." I just tagged and checked 2 versions from timeline-segment-index-free-all branch on web: a commit right before #5762 (#5848 ) and #5762 itself. Looks like the heap size is quite stable, no known memory leak being detected: |
Seems that I found the root cause (checked on web and webos 6 for memory leaking atm) - #5899 Memory allocation magnitude is still high on webos 6 for dash live playback:
|
Interesting. I don't fully understand though what the real difference is or how that PR could cause this. Even before that PR, the TSI was holding a reference to an initSegmentReference -- that PR just makes it so that reference changes periodically. It's also unclear to me why my Something isn't quite adding up for me 🤔 |
We are testing it commit by commit so hard to tell. One thing that can be a gotcha is that the code we see and code we ship is not the same in JS land. Maybe the code change seems safe but maybe compiled code optimizes and creates some closure. |
@Illmatikx I think this has always been the case, though. Before the Timeline Segment Index was introduced, all of these SegmentReferences were created during manifest parse time, and each Segment Reference had a ref to the InitSegmentReference. With the introduction of the Timeline Segment Index, these Segment References are only created upon retrieval ( @OrenMe The only difference between your two tests was that single commit for #5899? It just seems like such an innocuous change! @avelad any idea? Maybe this is some weird compilation issue, but on a linear stream Segment References will be periodically evicted. They should then be garbage collected. As long as there is no other reference to the InitSegmentReference (which I tried to fix in my branch), the InitSegmentReference should be collected as well eventually. In practice, I must be missing something, because it's not happening that way 😆 . |
Apart from that, do you think it is appropriate to create a new Shouldn't we check if the properties used to create the |
Yes, please do it |
@Illmatikx Can you share the same comparison against 4.8.1? Thanks! |
@avelad that is the plan from my side, but I am blocked atm by #6533 |
@Illmatikx Yesterday we released 4.8.3, can you test on WebOS if this issue is fixed now? |
Closing due to inactivity. If this is still an issue for you or if you have further questions, the OP can ask shaka-bot to reopen it by including |
Hello @avelad sorry for the late response, but I managed to take a glance at 4.,8.4 version with our demo app. The test targeted a start-over video with 15h35m timeShiftBufferDepth. so It was a kind of an edge use case. |
@Illmatikx can you test if #6610 (comment) reduce the pressure? (It seems like a bit of an ugly hack, but... I would be willing to add it if it works for everyone). Thanks! |
In version 4.10.6 we have introduced some improvements, can you validate if it is enough or do we need more improvements? Thanks! |
Closing due to inactivity. If this is still an issue for you or if you have further questions, the OP can ask shaka-bot to reopen it by including |
Have you read the FAQ and checked for duplicate open issues?
Yes
If the problem is related to FairPlay, have you read the tutorial?
Not applicable
What version of Shaka Player are you using?
4.7.0
Can you reproduce the issue with our latest release version?
Not tested, but compared the branch against 4.7.0
Can you reproduce the issue with the latest code from
main
?Not tested, but compared the branch against main
Are you using the demo app or your own custom app?
Custom app (just a page with shaka player)
If custom app, can you reproduce the issue using our demo app?
Not tried
What browser and OS are you using?
Web: Windows Chrome 120.0.6099.199 (Official Build) (64-bit)
webos 4,6: chromium 53 and 79 respectively
tizen 4: chromium 56
For embedded devices (smart TVs, etc.), what model and firmware version are you using?
webos 4: 43UM7300PLB (2019)
webos 6: 43UP81006LA (2021)
tizen 4: UE43NU7400U (2018)
What are the manifest and license server URIs?
I will send you the URL via email.
What configuration are you using? What is the output of
player.getConfiguration()
?Player configuration is attached.
player_config_web.txt
What did you do?
Started playback of identical live dash stream with 4.3.0 and 4.7.0 shaka versions and kept it up to 2 hours. I monitored memory with distinct tools (depending on the platform), but namely on web I made heapdumps every 10-30 minutes.
What did you expect to happen?
Memory footprint should not be growing after buffer filling for 4.7.0 (as it is true for 4.3.0). Also, memory allocation rate should be smaller for 4.7.0 similar to 4.3.0 (tested on smart tv devices with webos 4/6 and tizen 4).
What actually happened?
Shaka 4.7.0 memory footprint grew over time during live playback (no interaction with the custom demo app, tested on web and webos 4/6). Allegedly, memory growth depends on DVR window size since it stabilized after 60 minutes of test on web (timeShiftBufferDepth is 1 hour). During two tests I observed heap reaching 39 and 68 MB respectively whereas with shaka 4.3.0 playback its size was around 11-12 MB.
Another one issue - memory allocation/clean up magnitude is much higher for 4.7.0 in comparison to 4.3.0 under the same testing conditions (tested on webos 4/6 and tizen 4 with 1h DVR). The consequence of high memory pressure on low-end smart tv devices is UI freezing during interaction with the app: OS and browser are busy with memory housekeeping tasks like swapping, GC, OS level memory allocation delays etc. Mere live playback with our own app started to freeze on webos 4 2019 device after ~15 minutes without any UI interaction.
Heapdumps comparison refers to memory kept by SegmentIndex/XML DOM data structures. As for high memory allocation rate, I am pondering about evict() changes in TimelineSegmentIndex class: slice() was added to timeline and references objects, which is about shallow copies of arrays (relevant changes may be in #5061).
The text was updated successfully, but these errors were encountered: