-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persist filters in between segments. #179
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One cosmetic change request, otherwise looks good!
// Filter: The challenge here is around fps filter adding and dropping frames. | ||
// The fps filter expects a strictly monotonic input pts: frames with | ||
// earlier timestamps get dropped, and frames with too-late timestamps | ||
// will see a bunch of duplicated frames be generated to catch up with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if segment missed? Like, orchestrator receives segment 1 and then segment 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question. This actually works correctly. The fps filter duplicates the last frame to make up for the "missing frames" if it detects insertion(s) are necessary between the current frame and the last frame in order to maintain a smooth rate of output. If the current and last frames are "far apart", this does result in outputting a zillion duplicates to fill the gap.
But since the last frame is a flush frame, the duplicates are dropped before hitting the encoder. This leads to the final output still being correct. We actually have a test that implicitly checks this behavior via alternating timestamps in 4ab522f which encodes segments 1 and 3 after dropping segments 0 and 2. But I'll see about adding an explicit check for skipped segments.
This brings up another issue I had overlooked until now. Since the fps filter comes before the scale filter, all duplicated frames get rescaled, then dropped. This is additional computation that really should be avoided since it is a clear attack vector: send artificial timestamps that are years apart and pin down the CPU / GPU as the filtergraph churns through all the duplicated frames.
Lines 213 to 217 in d03c749
var filters string | |
if param.Framerate > 0 { | |
filters = fmt.Sprintf("fps=%d/1,", param.Framerate) | |
} | |
filters += fmt.Sprintf("%s='w=if(gte(iw,ih),%d,-2):h=if(lt(iw,ih),%d,-2)'", scale_filter, w, h) |
An easy fix for this would be to rearrange the filter string such that the fps adjustment comes after the rescaling, but it also means doing unnecessary work if the fps is being adjusted downwards (eg, 60fps -> 30fps) which is a rather common case (and that's the reason for the filters being ordered the way they are right now). But given the choice between avoiding redundant work, and protecting against an easy DoS, it'd be better to rearrange the filters to avoid DoS.
That being said, I think it might be good to spend a little more time figuring out how to support alternating timestamps without making the code even more terrible than it already is, because I suspect we'll be running into ordering issues often with retries etc. Supporting alternating timestamps should allow us to preserve the current filtergraph ordering while mitigating the DoS.
Note that we've always had an issue with the user requesting an extremely high fps output, but theoretically that could be addressed by short circuiting the transcode as described in #129
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd be better to rearrange the filters to avoid DoS.
I agree we should do this until we implement support alternating timestamps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this one livepeer/go-livepeer#1422 will be an issue with persistent filtergraph too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this one livepeer/go-livepeer#1422 will be an issue with persistent filtergraph too?
It shouldn't, since timestamps are carried as int64 everywhere, and most sane formats can accommodate rollovers (including mpegts and RTMP). Dunno what the issue is with that ticket.
Filters rearranged and skip test added in c5d1824
This helps ensure that changes to sensitive areas that munge pts such as the filtergraph continue to produce the correct results.
More sanity checking.
This is a useful test and we should always run it for software.
Rebased on top of master and squashed the fixups from earlier (and also probably some other small modifications which I've lost track of). But here are some relevant changes from the past day or so: c5d1824 Filters rearranged due to the DoS issue mentioned in #179 (comment) and tests also added for skipped segments. The compute savings are real with the rearranged filters: with two skipped segments (240 frames), the test runs roughly 20% faster on my machine. Or seen in another light, we've avoided a 20% slowdown. But we may still end up a bit slower here due to redundant rescaling in downsampling the framerate. 6d66e5b Added a pts check for framerate pass through. Good thing this was added, because there was a bug in the timestamp rewriting logic. 6c4daa9 Aforementioned bug fixed. See the commit for details on what it does. 33c7a90 Cosmetic changes as requested in #179 (comment) .
This was a bit of a silly bug. The memory leak PR ( https://github.com/livepeer/lpms/pull/177/files ) unintentionally introduced a small change meant for this patch series (the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Re-initializing the filtergraph turns out to be expensive, especially if GPUs are in the picture. We get around 30% more performance by persisting the filtergraph. In order to flush the filter, we cache the most recent audio/video frame and feed those into the filter repeatedly with a sentinel value set (`AVFrame.opque`). Once we receive a frame from the filtergraph with the sentinel set, we know the filter has been completely flushed.
This is a consequence of persisting the fps filter which expects pts timestamps to be monotonic. Non-monotonic frames are dropped. Accommodating out-of-order segments would involve totally rewriting output timestamps based on the input value from the source frame, while forging the pts that the filter receives.
Also add a test for skipping segments.
Rebased and merged. |
This PR fixes GPU performance and VRAM growth issues by reusing the filtergraph in between segments. With this, LPMS is about equal to ffmpeg. Compared to previous LPMS versions, this gives us roughly a 30% performance gain for the workload of 1080p -> 720p,576p,460p,360p .
Note: Github for some reason refuses to show commits in the order they appear in the tree, and insists on incorrectly showing them chronologically based on commit time. The below is the correct ordering in tree, as will be reflected in a pull. The ordering is somewhat important, as some tests are added prior to the bulk of the functional modifications as a guard against behavioral changes. Ideally, nothing about the behavior as a result of this PR should change; it's just an extensive plumbing fix for performance.
Mostly ready for review, but not entirely comfortable with this yet ; in particular, SW encoding seems to show some differences at higher frame rates between 97c48b3 and a03f838 .
eab2b37 ffmpeg: Add tests for first and last PTS values. …
This helps ensure that changes to sensitive areas that munge pts
such as the filtergraph continue to produce the correct results
97c48b3 ffmpeg: Check both software and Nvidia frame counts. …
More sanity checking.
a03f838 ffmpeg: Persist filtergraph in between transcodes. …
Re-initializing the filtergraph turns out to be expensive,
especially if GPUs are in the picture. We get around 30% more
performance by persisting the filtergraph.
In order to flush the filter, we cache the most recent audio/video
frame and feed those into the filter repeatedly with a sentinel
value set (
AVFrame.opque
). Once we receive a frame from thefiltergraph with the sentinel set, we know the filter has been
completely flushed.
4ab522f ffmpeg: Disallow out-of-order segments. …
This is a consequence of persisting the fps filter which expects
pts timestamps to be monotonic. Non-monotonic frames are dropped.
Accommodating out-of-order segments would involve totally
rewriting output timestamps based on the input value from the
source frame, while forging the pts that the filter receives.
00fe946 ffmpeg: Log improvements and cosmetics.
00163c5 ffmpeg: Documentation updates.