Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add keystroke overlay #496

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

Utagai
Copy link
Contributor

@Utagai Utagai commented Jun 30, 2024

I'm sorry it took so long. This wasn't really my priority the last couple of weeks so I didn't give it as much time as I could have.

Adds functionality for injecting a keystroke overlay into a vhs recording. This addresses #164 & #468. A large portion of this work is adapted from my work on steno.

The main overview of the approach here is that:

  • We add support for Set KeyStrokes [Hide|Show] in tape files. This can be toggled on/off throughout the recording.
  • We wrap various types from rod, including rod.Page, rod.KeyActions and rod.Keyboard, and make it so that any time vhs codebase wants to "type" something in some fashion, it goes through these decorated wrappers. The wrappers will record keystrokes.
  • At the "end", when we go to call ffmpeg to generate output files, we use a series of drawtext filters based on the recorded keystroke events to render the keystroke overlay.

Testing was done via some extended/added unit tests, which can be found in this PR.
I also ran E2E tests, but the verification was manual and via eyeball since there's no good way to automatically test vhs in an E2E manner (at least currently). You can find my test cases here, which are executed via a Python script, also on that branch (here).

NOTE 1: This PR should be reviewable commit-by-commit if you prefer.
NOTE 2: Force-pushes were from me trying to fix my GPG signing configuration on WSL.

Demo

readme_demo

Open Questions

  • I currently use a series of symbols to print non-printable characters (e.g. space, control, etc). These can sometimes lead to issues if the system's font configuration uses a font that does not have the glyph in question and renders tofu. The user can fix this via fontconfig, which is what ffmpeg uses, but I'm open to changing the code to only use basic ASCII for example.
  • Should I regenerate all the example GIFs? They shouldn't really appear changed to the human eye, but the timing fix (73cf9f0) technically affects all recordings.
  • Should we continue to hardcode the overlay font size as the defaultFontSize, or should we match whatever the specified font size is?

Utagai added 12 commits June 30, 2024 16:58
ADd #	modified:   parser/parser_test.go
This, as you might expect, includes KeyStrokes coverage!
We're going to need this, at least temporarily, for supporting the incoming key
stroke recording types.
This currently includes a few different things:
- KeyStrokeEvent: Mostly taken from steno almost verbatim.
- KeyStrokeEvents: A wrapper around a slice of KeyStrokeEvents that helps us
package a little state with the events such that we can record exactly when they
should be rendered.
- Page/Keyboard: Another wrapper type around their equivalently named rod types.
These help us ensure that we continue to log the keypresses without substantial
changes to the call sites, and also minimizes the chance of bugs when command
execution is modified since all inputs go through these types.
I was hesitant about this approach at first, and technically still am to a
degree. However, ApplyLoopOffset() seems to be setting a precedent where
VideoOpts may be modified at a later point in the program's lifespan, so I think
this might be a fine approach.
Utagai added 17 commits June 30, 2024 17:18
This also makes it easier to interop with potential future changes that make the
overlay follow the set font family for the recording.
This allows us to not have to do things like draw an actual newline and instead
draw some readable symbol for it.
This is actually still too simple. We'll have to do testing with the margin or
whatever settings in vhs.
This also gets rid of the minor TODO comment about
an empty events slice, which could technically
happen.
Makes it so that the keystroke overlay's drawing
agrees with the specified margin, otherwise, we'd
be drawing on the margin which is probably OK, but
doesn't make as much sense as honoring the margin.

For padding, I opted to just reuse
opts.Style.Padding for consistency.
Before, we were either showing all the overlay the
moment things were turned on, or never at all. Now
we properly support turning it on/off in the
middle of the recording.
We weren't actually using them, and in general
from our talks with Maas, we want to err on
implementing less in the first implementation, and
add more as requests come in.
This is mostly to avoid confusion with so-called
'keypress commands'.
This also fixes the time-drift that became more and more pronounced as the
recording proceeded.
We need to count the runes in the conditional check, not just the truncating of
the display string. Simple oversight.
Does some tune-up of our special symbols:
- Delete now uses its correct symbol
- PageUp & PageDown is replaced with their correct symbols
- Escape replaced with left diagonal arrow (way more readable too)
- Alt replaced with the option key symbol, not sure how I feel about this.
- Control replaced with ^.
- Shift added.

It should be noted that "correct" here is a bit misleading. I don't know if
there are any 'standard' symbols for some of these keystrokes, but I'm just
choosing ones that seem to map to the key when you search them up online.
@Utagai Utagai marked this pull request as ready for review July 3, 2024 04:42
@Utagai Utagai requested a review from maaslalani as a code owner July 3, 2024 04:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant