Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<transcript> element inside <video> and <audio> for deafblind users #45

Closed
accessabilly opened this issue Jan 20, 2022 · 8 comments
Closed

Comments

@accessabilly
Copy link

Transcripts of multimedia are essential for deafblind users

I previously submitted this here, but was told it would be better here.

Deafblind multimedia users need everything in a machine-readable text format to be displayed via a screen reader in a refreshable braille device. Closed captions in videos are not usable text for them, because, though some modern braille readers can display the captions, the captions change too quickly to read them in real time as the video plays. Also, there is no easy way to access the captions separately from the video, even if the captions are in a text format. A separate transcript to audio or video is the only way that multimedia content can be made accessible to deafblind users.

Use Cases

There are ways to create transcripts already, like placing transcript content into a separate markup container after the multimedia content. But this technique has accessibility and usability issues:

  • There is no universal role for transcripts: anything can be a transcript and not be named as such. For deafblind users, it's hard to find a transcript for a multimedia element. The closeness to the multimedia element and the naming by authors are the only indicators for something that works as a "transcript".
  • Transcripts are neither connected directly nor semantically to the <video> or <audio>.
  • A connection via aria-describedby to the multimedia content is not usable for deafblind users, because it does not allow pausing or navigating the text in a screen reader. It makes the screen reader read the whole thing at once.
  • A large transcript would need a skip link placed before it to make it possible for other users to skip the content.
  • There should be an accessible way to show or hide the transcript for all users like there is for close captions (like through buttons in the UI of the multimedia element).
  • The duty to prepare a transcript rests solely with the author. The necessity for a transcript is not obvious to developers/authors.

Goals

  • Allow a <transcript> element inside the <video> or <audio>, which ensures a semantical connection and controllability via the multimedia player. It should be possible to place sectioning content and flow content inside, like in a <section>.
  • To make this usable for all users a <transcript> should reflect a button in the multimedia player to show/hide the transcript, something like there already is for close captions.
  • It should be possible to have reference a whole HTML document as a transcript, maybe something like this: <transcript src="/transcript.html">. An embedded solution like an <iframe> could be possible, but have the same security and accessibility issues.
  • This would allow to reference a <section> of the current document as transcript like this: <transcript src="#my-custom-transcript">.
  • A reference outside the multimedia element like this should allow new ARIA roles like <div role="transcript" id="my-custom-transscript">. A custom transcript outside the multimedia element would allow custom styling.
  • A native HTML element would highlight the importance of a transcript for deafblind users who can't access multimedia in another way than text.
  • Also search engines and their users could benefit from a semantically correct transcript.
  • Missing <transcript> elements could be automatically detected via automated testing tools like HTML Validator, Lighthouse, AXE, WAVE etc.

Examples

Example 1: embedded <transcript>

<video>
    <source src="/video.mp4" type="video/mp4">
    <transcript>
        <h1>Transcript for my Video</h1>
        <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
    </transcript>
    Sorry, your browser doesn't support embedded videos.
</video>

Example 2: embedded <transcript> with reference in the document

<video>
    <source src="/video.mp4" type="video/mp4">
    <transcript src="#my-transcript" />
    Sorry, your browser doesn't support embedded videos.
</video>
<div role="transcript" id="my-transcript">
    <h1>Transcript for my Video</h1>
    <p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p>
</div>

Privacy & Security Considerations

Privacy

I think you can monitor and track, if <transcript> was viewed or not if toggled via a control in the multimedia element. But that does not reveal data about the person viewing it. This could be a robot or a human as well.

Security

Assumed, that <transcript> can work like an <iframe>, it could have the same security issues.

@tomayac
Copy link

tomayac commented Jan 20, 2022

For the status quo, I guess this is how I would have marked this up with the solutions of today. It obviously does not solve all points from your list, but it's valid (ignore the data URL issue). The semantic association between the transcript and the video is the figure.

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Foo</title>
  </head>
  <body>
    <figure>
      <video controls>
        <source src="foo.mp4" type="video/mp4" />
      </video>
      <figcaption>
        <details>
          <summary>Transcript</summary>
          <section>
            <h1>Foo</h1>
            <p>Bar</p>
          </section>
        </details>
      </figcaption>
    </figure>
  </body>
</html>

I don't claim that this is the solution or to be an expert, simply was wondering how I would solve this today.

@Malvoz
Copy link

Malvoz commented Jan 20, 2022

Apparently, there already exists a proposal for <transcript>:

/cc @chaals

Other useful resources:

@accessabilly
Copy link
Author

Apparently, there already exists a proposal for <transcript>:

* https://w3c.github.io/html-transcript/html-transcript-src.html

* https://www.w3.org/WAI/PF/HTML/wiki/Full_Transcript

/cc @chaals

Other useful resources:

* https://www.w3.org/WAI/media/av/transcripts/

* https://www.w3.org/WAI/perspective-videos/captions/#transcript

OK, I was not aware of that. Thank you for the notice.

@accessabilly
Copy link
Author

For the status quo, I guess this is how I would have marked this up with the solutions of today. It obviously does not solve all points from your list, but it's valid (ignore the data URL issue). The semantic association between the transcript and the video is the figure.

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>Foo</title>
  </head>
  <body>
    <figure>
      <video controls>
        <source src="foo.mp4" type="video/mp4" />
      </video>
      <figcaption>
        <details>
          <summary>Transcript</summary>
          <section>
            <h1>Foo</h1>
            <p>Bar</p>
          </section>
        </details>
      </figcaption>
    </figure>
  </body>
</html>

I don't claim that this is the solution or to be an expert, simply was wondering how I would solve this today.

There are ways to make a transcript and connect it semantically, sure. My point was: it is not enforced by the web standard. It depends totally on the author/developer and therefore has no universal role, which it should have for braille users to make it easier to find it.

@tomayac
Copy link

tomayac commented Jan 20, 2022

[I]t is not enforced by the web standard. It depends totally on the author/developer and therefore has no universal role, which it should have for braille users to make it easier to find it.

Not sure how Braille readers represent this in practice, but the HTML spec says that "[t]he figcaption element represents a caption or legend for the rest of the contents of the figcaption element's parent figure element, if any".

@accessabilly
Copy link
Author

Yes, this is perfectly right. But the use of a <figure> and <figcaption> for a transcript is mandatory. People use simple <div> or other stuff. There is no standard way of doing so like it is for captions with a <track>. But there should be to ensure a universal naming and connection. It should not be the developers duty.

@accessabilly
Copy link
Author

As there is an official proposal for a <transcript> element, I decided to close this issue. Sorry for the noise...

@Malvoz
Copy link

Malvoz commented May 10, 2022

@accessabilly I don't decide what's accepted. I didn't mean to indicate that this (or the WHATWG issue) should be closed. I only wanted to point to the existing proposal as a means to indicate there's interest in this elsewhere. Feel free to re-open, especially since the existing HTML proposal hasn't been brought up here nor at WHATWG yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants