Wright State Online

Closed Captioning for Audio and Video

On this page:

Closed Captioning for Audio and Video Overview

Captions are written versions of audio content, displayed in sync with the audio/video source. Captions ensure that audio/video material is accessible for all users, including deaf or hard-of-hearing learners, non-native English learners, and learners with cognitive disabilities. Captions are also helpful for students working in loud environments, or those unable to turn their sound on.

All audio/visual material in a course must have closed captions. Transcripts alone will not suffice.

Note: A closed caption refers to a caption track that is separate from the audio/video source and can be toggled on or off throughout. Open captions are always visible and integrated into the video file itself.

Captions are necessary in all classroom settings:

Synchronous online classes

WebEx provides real-time automatic closed captions for online class sessions.

In-person classes

Any audio/visual materials shared in a lecture should have closed captions enabled or added to the media.

Asynchronous material

Closed captions must be generated and added by the faculty to all audio/visual content in their asynchronous course. 

For help getting started with creating your own captions, refer to WSO’s guide for creating Automatic Speech Recognition (ASR) closed captions in Panopto.

Closed Captioning for Audio and Video (PDF)

 

After generating your ASR closed captions, refer the below information to proofread and edit them in accordance with ADA and WCAG standards.
 



Closed Captioning Best Practices

The following information is adapted from the Described and Captioned Media Program’s Captioning Key and 3Play Media’s Ultimate Guide to Closed Captioning.

Developments in automatic speech recognition technology have made generating closed captions much easier, but it is important to take the time to ensure your captioned material meets the needs of your students.

Captions must be accurate, consistent, clear, readable, and equal in their implementation.

1

Accurate

Strive for errorless captions. ADA and FCC require at least 99% accuracy. The accuracy of automatic speech recognition technology is only 80-95% without proofreading.

  • No spelling errors.
    • This is especially important for technical and discipline-specific terminology.
  • Correct punctuation. (e.g., commas, quotation marks, exclamation points, question marks)
  • Proper capitalization.
  • Accurate grammar.
  • Use of the Oxford comma in lists.
2

Consistent

Uniformity in style and presentation.

  • All captions must have the same visual style (color, font, location on screen).
    • White text and a black background are the default and recommended choices.
  • If a word can be spelled multiple ways, choose one and stick to it for all your media.
  • Maintain consistency in use of acronyms.
  • Choose a method of writing out numbers and be consistent with it.
3

Clear

Create a complete textual representation of the audio.

  • If multiple speakers are present in the media, identify who is speaking and when.
    • Use parentheses to indicate speaker information.
    • Speaker identification should be on its own line above the caption.
    • If the speaker’s name is known, use their name for the identification.
    • If the speaker is unknown, use a descriptive title. (e.g., (Child) or (Narrator #1))
  • Include slang, quirks of speech, or profanity exactly as spoken.
  • Hesitation or stuttering should be included in the closed captions.
    • This can be cleaned up for better pacing and clarity in instructional material.
  • Use an ellipsis when there is a significant pause or the speaker trails off.
  • Relevant music is given objective descriptions in brackets (e.g., [upbeat pop music])
  • Pertinent non-speech information is depicted in brackets (e.g., [dog barking])
  • Only use all-caps text if the speaker is shouting, not for emphasis.
4

Readable

Ensure users can easily follow and understand the captions.

  • All captions are synchronized with audio from start to finish.
  • Captions must be displayed long enough to be read completely.
  • Use no more than two lines of text on screen at once.
  • Unnecessary “um” or “uh” filler sounds may be omitted for clarity.
5

Equal

Complete parity of user experience for all audio/visual material.

  • Caption the entire video
  • Do not paraphrase or omit any spoken or meaningful sounds.
  • If captioning a completely silent video, include a single caption that says [silence] for the first five seconds.
  • Include descriptive words in brackets to convey the style of speech as needed (e.g., [sarcastic], [laughing], [angry], [whispering]).
  • Make captions readily available without requiring an accommodation request.


Additional Resources

Note:  WSO and CaTS do not provide support for the following programs. Troubleshooting and other inquiries should be directed at the software’s customer support.

Automatic Speech Recognition Closed Captioning Alternatives

For an alternative to Panopto, you can create captions using automatic speech recognition on YouTube. As with all ASR captions, you must proofread and edit anything YouTube generates before including it in your course. For more information on this, see YouTube’s help section on automatic captioning.

Features for the following alternatives may be limited based on your account’s plan. Sign-up is required.

Manual Captioning Alternatives

The following tools can aid you in creating closed captions, but lack the speech recognition technology, meaning all captions would need to be hand-typed. These tools can also import existing captions or transcripts and allow you to edit and adjust timing.