Many conferences have moved online this year due to the pandemic, and many attendees are expecting captions on videos (both live and recorded) to help them understand the content. Captions can help people who are hard of hearing, but they also help people who are trying to watch presentations in noisy environments and those who lack good audio setups as they are watching sessions. Conferences arguably should have been providing live captions for the in-person events they previously held. But since captions are finally becoming a wider a topic of concern, I want to discuss how captions work and what to look for when choosing how to caption content for an online conference.
There was a lot of information that I wanted to share about captions, and I wanted it to be available in one place. If you don’t have the time or desire to read this post, there is a summary at the bottom.
Note: I’m not a professional accessibility specialist. I am a former conference organizer and current speaker who has spent many hours learning about accessibility and looking into options for captioning. I’m writing about captions here to share what I’ve learned with other conference organizers and speakers.
Closed captions provide the option to turn captions on or off while watching a video. They are usually shown at the bottom of the video. Here’s an example of one of my videos on YouTube with closed captions turned on.
The placement of the captions may vary based upon the service used and the dimensions of the screen. For instance, if I play this video full screen on my wide screen monitor, the captions cover some of the content instead of being shown below.
Open captions are always displayed with the video – there is no option to turn them off. The experience with open captions is somewhat like watching a subtitled foreign film.
But despite captions often being referred to colloquially as subtitles, there is a difference between the two. Captions are made for those who are hard of hearing or have auditory processing issues. Captions should include any essential non-speech sound in the video as well as speaker differentiation if there are multiple speakers. Subtitles are made for viewers who can hear and just need the dialogue provided in text form.
For online conferences, I would say that closed captions are preferred, so viewers can choose whether or not to show the captions.
Captions can either be created as a sort of timed transcript that gets added to a pre-recorded video, or they can be done in real time. Live captioning is sometimes called communication access real-time translation (CART).
If you are captioning a pre-recorded video, the captions get created as a companion file to your video. There are several formats for caption files, but the most common I have seen are .SRT (SubRip Subtitle), .VTT (Web Video Text Tracks). These are known as simple closed caption formats because they are human readable – showing a timestamp or sequence number and the caption in plain text format with a blank line between each caption.
There are multiple options for creating captions. The first thing to understand is that captioning is a valuable service and it costs money and/or time.
In general, there are 3 broad options for creating captions on pre-recorded video:
Some video editing applications allow authors to create caption files. For example, Camtasia provides a way to manually add captions or to upload a transcript and sync it to your video.
Alternatively, there is a VTT Creator that lets you upload your video, write your captions with the video shown so you get the timing right, and then output your .VTT file.
Another approach is to use text-to-speech software to create a transcript of everything said during the presentation and then edit that transcript into a caption file.
Services like YouTube offer auto-captioning, so if it’s an option to upload as a private video to get the caption file from there, that is a good start. But you will need to go back through and edit the captions to ensure accuracy with either of these approaches. Vimeo also offers automatic captioning, but the results will also need to be reviewed and edited for accuracy.
These are valid approaches when you don’t have other options, but they can be very time consuming and the quality may vary. This might be ok for one short video, but is probably not ideal for a conference.
If you are going to make presenters responsible for their own captions, you need to provide them with plenty of time to create the captions and suggest low-cost ways to auto-generate captions. I’ve seen estimates that it can take up to 5 hours for an inexperienced person to create captions for one hour of content. Please be aware of the time commitment you are requesting of your presenters if you put this responsibility on them.
Depending on the platform you use, your presentation software might provide AI-driven live captioning services. This is also known as Automatic Speech Recognition (ASR). For example, Teams offers a live caption service. As of today (November 2020), my understanding is that Zoom, GoToMeeting, and GoToWebinar do not offer built-in live caption services. Zoom allows you to let someone type captions or integrate with a 3rd party caption service. Zoom and GoToMeeting/GoToWebinar do offer transcriptions of meeting audio after the fact using an AI service.
PowerPoint also offers live captioning via its subtitles feature. My friend Echo made a video and blog post to show the effectiveness of PowerPoint subtitles, which you can view here. There are a couple of things to note before using this PowerPoint feature:
Google Slides also offers live captions. The same limitations noted for PowerPoint apply to Google Slides as well.
Third-Party Caption Services
There are many companies that provide captioning services for both recorded and live sessions. This can be a good route to go to ensure consistency and quality. But all services are not created equal – quality will vary. For recorded sessions, you send them video files and they give you back caption files (.VTT, .SRT, or another caption file format). They generally charge you per minute of content. Some companies offer only AI-generated captions. Others offer AI- or human-generated captions, or AI-generated captions with human review. Humans transcribing your content tends to cost more than AI, but it also tends to have a higher accuracy. But I have seen some impressively accurate AI captions. Captions on recorded content are often less expensive than live captions (CART).
Below are a few companies I have come across that offer caption services. This is NOT an endorsement. I’m listing them so you can see examples of their offerings and pricing. Most of them offer volume discount or custom pricing.
The Described and Captioned Media Program maintains a list of captioning service vendors for your reference. If you have used a caption service for a conference and want to share your opinion to help others, feel free to leave a comment on this post.
For recorded or live video:
For recorded video:
For captions on live sessions:
If you’d like more information about captions, 3PlayMedia has an Ultimate Guide to Closed Captioning with tons of good info. Feel free to share any tips or tricks you have for captioning conference sessions in the comments.
I’ve summarized the info in this post below for quick reference.
(Click to enlarge)
Manual creation of caption files for recorded sessions
Cost: None
Time/Effort: High
Pros:
• Doesn’t require a third-party integration
• Supports closed captions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Accuracy will vary widely
• Manual syntax errors can cause the file to be unusable
Upload to YouTube, Vimeo or another service that offers free captions
Cost: None to Low
Time/Effort: Medium
Pros:
• Supports closed captions
• Works no matter what application is shown on the screen
• Works no matter what application is used to record and edit video
Cons:
• Not available for live sessions
• Requires editing of captions to achieve acceptable accuracy
• Requires an account with the service and (at least temporary) permission to upload the video
• Accuracy will vary widely
Auto-generated captions in presentation software (e.g., PowerPoint, Google Slides)
Cost: Low
Time/Effort: Low
Pros:
• Works for live and recorded sessions
• No third-party integrations required
Cons:
• Requires that all presenters use presentation software with this feature
• Must be enabled by the presenter
• Won’t work when speaker is showing another application
• Often offers only open captions
• Accuracy may vary
• Often only captures one speaker
ASR (AI-generated) captions from captioning service
Cost: Medium
Time/Effort: Low
Pros:
• Works for live and recorded sessions
• Supports closed captions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Accuracy may vary
• Requires planning to meet lead times for recorded sessions
• Poor viewer experience if delay is too large during live sessions
Human-generated or human-reviewed captions from a captioning service
Cost: High
Time/Effort: Low
Pros:
• Ensures the highest quality with the lowest effort from conference organizers and speakers
• Works for live and recorded sessions
• Works no matter what application is shown on the screen
• Works not matter what application is used to record and edit video
Cons:
• Requires planning to meet lead times for recorded sessions
• Poor viewer experience if delay is too large during live sessions
I hope you find this exploration of options for captions in online conference content helpful. Let me know in the comments if you have anything to add to this post to help other conference organizers.
The post Captioning Options for Your Online Conference appeared first on SQLServerCentral.