A guide to the visual language of closed captions and subtitles

Gareth Ford Williams
UX Collective
Published in
10 min readOct 31, 2021

--

The Doctor asks her phone, Tell me about the Visual Language of Closed Captions and Subtitles

If you have ever watched a film or TV programme, or gone to an opera or theatre production with either captions (subtitles in the UK) for improved accessibility or subtitles (also called subtitles in the UK) for language interpretation, there is a visual language that should be followed that helps users get more from the content.

These are things that professional subtitlers and captioners do when they follow best practice as they can help with reading comprehension. These have been researched and tested, and put into practice by broadcasters in particular for over 40 years.

This guide should be applicable to TV or Film open captioning, Translation Subtitles, YouTube Subtitles, Burned-in Captions on social media videos and Video Game Captions.

Some of these are fairly obvious but there are others that can convey subtleties that can unlock meaning.

One thing that needs to be said is that the presentation of captions and translation subtitles should always follow the same rules. They have different purposes but the function is the same, reading the spoken word.
They are both relaying narrative and dialogue, and the end-users included people with or without access to sound, and might also have vision or cognitive impairments. There are people who will say differently, but they only consider translation subtitles being accessed by a standard issue audience, especially hearing audiences. As soon as you consider that disabled people also watch foreign-language films and TV, then you realise that the editorial accessibility conventions are just as relevant.

The following is a user-centric guide to the editorial conventions of an accessible caption or subtitle experience.

If you are a content producer then the main 16 considerations that can be universally applied on any VOD social media platform are on this downloadable cheat sheet. There are more links to resources at the end of the article.

For some of the following there are more than one variation, this is because there are different conventions in captioning in North America as there are in Europe. Key differences are that in the US square brackets are used and have multiple purposes in delivering information and there is a greater tendency to add names of speakers to the beginning of lines in the US.

1. Different Colours.
When text appears in a different colour, then this denotes a different speaker. Colours for individual speakers should persist. If possible a colour associated with a character could be used, like red for Iron Man, blue for Captain America or Green for The Hulk, as this helps cognitively.

Hello I’m The Doctor? (new speaker) Doctor who?

2. A hyphen at the Start of a Line.
Where the platform doesn’t support different colours hyphens are used instead to tell us the speaker has changed.

- (new speaker) Hello I’m The Doctor. — (new speaker) Doctor who?

3. Combined Colours and a Hyphens.
This is when there is either redundancy built into the design so it can be used on multiple platforms, or the content producer is ensuring this also works for colour blind users.

- (new speaker) Hello I’m The Doctor. — (new speaker) Doctor who?

4. Single Quotation Marks.
This means that the person speaking is either a voiceover artist, a narrator or on the phone.

‘Hello I’m The Doctor’

5. Named Speakers.
When there are too many speakers for colours and/or hyphens, or the pace and volume of text allows it, names of speakers can be used.

This can also be used the first time someone speaks to introduce their caption colour, especially if the first time we encounter them is a particularly busy scene with lots of characters in the dialogue.

Names can either be displayed lowercase followed by a colon which is European, or in square brackets with is North America.

THE DOCTOR: Hello I’m The Doctor. ROSE: Doctor who?
[The Doctor] Hello, I’m The Doctor. [Rose] Doctor who?

6. Double Quotation Marks.
When there are double quotation marks this tells us that the voice is coming from a radio or loud speaker. This could also be a synthetic voice.

“Hello, I’m The Doctor”

7. Double Quotation Marks after No Quotation Marks.
This simply means that one speaker is quoting another.

You said, “The Doctor”

8. Arrows.
When there is an arrow at the start or end of a line, it tells us that the speaker is off the screen in the direction the arrow is pointing

<Over here! (new speaker) No, over here!>

9. Speech Delivery in Brackets.
When there is a description of the delivery of speech in brackets at the start of the line, this can mean that their speech has changes and/or how they are speaking is important to the narrative. For instance this could mean someone is injured, intoxicated or under stress.

(SLURRED): Just one more.

10. Hyphens In-between Letters.
When there are repeated letters with hyphens this tells us that the speaker is stammering.

G-g-g-g-go for it!

11. Declared Accent.
When a person or character’s accent is relevant to the story being told or gives valuable context to a character, it is pointed out the first time they speak.

AMERICAN ACCENT: Okay, okay, I’ve got you!

12. Apostrophes Instead of Silent Letters.
If a character’s accent means they either soften or miss out letters, like accents with a glottal stop, then these letters are replaced with an apostrophe. This helps convey more richness in the characterisation.

Take ‘im darn the ‘ospital

13. Bracketed Words
If a character whispers some lines, then this is conveyed in the caption using brackets, although it is also OK to precede the whispered words with an explanation, although that is less efficient in terms of efficiency and caption real estate.

Freeze, Sarah Jane. (If you move we’re dead)
Freeze, Sarah Jane. [whispers] if you move, we’re dead
Freeze, Sara Jane. WHISPERS: If you move we’re dead.

Words in square brackets can convey several different things. They can either describe the sound qualities of the speech, [synthetic voice] [quietly] [whispers] [screams], they can contain song information [“9 to 5” by Dolly Parton intro playing], they can be an alternative for all caps for sound effects [gunshot] and information such as [narrator].

[synthetic voice] Daleks don’t have a concept of elegance!

14. Question and Exclamation Marks in Brackets
When you see either (!) or (?) at the end of the line, this tells you that the speaker is being sarcastic. The exclamation mark tells you that it is a sarcastic statement whilst the question mark is a sarcastic question.
Like whispering, there can be an indicator at the beginning of the line that says SARCASTIC: but this uses up more real estate and also does not differentiate between a question and a statement.

Well that’s just fine and dandy(!)
What do you think it is, a space helmet for a cow(?)

15. Question Mark followed by Exclamation Mark.
The addition of ?! at the end of a line indicates that the words have been delivered with an incredulous tone, because the speaker is unable or unwilling to believe in something. Like sarcasm, knowing this can completely change the meaning.

You mean you’re going to marry him?!

16. FULL CAPS AND AN EXCLAMATION MARK!
simply tells us that the speaker is shouting or screaming. Both can be indicated using the words SHOUTS: or SCREAMS: but these take up a lot of space and add an addition word to read, so full caps and an exclamation mark saves space.

WHY DON’T YOU JUST DIE! (new speaker) “You would have made a good Dalek”

17. FULL CAPS
If words are displayed in full caps it cal also mean they are describing a noise and are and not speech.
The addition or a chevron or arrow can also tell us what direction the sound came from if that is important information.

GUNSHOT>

18. Rrrarrrgghh!
Sometimes sound effect words can be substituted for or added to descriptions of sounds. These words are no onomatopoeias but more like the effect words used in comic books. This is the content producer telling us important contextual information and trying to bring the sound to life.
LIONS ROARING is more factual and would be used if identifying the source of the noise is obvious to someone who can hear it, whereas <Rrrarrrgghh! tries to imitate the sound, bringing the roar to life, but does not reveal what is making the sound.

<Rrrarrrgghh!

19. A Line Starting With Two Dots
If a line begins with two dots this is because it is in response to speech that is unheard by viewers who have access to the sound. This could be a situation like a character listening responding to someone on the phone that we can’t hear.

..and who am I speaking to?

20. Some ALL CAPS words in the middle of a sentence.
If a single word or words in a sentence are displayed as all caps this is because the speaker is stressing those words in their delivery.

The only exception is if the word ‘I’ is stressed, it is then often displayed in a different colour.

For heaven’s sake, WHY would you do that?
I can’t believe you would do that. The letter ‘I’ is in a different colour

21. Three Dots in the Middle of a Word or Sentence
The use of three dots in the middle of a sentence shows that the speaker has paused, which could be important because it could show use they are considering something, realising something or changing their mind.

And… he’s wonderful.

22. Three Dots at the End of an Unfinished Sentence
This tells us that speech has trailed off. If the trailed-off speech was a question, an exclamation or delivered with disbelief, then the three dots are followed immediately with ?, ! or ?!

Everything’s got to end… sometime.

23. If the speaker comes back to the sentence after a pause, then the rest of the sentence is preceded by two dots.

Everything’s got to end sometime… ..otherwise nothing would ever get started.

24. Music Information
There are lots of reasons why music is used. Sometimes it helps create atmosphere, others times a particular piece or its lyrics can help with storytelling, or sometimes it’s just there as audio decoration.

There are different ways a content producer can give context to the use of music.

If knowing what a piece of music is is , there can be an informational caption.

MUSIC: “Mr Blue Sky” by The Electric Light Orchestra
[ “Mr Blue Sky” by The Electric Light Orchestra ]

If it is important to know the style and delivery of the piece then this can be in an ALL CAPS description or in square brackets.

SHE WHISTLES A HAPPY TUNE
[She whistles a happy tune]

If these are combined, then descriptors and information are combined also.

If the music is incidental but the atmosphere it creates is important, then this is in an ALL CAPS description or in square brackets.

EERIE MUSIC

When characters, artists or crowds sing and hearing the words is important then that can be presented in two ways, top and tailing the lyrics with hashes or musical notes.

#Sun is shinin’ in the sky. There aint a cloud in sight…#
♪Sun is shinin’ in the sky, There ain’t a cloud in sight… ♪

If the song is interrupted, the heard lyrics end with three dots, and if the lyrics start with three dots that tells us that this is not the beginning of the song.

#..Hey you with the pretty face. Welcome to the human race…#

Compound Use.
The reason why lots of things like expression and context are not described in words, but instead using punctuation and symbols, means that more information can be delivered more efficiently.
Both the amount of words displayed and how long they are displayed for have their limitations, so understanding the subtitle and caption visual language gives all users the opportunity to access richer information.

Resources
If you would like to know more about subtitling and captioning, please check out these resources:

How captions increase ROI and audience for media creators
BBC Subtitling (Closed Caption) Guidelines

BBC R&D 360 Video and VR Captions Display Research
BBC R&D Subtitles and Closed Caption Quality Research
How Big Should Closed Captions and Subtitles Be?
How TV Subtitles and Closed Captions are Produced
How to Create Subtitles and Closed Captions
The History of Access Services at the BBC
The Do’s and Dont’s of Captioning in the US
W3C Closed Caption Guidelines
YouTube Guide to Captioning

#closedcaptions #subtitles #captions #a11y #accessibility

--

--

Director at Ab11y.com and The Readability Group. I am an Ex-Head of UX Design and Accessibility at the BBC and I have ADHD and I’m Dyslexic.