Knowing context in designing for voice

Published in

UX Collective

5 min readMay 26, 2018

Enabling computers to hold a conversation is a hard problem to solve. Many assistants claim to be conversational. However, what they really do is ask a series of one-off questions to simulate a conversation. One-offs like ‘set an alarm for 7 AM’ can be really useful, but it would be really awkward to say ‘when should I leave for the airport for my wife’s flight number 747 from Bangalore.’ A much more natural conversation would go somewhat like this;

User: Tell me when my wife’s flight lands.Assistant: Oh the flight from Bangalore? It lands at 9 PM. She should be out by 9:30User: Cool. When should I leave?Assistant: It’s better you leave at 8:30. Should I book you a cab?User: Umm no. I’ll drive today.
… after a few hours
User: Hey can you book me the cab?Assistant: Sure! I’ll book it right away.

There are a few key components to a good conversation; contextual awareness, memory of previous interactions and an exchange of appropriate information. The computer needs to know the context in which the conversation is happening. If I shout to it, it could speak loudly. But if I whisper, can it whisper back?

Maintaining an illusion of awareness

Users would engage with the system more if they realize that it is aware of their presence. We need certain strategies for the assistant to maintain an illusion of awareness. Harry Gottlieb wrote “The Jack Principles of the Interactive Conversation Interface” in 2002. In this paper, Gottlieb outlines tips for creating the illusion of awareness in a conversational system; specifically, he suggests responding with human intelligence and emotion to the following:

The user’s actions
The user’s inactions
The user’s past actions
A series of the user’s actions
The actual time and space that the user is in (time is obvious, place can refer to geographical space, which app the user is in or a place in the house like kitchen, living room, etc.)
The comparison of different users’ situations and actions

Gottlieb also outlined tips for maintaining the illusion of awareness:

Use dialog that conveys a sense of intimacy
Ensure that characters act appropriately while the user is interacting
Ensure that dialog never seems to repeat
Be aware of the number of simultaneous users
Be aware of the gender of the users
Ensure that the performance of the dialog is seamless
Avoid the presence of characters when user input cannot be evaluated

There are a few other ways of maintaining this illusion of awareness when designing modern VUIs. It is imperative to keep in mind the context of the user when designing any conversational interaction.

The user’s physical location

If the assistant knows where the user is and responds accordingly, it will seem more aware. Knowing the user’s location has multiple advantages. It helps the assistant answer to queries while respecting context. For example, when a user asks for “party places,” the assistant can suggest clubs near the user rather than doing a random search of irrelevant places around the world.

Type of users

It is a good strategy for the system to be aware if the user is interacting with it for the first time or uses it on a regular basis. For example a life logging app might require users to log their mood everyday. This is how the app could prompt different users:

Beginner
Assistant: “How are you feeling today? Make sure you take a minute to think about your day and select one or more options that correspond with your mood.”

Advanced
Assistant: “Hey there! How’s it going?”

It is important to count the frequency of usage rather than the number of times the assistant has been used to determine proficiency of the user. The user might have used the assistant only once every month, but the overall numbers might be great.
It is also important for the system to adapt to the user’s behavior rather than simply nudging to use the assistant. For example, the system can know when the user uses the assistant and prompt only during those times.

PrimingIn psychology, priming is a technique whereby exposure to one stimulus influences a response to a subsequent stimulus, without conscious guidance or intention. For example, you are more likely to answer ‘Mongolia’ when asked to name a place that starts with the letter ‘M’ if you’ve just seen a documentary about Mongolia.Letting the user know what to expect is also a form of priming. It informs users on how to prepare themselves. Priming however can be subtle, if the VUI responds to the query “Can you play me ‘Fix you’?” as “Playing ‘Fix you’ by ‘Coldplay.’” Next time the user can simply say “Play ‘Fix you’ by ‘Coldplay’”

Type of device

Voice interfaces have gone beyond IVR to smartphones, smart speakers, in cars, on the wrist watch and soon will be an integral part of our lives. When designing for voice, we are designing for two different things: an input mechanism through voice and an output mechanism which is not necessarily through voice. Although the input is voice, the output would depend on the context in which the solution is being used.

The assistant can present much more information about the movie apart from the name by knowing the modality of interaction.

If you ask the an assistant on your phone to tell you “the top ten movies of 2017”, there would be a cognitive load on the user if it reads out everything. It would be much better to just show the list of movies.

Doing this has advantages beyond reducing cognitive load: the assistant can present much more information about the movie apart from the name like the actors, directors, awards received, etc. which would otherwise be difficult to capture with voice alone. It is important to remember that there exist interfaces beyond a speaker and a microphone.

References:
1. Being Digital — Nicholas Negroponte
2. Designing voice user interfaces — Cathy Pearl
3. Design for Voice Interfaces — Laura Klein

If you liked this article, please click the 👏 button (once, twice or more).
Share to help others find it!

Knowing context in designing for voice

Maintaining an illusion of awareness

The user’s physical location

Type of users

Type of device

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in UX Collective

Written by Kore

No responses yet