UX Requirements for Voice Interfaces

Defining UX requirements for a Voice User Interface Product

Published in

UX Collective

4 min readNov 17, 2018

I have been designing Voice UX for about two and a half years. During this period, I was lucky to get to work on a plethora of domains, devices, and platforms. I have had opportunities to work with a melange of designers and add new tools to my skill set.

I also got a chance to experiment with my process — from implementing theoretical methods, borrowing from my fellow designers, learning from my pet projects.

Here, I am trying to document my Voice User Experience(VUX) design which I have now roughly settled on. (I am no way claiming that you should adopt this process, or that it is the best/only process. I am sharing my two cents, on the off chance that somebody finds it helpful.)

Ok, so here goes..

When designing for Voice, I can roughly divide my work into three phases: 1. Discover & Define 2. Ideate & Design 3. Evaluate

In this post, I would expand on the Discover & Define phase. This phase is a collaboration with Product Managers(PM), Developers, and Marketing teams.

Project Kick-off & Validation

The starting point for any project is a loose brief of the project idea — the high-level purpose of the project.
Ex: I want users to be able to access Goodreads through voice.
(I will be using a Goodreads voice app as an example throughout the article)

To check the possibility of a compelling product, a research phase is necessary. The following needs to be covered:
1. User Needs: What is the user need that the product is addressing? What are the main tasks users will be performing? Who is the target audience, and what are their expectations from a voice interface?
The methods used could be primary (user interviews), or secondary ( existing research, analytics from an existing product).

2. Competitor Audit: If a similar product exists in the marketplace, looking into its user reception & feedback helps to find points of interest. Moreover, this gives an idea regarding user mental models & expectations.

3. Research the platform/tech: If working on a new voice platform, it is very important to grasp its capabilities & limitations. Points considered are follow-up support, multiple-user support, barge-in command support, anaphora, etc. (If you are completely new to voice interfaces, you can refer to this VUI glossary to understand a few jargons.)

Another important consideration is if the platform has/needs a display component. That will impact the supported features and goals.

User Scenarios

Based on the (primary/secondary) research, we can now create a few core user scenarios.

Ex: in Goodreads, few user scenarios:

John’s friend, Andrew mentions an amazing book he has recently completed. He insists that John should definitely read it. John adds the book to his ‘to-read’ shelf.
Sally is binging on Netflix after a long day at work. The title of the book in one of the character’s hand looks quite appealing. She looks up the book on Goodreads.

Mostly, the core functionality of an app becomes the most important user scenario. Ex: a weather app offering information on weather, and alerts. However, in some cases, that may not hold true. Ex: in Goodreads, scrolling to check friend’s activity doesn’t seem a scenario for voice interface.

Journey Mapping

For each of the user scenarios, we outline the user journey to find where voice modality can fit in.
We look into each step to see voice is suited to complete the task.

Here, all the steps can be supported by voice

Voice-only may not be the best medium for the last task

For beginners, Webcredible’s checklist is a good resource about deciding if voice would be a correct modality for a task.

Defining Goals/Capabilities

From the journey maps, we can now decide the user goals that will be supported:
1. If all the steps are suitable for voice, and the platform supports all the tasks, that journey can be a goal.
2. If a step is not suited for voice, is it possible (good UX) to complete a part of the task, and let the user finish the rest (synced device)? If so, it can be added as a goal.

Goal 1: Look up a book.
Goal 2: Look up rating of a book
Goal 3: Mark a book as to-read, currently reading, read
etc.

Note: For some cases, some core goals need to be added even if they don’t fulfill these criteria.
Ex: A banking app needs to handle the user utterance to transfer money, even if the platform/tech doesn’t handle it. Designing for that goal is all about the best way of error-handling.

Defining Personality

In voice interfaces, the product’s personality is overt. The personality is defined by the voice and tone of verbal interactions.
For existing services, personality may already be defined. That can be referenced to maintain a consistent personality across mediums.
If the personality is yet to be created, it needs to align with the product’s purpose.

Also to be factored in is the platform’s defined persona, ex: Alexa has some defined rules regarding her behavior.

This concludes the define phase.
Documents at the end of the phase: The Product Spec, usually maintained by PM. (This may get updated later, based on unforeseen issues during design and development phase)

Note: I am in no way affiliated to Goodreads, just a frequent user of the product.