Designing magical interfaces (without going to the dark side)

steve turbek
UX Collective
Published in
14 min readFeb 9, 2019

--

Photo by Cederic X on Unsplash

“Any sufficiently advanced technology is indistinguishable from magic.” -Arthur C Clarke

This quote is usually seen as a positive recognition of technology’s power. The new user interfaces of Voice, Chat, and Gestures also aspire to magic, to enable people to control technology without an obvious screen-based user interface. To many designers, it is the obvious future for interfaces.

“I see a natural progression from knobs and dials, to clicks and taps, to swipes and gestures, to voice and emotion” Imran Chaudhri, Apple iPhone UI designer link

Yet, we all know magic has a dark side. Could today’s trend of magical UX spell doom to users?

The User Experience of Magic

The appeal of magic is power, the ability to do something that others can’t. Or to do it easier: Why wash the floor when you can enchant a broom to do it for you? The power of the modern mobile phone to summon a cab is a clear descendant to a magical wand.

Mickey Mouse conjuring brooms to carry water from ‘Fantasia’ source

It might seem odd to talk about the UX of magic, but if books and movies are to be believed, magic is a technology like any other, and technology must be controlled by people in some way. If spells are apps and the magic wand is your device, those who create the spells are the UX’rs of magic.

As it’s usually shown, a master magician is able to create amazing works with their bare hands.

Copyright Marvel Studios

How do new users know how to turn these gestures turn into commands?To the uninitiated, it looks an awful lot like just waving your hands around.

Dr Strange struggling to learn magic. Copyright Marvel Studios

RTFM (Read This For Magic)

All computer users have experienced the frustration of knowing that a feature or function is possible, but not to be able to find how to do it. In the old days of computing, the solution was telling people to “Read The [Friendly] Manual”.

Most tools are designed to be obvious show how to hold them. wikipedia

Manuals were essential because the original computers offered little in the way of guidance. The cursor of the Command Line Interface just waited, blinking. And few were able to make the magic.

If there is one rule of UX design, it is that the user should not need a manual to perform an application’s primary functions — just as for most physical tools, which are designed so it’s obvious how to hold and use them. But the new generation of user interfaces that rely on gesture, voice, and chat have one obvious similarity to the command-line interface: they lack affordances — cues such as buttons, links, or menus that help users know what they can do next.

The loss of affordances in applications risks our returning to the bad, old days of personal computing, when the user was responsible for somehow divining how to use an application.

Mickey Mouse consulting the manual source

Gestures

Mickey Mouse, perhaps using the iPhone XX source

Nothing seems more magical than making things happen by waving your hands around. The modern phone is up to the challenge and can detect whether the user is pinching or swiping with their fingers. In many cases, this experience is delightful — for example, if the user appears to be physically manipulating a digital map.

But gestures also have a dark side. In most gestural user interfaces, there are no cues that indicate to the user what functionality is available. Plus, gestures are implemented inconsistently from app to app.

The iPhone X provides an excellent case study because Apple’s designers have notably removed what has historically been the iPhone’s most prominent control: the Home button. Instead, the iPhone X user interface relies on gestures. While extending the use of gestures on the iPhone X was an interesting idea, its many unresolved design compromises mean users are a year later still dealing with confusion. Let’s look at some of the gestures on the iPhone X.

Swiping Left & Right

The Safari browser app has a longstanding gesture; swiping from left to right goes back a page; right to left, forward a page. However, on the iPhone X, unintentionally swiping less than a centimeterdown switches the user between apps

Swiping in the browser
Swiping on the bottom bar

Swiping Up and Down

On the iPhone, the main function of the Home button is to go to the home screen, while on the iPhone X, the swipe up gesture does it. This is quickly learned as it is an essential action. Unfortunately, when you’re trying to use a drop-down menu or play Pokémon Go, swiping up accidentally triggers the home gesture.

Safari select menu gesture conflicts with iPhone X gesture
Swiping up in Pokemon Go leaves app

3D Touch

3D touch was Apple’s attempt to add a new gesture by sensing how hard a user is pressing the screen. Unfortunately, the feature has never taken off — probably because most people can’t tell the difference between a hard press and a long press. Try it. Now try to explain the interaction to someone else. The latest iPhones seem to be doing away with this gesture, but it is unlikely to be missed.

Shake to Undo

Another example of a less-than-successful gesture is shake to undo. According to John Gruber:

“[The] shake gesture is dreadful — impossible to discover through exploration of the on-screen UI, bad for accessibility, and risks your phone flying out of your hand. How many iOS users even know about Shake to Undo?”

If people are still complaining about a UX-design choice ten years later, it probably needs improvement. For UX designers, the real risk is the assumption that the user understands all these subtle, gestural-UI concepts so there is no need for additional cues.

Design is all about tradeoffs, but what was the tradeoff here? The reason for this design choice appears to have been to hide features and preserve a clean look. The iPad keyboard displays the undo action as a button.

This is not to say that gestures are bad. Some gestures are perfectly natural: pinch to zoom out; spread to zoom in. The key is that there is a physical rather than a metaphorical intent to the gesture; The user is manipulating the virtual item on the screen, not sending an abstract command.

Manipulating the virtual

Voice

Traditional magic usually requires the spoken incantation of spells. Today’s technology — including Siri, Cortana, Google Voice, and Alexa follows that pattern.

Given the amount of press voice user interfaces (VUIs) have received, you would think they’ve become the dominant user interfaces of our age. However, most people seem to be using their phones and computers just as they always have. Reportedly, people are using these always on recording devices primarily to play music or as kitchen timers and never learning many other commands. (Who knew there was such an underserved need for kitchen timers?)

Is anyone surprised this interface is hard to learn?

But is anyone who studies users surprised that people aren’t mastering voice user interfaces? There are implicit limitations to voice user interfaces. Spoken language is imprecise — often by choice. When language is precise, speech is usually slow. We’ve all suffered through PowerPoint presentations with the presenter reading out every word on every slide. People read approximately twice as fast as they can speak. Keep the phrase “Pictures are worth a thousand words” in mind when people try to convince you that voice is the future of user interfaces.

By far the biggest issue with voice user interfaces is that they recreate the problem of the command-line interface: users don’t know what they can and cannot do.

Let’s consider Siri on the iPhone. Siri prompts the user, asking how she can help, then helpfully makes some suggestions regarding what she can do for the user. This is an improvement over the initial version of Siri, but it’s still not totally clear why the user would want to use this feature.

Yet it’s still not totally clear why to use it. Most of the examples could be done faster on screen than verbally. One example, “book a table for four in Palo Alto” demonstrates the tradeoffs.

In order to make it appear like this voice application is useful, Apple had to take a number of shortcuts.

  • Apple picked the time
  • Apple picked the 4 restaurants
  • Apple picked the reservation network
  • Apple doesn’t even suggest any restaurant that takes reservations on the phone.

The point is that real life is complicated. While voice assistants may aspire to replace a person, the necessary compromises make this prospect seem doubtful. It’s not a question of artificial intelligence; it’s the voice user interface itself that is limiting. Changing the details of time and place for a reservation is just so much easier on a screen than doing it verbally. Imagine trying to figure out the usual balancing of restaurant style, location, and availability. Voice lets you say only one thing at a time and only one person can be speaking at a time. This voice-as-command-line user experience is so limited that it’s hard to see how it can ever move beyond being a toy.

And let us not forget that interactive voice response (IVR) systems have been common on phones for decades, but people still consider using them a frustrating, slow experience. Despite their generally being good at using the phone’s physical affordances — for example, “Press 1 for sales, press 2 for service” or even “Why don’t you say the name of the movie?” the slow nature of voice communication is a fundamental challenge for voice as a user interface.

From a design perspective, voice user interfaces may even be a step backward from graphic user interfaces. The creators of VUIs seem to expect that users should be able to formulate their requests to meet the requirements of the machine. I’m sure VUIs will get better as the computing power of mobile devices increases, but an over-reliance on voice interactions is a limitation, not an enhancement. While some believe that this is because computers were originally designed to be visual, it’s also because human beings have really good visual perception. In an attempt to solve the problem of choosing verbal instead of visual interactions, Amazon has come up with a revolutionary idea: the Echo Show — a VUI device with a screen!

This is not to say that voice interactions don’t have a role in modern user interfaces. They’re pretty handy for hands-free use cases — such as sending a text saying, “I’ll be ten minutes late” while driving. But the ideology that believes “voice is the future” is sometimes causing today’s UX designers to make poor design choices.

“Negative Affordances”

A benefit of visible affordances that designers rarely discuss is that they let people know what they can do in an app. Users don’t generally expect features for which there are no affordances. For example, even though the iPhone has a Stocks app, people don’t expect to use it to trade stocks because its user interface lacks any buttons that would allow the user to do that. The app has not promised to provide that capability.

In contrast, voice user interfaces’ lack of affordances implicitly promises that they can do anything. If there are no obvious limitations, people expect apps to do everything and will be disappointed.

At the same time, users have no way of knowing what they can actually do. I’ve observed people trying out a few requests, then giving up altogether or learning just a small subset of common features. With voice assistants, people typically use only one or two skills — basic things such as playing music or setting a kitchen timer.

Chatbots are another UI where a lack of guidance makes it difficult for users to know how to use them. So bots often end up sneaking affordances back in, displaying options to tap, which is not really chatting anymore!

Search

Search user experiences are the best example of enabling the user to accomplish an open-ended task verbally, but mainly because the UX model is simple: Web search does not use a structured language of commands such as SQL, but instead lets the user type a few words to query a search engine and find content on the Web.

Beware of dark magic!

In every fairytale, unwary people fall into magic unawares. One day, I accidentally summoned dark magic while driving on the highway by triggering the Emergency SOS feature on my phone when trying to silence a persistent caller. The very loud Emergency SOS almost caused me to get in an accident!

https://youtu.be/0nfc0qOmSRo Video: Warning, very loud

I’m sure this feature is a good idea, but hiding it and expecting users to invoke it using an obscure gesture means users might accidentally trigger it and have no idea what has happened or why. It’s like a loaded UX bomb that apparently strikes when people are trying to reboot or take a screenshot! After playing the loud alarm, the feature automatically calls 911. How many accidental calls to 911 are these designers responsible for?

Design Languages Are Languages

User experience is a language and language evolves. We create new words to describe new ideas. However, the evolution of language is not a linear process, nor is it purely functional. Slang evolves faster than formal language because it is driven by a need for novelty and the experience of knowing something other people don’t know or wanting to be ahead of the curve. The incorporation of most truly successful words — such as OK — into a language happens so quickly it is almost imperceptible.

User interfaces also evolve — and also have their own slang and trends. Experiments such 3D touch and virtual reality come and go. But, just as with language, their rate of change can leave people behind. People are forced to keep up with ever-evolving technology. Not everyone wants to spend time keeping up with each release of iOS or Android. Believe it or not, most people don’t really care what version they’re using. They just want to do stuff.

Sometimes an evolution is a simplification of a visual interface. For example, the lock screen on the iPhone is a basic function that has evolved from the obvious to the abstract

3 generation of iPhone lock screens, showing increasing abstraction.

Tension exists between UX designers who are continually trying to advance the user experience and those who want to maintain consistency for existing users. While it’s great when designers make meaningful design improvements, they need to be wary of changing things just for stylistic reasons such as flat design or minimalism — especially if they’re attempting to achieve minimalism by making functions invisible.

Skeuomorphism got a bad rap, but it did help people to recognize a calendar or notepad on their screen. The trend toward flat design has had a tendency to make every app look the same, which makes it harder for users to recognize their current context.

The original iPhone was so successful because the user could simply touch an element on the screen to do something, and they could always get home again. The Home button was the only button on the front of the phone. There seems to be a new generation of UX designers who think basic UX-design standards are boring and are constantly seeking the next big thing.

The use of gestures can become a crutch — similar to the urge to push all decisions to the settings screen. The biggest sin in design is denial: hiding complexity instead of solving it.

“People who understand the system can take advantage of it, but people that don’t–the people that don’t even change their ringtone, suffer from this sort of thing..” Jesus Dias The iPhone X Is A User Experience Nightmare

by Joanna Stern

Mastering the Mystical Arts

The dream of magical user interfaces is to be more like movies — complexity that’s easy to navigate and effortless. Design is all about tradeoffs. Steve Jobs was famously against adding a second mouse button because he wanted to keep the device simple. Clearly, this mouse is too complex for novice users. Nevertheless, the scroll wheel has been a useful addition to computing.

Microsoft has a long history of teaching people how to use computers. They offer the useful concept of keyboard accelerators, which are shortcuts that they present alongside the most discoverable way of performing an action — for example, Control-S tips and context menus.

Gestures can also be accelerators. But for the average user, there should be an obvious affordance for every action. For example, iOS Mail’s swipe to delete a message from an email list is a great example of an accelerator. But if users don’t know it’s there, they can just as easily tap each email message, then tap its Delete icon. iOS mail also hides its search bar under a swipe-down gesture.

Such small affordances do not block users and could help them become more proficient with their applications .The only design compromise necessary to give search a visible affordance would be the addition of an icon — and let’s be honest, there is space for one..

However, an example of the wrong way to educate users was the camera icon on iOS 6’s lock screen. There used to be a Camera button on the lock screen. The designers then changed the action to a swipe up. They could simply have made tapping the icon work, but they instead chose to display a passive-aggressive animation that communicated: “I know what you want to do, but I’m going to forcibly teach you how I want you to do it.”

“Like Magic”

For voice & gesture interfaces to become truly powerful, we need to understand their limitations and evolve past the ideological demand that we do everything verbally. Imagine a restaurant-booking experience that showed the user’s request on the screen and allowed the user to edit it using both touch interactions and verbal commands.

“Like Magic” is the highest compliment a design can get, but as we all know, fooling with magic can have dire consequences. If we don’t acknowledge the technological powers we summon, UX designers risk repeating the technology-driven conditions that created the UX field in the first place.

But when we get the balance right, it is…

Shia LaBeouf on Saturday Night Live

An earlier version of this article was published at UXmatters.com

--

--

Dad. Husband. Designer. Created the Bubble Calendar. My opinions are mine alone.