EXPERIENCE DESIGN
Video conferencing is failing our new COVID-19 reality
How to create virtual spaces that establish presence.
Until recently, I worked on Facebook’s Portal device, where I helped plan and design the future of video calling. My research and design work have focused on the idea of presence in virtual environments, or how to make online meetings feel more like real life. This all felt very niche until this year when a virus spread through and disrupted life across the planet.
As most office workers have shifted to working from home under quarantine, a sudden spotlight has been cast on the shortcomings of today’s video conferencing options. People isolated at home are at higher risk for depression, anxiety, and a sense of disconnectedness. While many tech platforms talk about connection in their mission statements, no platform has fully succeeded at creating a sense of genuine presence — the sense that we are truly with other people. The options for video conferencing today are grim: inflexible, asocial, and visually dated. To move forward, these solutions, or wholly new ones, will need to embrace the importance of presence by supporting more flexible camera ecosystems, increasing caller visibility (in feeling as much as visuals), and seek out ways to communicate the idiosyncrasies that make being with someone feel real.
How to create virtual spaces that establish presence
For centuries, fairy tales told of enchanted objects like magic mirrors and crystal balls that enabled one-way or two-way visual communication. Radio and television made those fantasies seem possible, and by the 1968 World’s Fair, AT&T had unveiled the first commercially available teleconferencing system, called Picturephone. It had a call time limit of ten minutes and was astronomically expensive. Little movement happened in the field until the 1990s, when advances in video compression, camera hardware, and the internet made webcams and chat rooms a part of modern living.
Presence starts with visibility
As cable internet dramatically increased the fidelity of online experiences, enterprise players like WebEx, UStream, and Macromedia Breeze forged the first major video conferencing and e-learning platforms. They entrenched themselves into legacy enterprise IT solutions and have done little to advance the user experience since.
Designs for these platforms were focused on broadcast rather than conversation, designed with of-their-day metaphors, like “call” or “hang up,” for easier use, and often favored a one-way presentation format. Audience members felt anonymous — no profile pictures, rarely video streams of their own, and no indication of whether they were active or away. The lack of visibility in the experience hurt participation.
I worked for a professional coaching company at the time, and it was an exhausting effort for online instructors to keep the invisible participants focused. They lamented the struggle of online classroom engagement versus physical ones.
The design approaches of the 2000s era platforms were hollow, comically literal designs by today’s standards. Today’s interfaces, if they haven’t already, should do away with dated physical references and emphasize direct ways to show that participants are online and active. Relying on physical metaphors often takes more space on the screen and can create an uncanny valley that inhibits an authentic sense of presence.
Utilities like Google Docs and Dropbox Paper today show how even platforms not primarily focused on synchronous conversation can become more social by showing your collaborators’ cursors and avatars to create a greater sense of connection. We don’t need visual cues that remind us that connecting via video vaguely resembles a tool we used to use called a telephone. By stripping those cues away, we help make the connections between each other feel more immediate.
Of all these early approaches, Skype was really the first to point the way to making presence a greater priority. It was one of the first platforms I ever used to video-call friends and family while traveling abroad. Those early calls look pixelated and rough now, but they were a leap forward in seeing, hearing and being near to someone.
Meanwhile, experiments like ChatRoulette, launched in 2009, showed that video conferencing could actually be fun and fluid. World of Warcraft and XBOX Live were pioneers in how big and how important online communities built on presence could be, even if in these cases, “you” were represented by your in-game avatar. These gaming platforms also began to spur a shift in the perception of how people could play and collaborate virtually rather than in person. The anonymous paradigms that had built the giants in the industry were changing almost as soon as they were formed.
Real presence is flexible
Mobile phones, and more recently smart home devices, catalyzed a new wave of video platforms just as cable internet had done before. Apple launched FaceTime. Google launched Hangouts. Ex-WebEx employees founded Zoom. Amazon, Facebook, and Google would all create smart home displays that leveraged cameras to make conversations more casual and seamless.
The world’s biggest social networks, including Facebook, YouTube, and Twitter, built live video features and pushed them aggressively, in the process familiarizing most people with a new vocabulary of interactions based on real-time reactions and commenting. Pop-tech phenomena boomed around synchronous (Periscope) and asynchronous (Vine, Snapchat) video communication.
In gaming, Twitch built an entire subculture around live streaming video games. Professional gamers, Twitch celebrities, and eSports all became a thing. Discord built experiences around walkie-talkie style audio chat for teams of gamers.
These platforms all raised the bar in user experience. Not only did the diversity of devices lend itself to more flexible times and positions to make video calls, but they also encouraged video calls to feel more casual and friendly. When you can sit as you would with a friend in real life, you transcend a video chat toward a state of true presence in your conversation.
For Portal, the breakthrough feature was the smart AI “cameraman.” The Portal device itself is a small mounted display, roughly a foot across, that sits comfortably on a kitchen counter or living room end table. The smart cameraman can zoom out to include new speakers in the room or follow you while you naturally move around a room. The freedom to sit, stand, or walk comfortably while talking to someone else greatly contributes to a sense of togetherness. In fact, it was jarring to go back to a Cisco WebEx system in my new office, where the camera stared flatly, wide-angled and left the speaker as a tiny pixelated corner of the image. At home now, I feel chained to the view range of my webcam. I’m constantly reminded of the limitations in zooming on an object I’m holding or the ability to bring a fellow “quarantiner” into frame.
While enterprises over the past several years have shifted much of the communication load off of email and onto shorter, more casual text chat platforms, B2B video conferencing didn’t take the hint on flexibility and remained largely stagnant. Imagine being able to take a call in your home office and choose to continue the discussion as you walk to your kitchen for a glass of water, the various smart devices with cameras in your home handing off between each other to continue the discussion. Our homes aren’t yet outfitted with the hardware to handle such flexibility. But with the white-collar world working from home for the foreseeable future, the time is now to accelerate such options to help free workers from their chairs into a more natural, organic virtual conversation space.
Personal idiosyncrasies help build presence
Zoom backgrounds and AR filters have introduced games and storytelling to video calls, which help make connections between people over video feel more personal. Experiments in workplace VR are reviving design discussions around how to establish an authentic sense of presence among virtual participants while steering clear of the uncanny valley.
These efforts feel like the right first steps toward the third key pillar in establishing presence: personality, which is built on idiosyncrasy. I have a favorite design principle that I quote frequently: reduce frustration, not friction. Too many interface designers assume that getting someone from A to B as quickly as possible is the best experience. You’ll often hear them brag about how you can do something in “three clicks or less.” But in establishing presence, the criteria are more complicated.
Real connection is built on the messy interactions between people — which is not an efficiency problem as much as it is an environment problem.
When designing for Facebook’s Portal, I often looked for opportunities to take UI out of the task at hand. If you and a friend were playing a game physically next to one another, idiosyncrasies around how a player holds their cards, whether they challenge your Scrabble word, and even in setting up the board are all parts of an experience that contribute to a sense that you’re really there. My grandmother is an infamous cheater who likes keeping her UNO cards close to the edge of a table so she can ditch some discreetly. It’s a detail that makes playing with her a (ahem) distinctive experience.
Interfaces until now have often sought to pave over these moments with efficiencies. If a product designer today made a virtual game you could play across video calls, they’d probably go straight to auto-sorted cards, auto-challenging invalid words, and randomly deciding who goes first in the game. While the intent would be to make the experiences “easier to use,” it actually sanitizes these interactions of their humanity.
Video conferencing today should be mindful of these opportunities for personality and perhaps even remove some assumed efficiencies in pursuit of more human-driven exchanges. What if video conferencing could introduce more serendipity to daily work? How might physical gestures like a hand wave goodbye or a shushing finger to the lips remove some of the digital layer between two humans talking?
It’s also important to remember that much of what constitutes an in-person social experience is passive. A physical office is an always-on experience where it is easy to chime in, interrupt, or watch something with someone. We see from emerging apps like Airtime and Squad that quality time can be spent just watching together. We learn from platforms like Twitch and YouTube that following someone’s “work” can drive togetherness, discussion, and community. The ability to move naturally in front of a device highlights the impact that smarter cameras and human-centered design can have. When our video conferencing blends with our physical space, we don’t feel like we’re quarantined. We feel together.
The opportunity to finally redesign and bring about closer relationships through virtual platforms is exciting. More than ever, our mental health and our business success depend on easy collaboration that feels realistic and human. From day-to-day interactions with co-workers to sales calls and client meetings to large-scale events, every social dimension of business will likely be mediated by digital platforms for months to come. That’s a harrowing thought, but it also gives us the time — and creates the imperative — to make these digital interactions better.
For now, the best solutions lie in blending the platforms that get aspects of presence right. But we can do better. Together we can spend this challenging time sharing best practices, developing and lifting up new platforms that offer better solutions, and working with account teams to share feedback to product teams. Our virtual spaces need visibility, flexibility, and idiosyncrasy in this unprecedented time of isolation, helping us hold onto our human ties and maybe even make them stronger.