Synthetic faces, facial recognition and the lesson of Hikikomori

Published in

UX Collective

7 min readJan 18, 2020

“Patients frequently describe a sense of relief at being able to escape from the painful realities of life outside the boundaries of their home. …With advances in digital and communication technologies that provide alternatives to in‐person social interaction, hikikomori may become an increasingly relevant concern.” World Psychiatry Today 1/10/20
“But, unlike real people, digital humans are accessible 24 hours a day, seven days a week; they never have an off day; they never forget; and they can use deep troves of data to provide in-depth, real-time answers.” digitalhumans.com (previously FaceMe.com)

The Turk was a fake chess playing machine purporting to be an automaton that was built in 1770 by Wolfgang von Kempelen and astounded audiences and Emperors of Europe by its high level of playing, including demonstrating the knights tour whose solution was poorly understood at that time. In fact the machine had a variety of chess masters cleverly hidden inside the cabinetry.

Wolfgang’s reluctance to further show the machine drove demand for displays of chess prowess, but he was more interested in researching steam engines and machines that replicated human speech, so he said it was in disrepair.

In 1781 Emperor Joseph II ordered the machine delivered to Vienna for a match with Russian Emperor Paul I. Its popularity continued and in 1809 Napoleon I played and lost to the Turk, and it was exhibited in New York in 1826. Its secret was only revealed in 1857 by a US chess magazine.

Complex devices have existed since ancient times, such as the Antikythera mechanism, and inventors have sought to push the boundaries of what is possible to either mimic, automate, replace or augment human work activity.

While robotics are real machines, welcome to the new world of Soul Machines, your new best digital human ‘friends’ and enterprise customer sales and service assistants.

Soul Machines markets digital avatars with chatbot like language capability including speech, calling them ‘Digital Heroes’ with an IBM watson Q&A back end. These are next-gen user experiences that are being up sold as the solution to effective and efficient customer engagement.

A live example is helping people with a disability access information through this new automated user experience. And, while the Australian Government National Disability Insurance Scheme’s deployment of Soul Machines ‘Nadia’ has been deferred after $3.5M spend, it will be released eventually, because of the 6000 calls that cost $25 each. $7.8M a year.

It is superficially a noble premise, but is a worry. People with disabilities, perhaps some of them cognitive in nature, some of the most helpless in society are being fobbed off to an automated UX, forced to interact with a digital IVR on steroids with a 3500 Q&A database voiced pro bono by Cate Blanchett with the propensity for nuances to be lost and misinterpretation or confusion to ensue.

It is perplexing how much the marketers vocabulary taints Soul Machines public portal:

“Digital Heroes”, “Human OS” “Digital Brain” “Modelled on research from the world’s leading neuroscientists.” “89% of customers achieved their goals through engagement with a Digital Hero” “More than 81% of customers would chat to a Digital Hero again.” “Creating deep connections between brands and fans”. “Emotional Intelligence” “Emotional Responsiveness.”

Even using the word ‘Soul’ is really odd.

Just like video-games provide a virtual world to explore or build or kill in, this is a very smart gamified UX avatar digitally mimicking an 2D representation of a human face using nonverbal cues to get an emotional response from the real human observer.

Soul Machines’s New Zealand CEO and Associate Prof. Mark Sagar spent decades in the US and has won two academy awards on Avatar, King Kong or Spiderman 2. Using that expertise and Auckland Uni’s Bioengineering Lab’s expertise is impressive.

Crossing risky boundaries

What is not impressive and perhaps unthought of is that, like UneeQ and Realbotix, a boundary is being crossed. This is gamification of an inauthentic experience leaking and intermixing in reality.

A videographic computer game is like an immersive adventure novel. We turn it on, use willing suspension of disbelief and are within that fantasy context the whole time. No harm except lost time or in rare cases addiction.

Now these new classes of UX’s are stepping out of a discrete bounded context to become contiguous with a wider complex reality where we will make decisions based on advice from code where the code has no consequence to it of error — unlike a human assistant who can have penalties for wrong or misleading advice.

Using a bio-mechanically based model that integrates with current models of the brain, emotional cues are rendered almost naturally and the IBM Watson Q&A or chatbot databases with human speech aim to trick human cognition to digest its utterances.

This is potentially risky where a mistake can be made, or there is a cognitive disability or miscommunication.

What is much worse than all of this and not really disclosed but probable is that through the experience, the tech measures us, data captures us, applies facial recognition to the user.

The bioengineering technology that allows them to generate a face that mimics facial movements that relay empathy obviously has its source not in analogs but in real recorded humans expressing a suite of emotions that have been catalogued and API-enabled. Each human user is thus continuously measured and recorded for their response to the synthetic stimuli.

Just as chatting with the Mitsuku chatbot can get a real response of laughter or anger, interacting with the digital ‘soul’ is new class of synthetic user experience based on a bi-directional feedback.

While the previously clear line between the real and non-real (digital) experiences is now being crossed and blurred by synthetic digital experiences, the real experiences is being captured in a feedback loop.

Dr. Sagar even made a synthetic digital learning baby that grows up, Baby X.

This new class of UX makes for a synthetically compelling channel experience that customers are enticed to rely on to make ‘meaningful’ bonds for corporate clients who want to offload the cost of customer contact.

The fundamental premise of human experience is to seek out meaning and authenticity. Even if the back-end of these faux ‘digital heroes’ is a weak Q&A set at the moment, the convergence of the front-end with an emergent possible artificial general intelligence (AGI), whether in 10 years or 100+, does that mean the need to seek out meaningful experiences is diminished?

And, instead of making a meaningful connection with a real human, would I prefer to interact with a synthetic reflected version of my own identity, a shard of myself?

In Japan today, apparently 700,000 people with an average age of 31 choose to live in extreme social isolation. The cause of this phenomenon, called Hikikomori, is uncertain.

What is certain is that even conservative medical practitioners are concerned about the loss of social interaction that is made available through online technologies that provide alternatives for avoiding interpersonal interaction.

If as a child I was rejected and decided to practice interpersonal avoidance to reduce hurt, these new classes of UX provide the easy option, and because of facial measurement enabled by facial recognition, the synthetic interface is a far more compelling and dangerous bidirectional feedback empowered interface than a wordy chatbot like Mitsuku [1].

Designed not to offend, learning what you talk about to just talk about the similar, narrowing real experiences, must have a consequence to cognition.

What is very sad about this development is that the people who are most vulnerable to these experiences are the most at risk and will gravitate towards them. Those at risk of social isolation whether through a disability, a past hurtful experience, or other tendency are the possible first casualties in tech that crosses the boundary between the real and non-real and fills a gap that society cannot at the cost of one part of their sanity through deriving faux albeit meaningful experiences with synthetic digital conversations.

As we have said before, while technology is ethically neutral it is the human inventors and owners of the tech that have the duty to see it deployed for good and sane purposes not just for profit. I think the schemer of marketing this tech with language like ‘digital heroes’, ‘digital brain’, ‘soul machines’ and ‘Baby X’ in the absence of critically thinking about the possible bad outcomes for people who are at risk is poor form and may result in real psychological harm that, while legally abrogated by the customers accepting the software terms of use, cannot be ethically abrogated at all.

Footnote

[1] Although a cartoon-like avatar experience is available on Twitch.

Synthetic faces, facial recognition and the lesson of Hikikomori

Crossing risky boundaries

Written by Dr. Adam Hart