Human-centered AI cheat-sheet

Published in

UX Collective

6 min readMay 11, 2019

On May 17, 2017, Sundar Pichai stood onstage at Google I/O and told everyone that Google was moving from being mobile-first to AI-first. Meanwhile back at the Google offices, there were quite a few folks looking around asking “what does that actually mean?”

For us fortunate few UXers who’d been tinkering with integrating AI into early-stage product development, we immediately started to see an influx of interest from our peers. So we began scraping together internal workshops to share the tips and tricks we’d picked up along the way. Those grew into a series of talks, then articles, a company-wide office hours, a mentorship program, an internal education series, and ultimately the People + AI Guidebook.

Throughout my journey as a UXer working on AI, I’ve been refining this cheat-sheet of questions-as-guidance. It’s helped me through countless consultations, crits, and jam sessions, and has continued to be my safety blanket as I’ve transitioned to Microsoft, where I lead design for Ethics & Society. Hopefully others will find it useful, too :)

Over time, the cheat-sheet has affinitized into five areas:

Foundations… when I feel like I’m missing some of the “ok, but why?” basics.
Quality… when I feel like there are important unstated assumptions.
Machine teaching… when I feel like the value of common sense might not be getting fully appreciated.
Learning over time… when I feel unclear about the feedback loops.
User agency and optionality… when I feel like there may be a “we know better than the user” attitude.

Foundations

AI is uniquely suited to situations where people can collectively agree on what “good” looks like, but it would either be infeasible to code if/then/else application logic that’d consistently produce good results or impractical for people to perform the task manually.

It’s also important to maintain a healthy skepticism about where AI can actually add unique value…

People: Thrive on novelty but get easily distracted
Machines: Thrive on repetition but never lose focus
People: Ask questions out of curiosity
Machines: Can respond instantly with precision
People: Adapt their behavior based on a small number of examples
Machines: Adapt their behavior based on a large number of examples

Inspiration

What human needs are being addressed?

Differentiator

How might a probabilistic* system uniquely address these needs? (* A system that uses fuzzy logic that’s been “learned” from patterns in reference data instead of deterministic logic that’s been manually coded)

Aspiration

Our job is to improve the lives of as many people as possible by augmenting their capabilities, so…

Let’s make … (Product or program)
For … (Who)
So they can … (Something they couldn’t do before)

Quality

The burden of proof that any part of a system can be automated should hinge on the strength of agreement — across a broad spectrum of people — about what useful outcomes look like; and how well that agreement holds up across a diversity of users, use cases, and environments of use. Furthermore, it must be possible to compare performance between socially-constructed groups, even if those groups aren’t equally represented in the data set.

Level-setting

What are we trying to predict?

Interaction

What goals will drive optimization? Over what time frame?

Utility

What would a useful prediction look like?

Perspective-taking

In what contexts do we believe predictions will be most useful? Why?

Impact

Who stands to benefit most? Least?

Representativeness

Why do we believe that the data used to train are representative of the expected users, use cases, and contexts of use?

Accountability

How will we respond when unwanted predictions are made that substantively negatively impact people?

Machine teaching

Machine learning is a process of teaching an AI to develop a ‘hunch’ about something. Traditional software engineering, by contrast, is about rote memorization; i.e. ‘recognize this precise scenario and do this precise thing’. But remember, if a human can’t do it, neither can an AI, so it’s important to initially focus on tasks that are grounded in some form of real — or at least theoretical — human expertise.

Therefore, when evaluating the unique capabilities of an AI, it can often be useful to frame things from the perspective of a human expert; and in particular how that expert might struggle due to the hard constraints of time, attention, or memory. An AI trained with sufficiently representative examples of what useful options look like can pour over mountains of data — applying the fuzzy logic it’s learned — without ever getting tired, distracted, or forgetful.

Expertise

Describe the way a theoretical human “expert” might perform the task today.

Learning

If a human were to perform the task, what information would they need? How would they get that information?

Agreement

If a human were to perform the task, what information would they consider critical to pay attention to and what would be ignorable?

Critical to pay attention to … (Examples of what people think they would paid attention to if they were performing the task)
Ignorable … (Examples of what people think they could ignore if they were performing the task)

Humility

If a human were to perform the task, what might they say or do when they weren’t confident?

Obviousness

If a human were to perform the task, what assumptions would you want them to make?

Reinforcement

If a human expert were to perform the task, how would you respond to them so they improved the next time if they …

Made a helpful prediction about what to do (true positive)
Made an unhelpful prediction about what to do (false positive)
Made a helpful prediction about what not to do (true negative)
Made an unhelpful prediction about what not to do (false negative)

Learning over time

When learning to perform a task, people are well-adapted to quickly evaluating and pruning the characteristics of the task that lack practical value. Algorithms, meanwhile, will treat all features of data as equal unless told otherwise. Furthermore, the more confident people feel about their abilities when interacting with a system, the more they will persevere, learn, and succeed in using that system.

Feedback

What can the user do to make the AI work better for them?

Adaptation

How should the AI grow with the user? How do we expect the user’s behavior to change after the 10th use? 100th use? 1000th use?

Context

How should the context of use affect the way the AI behaves?

Performance

How “wrong” can the AI be, and when?

Freshness

Which parts of the reference data will stay the same over time? Which parts will change? How much change is acceptable before it’s time for a refresh?

User agency and optionality

The UX of an AI should start with the assumption that a human being will have the final say. The more an AI’s behavior is — or should be — affected by personal context, the more reference points and calibration opportunities should be offered to the user. Said another way: The role of AI shouldn’t be to find the needle in the haystack for people, but to show them how much hay it can clear so they can better see the needle themselves.

Mental modeling

If a human were to perform the task, what questions might a user ask them in order to understand their goals? How might these questions change based on the user’s context? (e.g. time of day, environment, past experiences)

Authorship

How might the AI affect the way the user expresses their creativity or expertise?

Safety

What do we want the user to rely on the AI to do? What might the user stop doing because of that reliance?

Josh leads design for Ethics & Society at Microsoft, guiding technical and experience innovation towards ethical, responsible, and sustainable outcomes.