On input masking: taking the pain out of forms

So what? 😕

Published in

UX Collective

15 min readMay 22, 2020

A form is, at its core, a series of inputs into a database. Yet users don’t always fill out inputs the same way — even with clear prompts. If users enter invalid data, affirm stodgy database architects, it’s their fault for not following ostensibly clear instructions. This assumption is as haughty as it is common. Indeed, an average of 89 percent of users enter input numerical data based on personal preference rather than as it appears in an input field.

When the overwhelming majority of users make a mistake, it’s not a mistake; it’s an intuition unaccounted for in the form’s design. The assumptions underlying that design must change to accommodate the behavior, not vice-versa.

An old form dating to the census of 1850. — Digital forms reflect tabular data, such as these entries from the 1850 Census. Source: U.S. Census

The Wisdom of Forms 🦉

The global shift from paper to digital pages has done nothing to remove forms from our lives. Forms have come a long way since Roman surveyors recorded property boundaries on tablets. In the early days of graphical user interfaces, when computers had moved beyond the DOS-like command line into grayscale GUIs, forms closely resembled their paper predecessors. Empty input fields with little to no validation, they did little to prevent user input errors. The silver lining, of course, was that typed input was at least legible.

We’ve had a go at collecting and standardizing forms for a few millennia, and over that time a few things have come clear:

Information tends to fall into discrete categories.
It’s best to ask for information one detail at a time.
Ask for what’s needed rather than what’s already known.

I’ve distilled these into six lessons — six because it’s a bare minimum for what I consider best practice for form inputs. Afterwards, I provide a bunch of code prototypes that put these lessons into practice; feel free to borrow them for your own projects.

Lesson 1: Reconsider What’s in a Name

A bone of contention among designers is the ideal way to enter a user’s full name. Forms designed for English speakers split names into three parts: first name, middle name or initial, and last name.

A more recent trend condenses these parts into a single field that separates them again with a string parser. Efficient though it may seem, running a parser over a “full name” input field is risky. How does the parser handle hyphenation or compound surnames? Can it accept a middle name? Does it accommodate users without traditional middle names? What about orthographic accents to denote exceptions to standard syllabic pronunciation?

How, for example, might a full name parser manage naming customs in Spanish or Portuguese-speaking countries? Though customs vary by region, Hispanophones (and Lusophones, with some differences) generally receive two first names, an optional middle name, and paternal and material surnames. The combination of compound surnames and prepositional particles, such as “y” or “de,” yield names easily capable of overwhelming all but the most carefully crafted parser.

The anatomy of the former prime minister of Spain, broken into his forename, paternal surname, and maternal surname. — A breakdown of the full legal name of Spain’s prime minister between 2004 and 2011.

An Anglocentric string parser would interpret Francisco Gómez de Quevedo y Santibáñez Villegas, a seventeenth-century Spanish poet, as “Francisco Villegas” rather than Francisco de Quevedo. Briefer names, like Gabriel García Márquez, would fare no better; García Márquez is a combination of his paternal and maternal surnames.

Dom Manuel II, the last king of Portugal. — Manuel II, the last king of Portugal (1889–1932) whose reign ended with the proclamation of the Portuguese Republic in 1910. Source: Agência Geral de Gravura de Lisboa

Sure, nobody expects a parser to make sense of names like that of the hapless last king of Portugal, Manuel Maria Filipe Carlos Amélio Luís Miguel Rafael Gabriel Gonzaga Xavier Francisco de Assis Eugénio de Bragança Orleães Sabóia e Saxe-Coburgo-Gotha. Most people refer to him as “Manuel” and tack on “the Unfortunate” as a sobriquet, his misfortune having nothing to do with the algorithm that compresses his name into “Manuel Saxe-Coburgo-Gotha.”

To keep things simple, keep them separate. Always include a first and last name field, and an optional field for the user’s middle name or initial. If you’re masking the field’s input, allow for multiple names, hyphens, spaces, inter-word separators, and orthographic accents.

Lesson 2: It’s Better to Prevent than to Fix 🛡️🔧

Asking for too much information has always been annoying, but it’s also costly. No less than 23% of online shoppers abandon a checkout flow because it’s too long — with an average of 23.48 form elements. (Around 53% leave off because of costs and 31% do the same because the site tethers feature usage to creating an account).

Flow abandonment is the designer’s worst nightmare; it suggests a lack of understanding of, or worse, a lack of respect for the user. Keep form flows brief by asking the bare minimum needed for the user to complete a task. You can always ask for more information elsewhere or later.

Lesson 3: Don’t Ask for Everything at Once

Nowadays, conversational UX is de rigueur. There are conversational onboarding flows to make users feel welcome and conversational chatbots to mollify them when facing a problem. There’s conversational copy, conversational ads, and yes, conversational forms, too. Stilted professional language is out, replaced with casual tones that sometimes verge on banter.

The etymology of “conversation” gives us clues as to what the term might mean, and it’s not just scrapping formalities. Conversation literally means to “keep company with” (con, or “with” and “versare,” the Latin frequentative of vertere, “to turn”) at least one other person. A mutual turning together, conversations are by definition interactive, entailing a back-and-forth of ideas or information.

Multi-column input fields aren’t the ideal way of structuring a form. — Above: input fields abound in NewEggs’s and Cabela’s checkout flows. Both companies have updated their checkout flows, since these screenshots featured as examples of poor checkout flows in a recent article in Baymard.

Conversations tend to treat things one at a time; otherwise, they end in awkwardness. At a cocktail party, you’d scare off the guests if you asked for everyone’s details all at once or in no sensible order. So it is with digital forms. Assuming the user wants to surrender his or her data, ask for one thing at a time in a way that makes sense — names before payment information, etc. It also makes sense to organize questions seriatim, either in a single-column layout split across a few pages (hopefully just a few) into a stepper with a progress indicator that displays one question per screen.

Lesson 4: Assume Nothing. Oh, and Give Examples.

Labels and placeholder text can make or break an input field. Labels should be succinct — two or three words at most — and sit above the input field. Never use placeholder text as a label. Very light gray text doesn’t call attention to itself and may be difficult for some users to read. Worse, placeholder text vanishes the moment the user clicks inside the input field.

Shorn of a label above, the input field becomes a box with a blinking cursor. When space is a premium, floating labels are a viable alternative as long as you account for accessibility.

Input fields should contain persistent labels and placeholder text that, while explanatory, shouldn’t substitute for labels.

Placeholder text is key to giving the user context about what kind of input to enter. It works best when written as a question or by supplying an example. If the label reads “First Name,” the placeholder text might ask, “What’s your first name?” Those unfamiliar with ZIP Codes, such as newcomers to the United States, would benefit from a placeholder like “e.g., 94110.” Well-crafted placeholders are more than just a gimmick: blind or visually impaired users rely on assistive technologies to read placeholder text.

An example of an input field with a persistent label and an example of the input.

Buttons should never read “Submit” — a verb as threatening as it is vague. It’s only fair that after putting in the effort to complete a form, users know what completing the form actually means.

Clicking a button performs an action the user should expect. “Create Account,” “Send Message,” and “Subscribe” are all familiar actions for anyone who’s used the Internet. They also give a clear indication of what clicking on the button at the end of the form will do.

A massive action button with a large and specific CTA.

Lesson 5: Accommodate Multiple Genders.

Western culture has traditionally considered everyone to be either male or female. The binary between man and woman, the masculine and the feminine, has long been considered as natural and self-explanatory as the weather.

More recently, gender theorists have pointed out what should have been obvious all along: how we talk about the sexes has little to do with biology. We even have separate words to handle the distinction: sex and gender. Gender is the social construction of sex, and while sex follows the glacial pace of evolution, gender is fluid, its shape formed in the cultural tides. These tides shift with each generation, sometimes sooner, and so it is that gender non-binarism became a cultural mainstay in the 2010s. (Official recognition is another matter.)

Between the traditional symbols representing men and women, a new symbol representing people who identify as gender-neutral.

As with algorithms, a form reifies its designer’s assumptions. Common among them is to drop a pair of radio buttons, “Male” and “Female,” that allow users to select a gender. This binarism is long enshrined in law and public policy, a not-so-tacit refusal to recognize non-binary gender identities. Still, in the United States as of late 2019, some seven states recognize gender fluidity officially by including “non-binary” (versus “male” and “female”) and “X” (versus “M” and “F”) on certain legal documents. No one should be surprised that political identification plays an outsized role in support for legal recognition of non-binary people.

What happens at the state house needn’t determine how designers or the organizations they represent handle gender: making a small adjustment to accommodate more users makes good business sense. For the designer working at a consumer SaaS startup, tailoring profiles and onboarding flows to non-binary users is a product decision that varies widely between organizations; for the designer working at an organization in a highly regulated industry, such as health care, making room for those who identify as non-binary may be considerably harder.

Yet even when users are forced to identify as male or female, they can resort to creative workarounds. Blue Shield of California added a non-binary option following the ratification of the California Gender Recognition Act in 2017, while OneMedical, a boutique medicine practice, allows users to qualify their gender even while requiring them to identify as male or female. The availability and cost of medical care in the United States hinges on one’s biological sex, something the federal and most state and local governments conflate with gender identity.

Sometimes the best way of avoiding gender bias on a form is not to ask about it at all. Regulations may require the user to indicate a gender as a precondition to receiving a particular service, but not always. In that case, and in the spirit of keeping forms short, omit the question about gender altogether.

Lesson 6: Garbage In, Garbage Out.

Approximately 98 percent of e-commerce sites place restrictions on inputs for specific fields, such as phone numbers or ZIP codes. Validating errors based on these restrictions can vary widely. At least five interaction patterns prevail:

Users type whatever they want into the input field, submit the form, and receive a request from a customer service representative to fix the errors.
Users type whatever they want into the input field and only learn about errors when submitting the form.
Users type whatever they want into the input field and learn about errors upon exiting the input field.
Users can only enter valid data into an input field (e.g., numbers for ZIP codes, letters for names, etc.).
Users can only enter valid data into an input field, which automatically formats the input based on the valid format.

Take, for instance, the variability in entering a now-defunct phone number in Pima County, Arizona:

(520) 529–2036
520–529–2036
520 529 2036
5205292036

The complexity of these combinations grows when prepended with a country code (1 or +1) and an extension. And it can balloon depending on the restrictions placed on the input field. Let’s review a form containing five input fields designed to accept a phone number in the United States:

No Masking or Validation
No Masking with Validation on Submit
No Masking with Validation on Blur
Limited Masking with Validation on Blur
Full Masking with Validation on Blur

The first pattern is as effortless — as in, no effort made — as it is unforgivable. It’s also (fortunately) rare and therefore not worth further discussion. The second and third patterns, though, are more worrisome: 64% of e-commerce sites rely on these patterns, of which some are justified. A website without a portal that sorts incoming traffic by country can’t place meaningful restrictions on the length of a phone number.

Those in the United States are a reliable 10 digits, while in Germany, which maintained an open telephone number system until 2010, they can range between five and 11 digits. Patterns four and five entail significant effort and are only worth it if they match regional conventions (e.g., a five-digit ZIP code) and can handle editing in the middle of the string. Otherwise, they’re not worth the effort.

Technical glitches or errors borne of rickety assumptions can frustrate users into abandoning a form flow. Left unaddressed, organizations may see spikes in customer support inquiries and declines in revenue and NPS scores.

A skeleton screen of a single-column form.

Demos 💃

If you’re going to mask and validate input on your site (or app), start with the basics, the kind of data every user must provide as a precondition to signing up, signing in, or accessing his or her account. The fate of e-commerce sites, for example, hinges on the quality of their checkout flow. Asking for too many details or requiring too many steps is a recipe for abandonment. So is placing masking and validations on fields that confuse the user rather than ease their journey through the flow.

Two Suggestions ☝️

Just kidding: before diving into specific types of masking, I’ve found that two types of event listener work well between different scenarios for masking and validation:

Masking does well when setting an input event listener on the input field. In JavaScript, using keyup or keydown can sometimes work, but just as often interferes with typing. Use these event listeners with caution.
Validation should take place after the user exits the input field. For this, I prefer blur, but focusout should work equally well.

Now onto masking!

Names 📛

First, middle, and last names should allow for orthographic accents, hyphenation, and compound words. With a few exceptions, such as “de” or “von,” names are capitalized. Name fields should also contain a few restrictions:

Periods should only appear after names — “St.” comes to mind — and never in multiples.
Users should only be able to type one period, dash, and space between letters.
Name fields should accommodate most orthographic accents. The example below can handle accents in most languages in the Americas and Western Europe; the Regular Expression lists each of the allowed characters rather than filter them individually, as can be done by specifying a range of Unicode blocks.

Passwords 🔐

Passwords, I hope, will soon become passé, supplanted by better alternatives like Slack’s magic link. Until then, we’ll be saddled with passwords whose complexity seems to balloon quicker than the national debt. Length, variety, and randomness are all ingredients in strong passwords—“strong” meaning impossible to guess and tedious to crack with brute force attempts. Though I’m partial to pass phrases, a common formula for strong passwords is a minimum of 8 characters in length and a mix of capitalized letters, numbers, and special characters.

It helps to give users feedback when creating a password of this sort with a progress bar or another indicator rather than leave them guessing until a general error message tells them where they’ve gone astray. Also, unless absolutely necessary, allow users to toggle the password field’s visibility rather than require users to enter their password twice. Less is more.

Here’s the code in vanilla HTML, CSS, and JavaScript — vendor prefixes and all — you’ll need to set up client-side password masking (more on validation for different types of HTML fields in a bit):

Email ✉️

HTML5 includes a wide array of input types that Regular Expressions can validate with the HTMLSelectElement.checkValidity() method. Popular among them is email, or <input type="email">, given the centrality of email in account management and user outreach. Email fields, however, only accept plain letters, numbers, and a handful of special characters, all specified in RFC 5322: Internet Message Format. But checkValidity() becomes useful after the user enters an email address, not while typing in the field.

Phone Number ☎️

Phone numbers, reviewed above, only come in certain combinations in the United States. The checkValidity() method works on phone numbers <input type="tel">, too, and adjusts to international phone numbers based on the Regular Expression in the input field's pattern attribute.

Social Security Number 🔒

In the United States, Social Security Numbers (SSN) remain, unfortunately, the main way of proving one’s identity. This seems to have worked well through much of the twentieth century, but is now antiquated, a favorite tool of identity thieves. It’s nonetheless a mainstay for governments and financial institutions. If you must ask users for their SSN, do so in a way that inspires trust: allow users to toggle its visibility and encrypt it once it enters your database. Better yet, don’t ask for it in the first place.

Percentage

Creating a field that accepts a percentage raises a host of questions. Should it permit anything lower than 0% or higher than 100%? Should it accommodate decimal points? If so, how many? For input fields designed for percentages, I recommend allowing a range between 0% and 100% without decimals. In the JavaScript, updating the maxlength attribute in the HTML and changing 100 in if (Number(this.value.replace(/[$,%\\s]/g, '')) > 100) will allow for different uses of the field. To allow for decimals, add a full stop to this.value.replace(/[^\\d/]/, ''), or this.value.replace(/[^\\d\\./]/, '') . There's more to it, of course, but a full account of Regular Expressions in the String.prototype.replace() lies beyond the scope of this article.

Currency 💰

A while back, in the face of a deadline, I quested after a simple way of automatically formatting numbers to display as current. The rules, I thought, were simple enough — a dollar sign prepended to the number, a separator for thousands, millions, and billions (that’s enough for most use cases), and a decimal separator.

The main obstacle to getting this to work was the separators. The thousands separator would pop up as expected when any number exceeded 1,000, but wouldn’t disappear if the number fell. Decimal points presented the same issue, and even when I managed to snuff out these bugs, the deadline loomed. I decided to split the difference by formatting the field to accept numbers, commas, and periods up to 16 characters and lean on a plugin, cleave.js, to do the heavy lifting. If someone has written a lightweight plugin in vanilla JavaScript, I surmised, there was no reason (other than personal edification) to continue tinkering. This hybrid approach did the trick:

ZIP Code 🗺

ZIP Codes, or “Zone Improvement Plan Code,” originated in large cities in the early 1940s and took on their current form in the 1960s. The USPS tacked on an additional four digits, sometimes called “plus-four codes,” in 1983, an optional add-on that pinpoints the location of an address.

Today ZIP Codes remain ubiquitous and, on web forms, often obviate the need to specify a city and state. Relative to other numbers, masking and validation for ZIP Codes is fairly straightforward. All that’s needed is a block on all content types except for numbers and a hyphen that appears if the user tacks on a plus-four code. Trigger an error message if the field contains anything other than five numbers alone or with the four-digit add-on with a hyphen separating the two sequences.

Icons and Shaking 🎲

Error messaging is an art and science of its own. Needless to say that while error messages require thoughtful timing and presentation, above all they should tell the user what’s wrong with the field input and giving them a sense of how to correct it—a heuristic so sensible that it predates modern websites and apps.

Over and above error semantics, immediate visual feedback can signal whether input is correct or incorrect. For this, I’ve used two methods:

On the right side of the input field, display a green checkmark for valid input and a red x for invalid input.
Shake the field with invalid input when a user exits à la macOS.

Both combine well with inline error messaging in small text below the field.

Method 1: Icons

Method 2: Shaking

What Input Masking Can’t Do

For all its merits, input masking can’t prevent users from making mistakes. Valid information isn’t necessarily correct. Until artificial intelligence renders input fields obsolete, user interfaces can only do so much to prevent mistakes; a GUI can’t solve problems for users any more than it can preclude them from entering incorrect data. Aside from confirmation, there’s little point in asking for something you already have.

Input masking and client-side validation are more or less helpless against users familiar enough with code to be dangerous. Without server-side validations as backup (and they aren’t enough, either), users need only open the Inspect Element browser tool and start editing attributes and patterns in HTML fields or the JavaScript itself. In the end, the thrust behind input masking is to make filling out form fields easier, not to block invalid information from entering the database. Even the best masking and validation can’t always stop that from happening.

Thanks for reading! 🙏

Chris Kark is a Denver-based product designer with a background in teaching foreign languages and literatures. You can reach him at ckark@alumni.stanford.edu.