Data basics for UX people

Why Data, Anyway?
UX designers have numerous methods to improve their design, such as user interview, focus group, diary study, persona, storyboard, task analysis, customer journey map etc. Many of the methods are intuitive and powerful; they speak a lot about user needs and stories. These qualitative methods are driven by the urge to understand the users, to empathize them to create better solutions.
However, qualitative methods aren’t always the best ways, especially when it comes to evaluating the prototypes and products. We often face evaluator effects when we conduct usability tests. We think we observe something objectively, but in fact we might be just seeing what we want to see. As the Yale psychology professor Paul Bloom says:
People are often highly confident in their ability to see things as others do, but their attempts are typically barely better than chance.
That’s why we need to step back and take a different perspective to understand the users. That’s where quantitative data and statistics come in.
Quantitative Metrics in UX Research
Quantitative research is not as difficult or expensive as one might think. If you conduct a qualitative usability test, such as think-aloud, then you can just measure some data during the test session, or/and add a short questionnaire after the session.
Here are some metrics that you can use to measure effectiveness, efficiency and satisfaction.
Metrics for Effectiveness
Can the user complete the tasks successfully? You can measure Task Success, Errors, Issues-based Metrics (usability problems found in the test), or Self-reported Metrics.
Metrics for Efficiency
How easily and quickly can a user complete the tasks? You can measure Time on Task, Efficiency Metrics such as page count and click count before completing the tasks, Learnability Metrics such as Task Time across Trials, or combination of those metrics. Behavioral metrics such as Eye-Tracking are also useful to measure efficiency.
Metrics for Satisfaction
Is the user satisfied by the interaction with your product? You can use Self-Reported Metrics such as System Usability Scale (SUS), Computer System Usability Questionnaire (CSUQ) and Net Promotor Score (NPS). Additionally, you could use behavioral and physiological metrics such as heart-rate, skin conductance or face expression, to estimate stress-levels and emotions that participants are not willing to say (some participants are too nice to speak their negative emotions!).
Which exact data you should collect depends on the goals of the users, the goals of your product, and conditions such as project schedule, budget and other resources.
The 4 Types of Data You should Know
Understanding your data is critical when analyzing the research results, because different types of data require different types of analysis. Below are the 4 types of data that you should know to do some statistics.
Nominal Data
Nominal data are groups or categories that are not ordered by numbers. For example, if you want to compare the performance between different user groups, such as “male vs. female” or “frequent user vs. non-frequent user”, the groups are not ordered and therefore are nominal data. (Of course you can put a number for each group for convenience, but that doesn’t mean you can compare the groups as numbers, as it is only arbitrary coding.) The examples of nominal data include user characteristics, locations, gender or groups with different expertise.
Ordinal Data
Ordinal data are ordered groups or categories. For example, when you ask users how often they use your website by choosing from “very often”, “often”, “sometimes” and “rarely”, then the acquired data are ordinal. In this case, “very often” means more than “often”, and “often” is more than “sometimes”. They are comparable but the distance between each rank is meaningless. In other words, you don’t know how much more often the user who chose “very often” uses it compared to a user who chose “sometimes”. Therefore, you can’t analyze the data using an average rank or other statistical methods.
Interval Data
Interval data are continuous data where the distances between each value are meaningful but there is no true zero point. An example is temperature data in Fahrenheit or Celsius. The distance between 10 degrees and 20 degrees are meaningful, but the zero point for temperature is only arbitrary. In UX researches, subjective rating data are often treated as interval data. Let’s look at this scale.

It can be treated as ordinal data, but if the distances between each point are same and meaningful, then it can be treated as interval data. You can judge if it’s ordinal or interval by asking if the halfway point between two dots makes sense.
Ratio Data
Ratio data are almost same as interval data, but they have a true zero point. For example, when you measure task complete time, the acquired data are ratio data because there is an absolute zero point (zero seconds means no time at all, unlike when you say zero Celsius degrees where it’s just a certain point in the scale). Other examples of ratio data include age, height, weight and number of tasks completed.
What kind of Statistical Test should you use?
Now we understand the differences between 4 data types, let’s think about some statistical methods to analyze the data.
For Nominal Data — Frequency, Chi-square
Simple descriptive statistics can be used for nominal data. For example, you can count the numbers for each category and make a frequency table. To compare nominal data, you can use a statistical test called chi-square. Chi-square test is used to decide if two categorical data are related or unrelated (it’s called “dependent” or “independent” in statistics). For example, if you want to compare the preferences between different user groups, then you conduct chi-square test to see if the difference between user groups are significant or not. If you use Excel, you can use the function “CHITEST” to do the chi-square test easily.
For Ordinal Data — Frequency, Chi-square, Wilcoxon Rank Sum Test
Looking at frequencies is a common way to analyze ordinal data. For example, 20% of users rated the design excellent, 40% of users rated good, and so on. The differences between each order are meaningless, so you can’t calculate average ratings. For more advanced analysis, there is a method called Wilcoxon Rank Sum Test which is used to compare ordinal data. Wilcoxon test can be done using Excel, but it would be easier if you use a programming language such as R.
For Interval data — All descriptive statistics, T-Test, ANOVA, Correlation
Interval data allow you to use wide range of descriptive statistics, such as average and standard deviation. Also, you can use some inferential statistics to derive a general conclusion that applies to larger population, not limited to your test participants. One of the most common ways to analyze interval data is comparing the means(averages), using T-test or ANOVA. T-test is used to compare two samples; ANOVA is used to compare three or more samples. If you use Excel, you can simply use the function “TTEST” for T-test. It’s also possible to do ANOVA with Excel, using Analysis ToolPak add-in.

Another useful way to analyze interval data is looking at relationship between different variables. The chart above is an example of scatterplot with trend line, showing the correlation between two variables. You can calculate correlation coefficient to see how the two variables are correlated. Excel has “CORREL” function to calculate correlation coefficient, as well as chart functions to draw scatterplot with trend line, including r-squared value that shows how strongly the values are correlated (r-squared is simply the square of the correlation coefficient).
For Ratio Data — All descriptive statistics (including geometric means), T-Test, ANOVA, Correlation
There are not much difference between interval data and ratio data, and all the statistics that are used for interval data can also be used for ratio data. One difference is that you can use geometric mean for ratio data. Geometric mean is another way to calculate average, which is useful in measuring differences in time.
How many Participants do you need?
The number of participants needed for a research depends on the goals of your research and your tolerance for a margin of error. Generally, you need less participant in the first stages of the design and development, while you need more participant in the later stages to find remaining issues.
For a quantitative research, what you need to consider is how much statistical errors you can tolerate. When you do the research with fewer participants, your data tend to contain more statistical errors. When you do the research with many participants, your data tend to be closer to the true population. That’s why confidence interval is important. Confidence interval is an estimate of a range of values that includes the true population value for a statistic, such as a mean. You decide what level of confidence you need, such as 90% or 95%, and calculate the confidence interval to show how accurate the measures actually are.
This article quickly introduced the basic ideas about data and statistics, and some terms might have sounded too technical. But you don’t need to do all the intimidating statistics, of course. You can start by gathering simple data, such as counting task success or asking one question (e.g. “how easy was the task?”) while doing qualitative usability test. Try them out and iterate the process to make them work better. Quantitative data and statistics are not very intuitive for most people, but they can help a lot to improve usability and user experience.
(Reference: Measuring the User Experience by Tom Tullis and Bill Albert)