Colouring data visualizations

For data visualization, colour is more than aesthetic. Depending on your audience and the story you want to tell, the colours you select and the way you use them can have a huge impact on how the information is perceived. Colour can be used to:
- differentiate categories of data, like different ice cream flavours.
- represent values, like ice cream sales by neighbourhood.
- call attention to specific values, like highlighting your neighbourhood’s ice cream sales.



When used with purpose, colour tells a story and brings clarity to your data, but when used without consideration, colour can work against you.
We saw this firsthand while exploring custom colour palettes for visualizations in IBM Cognos Analytics. You might be thinking, “How could creating custom palettes be a problem? What’s the worst that could happen?” Well, after talking to customers, we realized that while they’re business and/or analytics experts, they’re not necessarily colour experts. The freedom to create your own palette, while liberating to our customers, also opened them up to the potential risks involved in using colour if they don’t have a solid understanding of how to apply it to data. A poor selection of colours can unknowingly compromise visual clarity or even skew the data.
I’m not pointing this out to scare you. If it’s difficult to distinguish the colours on your dashboard, profits aren’t going to instantly plummet… or at least it’s safe to assume colour palettes weren’t the reason. I’m pointing this out so that we can give colour the weight it deserves, especially when using it to tell stories with our data. Colour shouldn’t be something we unconsciously apply to “make it look pretty” because colour conveys meaning whether it is intentional or not.
Factors that complicate colour
On the surface colour seems relatively straightforward, but there are underlying elements at play that add to the complexity of selecting colours for data visualizations and create potential issues for correctly interpreting the information. Common issues with colour include:
Perceptual factors
These factors include our ability to see colours and distinguish them from one another. Visual impairments like colour blindness or cataracts, colour differences on a phone compared to a TV or a laptop, or even the size and shape of what you’re looking at can effect how easily one colour is identified from another.
Semantic factors
This is how we interpret meaning from colour. The associations we have with colour depends on the cultural, environmental and personal contexts that we’ve been exposed to. These differences can be seen in “Colours in Culture” from Information is Beautiful (below). The colour green, for example, is associated with good luck in Arab, Japanese, and Western cultures — while in African, Chinese, and Eastern European cultures, good luck is associated with the colour red.

By being aware of these perceptual and semantic factors, you can select colour combinations that avoid common constraints and help focus attention on the message you want to share.
Questions to ask before picking colours
Let’s imagine that you’re a part owner in the Udderly Delicious Ice Cream Company, a small fleet of ice cream carts in Ottawa, Canada. Here are a few questions you can ask yourself when picking colours for data visualizations.
Who’s the audience?
Are you making visualizations for a specific group of people? Are there cultural or industry-specific conventions they use for colour?
Let’s say you thought it would be helpful to put together a few visualizations to share with your business partners to help inform budget, sales targets, and other areas to focus on this year.
You know that, when it comes to cultural or industry-specific conventions in Canada, green means good and red means bad. You’ve also seen blue and red used together when referencing temperature.
By taking a few moments to write down what you know about your audience, their goals & motivations, your story becomes clearer.
What’s the story?
Ok, now that we know who the audience is, it’s important to focus on the story we want to tell before thinking about colour. What are you trying to explain to your audience? For example, at the Udderly Delicious Ice Cream Company, what are the three most important questions that you and your business partners need answered to inform the business this year?
Maybe you’re wondering what the annual sales for the top 5 ice cream flavours are because it will affect what you choose to order for your first batch this year. Or maybe you’re wanting to see sales volume by neighbourhood, so that you can strategically distribute the ice cream carts to meet customer demands. Or perhaps you want to visualize customer satisfaction per cart to see if any trends become obvious.
By knowing the story and how it relates to your audience, you’ll know how to check if you’ve achieved what you set out to do. It means you know why sharing this information matters, what data you’ll be using — and often this will lead you to the best way to visualize this information.
When to use colour in data visualizations
Now that we know the audience and have figured out what information we want to share, we can focus on colouring the data.
It’s important to remember that colour needs a purpose. If the information can be understood without it, then don’t add colour. As a general rule, if the visualization only has two dimensions of data, like gross profit over the years, then you don’t even need a palette, a single colour is perfect.

When colouring data visualizations, the type of data being coloured determines the type of colour palette that will be used. We will be focusing on the two most common types of data —qualitative (categories) and quantitative (values).
Qualitative data is information that has no logical order and can’t be translated into numeric order. Different ice cream flavours is one example, since “cookie dough” isn’t higher or lower than “mint chocolate chip”.
Quantitative data is information that implies an order. Daily temperature during the month of August is one example, since the temperature of 32°C on one day is higher than 16°C on another day.

Below, I’ve aggregated some best practices for creating data visualization colour palettes. To see how colour can be applied to different data, let’s use the business questions we’ve identified for the Udderly Delicious Ice Cream Company.
Best practices for creating data viz palettes
Qualitative data
Qualitative data uses categorical palettes with distinct colour swatches to differentiate the categories.
Best practices for categorical colour palettes:
- Swatches should be distinct hues (colours). When colours are too similar it can cause visual clusters, groups, or give a perception of order.
- No one colour should stand out relative to the other colour swatches. The colours should be visually equal in luminance (brightness) and chroma (saturation).
- Palettes should use a small number of colours. People are only able to reliably distinguish 5–8 colours simultaneously.

Quantitative data
Quantitative data can be coloured in one of two ways. If the data follows a path from low to high (or vice versa), like sales volume by neighbourhood, then it can be coloured using a sequential palette. Sequential palettes use a colour gradient to show when values are low vs. high.
Best practices for sequential colour palettes:
- Two distinct colours (i.e. blue and yellow) work better in a gradation than using variations of a single colour (i.e. blue only). Having two colours gives more contrast, making it easier to differentiate colour along the gradation.
- Colours must clearly indicate the order of the palette. It should be obvious which data values are smaller & larger compared to the other data values (i.e. the palette should go from a light colour to a dark colour).
- Gradation of colours should be even across the palette — you should be able to determine how far two values are from one another from any point on the palette.
- Lastly, gradients should be divided into 5 equal steps (or bins). Stepping a gradient makes it easier to distinguish where values are distributed along the scale (if you’re looking for an awesome tool to do this for you, I’ve provided a link at the end).

Some types of quantitative data, such as temperature or customer satisfaction, are better visualized by divergent palettes. Divergent palettes use a third neutral colour in the middle of a sequential palette to show the change in data values from two directions relative to a neutral midpoint (often zero).
Best practices for divergent colour palettes:
- A neutral colour should be placed at the midpoint of the palette. Light greys, yellows or even white will work. Just make sure the neutral colour is still visible on the visualization’s background.
- End colours must be balanced in terms of luminance (brightness) and chroma (saturation), so that the perception of colour progression from dark to light (towards the midpoint) is equal on both ends.
- Gradation of colours should be even across the palette — you should be able to determine how far two values are from one another from any point on the palette.
- Lastly, gradients should be divided into 5 equal steps (or bins). Stepping a gradient makes it easier to distinguish where values are distributed along the scale.

Final thoughts
As we’ve learned, when it comes to data visualization, colour can either help or hinder the message you’re trying to communicate. My hope is that you’ve come away with a new appreciation for the role it plays in your story, as well as a frame of reference for applying colour with meaning.
We can bring clarity to the data when we take a moment to think about our audience, the story we want to tell, and which colours carry meaning for them.
I’ll leave you with some of my favourite tools for creating data visualization palettes:
- Viz palette: a tool to visualize palettes across visualization types and test for visually similar colours and names
- Chroma.js: a tool for helping create sequential and divergent palettes
- Colorbrewer: a library for sequential and divergent palettes
Cate Wilcox is a UX Designer at IBM Cognos Analytics based in Ottawa, Canada. The above article is personal and does not necessarily represent IBM’s positions, strategies or opinions.