The hidden benefits of building an insights repository

Creating an insights repository made us better researchers in ways that we didn’t expect.

Published in

UX Collective

7 min readFeb 24, 2020

(This is an article version of a talk I gave in February 2020 for the UX + Data Meetup in NYC. Slides are here. Video is here.)

My UX Research team and I couldn’t go a week without being asked if we had done research on a specific topic, or if we could quickly pull information on a certain user group. To answer these questions we’d sift through our list of reports, searching for ones than seemed relevant, and then dig through the raw data to find anything pertinent. Sometimes we’d conduct the primary research again with the sneaking suspicion that if we could just easily search through our old data we’d find something valuable.

It was time to build to build an insights repository.

What is a repository?

The world of research repositories is vast. Any research that is collected and stored could be called a repository. So all researchers have a repository whether they know it or not!

I’m a member of an incredible group of researchers and research ops folks called the Research Ops Community. We are working on teasing apart what a repository truly is and how it’s created and used. Here is the working definition:

“A research repository is any platform, system, drive, database, content collaboration tool, library, knowledge base, wiki, or file cabinet that stores research data, notes, transcripts, images, videos, recordings, findings, insights, reports, metadata, etc. to support consumption and reuse by the entire team.”
— Research Repositories Workshops in 2020, A Research Ops Community project

What my team built is one type of repository: an insights repository. We can define this more narrowly:

An insights repository is a place to store and access insights gathered from user research.

Our insights are the smallest, meaningful morsel of information that we gleaned from user research. Usually, an insight looks like a statement with a corresponding direct quote from a user. In a half hour interview, I usually collect about 20 unique insights. Some are pertinent to the research questions I’m addressing, and others are simply interesting but not relevant . . . yet. If those insights do become relevant in the future, an insights repository is where they can be found.

What we expected to gain from our repository

There is a general discourse around insights repositories. They help a research team do two things really well:

Democratize data so that anyone in your organization has access to user insights.
Enable researchers to reuse old studies so they don’t repeat research.

We set out to build our repository with these goals in mind. Soon our old research would be available and organized so that anyone could make meaning!

Here’s what actually happened.

Creating the repository

We began the journey of building our insights repository by choosing a few recent research reports that had “evergreen” insights; we knew they would be relevant for a long period of time. We brainstormed a taxonomy by creating as many categories and tags as we could. The taxonomy got huge quickly! We managed to cut it down to 22 categories which is still quite large, but it was a starting place.

Then we tried out a bunch of repository tools: EnjoyHQ, Dovetail, and Aurelius among others. We wanted a robust search, easy-to-scan search results, and a seamless process for inputting insights. Plus, some of our data is sensitive so we had to make sure any third-party app complied with our privacy needs. None of these tools were quite right for us. So we did the next best thing; we made our own.

Many large companies have made their own research repository platform (WeWork and MailChimp, for example). We had no software developer resources to do this, so instead we partnered with our data engineering team.

Our journey from spreadsheet, to data warehouse, to analytics dashboard.

We created a table inside our data warehouse called the Insights Table and set up the columns to match our taxonomy categories. This means our data is secure in our own system, and we can upload a spreadsheet of insights as long as the columns match the ones in the repository.

Here’s the magic part! Because our insights live in our own data warehouse, we can pull that data into a data visualization software like Looker. There we have an Insights dashboard where we (or anyone at the company) can search through our insights by keyword or category. Finally! Insights from our past research at our fingertips.

The real benefits of building a repository

After populating the repository with enough insights to be useful, our team was able to reflect on its value. It turns out the benefits of the repository were not at all what we expected.

Researchers dig deep quickly

While the repository is accessible to anyone, only our research team uses it deeply. Having a repository alone is not enough to democratize data. The democratization of data involves so much more than giving folks access to the data. You have to teach them what to look for and how to make meaning from what they find.

It’d be risky for other colleagues to dive into the repository without having a sense of the prevalence of that type of data or knowing the context in which it was collected. If you’re looking in a research repository to prove or disprove an idea, it’s easy to make biased assumptions about what you find and then say that you did “research” on it.

On the other hand, it’s safe for the research team to query the repository and serve up insights and answers quickly. Why? Because researchers have research chops and institutional knowledge. Even with a perfectly tagged insights repository, institution knowledge is golden.

2. Easily reconsider (not reuse) old research

When pulling up old insights in the repository, it became clear that we had to consider the relevancy of the insights to our current research. I’ve yet to encounter a research question where we can simply pull insights from a past study and find a fitting answer. We’re reconsidering our old research, not simply reusing it.

More often than not, we are finding old insights that fill in gaps or add evidence to the new insights we’ve collected. This mashing up of new and old data is why we decided to create a repository at the insight level instead of the report level. We’re able to pull up a single user quote on a certain theme and see how it jives with new primary research. If the old insights seem to contradict the new data, we can dig into why. Maybe the demographic or market needs changed since we did the original research. Digging into the differences enables us to uncover new meaning, making our findings that much stronger.

The act of creating and using our insights repository has made my team better at our jobs.

We understand the nature of our data better.

To get our insights repository off the ground we had to:

examine our past studies to identify what would be useful in the repository
create a taxonomy so that our data could be categorized
review and clean up past insights with our new tags

This was a large project! It took us a whole quarter to get enough data into the repository so that it was usable. But now we have in-depth knowledge on what research we’ve collected in the past, and we know the nature of the data we can collect. For example, we know we have more insights from one user demographic than others, and that we have more information on churned users than prospective customers. Armed with this information, we are able to set up our research plans more effectively. We know where we will have to rely only on primary research, or where we can lean on the repository to color our findings.

We examine our data from more angles.

All the insights that go into the repository have to be properly tagged. This means that we code data with tags for the repository that we wouldn’t have otherwise considered.

For example, we record “emotional state” for our insights on a numeric scale: 1-negative to 5-positive. Because “emotional state” is a repository category, we now consider it when we’re tagging all primary data. In the past, we may have only considered emotional state in analysis when the research question was about emotion (“How do users feel when . . .”) or when the raw quote was very emotionally charged. Now, we always consider it and it’s helped lead to new insights about how users are feeling.

We think deeper about relevancy.

Since we can spend more time considering past insights instead of collecting new ones, we have more time to focus on determining if the insights are relevant and actionable.

This has led to a whole lot of thinking about what a “good” insight is. If an insight is relevant to the research question, but not actionable or novel, should it even be included in analysis? And what determines relevancy? Due to our repository, we can be pickier about what information we examine because we have more information!

The process of creating the insights repository didn’t come without challenges. It’s still time consuming to format data so that it uploads nicely into the repository. We have to be careful that we don’t export and re-upload the same data back into the repository. We’re still getting used to the care and rigor that’s necessary to store all our insights properly.

The benefits we found by building our insights repository were not the ones we initially expected. We didn’t democratize data or find a way to simply reuse old studies. Instead, our repository has given us the means to examine our data more closely without losing speed. It’s helped us focus on finding insights that align to the needs of our company. We are measurably better researchers because of our insights repository.

Icons: dashboard by Rafael Garcia Motta, database by Prashanth Rapolu, spreadsheet by Ralf Schmitzer, path by Adrien Coquet from the Noun Project