Ways to quick start a taxonomy from scratch
If you’re creating your organization’s first centralized or enterprise taxonomy, the pressure is on. It’s a massive undertaking. You’ve got to show initial progress to further support for the project. And because it’s foundational, you’ve got to get a whole lot right from the beginning.



Leverage existing systems
You can hunt down taxonomy hiding in your company’s systems wherever they are. While not an alluring option, this is probably a good place to start. In fact, it’s what led us to the decision to create a central taxonomy in the first place.
For us at Quickbooks, it was no surprise that each system had its own one-off taxonomy, if any at all. The structures weren’t created by taxonomists or information architects. The models were a mess when we lined them up. Even though they didn’t help us define structure, they did provide quite a bit of discussion on term definitions as well as vocabulary for alternate and hidden terms.
Develop your own taxonomy utilities
It would be a slow job wading through all the content and data for terms. We had James Butler at Digital Agua develop a tool that allowed us to accept, reject, map and load terms into the taxonomy management system, PoolParty, along with additional reporting capabilities.
What would have taken 2 months, the team accomplished in 2 weeks using Keyworderator. It enabled us to go live with a V1 roughly consisting of 700 concept objects with 4,000 terms.

Generate taxonomy with ML
You might find an alternative to generate the taxonomy from your body of content. Early on, our data science partners hoped to construct a taxonomy. It was processed fast, with the idea that we could iterate over time. At least it represented our world based largely on our own data, enriched by content from the web.
We tried clustering concepts and relationships, expanding nodes and terms, but our own content quality was too poor for it to work. Analysis resulted in term variation we didn’t want in the taxonomy. It was time to focus on the performance of the model instead.
Happy note: our content has been subsequently rewritten by a team of newly hired writers.
Build, buy, mod
At some point, it’s natural to believe someone out there must have solved this before. If you’re in a common space like chemicals or health sciences, licensing an existing third-party taxonomy could provide 80% of what you need.
Of course we shopped Taxonomy Warehouse (I just like saying that). There was nothing in existence for the accounting domain of our product, QuickBooks. Even if we could have modified an existing taxonomy, it could have taken an unwieldy amount of customization to make it fit.
Decide on the type of taxonomy
Figure out what kind of taxonomy to build for your requirements as soon as possible. Should it be flat lists, hierarchy, facets, ontology…? Will you start with one and evolve to another?
Complexity lies in our domain. We’re dealing with practices and regulations that vary by location and language. We have different types of users with different names for the same thing: from expert accounting pros to small businesses to self-employed novices who might refer to “ACH transactions” as “stuff.”
We found the usual ambiguity of any taxonomy, plus overlap in ours, for instance:
- “I tried to expense this.” Not the same as an “expense report.”
- “Can I invoice it?” “Invoice” is a noun that’s a verb that includes verbs in, “create and send invoice.”
We landed on creating a thesaurus. Doing so also gave us the capability to meet a requirement to globalize vocabulary with one-to-one, one-to-many and none-to-one relationships from the source language.
Open standards, open source
If you check and can’t use open source taxonomies, at least use web standards for acceleration.
We took everything we could from ISO, NISO, SKOS, schema.org, and combed through the Financial Industry Business Ontology. While FIBO was created for the financial industry, it didn’t cover accounting. In fact, their committee rep asked us to contribute to it.
Looking back, these activities gave us speed and when they didn’t work as expected, we still uncovered inputs we could use. The challenge to getting started mainly had to do with our complex, uncharted domain. QuickBooks has been in the software business for decades, and this is the first year we’re tagging content by topic.
Intuit has an amazing mass of teams partnering to build a central taxonomy and infrastructure for delivering it. Morgen, Rick, James, Svetlana... I’m your biggest fan.