knowledge modeling: what I wish I'd known before starting

Quick definitions:

Taxonomy – A method for classifying information into hierarchical categories and subcategories, similar to biological classification (kingdom, phylum, class, etc.).

Ontology – An advanced framework that defines categories and illustrates the relationships between concepts, forming a network of connections rather than a simple hierarchical tree.

Knowledge Modeling – The practice of creating these categories and relationships to organize information according to how people and systems actually use and understand it.

"I’d like for you to help develop a knowledge model for our marketing materials. The team testing the new knowledge modeling tool needs a preliminary structure to work with."

I nodded confidently, opened my laptop, and then immediately Googled "what is knowledge modeling."

Over the next week, I collected files from repositories, worked through metadata, and developed a preliminary concept scheme. By the end, I had created a proposed model for the marketing content and gained new insight into the future of content management. I haven’t seen the tool itself yet—that’s the implementation team’s job—but I did learn just how much brainwork happens before you let the shiny semantic tech do its thing.

The Reality of Starting from Scratch

Knowledge modeling is how we teach content to make sense of itself.

The company is testing new software, a semantic technology platform that can generate starter models. But they wanted something custom-built for their specific marketing needs first—a human-developed understanding of their content that could inform how they configure and use the tool. That's where I came in: developing the conceptual framework to help the team get started with the testing.

For confidentiality reasons, I invented the sample material you see below.

Phase 1: The Great Content Collection

My first task was straightforward: gather recent marketing materials from the company's file repository into a text folder. The requirements were clear:

  • Recent content only

  • File formats: PDF, DOC, or HTML

  • Focus on marketing materials

This turned out to be relatively manageable. The company doesn't have a huge amount of marketing materials, which was actually a blessing for a first knowledge modeling project. I found:

  • Mostly PDFs (the vast majority of content)

  • A handful of DOC files

  • No HTML files at all

While the collection process itself wasn't difficult, it did reveal something interesting: the company's marketing content was fairly contained and consistent in format. This would make the modeling process cleaner, but it also meant every piece needed to count—there wasn't endless content to fill gaps.

Phase 2: The Metadata Marathon

With a manageable set of files collected, I moved on to tagging. I needed to apply metadata to each file to identify patterns and relationships. This is where I learned my first big lesson: metadata is only useful if it's consistent.

My initial tagging attempts were too granular. If you make tags like:

  • "customer-success-story"

  • "client-case-study"

  • "success-narrative"

  • "implementation-story"

...they all essentially mean the same thing. I had to step back and develop a controlled vocabulary—a defined set of terms to use consistently.

Phase 3: Pattern Recognition (Or: Staring at Spreadsheets Until They Make Sense)

With files tagged, I started looking for patterns. This involved:

  • Sorting content by different metadata fields

  • Creating pivot tables to see tag frequency

  • Identifying cluster topics that appeared repeatedly

  • Noting relationships between content types

Slowly, a structure emerged. I could see that the marketing materials naturally grouped around certain themes.

Phase 4: Building the Model

Based on the patterns, I proposed a preliminary model using a hierarchical structure that could eventually be implemented in the tool. What I was creating was essentially a taxonomy—a formal classification system that organizes information into categories and subcategories, showing how different concepts relate to each other. Think of it like the classification system in biology (Kingdom > Phylum > Class > Order) but for marketing content.

An image of a page showing biological taxonomy of some invertebrates of the ocean

In knowledge modeling, a taxonomy typically follows this structure:

  • Scheme (the overarching framework)

    • Concepts (main categories)

      • Subconcepts (detailed breakdowns)

For example:

  • Marketing Knowledge (scheme)

    • Content Purpose (concept)

      • Educational Content (subconcept)

      • Thought Leadership (subconcept)

      • Product Information (subconcept)

    • Buyer Journey Stage (concept)

      • Awareness (subconcept)

      • Consideration (subconcept)

      • Decision (subconcept)

    • Content Format (concept)

      • Long-form Content (subconcept)

      • Brief Content (subconcept)

      • Visual Content (subconcept)

This seems obvious in hindsight, but arriving at this clean structure required wading through lots of files and multiple false starts.

Understanding the Software from the Outside

While I haven't used the software directly, researching it helped me understand what kind of model structure would be most useful for the implementation team. This tool is a semantic technology platform that helps create and manage knowledge models (technically called taxonomies, thesauri, or ontologies).

From what I learned, my model needed to:

  • Follow hierarchical logic that could translate to semantic relationships

  • Use consistent terminology that could become a controlled vocabulary

  • Include clear parent-child relationships

  • Avoid circular references or logical conflicts

A good taxonomy does more than just organize—it creates a shared language. When everyone agrees that "thought leadership" means one specific thing and belongs in a specific place in the hierarchy, you eliminate confusion and improve findability.

This seems obvious in hindsight, but arriving at this clean structure required wading through hundreds of files and multiple false starts. Early versions of my taxonomy were either too broad (everything lumped into three huge categories) or too granular (dozens of micro-categories with only one or two items each).

What I Wish I'd Known from Day One

1. Your logic isn't their logic (and that's the whole point) When I started, I assumed the taxonomy would be organized purely by topic—straightforward and tidy. But once I dug into the content, I realized it was more complicated than that.

Marketers don’t look for “a topic about healthcare.” They look for “something for healthcare prospects in the awareness stage.” The difference may sound small, but it shifts the entire structure.

The model I ended up proposing looked very different from what I first imagined. Instead of:

Topic → Healthcare

It became:

Industry Solutions → Healthcare → Customer Success Stories

The content itself didn’t change, but the framing did. The lesson? Knowledge modeling isn’t about building the simplest possible structure—it’s about shaping a taxonomy around how your users actually look for and use content.

A photo of the arched bridged called Pennybacker Bridge in Austin, Texas

2. You're building a bridge, not a destination I never thought I was building the “final model”—I’m an intern, after all. My job was to sketch the bridge: something that would give the team a clearer view of the content landscape before they got into the platform. That shift in perspective let me aim for clarity and documentation rather than chasing perfection.

3. Document your reasoning, not just your results The team implementing this in the tool won't have the context of why I made certain decisions. Why are webinars under "thought leadership" and not "product education"? I learned to document my reasoning as thoroughly as my recommendations.

The Unexpected Value

What surprised me most was how much strategic thinking happens before any technology enters the picture. I thought knowledge modeling would be about learning sophisticated software. Instead, it was about:

  • Deep content analysis

  • Understanding user mental models

  • Creating logical structures

  • Identifying patterns and relationships

The fancy semantic technology platforms are powerful, but they're only as good as the human thinking that goes into them. My work—collecting, tagging, analyzing, and structuring—laid the conceptual foundation that makes the technology implementation possible.

For Fellow Beginners

If you're asked to support a knowledge modeling project, you don't necessarily need to know the technology. The teams testing tools need people who can:

  • Systematically analyze content

  • Identify patterns

  • Think about user needs

  • Document decisions clearly

  • Create logical structures

The preliminary work—understanding what you have and how it should connect—is just as valuable as the technical implementation.

Looking Forward

My preliminary model is now with the team testing the tool. They'll refine it, test it, and probably discover issues I didn't anticipate. But I've given them something crucial: a human-validated understanding of our content landscape and how our marketers think about it.

 august 27, 2025

Next
Next

inside my first content audit: a technical writing intern's perspective