Content Strategy and Operations Articles - Enterprise Knowledge https://enterprise-knowledge.com/category/advanced-content/ Mon, 17 Nov 2025 22:21:32 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.2 https://enterprise-knowledge.com/wp-content/uploads/2022/04/EK_Icon_512x512.svg Content Strategy and Operations Articles - Enterprise Knowledge https://enterprise-knowledge.com/category/advanced-content/ 32 32 How to Leverage LLMs for Auto-tagging & Content Enrichment https://enterprise-knowledge.com/how-to-leverage-llms-for-auto-tagging-content-enrichment/ Wed, 29 Oct 2025 14:57:56 +0000 https://enterprise-knowledge.com/?p=25940 When working with organizations on key data and knowledge management initiatives, we’ve often noticed that a roadblock is the lack of quality (relevant, meaningful, or up-to-date) existing content an organization has. Stakeholders may be excited to get started with advanced … Continue reading

The post How to Leverage LLMs for Auto-tagging & Content Enrichment appeared first on Enterprise Knowledge.

]]>
When working with organizations on key data and knowledge management initiatives, we’ve often noticed that a roadblock is the lack of quality (relevant, meaningful, or up-to-date) existing content an organization has. Stakeholders may be excited to get started with advanced tools as part of their initiatives, like graph solutions, personalized search solutions, or advanced AI solutions; however, without a strong backbone of semantic models and context-rich content, these solutions are significantly less effective. For example, without proper tags and content types, a knowledge portal development effort  can’t fully demonstrate the value of faceting and aggregating pieces of content and data together in ‘knowledge panes’. With a more semantically rich set of content to work with, the portal can begin showing value through search, filtering, and aggregation, leading to further organizational and leadership buy-in.

One key step in preparing content is the application of metadata and organizational context to pieces of content through tagging. There are several tagging approaches an organization can take to enrich pre-existing content with metadata and organizational context, including manual tagging, automated tagging capabilities from a taxonomy and ontology management system (TOMS), using apps and features directly from a content management solution, and various hybrid approaches. While many of these approaches, in particular acquiring a TOMS, are recommended as a long-term auto-tagging solution, EK has recommended and implemented Large Language Model (LLM)-based auto-tagging capabilities across several recent engagements. Due to LLM-based tagging’s lower initial investment compared to a TOMS and its greater efficiency than manual tagging, these auto-tagging solutions have been able to provide immediate value and jumpstart the process of re-tagging existing content. This blog will dive deeper into how LLM tagging works, the value of semantics, technical considerations, and next steps for implementing an LLM-based tagging solution.

Overview of LLM-Based Auto-Tagging Process

Similar to existing auto-tagging approaches, the LLM suggests a tag by parsing through a piece of content, processing and identifying key phrases, terms, or structure that gives the document context. Through prompt engineering, the LLM is then asked to compare the similarity of key semantic components (e.g., named entities, key phrases) with various term lists, returning a set of terms that could be used to categorize the piece of content. These responses can be adjusted in the tagging workflow to only return terms meeting a specific similarity score. These tagging results are then exported to a data store and applied to the content source. Many factors, including the particular LLM used, the knowledge an LLM is working with, and the source location of content, can greatly impact the tagging effectiveness and accuracy. In addition, adjusting parameters, taxonomies/term lists, and/or prompts to improve precision and recall can ensure tagging results align with an organization’s needs. The final step is the auto-tagging itself and the application of the tags in the source system. This could look like a script or workflow that applies the stored tags to pieces of content.

Figure 1: High-level steps for LLM content enrichment

EK has put these steps into practice, for example, when engaging with a trade association on a content modernization project to migrate and auto-tag content into a new content management system (CMS). The organization had been struggling with content findability, standardization, and governance, in particular, the language used to describe the diverse areas of work the trade association covers. As part of this engagement, EK first worked with the organization’s subject matter experts (SMEs) to develop new enterprise-wide taxonomies and controlled vocabularies integrated across multiple platforms to be utilized by both external and internal end-users. To operationalize and apply these common vocabularies, EK developed an LLM-based auto-tagging workflow utilizing the four high-level steps above to auto-tag metadata fields and identify content types. This content modernization effort set up the organization for document workflows, search solutions, and generative AI projects, all of which are able to leverage the added metadata on documents. 

Value of Semantics with LLM-Based Auto-Tagging

Semantic models such as taxonomies, metadata models, ontologies, and content types can all be valuable inputs to guide an LLM on how to effectively categorize a piece of content. When considering how an LLM is trained for auto-tagging content, a greater emphasis needs to be put on organization-specific context. If using a taxonomy as a training input, organizational context can be added through weighting specific terms, increasing the number of synonyms/alternative labels, and providing organization-specific definitions. For example, by providing organizational context through a taxonomy or business glossary that the term “Green Account” refers to accounts that have met a specific environmental standard, the LLM would not accidentally tag content related to the color green or an account that is financially successful.

Another benefit of an LLM-based approach is the ability to evolve both the semantic model and LLM as tagging results are received. As sets of tags are generated for an initial set of content, the taxonomies and content models being used to train the LLM can be refined to better fit the specific organizational context. This could look like adding additional alternative labels, adjusting the definition of terms, or adjusting the taxonomy hierarchy. Similarly, additional tools and techniques, such as weighting and prompt engineering, can tune the results provided by the LLM and evolve the results generated to achieve a higher recall (rate the LLM is including the correct term) and precision (rate the LLM is selecting only the correct term) when recommending terms. One example of this is  adding weighting from 0 to 10 for all taxonomy terms and assigning a higher score for terms the organization prefers to use. The workflow developed alongside the LLM can use this context to include or exclude a particular term.

Implementation Considerations for LLM-Based Auto-Tagging 

Several factors, such as the timeframe, volume of information, necessary accuracy, types of content management systems, and desired capabilities, inform the complexity and resources needed for LLM-based content enrichment. The following considerations expand upon the factors an organization must consider for effective LLM content enrichment. 

Tagging Accuracy

The accuracy of tags from an LLM directly impacts end-users and systems (e.g., search instances or dashboards) that are utilizing the tags. Safeguards need to be implemented to ensure end-users can trust the accuracy of the tagged content they are using. These help ensure that a user is not mistakenly accessing or using a particular document, or that they are frustrated by the results they get. To mitigate both of these concerns, a high recall and precision score with the LLM tagging improves the overall accuracy and lowers the chance for miscategorization. This can be done by investing further into human test-tagging and input from SMEs to create a gold-standard set of tagged content as training data for the LLM. The gold-standard set can then be used to adjust how the LLM weights or prioritizes terms, based on the organizational context in the gold-standard set. These practices will help to avoid hallucinations (factually incorrect or misleading content) that could appear in applications utilizing the auto-tagged set of content.

Content Repositories

One factor that greatly adds technical complexity is accessing the various types of content repositories that an LLM solution, or any auto-tagging solution, needs to read from. The best content management practice for auto-tagging is to read content in its source location, limiting the risk of duplication and the effort needed to download and then read content. When developing a custom solution, each content repository often needs a distinctive approach to read and apply tags. A content or document repository like SharePoint, for example, has a robust API for reading content and seamlessly applying tags, while a less widely adopted platform may not have the same level of support. It is important to account for the unique needs of each system in order to limit the disruption end-users may experience when embarking on a tagging effort.

Knowledge Assets

When considering the scalability of the auto-tagging effort, it is also important to evaluate the breadth of knowledge asset types being analyzed. While the ability of LLMs to process several types of knowledge assets has been growing, each step of additional complexity, particularly evaluating multiple types, can result in additional resources and time needed to read and tag documents. A PDF document with 2-3 pages of content will take far fewer tokens and resources for an LLM to read its content than a long visual or audio asset. Going from a tagging workflow of structured knowledge assets to tagging unstructured content will increase the overall time, resources, and custom development needed to run a tagging workflow. 

Data Security & Entitlements

When utilizing an LLM, it is recommended that an organization invest in a private or an in-house LLM to complete analysis, rather than leveraging a publicly available model. In particular, an LLM does not need to be ‘on-premises’, as several providers have options for LLMs in your company’s own environment. This ensures a higher level of document security and additional features for customization. Particularly when tackling use cases with higher levels of personal information and access controls, a robust mapping of content and an understanding of what needs to be tagged is imperative. As an example, if a publicly facing LLM was reading confidential documents on how to develop a company-specific product, this information could then be leveraged in other public queries and has a higher likelihood of being accessed outside of the organization. In an enterprise data ecosystem, running an LLM-based auto-tagging solution can raise red flags around data access, controls, and compliance. These challenges can be addressed through a Unified Entitlements System (UES) that creates a centralized policy management system for both end users and LLM solutions being deployed.

Next Steps:

One major consideration with an LLM tagging solution is maintenance and governance over time. For some organizations, after completing an initial enrichment of content by the LLM, a combination of manual tagging and forms within each CMS helps them maintain tagging standards over time. However, a more mature organization that is dealing with several content repositories and systems may want to either operationalize the content enrichment solution for continued use or invest in a TOMS. With either approach, completing an initial LLM enrichment of content is a key method to prove the value of semantics and metadata to decision-makers in an organization. 
Many technical solutions and initiatives that excite both technical and business stakeholders can be actualized by an LLM content enrichment effort. By having content that is tagged and adhering to semantic standards, solutions like knowledge graphs, knowledge portals, and semantic search engines, or even an enterprise-wide LLM Solution, are upgraded even further to show organizational value.

If your organization is interested in upgrading your content and developing new KM solutions, contact us!

The post How to Leverage LLMs for Auto-tagging & Content Enrichment appeared first on Enterprise Knowledge.

]]>
Defining Governance and Operating Models for AI Readiness of Knowledge Assets https://enterprise-knowledge.com/defining-governance-and-operating-models-for-ai-readiness-of-knowledge-assets/ Wed, 08 Oct 2025 18:57:59 +0000 https://enterprise-knowledge.com/?p=25729 Artificial intelligence (AI) solutions continue to capture both the attention and the budgets of many organizations. As we have previously explained, a critical factor to the success of your organization’s AI initiatives is the readiness of your content, data, and … Continue reading

The post Defining Governance and Operating Models for AI Readiness of Knowledge Assets appeared first on Enterprise Knowledge.

]]>
Artificial intelligence (AI) solutions continue to capture both the attention and the budgets of many organizations. As we have previously explained, a critical factor to the success of your organization’s AI initiatives is the readiness of your content, data, and other knowledge assets. When correctly executed, this preparation will ensure your knowledge assets are of the appropriate quality and semantic structure for AI solutions to leverage with context and inference, while identifying and exposing only the appropriate assets to the right people through entitlements.

This, of course, is an ongoing challenge, rather than a moment in time initiative. To ensure the important work you’ve done to get your content, data, and other assets AI-ready is not lost, you need governance as well as an operating model to guide it. Indeed, well before any AI readiness initiative, governance and the organization must be top of mind. 

Governance is not a new term within the field. Historically, we’ve identified four core components to governance in the context of content or data:

  • Business Case and Measurable Success Criteria: Defining the value of the solution and the governance model itself, as well as what success looks like for both.
  • Roles and Responsibilities: Defining the individuals and groups necessary for governance, as well as the specific authorities and expectations of their roles.
  • Policies and Procedures: Detailing the timelines, steps, definitions, and actions for the associated roles to play.
  • Communications and Training: Laying out the approach to two-way communications between the associated governance roles/groups and the various stakeholders.

These traditional components of governance all have held up, tried and true, over the quarter-century since we first defined them. In the context of AI, however, it is important to go deeper and consider the unique aspects that artificial intelligence brings into the conversation. Virtually every expert in the field agrees that AI governance should be a priority for any organization, but that must be detailed further in order to be meaningful.

In the context of AI readiness for knowledge assets, we focus AI governance, and more broadly its supporting operating model, on five key elements for success:

  • Coordination and Enablement Over Execution
  • Connection Instead of Migration
  • Filling Gaps to Address the Unanswerable Questions
  • Acting on “Hallucinations”
  • Embedding Automation (Where It Makes Sense)

There is, of course, more to AI governance than these five elements, but in the context of AI readiness for knowledge assets, our experience shows that these are the areas where organizations should be focusing and shifting away from traditional models. 

1) Coordination and Enablement Over Execution

In traditional governance models (i.e. content governance, data governance, etc.), most of the work was done in the context of a single system. Content would be in a content management system and have a content governance model. Data would be in a data management solution and have a data governance model. The shift is that today’s AI governance solutions shouldn’t care what types of assets you have or where they are housed. This presents an amazing opportunity to remove artificial silos within an organization, but brings a marked challenge. 

If you were previously defining a content governance model, you most likely possessed some level of control or ownership over your content and document management systems. Likewise, if you were in charge of data governance, you likely “own” some or all of the major data solutions like master data management or a data warehouse within your organization. With AI, however, an enormous benefit of a correctly architected enterprise AI solution that leverages a semantic layer is that you likely don’t own these source systems. The system housing the content, data, and other knowledge assets is likely, at least partly, managed by other parts of your organization. In other words, in an AI world, you have less control over the sources of the knowledge assets, and thereby over the knowledge assets themselves. This may well change as organizations evolve in the “Age of AI,” but for now, the role and responsibility for AI governance becomes more about coordination and less about execution or enforcement.

In practice, this means an AI Governance for Knowledge Asset Readiness group must coordinate with the owners of the various source systems for knowledge assets, providing additive guidance to define what it means to have AI-ready assets as well as training and communications to enable and engage system and asset owners to understand what they must do to have their content, data, and other assets included within the AI models. The word “must” in the previous sentence is purposeful. You alone may not possess the authority of an information system owner to define standards for their assets, but you should have the authority to choose not to include those assets within the enterprise AI solution set.

How do you apply that authority? As the lines continue to blur between the purview of KM, Data, and AI teams, this AI Governance for Knowledge Asset Readiness group should comprise representatives from each of these once siloed teams to co-own outcomes as new AI use cases, features, and capabilities are developed. The AI governance group should be responsible for delineating key interaction points and expected outcomes across teams and business functions to build alignment, facilitate planning and coordination, and establish expectations for business and technical stakeholders alike as AI solutions evolve. Further, this group should define what it means (and what is required) for an asset to be AI-ready. We cover this in detail in previous articles, but in short, this boils down to semantic structure, quality, and entitlements as the three core pillars to AI readiness for knowledge assets. 

2) Connection Instead of Migration

The idea of connections over migration aligns with the previous point. Past monolithic efforts in your organization would commonly have included massive migrations and consolidations of systems and solutions. The roadmaps of past MDMs, data warehouses, and enterprise content management initiatives are littered with failed migrations. Again, part of the value of an enterprise AI initiative that leverages a semantic layer, or at least a knowledge graph, is that you don’t need to absorb the cost, complexity, and probable failure of a massive migration. 

Instead, the role of the AI Governance for Knowledge Asset Readiness group is one of connections. Once the group has set the expectation for AI-ready knowledge assets, the next step is to ensure the systems that house those assets are connected and available, ready for the enterprise AI solutions to be ingested and understood. This can be a highly iterative process, not to be rushed, as the sanctity of the assets ingested by AI is more important than their depth. Said differently, you have few chances to deliver wrong answers—your end users will lose trust quickly in a solution that delivers inaccurate information that they know is unmistakably incorrect; but if they receive an incomplete answer instead, they will be more likely to raise this and continue to engage. The role of this AI governance group is to ensure the right systems and their assets are reliably available for the AI solution(s) at the right time, after your knowledge assets have passed through the appropriate requirements.

 

3) Filling Gaps to Address the Unanswerable Questions

As the AI solutions are deployed, the shift for AI governance moves from being proactive to reactive. There is a great opportunity associated with this that bears a particular focus. In the history of knowledge management, and more broadly the fields of content management, data management, and information management, there’s always been a creeping concern that an organization “doesn’t know what it doesn’t know.” What are the gaps in knowledge? What are the organizational blind spots? These questions have been nearly impossible to answer at the enterprise level. However, with enterprise-level AI solutions implemented, the ability to have this awareness is suddenly a possibility.

Even before deploying AI solutions, a well-designed semantic layer can help pinpoint organizational gaps in knowledge by finding taxonomy elements lacking in applied knowledge assets. However, this potential is magnified once the AI solution is fully defined. Today’s mature AI solutions are “smart” enough to know when they can’t answer a question and highlight that unanswerable question to the AI governance group. Imagine possessing the organizational intelligence to know what your colleagues are seeking to understand, having insights into that which they are trying to learn or answer, but are currently unable to. 

In this way, once an AI solution is deployed, the primary role of the AI governance group should be to diagnose and then respond to these automatically identified knowledge gaps, using their standards to fill them. It may be that the information does, in fact, exist within the enterprise, but that the AI solution wasn’t connected to those knowledge assets. Alternatively, it may be that the right semantic structure wasn’t placed on the assets, resulting in a missed connection and a false gap from the AI. However, it may also be that the answer to the “unanswerable” question only exists as tacit knowledge in the heads of the organization’s experts, or doesn’t exist at all. This is the most core and true value of the field of knowledge management, and has never been so possible.

4) Acting on “Hallucinations”

Aligned with the idea of filling gaps, a similar role for the AI governance group should be to address hallucinations or failures for AI to deliver an accurate, consistent, and complete “answer.” For organizations attempting to implement enterprise AI, a hallucination is little more than a cute word for an error, and should be treated as such by the AI governance group. There are many reasons for these errors, ranging from poor quality (i.e., wrong, outdated, near-duplicate, or conflicting) knowledge assets, insufficient semantic structure (e.g., taxonomy, ontology, or a business glossary), or poor logic built into the model itself. Any of these issues should be treated with immediate action. Your organization’s end users will quickly lose trust in an AI solution that delivers inaccurate results. Your governance model and associated organizational structure must be equipped to act quickly, first to leverage communications and feedback channels to ensure your end users are telling you when they believe something is inaccurate or incomplete, and moreover, to diagnose and address it.

As a note, for the most mature organizations, this action won’t be entirely reactive. For the most mature, organizational subject matter experts will be involved in perpetuity, especially right before and after enterprise AI deployment, to hunt for errors in these systems. Commonly, you can consider this governance function as the “Hallucination Killers” within your organization, likely to be one of the most critical actions as AI continues to expand.

5) Embedding Automation (Where It Makes Sense)

Finally, one of the most important roles of an AI governance group will be to use AI to make AI better. Almost everything we’ve described above can be automated. AI can and should be used to automate identification of knowledge gaps as well as solve the issue of those knowledge gaps by pinpointing organizational subject matter experts and targeting them to deliver their learning and experience at the right moments. It can also play a major role in helping to apply the appropriate semantic structure to knowledge, through tagging of taxonomy terms as metadata or identification of potential terms for inclusion in a business glossary. Central to all of this automation, however, is to ensure the ‘human is in the loop’, or rather, the AI governance group plays an advisory and oversight role throughout these automations, to ensure the design doesn’t fall out of alignment. This element further facilitates AI governance coordination across the organization by supporting stakeholders and knowledge asset stewards through technical enablement.

All of this presents a world of possibility. Governance was historically one of the drier and more esoteric concepts within the field, often where good projects went bad. We have the opportunity to do governance better by leveraging AI in the areas where humans historically fell short, while maintaining the important role of human experts with the right authority to ensure organizational alignment and value.

If your AI efforts aren’t yet yielding the results you expected, or you’re seeking to get things started right from the beginning, contact EK to help you.

The post Defining Governance and Operating Models for AI Readiness of Knowledge Assets appeared first on Enterprise Knowledge.

]]>
How to Ensure Your Content is AI Ready https://enterprise-knowledge.com/how-to-ensure-your-content-is-ai-ready/ Thu, 02 Oct 2025 16:45:28 +0000 https://enterprise-knowledge.com/?p=25691 In 1996, Bill Gates declared “Content is King” because of its importance (and revenue generating potential) on the World Wide Web. Nearly 30 years later, content remains king, particularly when leveraged as a vital input for Enterprise AI. Having AI-ready … Continue reading

The post How to Ensure Your Content is AI Ready appeared first on Enterprise Knowledge.

]]>
In 1996, Bill Gates declared “Content is King” because of its importance (and revenue generating potential) on the World Wide Web. Nearly 30 years later, content remains king, particularly when leveraged as a vital input for Enterprise AI. Having AI-ready content is critical to successful AI implementation because it decreases hallucinations and errors, improves the efficiency and scalability of the model, and ensures seamless integration with evolving AI technologies. Put simply: if your content isn’t AI-ready, your AI initiatives will fail, stall, or deliver low value.  

In a recent blog, “Top Ways to Get Your Content and Data Ready for AI,” Sara Mae O’Brien-Scott and Zach Wahl gave an approach for ensuring your organization is ready to undertake an AI Initiative. While the previous blog provided a broad view of AI-readiness for all types of Knowledge Assets collectively, this blog will leverage the same approach, zeroing in on actionable steps to ensure your content is ready for AI. Content, also known as unstructured information, is pervasive in every organization. In fact, for many organizations it comprises 80% to 90% of the total information held within the organization. Within that corpus of content, there is a massive amount of value, but there also tends to be chaos. We’ve found that most organizations should only be actively maintaining 15-20% of their unstructured information, with the rest being duplicate, near-duplicate, outdated, or completely incorrect. Without taking steps to clean it up, contextualize it, and ensure it is properly accessible to the right people, your AI initiatives will flounder. The steps we detail below will enable you to implement Enterprise AI at your organization, minimizing the pitfalls and struggles many organizations have encountered while trying to implement AI.

1) Understand What You Mean by “Content” (Knowledge Asset Definition) 

In a previous blog, we discussed the many types of knowledge assets organizations possess, how they can be connected, and the collective value they offer. Identifying content, or unstructured information, as one of the types of knowledge assets to be included in your organization’s AI solutions will be a foregone conclusion for most. However, that alone is insufficient to manage scope and understand what needs to be done to ensure your content is AI-ready. There are many types of content, held in varied repositories, with much likely sprawling on existing file drives and old document management systems. 

Before embarking on an AI initiative, it is essential to focus on the content that addresses your highest priority use cases and will yield the greatest value, recognizing that more layers can be added iteratively over time. To maximize AI effectiveness, it is critical to ensure the content feeding AI models aligns with real user needs and AI use cases. Misaligned content can lead to hallucinations, inaccurate responses, or poor user experiences. The following actions help define content and prepare it for AI:

  • Identify the types of content that are critical for priority AI use cases.
  • Work with Content Governance Groups to identify content owners for future inclusion in AI testing. 
  • Map end-to-end user journeys to determine where AI interacts with users and the content touchpoints that need to be referenced by AI applications.
  • Inventory priority content across enterprise-wide source systems, breaking knowledge asset silos and system silos.
  • Flag where different assets serve the same intent to flag potential overlap or duplication, helping AI applications ingest only relevant content and minimize noise during AI model training.

What content means can vary significantly across organizations. For example, in a manufacturing company, content can take the form of operational procedures and inventory reports, while in a healthcare organization, it can include clinical case documentation and electronic health records. Understanding what content truly represents in an organization and identifying where it resides, often across siloed repositories, are the first steps toward enabling AI solutions to deliver complete and context-rich information to end users.

2) Ensure Quality (Content Cleanup)

Your AI Model is only as good as what’s going into it. ‘Garbage in, garbage out’, ‘steady foundation’, ‘steady house’, there are any number of ways to describe that if the content going into an AI model lacks quality, the outputs will too. Strong AI starts with strong content. Below, we have detailed both manual and automated actions that can be taken to improve the quality of your content, thereby improving your AI outcomes. 

Content Quality

Content created without regard for quality is common in the everyday workflow. While this content might serve business-as-usual processes, it can be detrimental to AI initiatives. Therefore, it’s crucial to address content quality issues within your repositories. Steps you can take to improve content quality and accelerate content AI readiness include:

  • Automate content cleanup processes by leveraging a combination of human-led and system-driven approaches, such as auto-tagging content for update, archival, or removal.
  • Scan and index content using automated processes to detect potential duplication by comparing titles, file size, metadata, and semantic similarity.
  • Apply similarity analysis to define business rules for deleting, archiving or modifying duplicate or near-duplicate content.
  • Flag content that has low-use or no-use, using analytics.
  • Combine analytics and content age to determine a retention cut-off (such as removing any content older than 2 years).
  • Leverage semantic tools like Named Entity Recognition (NER) and Natural Language Processing (NLP) to apply expert knowledge and determine the accuracy of content.
  • Use NLP to detect overly complex sentence structure and enterprise specific jargon that may reduce clarity or discoverability.

Content Restructuring

In the blog, Improve Enterprise AI with Semantic Content Management we note that content in an organization exists on a continuum of structure depending on many factors. The same is true for the amount of content restructuring that may or may not need to happen to enable your AI use case. We recently saw with a client that introducing even just basic structure to a document improved AI outcomes by almost 200%. However, this step requires clear goals and prioritization. Oftentimes this part of ensuring your content is AI-ready happens iteratively as the model is applied and you can determine what level of restructuring needs to occur to best improve AI outcomes. Restructuring content to prepare it for AI involves activities such as:

  • Apply tags, such as heading structures, to unstructured content to improve AI outcomes and enhance the end-user experience.
  • Use an AI-assisted check to validate that heading structures and tags are being used appropriately and are machine readable, so that content can be ingested smoothly by AI systems.
  • Simplify and restructure content that has been identified as overly complex and could result in hallucinations or unsatisfactory responses generated by the AI model.
  • Focus on reformatting longer, text-heavy content to achieve a more linear, time-based, or topic-based flow and improve AI effectiveness. 
  • Develop repeatable structures that can be applied automatically to content during creation or retroactively to provide AI with relevant content in a consumable format. 

In brief, cleaning up and restructuring content assets improves machine readability of content and therefore allows the AI model to generate stronger and more accurate outputs. To prioritize assets that need cleanup and restructuring, focus on activities and resources that will yield the highest return on investment for your AI solution. However, it is important to recognize that this may vary significantly across organizations, industries, and AI use cases. For example, an organization with a truly cross-functional use case, such as enterprise search, may prioritize deduplication of content to ensure information from different business areas doesn’t conflict when providing AI-generated responses. On the other hand, an organization with a more function-specific use case, such as streamlining legal contract review, may prioritize more hands-on content restructuring to improve AI comprehension.

3) Fill Gaps (Tacit Knowledge Capture)

Even with high-quality content, knowledge gaps that exist in your full enterprise ecosystem can cause AI errors and introduce the risk of unreliable outcomes. Considering your AI use case, the questions you want to answer, the discovery you’ve completed in previous steps, and the actions detailed below you can start to identify and fill gaps that may exist. 

Content Coverage 

Even with the best content strategy, it is not uncommon for different types of content to “fall through the cracks” and be unavailable or inaccessible for any number of reasons. Many organizations “don’t know what they don’t know”, so it can be difficult to begin this process. However, it is crucial to be aware of these content gaps, particularly when using LLMs to avoid hallucinations. Actions you may take to ensure content coverage and accelerate your journey toward content AI readiness include: 

  • Leverage systems analytics to assess user search behavior and uncover content gaps. This may include unused content areas of a repository, abandoned search queries, or searches that returned no results. 
  • Identify content gaps by using taxonomy analytics to identify missing categories or underrepresented terms and as a result, determine what content should be included.
  • Leverage SMEs and other end users during AI testing to evaluate AI-generated responses and identify areas where content may be missing. 
  • Use AI governance to ensure the model is transparent and can communicate with the user when it is not able to find a satisfactory answer.

Fill the Gap

Once missing content has been identified from information sources feeding the AI model, the real challenge is to fill in those gaps to prevent “hallucinations” and avoid user frustration that may be generated by incomplete or inaccurate answers. This may include creating new assets, locating assets, or other techniques identified which together can move the organization from AI to Knowledge Intelligence. Steps you may take to remediate the gaps and help your organization’s content be AI ready include:

  • Use link detection to uncover relationships across the content, identify knowledge that may exist elsewhere, and increase the likelihood of surfacing the right content. This can also inform later semantic tagging activities.
  • Identify, by analyzing content repositories, sources where content identified as “missing” could possibly exist.
  • Apply content transformation practices to “missing” content identified during the content repository analysis to ensure machine readability.
  • Conduct knowledge capture and transfer activities such as SME interviews, communities of practice, and collaborative tools to document tacit knowledge in the form of guides, processes, or playbooks. 
  • Institutionalize content that exists in private spaces that aren’t currently included in the repositories accessed by AI.
  • Create draft content using generative AI, making sure to include a human-in-the-loop step for accuracy. 
  • Acquire external content when gaps aren’t organization specific. Consider purchasing or licensing third-party content, such as research reports, marketing intelligence, and stock images.

By evaluating the content coverage for a particular use case, you can start to predict how well (or poorly) your AI model may perform. When critical content mostly exists in people’s heads, rather than in documented, accessible format, the organization is exposed to significant risk. For example, an organization deploying a customer-facing AI chatbot to help with case deflection in customer service centers, gaps in content can lead to potentially false or misleading responses. If the chatbot tries to answer questions it wasn’t trained for, it could result in out-of-policy exceptions, financial loss, decrease in customer trust, or lower retention due to inaccurate, outdated, or non-existent information. This example highlights why it is so important to identify and fill knowledge gaps to ensure your content is ready for AI. 

4) Add Structure and Context (Semantic Components)

Once you have identified the relevant content for an AI solution, ensured its quality for AI, and addressed major content gaps for your AI use cases, the next step in getting content ready for AI involves adding structure and context to content by leveraging semantic components. Taxonomy and metadata models provide the foundational structure needed to categorize unstructured content and provide meaningful context. Business glossaries ensure alignment by defining terms for shared understanding, while ontology models provide contextual connections needed for AI systems to process content. The semantic maturity of all of these models is critical to achieve successful AI applications. 

Semantic Maturity of Taxonomy and Business Glossaries

Some organizations struggle with the state of their taxonomies when starting AI-driven projects. Organizations must actively design and manage taxonomies and business glossaries to properly support AI-driven applications and use cases. This is not only essential for short-term implementation of the AI solution, but most importantly for long-term success. Standardization and centralization of these models help balance organization-wide needs and domain-specific needs. Properly structured and annotated taxonomies are instrumental in preparing content for AI. Taking the following actions will ensure that you have the Semantic Maturity of Taxonomies and Business Glossaries needed to achieve AI ready content:

  • Balance taxonomies across business areas to ensure organization-wide standardization, enabling smooth implementation of AI use cases and seamless integration of AI applications. 
  • Design hierarchical taxonomy structures with the depth and breadth needed to support AI use cases.
  • Refine concepts and alternative terms (synonyms and acronyms) in the taxonomy to more adequately describe and apply to priority AI content.
  • Align taxonomies with usability standards, such as ANSI/NISO Z39.19, and interoperability/machine readability standards, such as SKOS, so that taxonomies are both human and machine readable.
  • Incorporate definitions and usage notes from an organizational business glossary into the taxonomy to enrich meaning and improve semantic clarity.
  • Store and manage taxonomies in a centralized Taxonomy Management System (TMS) to support scalable AI readiness.

Semantic Maturity of Metadata 

Before content can effectively support AI-driven applications, organizations must also establish metadata practices to ensure that content has been sufficiently described and annotated. This involves not only establishing shared or enterprise-wide coordinated metadata models, but more importantly, applying complete and consistent metadata to that content. The following actions will ensure that the Semantic Maturity of your Metadata model meets the standards required for content to be AI ready:

  • Structure metadata models to meet the requirements of AI use cases, helping derive meaningful insights from tagged content.
  • Design metadata models that accurately represent different knowledge asset types (types of content) associated with priority AI use cases.
  • Apply metadata models consistently across all content source systems to enhance findability and discoverability of content in AI applications. 
  • Document and regularly update metadata models.
  • Store and manage metadata models in a centralized semantic repository to ensure interoperability and scalable reuse across AI solutions.

Semantic Maturity of Ontology

Just as with taxonomies, metadata, and business glossaries, developing semantically rich and precise ontologies is essential to achieve successful AI applications and to enable Knowledge Intelligence (KI) or explainable AI. Ontologies must be sufficiently expressive to support semantic enrichment, traceability, and AI-driven reasoning. They must be designed to accurately represent key entities, their properties, and relationships in ways that enable consistent tagging, retrieval, and interpretation across systems and AI use cases. By taking the following actions, your ontology model will achieve the level of semantic maturity needed for content to be AI ready:

  • Ensure ontologies accurately describe the knowledge domain for the in-scope content.
  • Define key entities, their attributes, and relationships in a way that supports AI-driven classification, recommendation, and reasoning.
  • Design modular and extensible ontologies for reuse across domains, applications, and future AI use cases.
  • Align ontologies with organizational taxonomies to support semantic interoperability across business areas and content source systems.
  • Annotate ontologies with rich metadata for human and machine readability.
  • Adhere to ontology standards such as OWL, RDF, or SHACL for interoperability with AI tools.
  • Store ontologies in a central ontology management system for machine readability and interoperability with other semantic models.

Preparing content for AI is not just about organizing information, it’s about making it discoverable, valuable, and usable. Investing in semantic models and ensuring a consistent content structure lays the foundation for AI to generate meaningful insights. For example, if an organization wants to deliver highly personalized recommendations that connect users to specific content, building customized taxonomies, metadata models, business glossaries, and ontologies not only maximizes the impact of current AI initiatives, but also future-proofs content for emerging AI-driven use cases.

5) Semantic Model Application (Content Tagging)

Designing structured semantic models is just one part of preparing content for AI. Equally important is the consistent application of complete, high-quality metadata to organization-wide content. Metadata enrichment of unstructured content, especially across silo repositories, is critical for enabling AI-powered systems to reliably discover, interpret, and utilize that content. The following actions to enhance the application of content tags will help you achieve content AI readiness:

  • Tag unstructured content with high-quality metadata to enhance interpretability in AI systems.
  • Ensure each piece of relevant content for the AI solution is sufficiently annotated, or in other words, it is labeled with enough metadata to describe its meaning and context. 
  • Promote consistent annotation of content across business areas and systems using tags derived from a centralized and standardized taxonomy. 
  • Leverage mechanisms, like auto-tagging, to enhance the speed and coverage of content tagging. 
  • Include a human-in-the-loop step in the auto-tagging process to improve accuracy of content tagging.

Consistent content tagging provides an added layer of meaning and context that AI can use to deliver more complete and accurate answers. For example, an organization managing thousands of unstructured content assets across disparate repositories and aiming to deliver personalized content experiences to end users, can more effectively tag content by leveraging a centralized taxonomy and an auto-tagging approach. As a result, AI systems can more reliably surface relevant content, extract meaningful insights, and generate personalized recommendations.

6) Address Access and Security (Unified Entitlements)

As Joe Hilger mentioned in his blog about unified entitlements, “successful semantic solutions and knowledge management initiatives help the right people see the right information at the right time.” But to achieve this, access permissions must be in place so that only authorized individuals have visibility into the appropriate content. Unfortunately, many organizations still maintain content in old repositories that don’t have the right features or processes to secure it, creating a significant risk for organizations pursuing AI initiatives. Therefore, now more than ever, it is important to properly secure content by defining and applying entitlements, preventing access to highly sensitive content by unauthorized people and as a result, maintaining trust across the organization. The actions outlined below to enhance Unified Entitlements will accelerate your journey toward content AI readiness:

  • Define an enterprise-wide entitlement framework to apply security rules consistently across content assets, regardless of the source system.
  • Automate security by enforcing privileges across all systems and types of content assets using a unified entitlements solution.
  • Leverage AI governance processes to ensure that content creators, managers, and owners are aware of entitlements for content they handle and needs to be consumed by AI applications.

Entitlements are important because they ensure that content remains consistent, trustworthy, and reusable for AI systems. For example, if an organization developing a Generative AI solution stores documents and web content about products and clients across multiple SharePoint sites, content management systems, and webpages, inconsistent application of entitlements may represent a legal or compliance risk, potentially exposing outdated, or even worse, highly sensitive content to the wrong people. On the other hand, the correct definition and application of access permissions through a unified entitlements solution plays a key role in mitigating that risk, enabling operational integrity and scalability, not only for the intended Generative AI solution, but also for future AI initiatives.

7) Maintain Quality While Iteratively Improving (Governance)

Effective governance for AI solutions can be very complex because it requires coordination across systems and groups, not just within them, especially among content governance, semantic governance, and AI governance groups. This coordination is essential to ensure content remains up to date and accessible for users and AI solutions, and that semantic models are current and centrally accessible. 

AI Governance for Content Readiness 

Content Governance 

Not all organizations have supporting organizational structures with defined roles and processes to create, manage, and govern content that is aligned with cross-organizational AI initiatives. The existence of an AI Governance for Content Readiness Group ensures coordination with the traditional Content Governance Groups and provides guidance to content owners of the source systems on how to get content AI ready to support priority AI use cases. By taking the following actions, the AI Governance for Content Readiness Group will help ensure that you have the content governance practices required to achieve AI-ready content:

  • Define how content should be captured and managed in a way that is consistent, predictable, and interoperable for AI use cases.
  • Incorporate in your AI solution roadmap a step, delivered through the Content Governance Groups, to guide content owners of the source systems on what is required to get content AI ready for inclusion in AI models.
  • Provide guidance to the Content Governance Group on how to train and communicate with system owners and asset owners on how to prepare content for AI.
  • Take the technical and strategic steps necessary to connect content source systems to AI systems for effective content ingestion and interpretation.
  • Coordinate with the Content Governance Group to develop and adopt content governance processes that address content gaps identified through the detection of bias, hallucinations, or misalignment, or unanswered questions during AI testing.
  • Automate AI governance processes leveraging AI to identify content gaps, auto-tag content, or identify new taxonomy terms for the AI solution.

Semantic Models Governance

Similar to the importance of coordinating with the content governance groups, coordinating with semantic models governance groups is key for AI readiness. This involves establishing roles and responsibilities for the creation, ownership, management, and accountability of semantic models (taxonomy, metadata, business glossary, and ontology models) in relation to AI initiatives. This also involves providing clear guidance for managing changes in the models and communicating updates to those involved in AI initiatives. By taking the following actions, the AI Governance for Content Readiness Group will help ensure that your organization has the semantic governance practices required to achieve AI-ready content: 

  • Develop governance structures that support the development and evolution of semantic models in alignment with both existing and emerging AI initiatives.
  • Align governance roles (e.g. taxonomists, ontologists, semantic engineers, and AI engineers) with organizational needs for developing and maintaining semantic models that support enterprise-wide AI solutions.
  • Ensure that the systems used to manage taxonomies, metadata, and ontologies support enforcing permissions for accessing and updating the semantic models.
  • Work with the Semantic Models Governance Groups to develop processes that help remediate gaps in the semantic models uncovered during AI testing. This includes providing guidance on the recommended steps for making changes, suggested decision-makers, and implementation approaches.
  • Work with the Semantic Models Governance Groups to establish metrics and processes to monitor, tune, refine, and evolve semantic models throughout their lifecycle and stay up to date with AI efforts.
  • Coordinate with the Semantic Models Governance Groups to develop and adopt processes that address semantic model gaps identified through the detection of bias, hallucinations, or misalignment, or unanswered questions during AI solution testing.

For example, imagine an organization is developing business taxonomies and ontologies that represent skills, job roles, industries, and topics to support an Employee 360 View solution. It is essential to have a governance model in place with clearly defined roles, responsibilities, and processes to manage and evolve these semantic models as the AI solutions team ingests content from diverse business areas and detects gaps during AI testing. Therefore, coordination between the AI Governance for Content Readiness Group and the Semantic Models Governance Groups helps ensure that concepts, definitions, entities, properties, and relationships remain current and accurately reflect the knowledge domain for both today’s needs and future AI use cases.  

Conclusion

Unstructured content remains as one of the most common knowledge assets in organizations. Getting that content ready to be ingested by AI applications is a balancing act. By cleaning it up, filling in gaps, applying rich semantic models to add structure and context, securing it with unified entitlements, and leveraging AI governance, organizations will be better positioned to succeed in their own AI journey. We hope after reading this blog you have a better understanding of the actions you can take to ensure your organization’s content is AI ready. If you want to learn how our experts can help you achieve Content AI Readiness, contact us at info@enterprise-knowledge.com

The post How to Ensure Your Content is AI Ready appeared first on Enterprise Knowledge.

]]>
Top Ways to Get Your Content and Data Ready for AI https://enterprise-knowledge.com/top-ways-to-get-your-content-and-data-ready-for-ai/ Mon, 15 Sep 2025 19:17:48 +0000 https://enterprise-knowledge.com/?p=25370 As artificial intelligence has quickly moved from science fiction, to pervasive internet reality, and now to standard corporate solutions, we consistently get the question, “How do I ensure my organization’s content and data are ready for AI?” Pointing your organization’s … Continue reading

The post Top Ways to Get Your Content and Data Ready for AI appeared first on Enterprise Knowledge.

]]>
As artificial intelligence has quickly moved from science fiction, to pervasive internet reality, and now to standard corporate solutions, we consistently get the question, “How do I ensure my organization’s content and data are ready for AI?” Pointing your organization’s new AI solutions at the “right” content and data are critical to AI success and adoption, and failing to do so can quickly derail your AI initiatives.  

Though the world is enthralled with the myriad of public AI solutions, many organizations struggle to make the leap to reliable AI within their organizations. A recent MIT report, “The GenAI Divide,” reveals a concerning truth: despite significant investments in AI, 95% of organizations are not seeing any benefits from their AI investments. 

One of the core impediments to achieving AI within your own organization is poor-quality content and data. Without the proper foundation of high-quality content and data, any AI solution will be rife with ‘hallucinations’ and errors. This will expose organizations to unacceptable risks, as AI tools may deliver incorrect or outdated information, leading to dangerous and costly outcomes. This is also why tools that perform well as demos fail to make the jump to production.  Even the most advanced AI won’t deliver acceptable results if an organization has not prepared their content and data.

This blog outlines seven top ways to ensure your content and data are AI-ready. With the right preparation and investment, your organization can successfully implement the latest AI technologies and deliver trustworthy, complete results.

1) Understand What You Mean by “Content” and/or “Data” (Knowledge Asset Definition)

While it seems obvious, the first step to ensuring your content and data are AI-ready is to clearly define what “content” and “data” mean within your organization. Many organizations use these terms interchangeably, while others use one as a parent term of the other. This obviously leads to a great deal of confusion. 

Leveraging the traditional definitions, we define content as unstructured information (ranging from files and documents to blocks of intranet text), and data as structured information (namely the rows and columns in databases and other applications like Customer Relationship Management systems, People Management systems, and Product Information Management systems). You are wasting the potential of AI if you’re not seeking to apply your AI to both content and data, giving end users complete and comprehensive information. In fact, we encourage organizations to think even more broadly, going beyond just content and data to consider all the organizational assets that can be leveraged by AI.

We’ve coined the term knowledge assets to express this. Knowledge assets comprise all the information and expertise an organization can use to create value. This includes not only content and data, but also the expertise of employees, business processes, facilities, equipment, and products. This manner of thinking quickly breaks down artificial silos within organizations, getting you to consider your assets collectively, rather than by type. Moving forward in this article, we’ll use the term knowledge assets in lieu of content and data to reinforce this point. Put simply and directly, each of the below steps to getting your content and data AI-ready should be considered from an enterprise perspective of knowledge assets, so rather than discretely developing content governance and data governance, you should define a comprehensive approach to knowledge asset governance. This approach will not only help you achieve AI-readiness, it will also help your organization to remove silos and redundancies in order to maximize enterprise efficiency and alignment.

knowledge asset zoom in 1

2) Ensure Quality (Asset Cleanup)

We’ve found that most organizations are maintaining approximately 60-80% more information than they should, and in many cases, may not even be aware of what they still have. That means that four out of five knowledge assets are old, outdated, duplicate, or near-duplicate. 

There are many costs to this over-retention before even considering AI, including the administrative burden of maintaining this 80% (including the cost and environmental impact of unnecessary server storage), and the usability and findability cost to the organization’s end users when they go through obsolete knowledge assets.

The AI cost becomes even higher for several reasons. First, AI typically “white labels” the knowledge assets it finds. If a human were to find an old and outdated policy, they may recognize the old corporate branding on it, or note the date from several years ago on it, but when AI leverages the information within that knowledge asset and resurfaces it, it looks new and the contextual clues are lost.

Next, we have to consider the old adage of “garbage in, garbage out.” Incorrect knowledge assets fed to an AI tool will result in incorrect results, also known as hallucinations. While prompt engineering can be used to try to avoid these conflicts and, potentially even errors, the only surefire guarantee to avoid this issue is to ensure the accuracy of the original knowledge assets, or at least the vast majority of it.

Many AI models also struggle with near-duplicate “knowledge assets,” unable to discern which version is trusted. Consider your organization’s version control issues, working documents, data modeled with different assumptions, and iterations of large deliverables and reports that are all currently stored. Knowledge assets may go through countless iterations, and most of the time, all of these versions are saved. When ingested by AI, multiple versions present potential confusion and conflict, especially when these versions didn’t simply build on each other but were edited to improve findings or recommendations. Each of these, in every case, is an opportunity for AI to fail your organization.

Finally, this would also be the point at which you consider restructuring your assets for improved readability (both by humans and machines). This could include formatting (to lower cognitive lift and improve consistency) from a human perspective. For both humans and AI, this could also mean adding text and tags to better describe images and other non-text-based elements. From an AI perspective, in longer and more complex assets, proximity and order can have a negative impact on precision, so this could include restructuring documents to make them more linear, chronological, or topically aligned. This is not necessary or even important for all types of assets, but remains an important consideration especially for text-based and longer types of assets.

knowledge asset zoom in 2

3) Fill Gaps (Tacit Knowledge Capture)

The next step to ensure AI readiness is to identify your gaps. At this point, you should be looking at your AI use cases and considering the questions you want AI to answer. In many cases, your current repositories of knowledge assets will not have all of the information necessary to answer those questions completely, especially in a structured, machine-readable format. This presents a risk itself, especially if the AI solution is unaware that it lacks the complete range of knowledge assets necessary and portrays incomplete or limited answers as definitive. 

Filling gaps in knowledge assets is extremely difficult. The first step is to identify what is missing. To invoke another old adage, organizations have long worried they “don’t know what they don’t know,” meaning they lack the organizational maturity to identify gaps in their own knowledge. This becomes a major challenge when proactively seeking to arm an AI solution with all the knowledge assets necessary to deliver complete and accurate answers. The good news, however, is that the process of getting knowledge assets AI-ready helps to identify gaps. In the next two sections, we cover semantic design and tagging. These steps, among others, can identify where there appears to be missing knowledge assets. In addition, given the iterative nature of designing and deploying AI solutions, the inability of AI to answer a question can trigger gap filling, as we cover later. 

Of course, once you’ve identified the gaps, the real challenge begins, in that the organization must then generate new knowledge assets (or locate “hidden” assets) to fill those gaps. There are many techniques for this, ranging from tacit knowledge capture, to content inventories, all of which collectively can help an organization move from AI to Knowledge Intelligence (KI).    

knowledge asset zoom in 3

4) Add Structure and Context (Semantic Components)

Once the knowledge assets have been cleansed and gaps have been filled, the next step in the process is to structure them so that they can be related to each other correctly, with the appropriate context and meaning. This requires the use of semantic components, specifically, taxonomies and ontologies. Taxonomies deliver meaning and structure, helping AI to understand queries from users, relate knowledge assets based on the relationships between the words and phrases used within them, and leverage context to properly interpret synonyms and other “close” terms. Taxonomies can also house glossaries that further define words and phrases that AI can leverage in the generation of results.

Though often confused or conflated with taxonomies, ontologies deliver a much more advanced type of knowledge organization, which is both complementary to taxonomies and unique. Ontologies focus on defining relationships between knowledge assets and the systems that house them, enabling AI to make inferences. For instance:

<Person> works at <Company>

<Zach Wahl> works at <Enterprise Knowledge>

<Company> is expert in <Topic>

<Enterprise Knowledge> is expert in <AI Readiness>

From this, a simple inference based on structured logic can be made, which is that the person who works at the company is an expert in the topic: Zach Wahl is an expert in AI Readiness. More detailed ontologies can quickly fuel more complex inferences, allowing an organization’s AI solutions to connect disparate knowledge assets within an organization. In this way, ontologies enable AI solutions to traverse knowledge assets, more accurately make “assumptions,” and deliver more complete and cohesive answers. 

Collectively, you can consider these semantic components as an organizational map of what it does, who does it, and how. Semantic components can show an AI how to get where you want it to go without getting lost or taking wrong turns.

5) Semantic Model Application (Tagging)

Of course, it is not sufficient simply to design the semantic components; you must complete the process by applying them to your knowledge assets. If the semantic components are the map, applying semantic components as metadata is the GPS that allows you to use it easily and intuitively. This step is commonly a stumbling block for organizations, and again is why we are discussing knowledge assets rather than discrete areas like content and data. To best achieve AI readiness, all of your knowledge assets, regardless of their state (structured, unstructured, semi-structured, etc), must have consistent metadata applied against them. 

When applied properly, this consistent metadata becomes an additional layer of meaning and context for AI to leverage in pursuit of complete and correct answers. With the latest updates to leading taxonomy and ontology management systems, the process of automatically applying metadata or storing relationships between knowledge assets in metadata graphs is vastly improved, though still requires a human in the loop to ensure accuracy. Even so, what used to be a major hurdle in metadata application initiatives is much simpler than it used to be.

knowledge asset zoom in 4

6) Address Access and Security (Unified Entitlements)

What happens when you finally deliver what your organization has been seeking, and give it the ability to collectively and completely serve their end users the knowledge assets they’ve been seeking? If this step is skipped, the answer is calamity. One of the express points of the value of AI is that it can uncover hidden gems in knowledge assets, make connections humans typically can’t, and combine disparate sources to build new knowledge assets and new answers within them. This is incredibly exciting, but also presents a massive organizational risk.

At present, many organizations have an incomplete or actually poor model for entitlements, or ensuring the right people see the right assets, and the wrong people do not. We consistently discover highly sensitive knowledge assets in various forms on organizational systems that should be secured but are not. Some of this takes the form of a discrete document, or a row of data in an application, which is surprisingly common but relatively easy to address. Even more of it is only visible when you take an enterprise view of an organization. 

For instance, Database A might contain anonymized health information about employees for insurance reporting purposes but maps to discrete unique identifiers. File B includes a table of those unique identifiers mapped against employee demographics. Application C houses the actual employee names and titles for the organizational chart, but also includes their unique identifier as a hidden field. The vast majority of humans would never find this connection, but AI is designed to do so and will unabashedly generate a massive lawsuit for your organization if you’re not careful.

If you have security and entitlement issues with your existing systems (and trust me, you do), AI will inadvertently discover them, connect the dots, and surface knowledge assets and connections between them that could be truly calamitous for your organization. Any AI readiness effort must confront this challenge, before your AI solutions shine a light on your existing security and entitlements issues.

knowledge asset zoom in 5

7) Maintain Quality While Iteratively Improving (Governance)

Steps one through six describe how to get your knowledge assets ready for AI, but the final step gets your organization ready for AI. With a massive investment in both getting your knowledge assets in the right state for AI and in  the AI solution itself, the final step is to ensure ongoing quality of both. Mature organizations will invest in a core team to ensure knowledge assets go from AI-ready to AI-mature, including:

  • Maintaining and enforcing the core tenets to ensure knowledge assets stay up-to-date and AI solutions are looking at trusted assets only;
  • Reacting to hallucinations and unanswerable questions to fill gaps in knowledge assets; 
  • Tuning the semantic components to stay up to date with organizational changes.

The most mature organizations, those wishing to become AI-Powered organizations, will look first to their knowledge assets as the key building block to drive success. Those organizations will seek ROCK (Relevant, Organizationally Contextualized, Complete, and Knowledge-Centric) knowledge assets as the first line to delivering Enterprise AI that can be truly transformative for the organization. 

If you’re seeking help to ensure your knowledge assets are AI-Ready, contact us at info@enterprise-knowledge.com

The post Top Ways to Get Your Content and Data Ready for AI appeared first on Enterprise Knowledge.

]]>
Auto-Classification for the Enterprise: When to Use AI vs. Semantic Models https://enterprise-knowledge.com/auto-classification-when-ai-vs-semantic-models/ Tue, 26 Aug 2025 18:19:23 +0000 https://enterprise-knowledge.com/?p=25221 Auto-classification is a valuable process for adding context to unstructured content. Nominally speaking, some practitioners distinguish between auto-classification (placing content into pre-defined categories from a taxonomy) and auto-tagging (assigning unstructured keywords or metadata, sometimes generated without a taxonomy). In this article, I use ‘auto-classification’ in the broader sense, encompassing both approaches. Continue reading

The post Auto-Classification for the Enterprise: When to Use AI vs. Semantic Models appeared first on Enterprise Knowledge.

]]>
Auto-classification is a valuable process for adding context to unstructured content. Nominally speaking, some practitioners distinguish between auto-classification (placing content into pre-defined categories from a taxonomy) and auto-tagging (assigning unstructured keywords or metadata, sometimes generated without a taxonomy). In this article, I use ‘auto-classification’ in the broader sense, encompassing both approaches. While it can take many forms, its primary purpose remains the same: to automatically enrich content with metadata that improves findability, helps users immediately determine relevance, and provides crucial information on where content came from and when it was made. And while tagging content is always a recommended practice, it is not always scalable when human time and effort is required to perform it. To solve this problem, we have been helping organizations automate this process and minimize the amount of manual effort required, especially in the age of AI, where organized and well-labeled information is the key to success.

This includes designing and implementing auto-classification solutions that save time and resources – using methods such as natural language processing, machine learning, and rapidly-evolving AI models such as large language models (LLMs). In this article, I will demonstrate how auto-classification processes can deliver measurable value to organizations of diverse sizes or industries, using real-world examples to illustrate the costs and benefits. I will then give an overview of common methods for performing auto-classification, comparing their high-level strengths and weaknesses, and conclude by discussing how incorporating semantics can significantly enhance the performance of these methods.

How Can Auto-Classification Help My Organization?

It’s a good bet that your organization possesses a large repository of unstructured information such as documents, process guides, and informational resources, either meant for internal use or for display on a public webpage. Such a collection of knowledge assets is valuable – but only as valuable as the organization’s ability to effectively access, manage, and utilize them. That’s where auto-classification can shine: by serving as an automated processor of your organization’s unstructured content and applying tags, an auto-classifier adds structure quickly that provides value in multiple ways, as outlined below.

Time Savings

First, an auto-classifier saves content creators time in two key ways. For one, manually reading through documents and applying metadata tags to each individually can be tedious, taking time away from content creators’ other responsibilities – as a solution, auto-classification can free up time that can be used to perform more crucial tasks. On the other end of the process, auto-classification and the use of metadata tags can improve findability, saving employees time when searching for documents. When paired with a taxonomy or set list of terms, an auto-classifier can standardize the search experience by allowing for content to be consistently tagged with a set of standard language. 

Content Management and Strategy

These standard tags can also play a role in more content strategy-focused efforts, such as identifying gaps in content and content deduplication. For example, if some taxonomy terms feature no associated content, content strategists and managers may identify an organizational gap that needs to be filled via the authoring of new content. In contrast, too many content pieces identified as having similar themes can be deduplicated so that the most valuable content is prioritized for end users. These analytics-based decisions can help organizations maximize the efficacy of their content, increase content reach, and cut down on the cost of storing duplicate content. 

Ensuring Security

Finally, we have seen auto-classification play a key role in keeping sensitive content and information secure. Auto-classifiers can determine what content should be tagged with certain sensitivity classifications (for example, employee addresses being tagged as visible by HR only). One example of this is through dark data detection, where an auto-classifier parses through all organizational content to identify information that should not be visible to all end users. Assigning sensitivity classifications to content through auto-tagging can help to automatically address security concerns and ensure regulatory compliance, saving organizations from the reputational and legal costs associated with data leaks. 

Common Auto-Classification Methods

An infographic about the six common auto-classification methods: rules-based tagging, regular expressions tagging, frequency-based tagging, natural language processing, machine learning-based tagging, LLM-based tagging

So, how do we go about tagging content automatically? Organizations can choose to employ one of a number of methods as a standalone solution, or combine them as part of a hybrid solution. Below, I will give a high-level overview of six of the most commonly used methods in auto-classification, along with some considerations for each.

1. Rules-Based Tagging: Uses deterministic rules to map content to tags. Rules can be built from dictionaries/keyword lists, proximity or co-occurrence patterns (e.g., “treatment” within 10 words of “disorder”), metadata values (author, department), or structural cues (headings, templates).

  • Considerations: Highly transparent and auditable; great for regulated/compliance use cases and domain terms with stable phrasing. However, rules can be brittle, require ongoing maintenance, and may miss implied meaning or novel phrasing unless rules are continually expanded.

2. Regular Expression (RegEx) Tagging: A specialized form of rules-based tagging that applies RegEx patterns to detect and tag structured strings (for example, SKUs, case numbers, ICD-10 codes, dates, or email addresses).

  • Considerations: Excellent precision for well-formed patterns and semi-structured content; lightweight and fast. Can produce false positives without careful validation of results. Best combined with other methods (such as frequency or NLP) for context checks.

3. Frequency-Based Tagging: Frequency-based tagging considers the number of times that a certain term (or variations of said term) appear in a document, and assigns the most frequently appearing tags to the content. Early search engines, website indexers, and tag-mining software relied heavily on this approach for its simplicity and transparency; however, frequency of a term does not always guarantee its importance.

  • Considerations: Works well with a well-structured taxonomy with ample synonyms for terms, as well as content that has key terms appear frequently. Not as strong a method when meaning is implied/terms are not explicitly used or terms are excessively repeated.

4. Natural Language Processing (NLP): Uses basic calculations of semantic meaning (tokenization) to find the best matches by meaning between two pieces of text (such as a content piece and terms in a taxonomy).

  • Considerations: Can work well for terms that are not organization/domain-specific, but struggles with acronyms/more specific terms. Better than frequency-based tagging at determining implied meaning.

5. Machine Learning-Based Tagging: Machine learning methods allow for the training of models on pre-tagged content, empowering organizations to improve models iteratively for better results. By comparing new content against patterns they have already learned/been trained on, machine learning models can infer the most relevant concepts and tags to a content piece and apply them consistently. User input can help refine the classifier to identify patterns, trends, and domain-specific terms more accurately.

  • Considerations: A stock model may initially perform at a lower-than-expected level, while a well-trained model can deliver high-grade accuracy. However, this can come at the expense of time and computing resources.

6. Large Language Model (LLM)-Based Tagging: The newest form of auto-classification, this involves providing a large language model with a tagging prompt, content to tag, and a taxonomy/list of terms if desired. As interest around generative AI and LLMs grows, this method has become increasingly popular for its ability to parse more complex content pieces and analyze meaning deeply.

  • Considerations: Tags content like a human, meaning results may vary/become inconsistent if the same corpus is tagged multiple times. While LLMs can be smart regarding implied meaning and content sensitivity, they can be inconsistent without specific model tuning and prompt engineering. Additionally, suffers from accuracy/precision issues when fed a large taxonomy.

Some taxonomy and ontology management systems (TOMS), such as Graphwise PoolParty or Progress Semaphore, also offer auto-classification add-ons or extensions to their platforms that make use of one or more of these methods.

The Importance of Semantics in Auto-Classification

Imagine your repository of content as a bookstore, and your auto-classifier as the diligent (but easily confused!) store manager. You have a wide number of books you want to sort into different categories, such as their audience (children, teen, adult) and genre (romance, fantasy, sci–fi, nonfiction). 

Now, imagine if you gave your manager no instructions on how to sort the books. They start organizing too specifically. They put four books together on one shelf that says “Nonfiction books about history in 1814.” They put another three books on a shelf that says “Romance books in a fantasy universe with dragons.” They put yet another five books on a shelf that says “Books about knowledge management.” 

Before you know it, your bookstore has 1,098 shelves, and no happy customers. 

Therein lies the danger of tagging content without a taxonomy, leading to what’s known as semantic drift. While tagging without a taxonomy and creating an initial set of tags can be useful in some circumstances, such as when trying to generate tags or topics to later organize into a hierarchy as part of a taxonomy, it has its limitations. Tags often become very specific and struggle to maintain alignment in a way that makes them useful for search or for grouping larger amounts of content together. And, as I mentioned at the beginning of this article, auto-classification without a taxonomy in place is not auto-classification in the true sense of the word; rather, such approaches are auto-tagging, and may not produce the results business leaders/decision-makers expect.

I’ve seen this in practice when testing auto-classification methods with and without a taxonomy. When an LLM was given the same content corpus of 100 documents to tag, but one generated its own terms and the other was given a taxonomy, the results differed greatly. The LLM without a taxonomy generated 765 extremely domain-specific terms that often only applied to a singular content piece. In contrast, the LLM when given a taxonomy tagged the content with 240 terms, allowing the same tags to apply to multiple content pieces, creating topic clusters and groups of similar content that users can easily browse, search, and navigate, making discovery faster, more intuitive, and less fragmented than when every piece is labeled with unique, one-off terms

Bar graph showing the precision, recall, and accuracy of LLM's with and without semantics

Overall, incorporating a taxonomy into LLM-based auto-classification transforms fragmented, messy one-off tags into consistent topic clusters and hierarchies that make content easier to browse, search, and discover.

This illustrates the utility of a taxonomy in auto-classification. When you give your employee a list of shelves to stock in the store, they can avoid the “overthinking” of semantic drift and place books onto more well-architected shelves (e.g., Young Adult, Sci-Fi). A well-defined taxonomy acts as the blueprint for organizing content meaningfully and consistently using an auto-tagger.

 

When Should I Use AI, Semantic Models, or Both?

Bar graph about the accuracy of different auto-tagging methods

 

Bar graph showing the precision of different auto-classification methods

 

Bar graph showing the recall of different auto-classification methods
While results may vary by use case, methods including both AI and semantic models tend to score higher across the board. These images demonstrate results from one specific content corpus we tested internally.

Methods including both AI and semantic models tend to score higher in accuracy, precision, and recall.

 

As demonstrated above, tags created by generative AI models without any semantic model in place can become unwieldy and excessive, as LLMs look to create the best tag for that individual content piece rather than a tag that can be used as an umbrella term for multiple pieces of content. However, that does not completely eliminate AI as a standalone solution for all tagging use cases. These auto-tagging models and processes can prove helpful in the early stages of creating a term list as a method of identifying common themes across content in a corpus and forming initial topic clusters that can later bring structure to a taxonomy, either in the form of hierarchies or facets. Once again, while not true auto-classification as the industry dictates, auto-tagging with AI alone can work well for domains where topics don’t neatly fit within a hierarchy or when domain models and knowledge evolve quickly and a hierarchical structure would be infeasible.

On the other hand, semantic models are a great way to add the aforementioned structure to an auto-classification process, and work very well for exact or near-exact term matching. When combined with a frequency tagging, NLP, or machine learning-based auto-classifier in these situations, they tend to excel in terms of precision, applying very few incorrect tags. Additionally, these methods perform well in situations where content contains domain-specific jargon or acronyms located within semantic models, as it tags with a greater emphasis on these exact matches. 

Semantic models alone can prove to be a more cost-effective option for auto-classification as well, as lighter, less compute-heavy models that do not require paid cloud hosting can tag some content corpora with a high level of accuracy. Finally, semantic models can assist greatly in cases where security and compliance are paramount, as leading AI models are generally cloud-hosted, and most methods using semantics alone can be run on-premises without introducing privacy concerns.

Nonetheless, semantic models and AI can combine as part of auto-classification solutions that are more robust and well-equipped for complex use cases. LLMs can extract meaning from complex documents where topics may be implied and compare content against a taxonomy or term list, which helps ensure content is easy to organize and consistent with an organization’s model for knowledge. However, one key consideration with this method is taxonomy size – if a taxonomy grows too large (terms in the thousands, for example), an LLM may face difficulties finding/applying the right tag in a limited context window without mitigation strategies such as retrieving tags in batches. 

In more advanced use cases, an LLM can also be paired with an ontology, which can help LLMs understand more about interrelationships between organizational topics, concepts, and terms, and apply tags to content more intelligently. For example, a knowledge base of clinical notes and guidelines could be paired with a medical ontology that maps symptoms to potential conditions, and conditions to recommended treatments. An LLM that understands this ontology could tag a physician’s notes with all three layers (symptoms, conditions, and treatments) so when a doctor searches for “persistent cough,” the system retrieves not just symptom references, but also likely diagnoses (e.g., bronchitis, asthma) and corresponding treatment protocols. This kind of ontology-guided tagging makes the knowledge base more searchable and user-friendly and helps surface actionable insights instead of isolated pieces of information.

In some cases, privacy or security concerns may dictate that AI cannot be used alongside a semantic model. In others, an organization may lack a semantic model and may only have the capacity to tag content with AI as a start. However, as a whole, the majority of use cases for auto-classification benefit from a well-architected solution that combines AI’s ability to intelligently parse content with the structure and specific context that semantic models provide.

Conclusion

Auto-classification adds an important step in automation to organizations looking to enrich their content with metadata – whether it be for findability, analytics, or understanding. While there are many methods to choose from when exploring an auto-classification solution, they all rely on semantics in the form of a well-designed taxonomy to function to the best of their ability. Once implemented and governed correctly, these automated solutions can serve as key ways to unblock human efforts and direct them away from tedious tagging processes, allowing your organization’s experts to get back to doing what matters most. 

Looking to set up an auto-classification process within your organization? Want to learn more about auto-classification best practices? Contact us!

The post Auto-Classification for the Enterprise: When to Use AI vs. Semantic Models appeared first on Enterprise Knowledge.

]]>
Emily Crockett Participating in “Using Storytelling to Transform User Assistance” Panel at ConVEx Ideas Conference https://enterprise-knowledge.com/emily-crockett-participating-in-panel-at-convex-ideas-conference/ Mon, 11 Aug 2025 20:52:39 +0000 https://enterprise-knowledge.com/?p=25123 Emily Crockett, Senior Content Engineering Consultant at Enterprise Knowledge, will be participating as an expert panelist at the upcoming ConVEx Ideas Conference. The Component Content Alliance panel, titled, “Using Storytelling to Transform User Assistance,” will explore how structured content, metadata, … Continue reading

The post Emily Crockett Participating in “Using Storytelling to Transform User Assistance” Panel at ConVEx Ideas Conference appeared first on Enterprise Knowledge.

]]>
Emily Crockett, Senior Content Engineering Consultant at Enterprise Knowledge, will be participating as an expert panelist at the upcoming ConVEx Ideas Conference. The Component Content Alliance panel, titled, “Using Storytelling to Transform User Assistance,” will explore how structured content, metadata, and user insights come together to create meaningful narratives at scale. The panel will incorporate several unique voices in content, with Crockett representing the perspective of Knowledge Management and the understanding of content as an enterprise knowledge asset.

The session will be held online on Wednesday, September 17 from 9:00am – 10:00 AM PST. For more information and to register, visit here.

The post Emily Crockett Participating in “Using Storytelling to Transform User Assistance” Panel at ConVEx Ideas Conference appeared first on Enterprise Knowledge.

]]>
Semantic Layer for Content Discovery, Personalization, and AI Readiness https://enterprise-knowledge.com/semantic-layer-for-content-discovery-personalization-and-ai-readiness/ Tue, 29 Jul 2025 13:20:52 +0000 https://enterprise-knowledge.com/?p=25048 A professional association needed to improve their members’ content experiences. With tens of thousands of content assets published across 50 different websites and 5 disparate content management systems (CMSes), they struggled to coordinate a content strategy and improve content discovery. They could not keep up with the demands of managing content ... Continue reading

The post Semantic Layer for Content Discovery, Personalization, and AI Readiness appeared first on Enterprise Knowledge.

]]>

The Challenge

A professional association needed to improve their members’ content experiences. With tens of thousands of content assets published across 50 different websites and 5 disparate content management systems (CMSes), they struggled to coordinate a content strategy and improve content discovery. They could not keep up with the demands of managing content, leading to problems with outdated content and content pieces that were hard to discover. They also lacked the ability to identify and act on user data and trends, to better plan and tailor their content to member needs. Ultimately, members could not discover and take full advantage of the wealth of resources provided to them by the association.

Overall, the key driver behind this challenge was that the professional association lacked semantic maturity. While the association had a way to structure their content through a number of taxonomies across their web properties, their models were not aligned or mapped to one another and updates were not coordinated. Tagging expertise—and time to contribute to content tagging—varied considerably between content creators, resulting in inconsistent and irregular content tagging. The association also struggled to maintain their content due to an absence of clear governance responsibilities and practices. More broadly, the association lacked organization-wide processes to align semantic modeling with content governance—processes that ensure taxonomies and metadata models evolve in step with new content areas, and that governance practices consistently enforce tagging standards across content types and updates. This gap was also reflected in their technology stack: the association lacked an organization-wide solution architecture that would support their ability to coordinate and share semantics, data, and content across their systems. These challenges prevented the association from developing more engaging content experiences for their members. They needed support developing the strategies, semantic models, and solution architecture to enable their vision.

The Solution

EK partnered with the professional association to establish the foundational content strategy, semantic models, and solution architecture to enable their goals for content discovery and analytics. First, EK conducted a current state analysis and target state definition, as well as a semantic maturity assessment. This helped EK understand the factors that could be leveraged to help the association realize its goals. EK subsequently completed three parallel workstreams:

  1. Content Assessment: EK audited a sample of assets on priority web properties to understand the condition of the association’s content and semantic practices. EK identified recommendations for how to enhance the performance, governance, and discoverability of content. Based on these recommendations, EK provided step-by-step procedures to support the association in completing a comprehensive audit to enhance their content quality and aid in future findability enhancement and content personalization efforts.
  2. Taxonomy and Ontology Development: EK developed an enterprise taxonomy and ontology framework for the association—to provide a standardized vocabulary for use across the association’s systems, and increase the maturity of the association’s semantic models. The enterprise taxonomy included 12 facets to support 12 metadata fields, with a cumulative total of over 900 concepts. An ontology identified key relationships between the different taxonomy facets, establishing a foundation for identifying related content and supporting auto-tagging.
  3. Semantic Layer Architecture: EK provided recommendations for maturing the association’s tooling and integrations in support of their goals. Specifically, EK developed a solution architecture to integrate taxonomy, ontology, and auto-tagging across content, asset, and learning management systems, in order to inform a variety of content analytics, discovery, recommendation, and assembly applications. This architecture was designed to form the basis of a semantic layer that the association could later use to connect and relate content enterprise-wide. The architecture included the addition of a taxonomy and ontology management system (TOMS) to centralize semantic model management and to introduce auto-tagging capabilities. Alongside years of experience in tool evaluation, EK leveraged their proprietary TOMS evaluation matrix to score candidate vendors and TOMS solutions, supporting the association in selecting a tool that was the best fit for their needs.
  4. Auto-Tagging Proof of Concept: Building on these efforts, EK conducted an auto-tagging proof of concept (PoC), to support the association in applying the taxonomy to their content. The PoC automatically tagged all content assets in 2 priority CMSes with concepts from 2 prioritized topic taxonomy facets. The EK team prepared the processing pipeline for the auto-tagging effort, including pre-processing the content and conducting analysis of the tags to gauge quality and improvement over time.

To determine the exact level of improvement, EK worked with subject matter experts to establish a gold standard set of expected tags for a sample of content assets. The tags produced by the auto-tagger were compared to the expected tag set, to generate measures of recall, precision, and accuracy. EK used the analytics to inform adjustments to the taxonomy facets and to fine-tune and improve the auto-tagger’s performance over successive rounds.

To support the association in continuing to grow and leverage their semantic maturity, EK provided a detailed semantic maturity implementation roadmap. The roadmap identified five target outcomes for semantic enrichment, including: enhancing analytics to provide insights into content use and content gaps; and recommending content by using content tags to suggest related resources. For each outcome, EK detailed the requisite goals, business value, tasks, and dependencies—providing the association with the guidance they needed to realize each outcome and further advance their semantic maturity.

The EK Difference

EK was uniquely positioned to help the association improve their semantic maturity. As thought leaders in the semantic space, EK had the expertise and experience to assess the association’s semantic maturity, identify opportunities for growth, and define a vision and roadmap to help the association realize its business priorities. Further, EK has a deep understanding of the semantic technology landscape. This positioned EK to deliver tailored solutions that reflect the specific needs of the association, ensuring the solutions contribute to the association’s long-term technology roadmap.

EK leveraged a holistic approach to assessing and advancing the association’s semantic maturity. EK’s proprietary semantic maturity assessment accounts for the varied factors that influence an organization’s semantic maturity, including considerations for people, process, content, models, and technology. This positions the association to develop the capabilities required for semantic maturity across all contributing factors. Building off of the semantic maturity assessment, EK delivered end-to-end services that supported the entire semantic lifecycle, from strategy through design, implementation, and governance. This provided the association with the semantic infrastructure to realize near-term value; for instance, developing an enterprise taxonomy and applying it to their content assets using auto-tagging. By using proprietary, industry-leading approaches, EK was able to deliver these end-to-end services with tangible results within 4 months.

The Results

EK delivered a semantic strategy and solution architecture, as well as a content clean-up strategy and initial taxonomy and ontology designs, that helped the professional association establish a foundation for realizing their goals. This effort culminated in the implementation of an auto-tagging PoC. The PoC included configuring the selected TOMS, establishing system integrations, and developing processing pipelines and quality evaluations. Ultimately, the PoC captured tags for over 23,000 content assets using more than 600 concepts from 2 priority taxonomy facets. This foundational work helped the professional association establish the initial components required for a semantic layer. A final roadmap and recommendations report provided detailed next steps, with specific tasks, dependencies, and pilots, to guide the professional association in leveraging and extending their foundational semantic layer. The first engagement was deemed a success by association leadership, and the roadmap was approved for phased implementation, which EK is now supporting. This continued partnership is enabling the association to begin realizing its goals of enhancing member engagement with content by improving content discovery and overall user experience.

Want to improve your organization’s content discovery capabilities? Interested in learning more about the semantic layer? Learn more from our experience or contact us today!

Download Flyer

Ready to Get Started?

Get in Touch

The post Semantic Layer for Content Discovery, Personalization, and AI Readiness appeared first on Enterprise Knowledge.

]]>
Content Management Strategy for a Capital Producer https://enterprise-knowledge.com/content-management-strategy-for-a-capital-producer/ Wed, 16 Jul 2025 14:55:03 +0000 https://enterprise-knowledge.com/?p=24878 A capital producer understood the complexity of navigating international regulatory environments. Operating across nations in numerous fields of specialization, the organization had to uphold diverse and disparate ordinances, many of which have changed over time. Dedicated to providing high-quality services to their customers, the organization sought a solution that would help them better navigate revisions to compliance requirements and ensure adherence to rigorous standards of excellence. Continue reading

The post Content Management Strategy for a Capital Producer appeared first on Enterprise Knowledge.

]]>
 

The Challenge

A capital producer understood the complexity of navigating international regulatory environments. Operating across nations in numerous fields of specialization, the organization had to uphold diverse and disparate ordinances, many of which have changed over time. Dedicated to providing high-quality services to their customers, the organization sought a solution that would help them better navigate revisions to compliance requirements and ensure adherence to rigorous standards of excellence.

Like many companies, the capital producer relied on manual processes to identify, track, and communicate regulations across the organization. Unfortunately, manual approaches exposed the organization to human error, a possibility that threatened its ability to remain compliant. Since regulatory adherence depends on numerous team members throughout the organization, there were various potential points of failure, many of which were unknown. Staff had to personally determine how to best share sensitive information between groups, which created inefficiencies and risked information exposure. When these processes were performed correctly, they frequently included periods of redundancy where staff members duplicated each other’s efforts, thus diminishing organizational productivity. 

The Solution

To facilitate the organization’s compliance with regulations and standards, EK provided the capital producer with a comprehensive content management strategy rooted in knowledge management (KM) best practices. EK’s recommendations were informed by 11 separate interviews, four system demos, and 28 business unit validation workshop participants. EK spoke to executive stakeholders, content owners, system owners, and process performers throughout the organization. Based on these conversations and demonstrations of the organization’s current processes, EK developed a content strategy at the intersection of content and knowledge management. Recommendations for the organization were divided into five separate workstreams, based upon EK’s proprietary content strategy for KM evaluation framework, and broken down into the strategic and business impact of each item.

Leveraging EK’s expertise in semantics and data-driven knowledge management, EK delivered a content strategy with an emphasis on the structure, metadata, and management requirements for key organizational content types. For example, contracts exist currently as unstructured content, and in this organization’s use case can continue to be managed as such. Formulas for certain products, however, require robust security and personalization to enable regulatory compliance across multiple countries. These complex requirements necessitated a recommendation for structured formula content managed in a Product Information Management System (PIMS). 

Additionally, EK created a technology solution approach that not only identified existing pain points in the organization but also mapped each challenge to a corresponding technology solution. EK prioritized technical approaches that could easily work within the capital producer’s current technical ecosystem, minimizing the cost of integrating these solutions. At the start of the engagement, the organization was leveraging SharePoint for all content management needs. EK’s technology recommendations included strategies to optimize the use of SharePoint for appropriate use cases as well as recommendations for specialized contract management systems for product lifecycle management and contract management.

Implementing technological and procedural changes within the capital producer will allow the organization to continue to grow globally while providing compliant and high-quality products for its consumers. EK’s proposed content management approach will enable staff to better create, protect, share, and utilize compliance content to ensure the seamless continuity of operations, establish secure intellectual property, and achieve operational efficiencies. 

The EK Difference

Our team worked closely with the organization’s stakeholders to produce a content management strategy that would help them achieve larger knowledge objectives. Establishing processes and avenues for information sharing will enable the organization to not only uphold international standards of compliance but also increase productivity over time by efficiently sharing information and preserving tacit knowledge.

This engagement operated within the intersection of content management and KM. EK leveraged its KM background to guide this content strategy approach and used KM best practices to conduct knowledge-gathering activities, including document review, stakeholder interviews, stakeholder workshops, and system demos. After reviewing this information, EK was able to use its proprietary current state and target state framework to conduct a content management analysis at the organization. 

EK additionally utilized an ontological data modeling approach to guide its advanced content management strategy. The capital producer was exclusively a document-based organization at the beginning of the engagement; with EK’s support, they identified a future-ready content strategy for prioritized content use cases. There are a variety of content management approaches that can be used to provide structure to digital materials. These methods can be viewed on a continuum from file-level management to semantically enriched component management. However, not all approaches are the right fit for every client. Our content strategy and operations experts were able to ascertain the right level of content management for various use cases at the organization and ultimately provide them with a detailed technical plan for how to implement the right content management strategy. 

Content Management Continuum

The Results

At the end of the engagement, EK provided the organization with a clear roadmap for the adoption of a transformational content management strategy. Stakeholders from over ten different business units aligned on an approach that addressed their various needs and pain points, as well as an understanding of the investment required to achieve the target state content strategy. 

EK provided the organization’s stakeholders with the roadmap for a long-term vision and the tools for a quick return on investment. This came in five key accelerators, allowing the organization to deploy strategies, frameworks, and management approaches tailored to the organization’s unique needs. Each accelerator included a description of the recommendation, a path to implement the task successfully, success indicators to track, and the corresponding pain points it addressed. 

By implementing a more robust content management strategy, the capital producer will maintain compliance with regulations and standards, ensure content is secure and only accessible to those who need it, and improve overall efficiency of content operations. 

Download Flyer

Ready to Get Started?

Get in Touch

The post Content Management Strategy for a Capital Producer appeared first on Enterprise Knowledge.

]]>
Content Mastermind (Taylor’s Version): What Taylor Swift Can Teach Us About The Benefits of Repurposing Content https://enterprise-knowledge.com/content-mastermind-taylors-version-what-taylor-swift-can-teach-us-about-the-benefits-of-repurposing-content/ Mon, 23 Jun 2025 14:26:54 +0000 https://enterprise-knowledge.com/?p=24725 In January of 2025, Taylor Swift charted #1 on Billboard, breaking a record for most Number 1s on the Top Album Sales list with a new version of an almost six-year-old album. The 2025 repressing of Lover (Live from Paris) … Continue reading

The post Content Mastermind (Taylor’s Version): What Taylor Swift Can Teach Us About The Benefits of Repurposing Content appeared first on Enterprise Knowledge.

]]>
In January of 2025, Taylor Swift charted #1 on Billboard, breaking a record for most Number 1s on the Top Album Sales list with a new version of an almost six-year-old album. The 2025 repressing of Lover (Live from Paris) heart-shaped vinyl sold 100k copies within 45 minutes of its release, and continued to sell out every time it was restocked on the online store. 

Taylor Swift’s strategy of repurposing content, while unique for a singer, is very common from a business perspective. 94% of marketers repurpose content, indicating that reusing content is not a new concept… and yet, are you exploring the multi-facet reuse of your content? 

Since July 2020, Taylor Swift has released five original studio albums, four studio album re-recordings (“Taylor’s Version” produced before Taylor was able to buy back her original catalog of recordings), presentation variants, deluxe editions, and live albums totaling 36 albums to date, with 20 million+ units sold. Swift has had a stratospheric few years of breaking records—including becoming the first musician ranked as a Forbes billionaire primarily from songs and performances— partially due to her intelligent “content” reuse. What can we learn from this? Read on to find out.

Results

Before delving into the ways you can reuse content, what results can you expect when you put in the foundational work to enable intelligent reuse?

Broaden Your Target Audience

Statistically speaking, if you increase the amount of content you produce, you are more likely to reach a wider audience. With the development of the Eras Tour (where each era represents one of her 11 studio albums, spanning several different genres), many Taylor Swift fans began to classify themselves by their preferred “era”, or the album that made them a fan of Swift. With each album and re-recording, she’s endeared more fans to her, based on their preferred genre. 

The same can be said for reusing and repurposing content. By using Structured Content Management and effective content reuse, you decrease the overhead associated with creating and managing content. This effectively enables more systematic ways to reuse content and frees up time for content producers to create new and interesting types of content. This results in both an increase in content and the opportunity to broaden your audience. Moreover, content reuse frees up content producers’ and content marketers’ time, paving the way for two vital capabilities: personalization and experimentation.

Increase Customer Engagement with Personalization

In this day and age, most marketers use personalized content to reach their customers, but 74% say they struggle with scaling that personalization. While structured content alone can enable personalization of content in a more systematic way, when you combine structured content with the power of a knowledge graph, you also pave the way for effective personalization at scale. Using a combination of metadata applied to content components, data known about customers, and a knowledge graph, dynamic content can be created and scaled to reach more segments of customers. By giving customers relevant and personalized content for their needs, you are more likely to increase customer engagement and satisfaction.

Increase Conversion Rates with Experimentation

As a final highlighted benefit, deploying Structured Content Management enables your organization to run experiments on content, fail quickly, and adjust the content strategy as needed. While page variants and A/B testing can be deployed with traditional content management, it is not the same as being able to test an individual content component and run many different experiments quickly. This could be presentation experiments—does a CTA perform better on the side rail or above the fold embedded in the body content—but could also be which content performs best when presented in the “related content” section, an infographic or a blog? What ultimately comes from experimentation is an invaluable feedback loop that enables your organization to develop high-value, high-performing content that increases engagement metrics such as improving conversion rates.

Types of Reuse

Now that we’ve covered the benefits, let’s turn our attention to the types of reuse that are possible. Swift’s 36 record-shattering albums have three core reuse strategies: visual change, audience change, and assembly change. While there are certainly more than this, we’ll look at the same three methods in this blog: a new presentation, a new lens, and a new assembly. When it comes to your organization’s essential content, how can you reuse your content in the same ways without it becoming stale?

Change the Presentation of Content

Visually, many of the albums Swift has released in the last 5 years have thematic visual ties with the album art. Speak Now (Taylor’s Version) was released in three different shades and hues of purple: Orchid Marbled, Violet Marbled, and Lilac Marble. It’s not uncommon to see people create “Franken Variants,” where they’ve taken an LP from each version and put them together. The parallel to content strategy is the presentation changes made when employing multi-channel marketing. You may have written a long-form blog, but you’ll send it out in an email, on a social post, etc. Social posts can vary depending on the site, and many digital asset management systems (DAMs) support the ability to create automatic derivatives that fit the particular parameters of a social media channel (e.g. Instagram is 1080 x 1080 pixels, while LinkedIn is 1350 x 440 pixels) without creating an entirely new copy of this content. 

What are other ways you could create a new presentation of the content, though? When we design Content Models at EK, we emphasize decoupling content from presentation to enable this kind of reuse. When you create a model for a content type, the focus should be on what information is being communicated rather than how it is presented. An example of this could be a social proof component. Perhaps when writing up a use case of your product by a customer in long-form content, you have quotes from customers. Within the body of the long-form content it may have a particular styling, but you also reuse the quote on landing pages as social proof, and on these pages it uses more of a card style. If you decouple the customer quote from the styling needed in different channels, you can automatically populate the different styles without keeping multiple copies.

This not only saves time from creating all new components every time they are reused, but also decreases the risk of mistakes that can be introduced through manually copying the content. We saw this recently with a client who used social proof throughout their marketing website on many different pages, but through a content audit, it was discovered that one of the quotes was misattributed to another customer in an entirely different industry. The customer then had to go through the entire website (10,000 pages!) and scrub the quote. If they had already implemented Structured Content Management, they could have changed all instances with a single content update.

Update the Tone or Perspective of the Content

For Record Store Day in 2023, Swift released a version of her Folklore album that had only been seen in a Disney+ special, The Long Pond Studio Sessions (LPSS). This record was an acoustic version of the pandemic release Folklore, recorded at Aaron Dessner’s Long Pond Studios. While many may wonder why you would buy another version of an album you already have, many fans prefer the LPSS version because Swift sounds more raw than the studio version. While it is infamously difficult to get the right tone on the internet (e.g. if I don’t use exclamation points or emojis, I’m worried you’ll think I’m cold), tone of voice can still be incorporated in content and be consistent with your organization’s branding. In the same way, when you’re communicating information with a group of stakeholders, you may shift tone depending on the make-up of those stakeholders. You’ll communicate information differently to a group of executives compared to a group of individual contributors, or a group from IT vs. a group from HR. How can you use this with your customer-facing content? 

Perhaps your company writes a lot of thought leadership, and a customer can browse this thought leadership via an abstract or summary of the content. While you may have originally written the abstracts very technically, you may have since realized that your audience base is predominantly newer professionals who do not know all of your industry’s lingo. Using this insight into your customers, you could then update the abstracts to be more beginner-friendly to prompt more engagement with posts. While this could be a manual change, there’s also the possibility of using generative AI to adjust the tone or comprehension level of the abstracts to speed up rewriting and repurposing. Additionally, this paves the way for personalization by having variations of components tagged with different audiences. When a certain customer is identified as belonging to a certain group, content could be dynamically updated via a graph to be more appealing to the customer. This increases engagement and customer satisfaction. 

Use a New Assembly of Content

On many levels, music is an assembly. A song is an assembly of notes and phrases, an album is an assembly of songs that tell a story, a playlist is an assembly of songs curated in a chosen order to mimic an event or a feeling. One of the things Swift did during the Eras Tour was include a “Surprise Song” section in which she would play one song from her discography on guitar and one on piano. While at the beginning of the tour, she was playing single songs on each instrument, by the end of the tour, she was making “mashups” of songs where she would seamlessly mix multiple songs together for a new creation. I Hate It Here x the lakes, The Manuscript x Long Live, I Think He Knows x Gorgeous—over the course of several months, Swift created many new songs that were assemblies of parts of other songs.

When talking about Structured Content Management, we frequently compare content components or modular content to Legos. By creating reusable “legos” of content, you enable many different assemblies of those legos. This could take many forms—marketing landing pages or generation of proposals—but one of the easiest examples to understand is learning content. Internal trainings are ubiquitous in many organizations and often a sore spot because they can be irrelevant to an employee’s position. For example, perhaps you have a training on harassment that employees are required to take, but because the course is packaged as a single unit rather than broken up by the lessons within, all employees end up learning about topics that are more relevant to people managers. This could mean that the employee “checks out” when consuming that lesson and is more likely to disengage from the rest of the training. By creating smaller blocks of content, you could then have a personalized assembly of topics tagged with individual contributors and a personalized assembly of topics tagged with people managers without having to create multiple copies of the same course. 

Conclusion

While certainly not the first (or the only! or the last!) artist to develop methods of reuse, love her or hate her, it’s clear that Taylor Swift is a mastermind when it comes to engaging and expanding her fanbase. You can use these same techniques with your organization to expand your customer base. When you employ a clear content strategy and leverage methodical content engineering and content operations, your organization’s content has the potential to develop into a true business asset. If this has sparked your interest and you’re ready to get serious about bringing your content to its highest potential, give us a call.

The post Content Mastermind (Taylor’s Version): What Taylor Swift Can Teach Us About The Benefits of Repurposing Content appeared first on Enterprise Knowledge.

]]>
Rebecca Wyatt to Present on Context-Aware Structured Content to Mitigate Hallucinations at ConVEx conference https://enterprise-knowledge.com/rebecca-wyatt-present-2025/ Mon, 31 Mar 2025 17:50:48 +0000 https://enterprise-knowledge.com/?p=23566 Rebecca Wyatt, Partner and Division Director for Content Strategy and Operations at Enterprise Knowledge, will be delivering a presentation on Context-Aware Structured Content to Mitigate Hallucinations at the ConVEx conference, which takes place April 7-9 in San Jose, CA.  Wyatt … Continue reading

The post Rebecca Wyatt to Present on Context-Aware Structured Content to Mitigate Hallucinations at ConVEx conference appeared first on Enterprise Knowledge.

]]>
Rebecca Wyatt, Partner and Division Director for Content Strategy and Operations at Enterprise Knowledge, will be delivering a presentation on Context-Aware Structured Content to Mitigate Hallucinations at the ConVEx conference, which takes place April 7-9 in San Jose, CA. 

Wyatt will focus on techniques for ensuring that structured content remains tightly coupled with its source context, whether it’s through improved ontologies, metadata-driven relationships, or content validation against trusted sources to avoid the risks of hallucinations. 

By the end of the session, attendees will have a deeper understanding of how to future-proof their content and make it both AI-ready and hallucination-resistant, fostering more accurate and trustworthy outputs from LLMs.

For more information on the conference, check out the schedule or register here.

The post Rebecca Wyatt to Present on Context-Aware Structured Content to Mitigate Hallucinations at ConVEx conference appeared first on Enterprise Knowledge.

]]>