AI Articles - Enterprise Knowledge https://enterprise-knowledge.com/tag/ai/ Wed, 12 Nov 2025 22:00:08 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.2 https://enterprise-knowledge.com/wp-content/uploads/2022/04/EK_Icon_512x512.svg AI Articles - Enterprise Knowledge https://enterprise-knowledge.com/tag/ai/ 32 32 Enterprise Knowledge Speaking at KMWorld 2025 https://enterprise-knowledge.com/enterprise-knowledge-speaking-at-kmworld-2025/ Wed, 12 Nov 2025 21:11:22 +0000 https://enterprise-knowledge.com/?p=26002 Enterprise Knowledge (EK) will once again have a strong presence at the upcoming KMWorld Conference in Washington, D.C. This year, EK is delivering 11 sessions throughout KMWorld and its four co-located events: Taxonomy Boot Camp, Enterprise Search & Discovery, Enterprise … Continue reading

The post Enterprise Knowledge Speaking at KMWorld 2025 appeared first on Enterprise Knowledge.

]]>
Enterprise Knowledge (EK) will once again have a strong presence at the upcoming KMWorld Conference in Washington, D.C. This year, EK is delivering 11 sessions throughout KMWorld and its four co-located events: Taxonomy Boot Camp, Enterprise Search & Discovery, Enterprise AI World, and the Text Analytics Forum. 

EK is offering an array of thought leadership sessions to share KM approaches and methodologies. Several of EK’s sessions include presentations with clients, where presenters jointly deliver advanced case studies on knowledge graphs, enterprise learning solutions, and AI.  



 On November 17, EK-led events will include:

  • Taxonomy Principles to Support Knowledge Management at a Not-for-Profit, featuring Bonnie Griffin, co-presenting with Miriam Heard of YMCA – Learn how Heard and Griffin applied taxonomy design to tame tags, align content types, and simplify conventions, transforming the YMCA’s intranet so staff can find people faster, retrieve information reliably, and share updates with the right audiences.
  • Utilizing Taxonomies to Meet UN SDG Obligations, featuring Benjamin Kass, co-presenting with Mike Cannon of the American Speech-Language-Hearing Association (ASHA) – Discover how ASHA, a UN SDG Publishers Compact signatory, piloted automatic tagging to surface SDG-relevant articles, using taxonomies for robust metadata, analytics, and high-quality content collections.
  • Driving Knowledge Management With Taxonomy and Ontology, featuring Bonnie Griffin, co-presenting with Alexander Zichettello of Honda Development & Manufacturing of America – Explore how Zichettello and Griffin designed taxonomies and ontologies for a major automaker, unifying siloed content and terminology. Presenters will share a repeatable, standards-based process and the best practices for scalable, sustainable knowledge management with attendees.

On November 18, EK-led events will include:

  • Taxonomy From 2006 to 2045: Are We Ready for the Future?, moderated by Zach Wahl, EK’s CEO and co-founder – Celebrate 20 years of Taxonomy Boot Camp with a look back at 2006 abstracts, crowd-voted predictions for the next two decades (AI included), lively debate, and a cake-cutting send-off.

On November 19, EK-led events will include:

  • Transforming Content Operations in the Age of AI, featuring Rebecca Wyatt and Elliott Risch – Learn how Wyatt and Risch partnered to leverage an AI proof of concept to prioritize and accelerate content remediation and improve content and search experiences on a flagship Intel KM platform.
  • Tracing the Thread: Decoding the Decision-Making Process With GraphRAG, featuring Urmi Majumder and Kaleb Schultz – Learn about GraphRAG, how pairing generative AI with a standards-based knowledge graph can unify data to tackle complex questions, curb hallucinations, and deliver traceable answers.
  • The Cost of Missing Critical Connections in Data: Suspicious Behavior Detection Using Link Analysis (A Case Study), featuring Urmi Majumder and Kyle Garcia – See how graph-powered link analysis and NLP can uncover hidden connections in messy data, powering fraud detection and risk mitigation, with practical modeling choices and a real-world, enterprise-ready case study.
  • Generating Structured Outputs From Unstructured Content Using LLMs, featuring Kyle Garcia and Joseph Hilger, EK’s COO and co-founder – Discover how LLMs guided by content models break long, unstructured documents into reusable, knowledge graph–ready components, reducing hallucinations while improving search, personalization, and cross-platform reuse.

On November 20, EK-led events will include:

  • Enterprises, KM, & Agentic AI, featuring Jess DeMay, co-presenting with Rachel Teague of Emory Consulting Services – This interactive discussion looks at organizational trends as well as new technologies and processes to enhance knowledge sharing, communication, collaboration, and innovation in the enterprises of the future.
  • Making Search Less Taxing: Leveraging Semantics and Keywords in Hybrid Search, featuring Chris Marino, co-presenting with Jaime Martin of Tax Analysts – Explore how Tax Analysts, the nonpartisan nonprofit behind Tax Notes, scaled an advanced search overhaul that lets subscribers rapidly find what they need while surfacing relevant content they didn’t know to look for.
  • The Future of Enterprise Search & Discovery, a panel including EK’s COO and co-founder Joseph Hilger – Get a glimpse of what’s next in enterprise search and discovery as this panel unpacks agentic AI and emerging trends, offering near and long-term predictions for how tools, workflows, and roles will evolve. 

Come to KMWorld 2025, November 17–20 in Washington D.C., to hear from EK experts and learn more about the growing field of knowledge management. Register here.

The post Enterprise Knowledge Speaking at KMWorld 2025 appeared first on Enterprise Knowledge.

]]>
How Taxonomies and Ontologies Enable Explainable AI https://enterprise-knowledge.com/how-taxonomies-and-ontologies-enable-explainable-ai/ Fri, 31 Oct 2025 15:18:09 +0000 https://enterprise-knowledge.com/?p=25955 Taxonomy and ontology models are essential to unlocking the value of knowledge assets. They provide the structure needed to connect fragmented information across an organization, enabling explainable AI. As part of a broader Knowledge Intelligence (KI) strategy, these models help … Continue reading

The post How Taxonomies and Ontologies Enable Explainable AI appeared first on Enterprise Knowledge.

]]>
Taxonomy and ontology models are essential to unlocking the value of knowledge assets. They provide the structure needed to connect fragmented information across an organization, enabling explainable AI. As part of a broader Knowledge Intelligence (KI) strategy, these models help reduce hallucinations and make AI-generated content more trustworthy. This blog provides an overview of why taxonomies and ontologies are essential to connect disparate knowledge assets within an organization and improve the quality and accuracy of AI generated content. 

 

The Anatomy of AI

Here is a conceptual analogy to help illustrate how taxonomies and ontologies support AI. While inspired by the human musculoskeletal system, this analogy is not intended to represent anatomical accuracy, but rather to illustrate how taxonomies provide foundational structure and ontologies enable flexible, contextual connections of knowledge assets within AI systems.

Just like the musculoskeletal system gives structure, support, and coherence to the human body, taxonomies and ontologies provide the structural framework that organizes and contextualizes knowledge assets for AI. Here is the analogy: the spine and the bones represent the taxonomies, in other words, the hierarchical, backbone structure for categorizing and organizing concepts that describe an organization’s core knowledge assets. Similarly, the joints, ligaments, and muscles represent the ontologies that provide the flexibility to connect related concepts across assets in an organization’s knowledge domain. 

Just as the musculoskeletal system provides structure, support, and coherence to the human body, taxonomies and ontologies serve as a structural framework that organizes and contextualizes knowledge assets for AI. When those assets are consistently tagged with taxonomies and linked through ontologies, AI systems can trace how decisions are made, reducing the likelihood of hallucinations.

Taxonomies: the spine and the bones represent the taxonomies, in other words, the hierarchical backbone structure for categorizing and organizing concepts.

Ontologies: the joints, ligaments, and muscles represent the ontologies that provide the flexibility to connect related concepts across an organization's knowledge domain.

Depending on the organization’s domain or industry, certain types of knowledge assets become more relevant or strategically important. In the case of a healthcare organization, key knowledge assets may include content such as patients’ electronic health records, clinical guidelines and protocols, multidisciplinary case reviews, and research publications, as well as data such as diagnostic data and clinical trial data. Taxonomies that capture and group together key concepts, such as illnesses, symptoms, treatments, outcomes, medicines, clinical specialties can be used to tag and structure these assets. Continuing with the same scenario, an ontology in a healthcare organization can incorporate those key concepts (entities) from the taxonomy, along with their properties and relationships, to enable alignment and consistent interpretation of knowledge assets across systems. Both taxonomies and ontologies in healthcare organizations make it possible to connect, for instance, a patient’s health record with diagnostic data and previous case reviews for other patients based on the same (or similar) conditions, including illnesses, symptoms, treatments, and medicines. As a result, healthcare professionals can quickly access the information they need to make well-informed decisions about a patient’s care.

 

Where AI is Failing

On multiple occasions, AI has repeatedly failed to provide reliable information to employees, customers, and patients, undermining their confidence in the AI supported system and sometimes leading to serious organizational consequences. You may be familiar with the case in which a chatbot of a medical association was unintentionally giving harmful advice to people with eating disorders. Or maybe you heard in the news about the bank with a faulty AI system that misclassified thousands of transactions as fraudulent due to a programming error, resulting in significant customer dissatisfaction and harming the organization’s reputation. There was also a case in which an AI-powered translation system failed to accurately assess asylum seekers’ applications, raising serious concerns about its fairness and accuracy, and potentially affecting critical life decisions for those applicants. In each of these cases, had the corresponding AI systems effectively aggregated both unstructured and structured knowledge assets, and reliably linked them to encoded expert knowledge and relevant business context, these cases would have produced very different and positive outcomes. By leveraging taxonomies and ontologies to aggregate key knowledge assets, the result of these cases would have been much more closely aligned with intended objectives, ultimately, benefiting the end users as it was initially intended. 

 

How Taxonomies And Ontologies Enable Explainable AI

When knowledge assets are consistently tagged with taxonomies and related via ontologies, AI systems can trace how a decision was made. This means that end users can understand the reasoning path, supported by defined relationships. This also means that bias and hallucinations can be more easily detected by auditing the semantic structure behind the results.

As illustrated in the healthcare organization example, diagnoses can be tagged with medical industry taxonomies, while ontologies can help create relationships among symptoms, treatments, and outcomes. This can help physicians tailor treatments to individual patient needs by leveraging past patient cases and the collective expertise from other physicians. Similarly, a retail organization can enhance its customer service by implementing a chatbot that is linked to structured product taxonomies and ontologies to help deliver consistent and explainable answers about products to customers. More consistent and trustworthy customer interactions result in streamlining end user support and strengthening brand confidence.

 

Do We Really Need Taxonomies and Ontologies to be Successful With AI?

The examples above illustrate that explainability in AI really matters. Whether end users are patients, bank customers, or any individuals requesting specific products or services, they all want more transparent, trustworthy, and human-centered AI experiences. Taxonomies and ontologies help provide structure and connectedness to content, documents, data, expert knowledge and overall business context, so that they all are machine readable and findable by AI systems at the moment of need, ultimately creating meaningful interactions for end users.  

 

Conclusion

Just like bones, joints, ligaments, and muscles in the human body, taxonomies and ontologies provide the essential structure and connection that allow AI systems to stand up to testing, be reliable, and perform with clarity. At EK we have extensive experience identifying key knowledge assets as well as designing and implementing taxonomies and ontologies to successfully support AI initiatives. If you want to improve the Knowledge Intelligence (KI) of your existing or future AI applications and need help with your taxonomy and ontology efforts, don’t hesitate to get in touch with us

The post How Taxonomies and Ontologies Enable Explainable AI appeared first on Enterprise Knowledge.

]]>
How to Ensure Your Content is AI Ready https://enterprise-knowledge.com/how-to-ensure-your-content-is-ai-ready/ Thu, 02 Oct 2025 16:45:28 +0000 https://enterprise-knowledge.com/?p=25691 In 1996, Bill Gates declared “Content is King” because of its importance (and revenue generating potential) on the World Wide Web. Nearly 30 years later, content remains king, particularly when leveraged as a vital input for Enterprise AI. Having AI-ready … Continue reading

The post How to Ensure Your Content is AI Ready appeared first on Enterprise Knowledge.

]]>
In 1996, Bill Gates declared “Content is King” because of its importance (and revenue generating potential) on the World Wide Web. Nearly 30 years later, content remains king, particularly when leveraged as a vital input for Enterprise AI. Having AI-ready content is critical to successful AI implementation because it decreases hallucinations and errors, improves the efficiency and scalability of the model, and ensures seamless integration with evolving AI technologies. Put simply: if your content isn’t AI-ready, your AI initiatives will fail, stall, or deliver low value.  

In a recent blog, “Top Ways to Get Your Content and Data Ready for AI,” Sara Mae O’Brien-Scott and Zach Wahl gave an approach for ensuring your organization is ready to undertake an AI Initiative. While the previous blog provided a broad view of AI-readiness for all types of Knowledge Assets collectively, this blog will leverage the same approach, zeroing in on actionable steps to ensure your content is ready for AI. Content, also known as unstructured information, is pervasive in every organization. In fact, for many organizations it comprises 80% to 90% of the total information held within the organization. Within that corpus of content, there is a massive amount of value, but there also tends to be chaos. We’ve found that most organizations should only be actively maintaining 15-20% of their unstructured information, with the rest being duplicate, near-duplicate, outdated, or completely incorrect. Without taking steps to clean it up, contextualize it, and ensure it is properly accessible to the right people, your AI initiatives will flounder. The steps we detail below will enable you to implement Enterprise AI at your organization, minimizing the pitfalls and struggles many organizations have encountered while trying to implement AI.

1) Understand What You Mean by “Content” (Knowledge Asset Definition) 

In a previous blog, we discussed the many types of knowledge assets organizations possess, how they can be connected, and the collective value they offer. Identifying content, or unstructured information, as one of the types of knowledge assets to be included in your organization’s AI solutions will be a foregone conclusion for most. However, that alone is insufficient to manage scope and understand what needs to be done to ensure your content is AI-ready. There are many types of content, held in varied repositories, with much likely sprawling on existing file drives and old document management systems. 

Before embarking on an AI initiative, it is essential to focus on the content that addresses your highest priority use cases and will yield the greatest value, recognizing that more layers can be added iteratively over time. To maximize AI effectiveness, it is critical to ensure the content feeding AI models aligns with real user needs and AI use cases. Misaligned content can lead to hallucinations, inaccurate responses, or poor user experiences. The following actions help define content and prepare it for AI:

  • Identify the types of content that are critical for priority AI use cases.
  • Work with Content Governance Groups to identify content owners for future inclusion in AI testing. 
  • Map end-to-end user journeys to determine where AI interacts with users and the content touchpoints that need to be referenced by AI applications.
  • Inventory priority content across enterprise-wide source systems, breaking knowledge asset silos and system silos.
  • Flag where different assets serve the same intent to flag potential overlap or duplication, helping AI applications ingest only relevant content and minimize noise during AI model training.

What content means can vary significantly across organizations. For example, in a manufacturing company, content can take the form of operational procedures and inventory reports, while in a healthcare organization, it can include clinical case documentation and electronic health records. Understanding what content truly represents in an organization and identifying where it resides, often across siloed repositories, are the first steps toward enabling AI solutions to deliver complete and context-rich information to end users.

2) Ensure Quality (Content Cleanup)

Your AI Model is only as good as what’s going into it. ‘Garbage in, garbage out’, ‘steady foundation’, ‘steady house’, there are any number of ways to describe that if the content going into an AI model lacks quality, the outputs will too. Strong AI starts with strong content. Below, we have detailed both manual and automated actions that can be taken to improve the quality of your content, thereby improving your AI outcomes. 

Content Quality

Content created without regard for quality is common in the everyday workflow. While this content might serve business-as-usual processes, it can be detrimental to AI initiatives. Therefore, it’s crucial to address content quality issues within your repositories. Steps you can take to improve content quality and accelerate content AI readiness include:

  • Automate content cleanup processes by leveraging a combination of human-led and system-driven approaches, such as auto-tagging content for update, archival, or removal.
  • Scan and index content using automated processes to detect potential duplication by comparing titles, file size, metadata, and semantic similarity.
  • Apply similarity analysis to define business rules for deleting, archiving or modifying duplicate or near-duplicate content.
  • Flag content that has low-use or no-use, using analytics.
  • Combine analytics and content age to determine a retention cut-off (such as removing any content older than 2 years).
  • Leverage semantic tools like Named Entity Recognition (NER) and Natural Language Processing (NLP) to apply expert knowledge and determine the accuracy of content.
  • Use NLP to detect overly complex sentence structure and enterprise specific jargon that may reduce clarity or discoverability.

Content Restructuring

In the blog, Improve Enterprise AI with Semantic Content Management we note that content in an organization exists on a continuum of structure depending on many factors. The same is true for the amount of content restructuring that may or may not need to happen to enable your AI use case. We recently saw with a client that introducing even just basic structure to a document improved AI outcomes by almost 200%. However, this step requires clear goals and prioritization. Oftentimes this part of ensuring your content is AI-ready happens iteratively as the model is applied and you can determine what level of restructuring needs to occur to best improve AI outcomes. Restructuring content to prepare it for AI involves activities such as:

  • Apply tags, such as heading structures, to unstructured content to improve AI outcomes and enhance the end-user experience.
  • Use an AI-assisted check to validate that heading structures and tags are being used appropriately and are machine readable, so that content can be ingested smoothly by AI systems.
  • Simplify and restructure content that has been identified as overly complex and could result in hallucinations or unsatisfactory responses generated by the AI model.
  • Focus on reformatting longer, text-heavy content to achieve a more linear, time-based, or topic-based flow and improve AI effectiveness. 
  • Develop repeatable structures that can be applied automatically to content during creation or retroactively to provide AI with relevant content in a consumable format. 

In brief, cleaning up and restructuring content assets improves machine readability of content and therefore allows the AI model to generate stronger and more accurate outputs. To prioritize assets that need cleanup and restructuring, focus on activities and resources that will yield the highest return on investment for your AI solution. However, it is important to recognize that this may vary significantly across organizations, industries, and AI use cases. For example, an organization with a truly cross-functional use case, such as enterprise search, may prioritize deduplication of content to ensure information from different business areas doesn’t conflict when providing AI-generated responses. On the other hand, an organization with a more function-specific use case, such as streamlining legal contract review, may prioritize more hands-on content restructuring to improve AI comprehension.

3) Fill Gaps (Tacit Knowledge Capture)

Even with high-quality content, knowledge gaps that exist in your full enterprise ecosystem can cause AI errors and introduce the risk of unreliable outcomes. Considering your AI use case, the questions you want to answer, the discovery you’ve completed in previous steps, and the actions detailed below you can start to identify and fill gaps that may exist. 

Content Coverage 

Even with the best content strategy, it is not uncommon for different types of content to “fall through the cracks” and be unavailable or inaccessible for any number of reasons. Many organizations “don’t know what they don’t know”, so it can be difficult to begin this process. However, it is crucial to be aware of these content gaps, particularly when using LLMs to avoid hallucinations. Actions you may take to ensure content coverage and accelerate your journey toward content AI readiness include: 

  • Leverage systems analytics to assess user search behavior and uncover content gaps. This may include unused content areas of a repository, abandoned search queries, or searches that returned no results. 
  • Identify content gaps by using taxonomy analytics to identify missing categories or underrepresented terms and as a result, determine what content should be included.
  • Leverage SMEs and other end users during AI testing to evaluate AI-generated responses and identify areas where content may be missing. 
  • Use AI governance to ensure the model is transparent and can communicate with the user when it is not able to find a satisfactory answer.

Fill the Gap

Once missing content has been identified from information sources feeding the AI model, the real challenge is to fill in those gaps to prevent “hallucinations” and avoid user frustration that may be generated by incomplete or inaccurate answers. This may include creating new assets, locating assets, or other techniques identified which together can move the organization from AI to Knowledge Intelligence. Steps you may take to remediate the gaps and help your organization’s content be AI ready include:

  • Use link detection to uncover relationships across the content, identify knowledge that may exist elsewhere, and increase the likelihood of surfacing the right content. This can also inform later semantic tagging activities.
  • Identify, by analyzing content repositories, sources where content identified as “missing” could possibly exist.
  • Apply content transformation practices to “missing” content identified during the content repository analysis to ensure machine readability.
  • Conduct knowledge capture and transfer activities such as SME interviews, communities of practice, and collaborative tools to document tacit knowledge in the form of guides, processes, or playbooks. 
  • Institutionalize content that exists in private spaces that aren’t currently included in the repositories accessed by AI.
  • Create draft content using generative AI, making sure to include a human-in-the-loop step for accuracy. 
  • Acquire external content when gaps aren’t organization specific. Consider purchasing or licensing third-party content, such as research reports, marketing intelligence, and stock images.

By evaluating the content coverage for a particular use case, you can start to predict how well (or poorly) your AI model may perform. When critical content mostly exists in people’s heads, rather than in documented, accessible format, the organization is exposed to significant risk. For example, an organization deploying a customer-facing AI chatbot to help with case deflection in customer service centers, gaps in content can lead to potentially false or misleading responses. If the chatbot tries to answer questions it wasn’t trained for, it could result in out-of-policy exceptions, financial loss, decrease in customer trust, or lower retention due to inaccurate, outdated, or non-existent information. This example highlights why it is so important to identify and fill knowledge gaps to ensure your content is ready for AI. 

4) Add Structure and Context (Semantic Components)

Once you have identified the relevant content for an AI solution, ensured its quality for AI, and addressed major content gaps for your AI use cases, the next step in getting content ready for AI involves adding structure and context to content by leveraging semantic components. Taxonomy and metadata models provide the foundational structure needed to categorize unstructured content and provide meaningful context. Business glossaries ensure alignment by defining terms for shared understanding, while ontology models provide contextual connections needed for AI systems to process content. The semantic maturity of all of these models is critical to achieve successful AI applications. 

Semantic Maturity of Taxonomy and Business Glossaries

Some organizations struggle with the state of their taxonomies when starting AI-driven projects. Organizations must actively design and manage taxonomies and business glossaries to properly support AI-driven applications and use cases. This is not only essential for short-term implementation of the AI solution, but most importantly for long-term success. Standardization and centralization of these models help balance organization-wide needs and domain-specific needs. Properly structured and annotated taxonomies are instrumental in preparing content for AI. Taking the following actions will ensure that you have the Semantic Maturity of Taxonomies and Business Glossaries needed to achieve AI ready content:

  • Balance taxonomies across business areas to ensure organization-wide standardization, enabling smooth implementation of AI use cases and seamless integration of AI applications. 
  • Design hierarchical taxonomy structures with the depth and breadth needed to support AI use cases.
  • Refine concepts and alternative terms (synonyms and acronyms) in the taxonomy to more adequately describe and apply to priority AI content.
  • Align taxonomies with usability standards, such as ANSI/NISO Z39.19, and interoperability/machine readability standards, such as SKOS, so that taxonomies are both human and machine readable.
  • Incorporate definitions and usage notes from an organizational business glossary into the taxonomy to enrich meaning and improve semantic clarity.
  • Store and manage taxonomies in a centralized Taxonomy Management System (TMS) to support scalable AI readiness.

Semantic Maturity of Metadata 

Before content can effectively support AI-driven applications, organizations must also establish metadata practices to ensure that content has been sufficiently described and annotated. This involves not only establishing shared or enterprise-wide coordinated metadata models, but more importantly, applying complete and consistent metadata to that content. The following actions will ensure that the Semantic Maturity of your Metadata model meets the standards required for content to be AI ready:

  • Structure metadata models to meet the requirements of AI use cases, helping derive meaningful insights from tagged content.
  • Design metadata models that accurately represent different knowledge asset types (types of content) associated with priority AI use cases.
  • Apply metadata models consistently across all content source systems to enhance findability and discoverability of content in AI applications. 
  • Document and regularly update metadata models.
  • Store and manage metadata models in a centralized semantic repository to ensure interoperability and scalable reuse across AI solutions.

Semantic Maturity of Ontology

Just as with taxonomies, metadata, and business glossaries, developing semantically rich and precise ontologies is essential to achieve successful AI applications and to enable Knowledge Intelligence (KI) or explainable AI. Ontologies must be sufficiently expressive to support semantic enrichment, traceability, and AI-driven reasoning. They must be designed to accurately represent key entities, their properties, and relationships in ways that enable consistent tagging, retrieval, and interpretation across systems and AI use cases. By taking the following actions, your ontology model will achieve the level of semantic maturity needed for content to be AI ready:

  • Ensure ontologies accurately describe the knowledge domain for the in-scope content.
  • Define key entities, their attributes, and relationships in a way that supports AI-driven classification, recommendation, and reasoning.
  • Design modular and extensible ontologies for reuse across domains, applications, and future AI use cases.
  • Align ontologies with organizational taxonomies to support semantic interoperability across business areas and content source systems.
  • Annotate ontologies with rich metadata for human and machine readability.
  • Adhere to ontology standards such as OWL, RDF, or SHACL for interoperability with AI tools.
  • Store ontologies in a central ontology management system for machine readability and interoperability with other semantic models.

Preparing content for AI is not just about organizing information, it’s about making it discoverable, valuable, and usable. Investing in semantic models and ensuring a consistent content structure lays the foundation for AI to generate meaningful insights. For example, if an organization wants to deliver highly personalized recommendations that connect users to specific content, building customized taxonomies, metadata models, business glossaries, and ontologies not only maximizes the impact of current AI initiatives, but also future-proofs content for emerging AI-driven use cases.

5) Semantic Model Application (Content Tagging)

Designing structured semantic models is just one part of preparing content for AI. Equally important is the consistent application of complete, high-quality metadata to organization-wide content. Metadata enrichment of unstructured content, especially across silo repositories, is critical for enabling AI-powered systems to reliably discover, interpret, and utilize that content. The following actions to enhance the application of content tags will help you achieve content AI readiness:

  • Tag unstructured content with high-quality metadata to enhance interpretability in AI systems.
  • Ensure each piece of relevant content for the AI solution is sufficiently annotated, or in other words, it is labeled with enough metadata to describe its meaning and context. 
  • Promote consistent annotation of content across business areas and systems using tags derived from a centralized and standardized taxonomy. 
  • Leverage mechanisms, like auto-tagging, to enhance the speed and coverage of content tagging. 
  • Include a human-in-the-loop step in the auto-tagging process to improve accuracy of content tagging.

Consistent content tagging provides an added layer of meaning and context that AI can use to deliver more complete and accurate answers. For example, an organization managing thousands of unstructured content assets across disparate repositories and aiming to deliver personalized content experiences to end users, can more effectively tag content by leveraging a centralized taxonomy and an auto-tagging approach. As a result, AI systems can more reliably surface relevant content, extract meaningful insights, and generate personalized recommendations.

6) Address Access and Security (Unified Entitlements)

As Joe Hilger mentioned in his blog about unified entitlements, “successful semantic solutions and knowledge management initiatives help the right people see the right information at the right time.” But to achieve this, access permissions must be in place so that only authorized individuals have visibility into the appropriate content. Unfortunately, many organizations still maintain content in old repositories that don’t have the right features or processes to secure it, creating a significant risk for organizations pursuing AI initiatives. Therefore, now more than ever, it is important to properly secure content by defining and applying entitlements, preventing access to highly sensitive content by unauthorized people and as a result, maintaining trust across the organization. The actions outlined below to enhance Unified Entitlements will accelerate your journey toward content AI readiness:

  • Define an enterprise-wide entitlement framework to apply security rules consistently across content assets, regardless of the source system.
  • Automate security by enforcing privileges across all systems and types of content assets using a unified entitlements solution.
  • Leverage AI governance processes to ensure that content creators, managers, and owners are aware of entitlements for content they handle and needs to be consumed by AI applications.

Entitlements are important because they ensure that content remains consistent, trustworthy, and reusable for AI systems. For example, if an organization developing a Generative AI solution stores documents and web content about products and clients across multiple SharePoint sites, content management systems, and webpages, inconsistent application of entitlements may represent a legal or compliance risk, potentially exposing outdated, or even worse, highly sensitive content to the wrong people. On the other hand, the correct definition and application of access permissions through a unified entitlements solution plays a key role in mitigating that risk, enabling operational integrity and scalability, not only for the intended Generative AI solution, but also for future AI initiatives.

7) Maintain Quality While Iteratively Improving (Governance)

Effective governance for AI solutions can be very complex because it requires coordination across systems and groups, not just within them, especially among content governance, semantic governance, and AI governance groups. This coordination is essential to ensure content remains up to date and accessible for users and AI solutions, and that semantic models are current and centrally accessible. 

AI Governance for Content Readiness 

Content Governance 

Not all organizations have supporting organizational structures with defined roles and processes to create, manage, and govern content that is aligned with cross-organizational AI initiatives. The existence of an AI Governance for Content Readiness Group ensures coordination with the traditional Content Governance Groups and provides guidance to content owners of the source systems on how to get content AI ready to support priority AI use cases. By taking the following actions, the AI Governance for Content Readiness Group will help ensure that you have the content governance practices required to achieve AI-ready content:

  • Define how content should be captured and managed in a way that is consistent, predictable, and interoperable for AI use cases.
  • Incorporate in your AI solution roadmap a step, delivered through the Content Governance Groups, to guide content owners of the source systems on what is required to get content AI ready for inclusion in AI models.
  • Provide guidance to the Content Governance Group on how to train and communicate with system owners and asset owners on how to prepare content for AI.
  • Take the technical and strategic steps necessary to connect content source systems to AI systems for effective content ingestion and interpretation.
  • Coordinate with the Content Governance Group to develop and adopt content governance processes that address content gaps identified through the detection of bias, hallucinations, or misalignment, or unanswered questions during AI testing.
  • Automate AI governance processes leveraging AI to identify content gaps, auto-tag content, or identify new taxonomy terms for the AI solution.

Semantic Models Governance

Similar to the importance of coordinating with the content governance groups, coordinating with semantic models governance groups is key for AI readiness. This involves establishing roles and responsibilities for the creation, ownership, management, and accountability of semantic models (taxonomy, metadata, business glossary, and ontology models) in relation to AI initiatives. This also involves providing clear guidance for managing changes in the models and communicating updates to those involved in AI initiatives. By taking the following actions, the AI Governance for Content Readiness Group will help ensure that your organization has the semantic governance practices required to achieve AI-ready content: 

  • Develop governance structures that support the development and evolution of semantic models in alignment with both existing and emerging AI initiatives.
  • Align governance roles (e.g. taxonomists, ontologists, semantic engineers, and AI engineers) with organizational needs for developing and maintaining semantic models that support enterprise-wide AI solutions.
  • Ensure that the systems used to manage taxonomies, metadata, and ontologies support enforcing permissions for accessing and updating the semantic models.
  • Work with the Semantic Models Governance Groups to develop processes that help remediate gaps in the semantic models uncovered during AI testing. This includes providing guidance on the recommended steps for making changes, suggested decision-makers, and implementation approaches.
  • Work with the Semantic Models Governance Groups to establish metrics and processes to monitor, tune, refine, and evolve semantic models throughout their lifecycle and stay up to date with AI efforts.
  • Coordinate with the Semantic Models Governance Groups to develop and adopt processes that address semantic model gaps identified through the detection of bias, hallucinations, or misalignment, or unanswered questions during AI solution testing.

For example, imagine an organization is developing business taxonomies and ontologies that represent skills, job roles, industries, and topics to support an Employee 360 View solution. It is essential to have a governance model in place with clearly defined roles, responsibilities, and processes to manage and evolve these semantic models as the AI solutions team ingests content from diverse business areas and detects gaps during AI testing. Therefore, coordination between the AI Governance for Content Readiness Group and the Semantic Models Governance Groups helps ensure that concepts, definitions, entities, properties, and relationships remain current and accurately reflect the knowledge domain for both today’s needs and future AI use cases.  

Conclusion

Unstructured content remains as one of the most common knowledge assets in organizations. Getting that content ready to be ingested by AI applications is a balancing act. By cleaning it up, filling in gaps, applying rich semantic models to add structure and context, securing it with unified entitlements, and leveraging AI governance, organizations will be better positioned to succeed in their own AI journey. We hope after reading this blog you have a better understanding of the actions you can take to ensure your organization’s content is AI ready. If you want to learn how our experts can help you achieve Content AI Readiness, contact us at info@enterprise-knowledge.com

The post How to Ensure Your Content is AI Ready appeared first on Enterprise Knowledge.

]]>
How to Ensure Your Data is AI Ready https://enterprise-knowledge.com/how-to-ensure-your-data-is-ai-ready/ Wed, 01 Oct 2025 16:37:50 +0000 https://enterprise-knowledge.com/?p=25670 Artificial intelligence has the potential to be a game-changer for organizations looking to empower their employees with data at every level. However, as business leaders look to initiate projects that incorporate data as part of their AI solutions, they frequently … Continue reading

The post How to Ensure Your Data is AI Ready appeared first on Enterprise Knowledge.

]]>
Artificial intelligence has the potential to be a game-changer for organizations looking to empower their employees with data at every level. However, as business leaders look to initiate projects that incorporate data as part of their AI solutions, they frequently look to us to ask, “How do I ensure my organization’s data is ready for AI?” In the first blog in this series, we shared ways to ensure knowledge assets are ready for AI. In this follow-on article, we will address the unique challenges that come with connecting data—one of the most unique and varied types of knowledge assets—to AI. Data is pervasive in any organization and can serve as the key feeder for many AI use cases, so it is a high priority knowledge asset to ready for your organization.

The question of data AI readiness stems from the very real concern that when AI is pointed at data that isn’t correct or that doesn’t have the right context associated with it, organizations could face risks to their reputation, their revenue, or their customers’ privacy. With the additional nuance that data brings by often being presented in formats that require transformation, lacking in context, and frequently containing multiple duplicates or near-duplicates with little explanation of their meaning, data (although seemingly already structured and ready for machine consumption) requires greater care than other forms of knowledge assets to comprise part of a trusted AI solution. 

This blog focuses on the key actions an organization needs to perform to ensure their data is ready to be consumed by AI. By following the steps below, an organization can use AI-ready data to develop end-products that are trustworthy, reliable, and transparent in their decision making.

1) Understand What You Mean by “Data” (Data Asset and Scope Definition)

Data is more than what we typically picture it as. Broadly, data is any raw information that can be interpreted to garner meaning or insights on a certain topic. While the typical understanding of data revolves around relational databases and tables galore, often with esoteric metrics filling their rows and columns, data takes a number of forms, which can often be surprising. In terms of format, while data can be in traditional SQL databases and formats, NoSQL data is growing in usage, in forms ranging from key-value pairs to JSON documents to graph databases. Plain, unstructured text such as emails, social media posts, and policy documents are also forms of data, but traditionally not included within the enterprise definition. Finally, data comes from myriad sources—from live machine data on a manufacturing floor to the same manufacturing plant’s Human Resources Management System (HRMS). Data can also be categorized by its business role: operational data that drives day-to-day processes, transactional data that records business exchanges, and even purchased or third-party data brought in to enrich internal datasets. Increasingly, organizations treat data itself as a product, packaged and maintained with the same rigor as software, and rely on data metrics to measure quality, performance, and impact of business assets.

All these forms and types of data meet the definition of a knowledge asset—information and expertise that an organization can use to create value, which can be connected with other knowledge assets. No matter the format or repository type, ingested, AI-ready data can form the backbone of a valuable AI solution by allowing business-specific questions to be answered reliably in an explainable manner. This raises the question to organizational decision makers—what within our data landscape needs to be included in our AI solution? From your definition of what data is, start thinking of what to add iteratively. What systems contain the highest priority data? What datasets would provide the most value to end users? Select high-value data in easy-to-transform formats that allows end users to see the value in your solution. This can garner excitement across departments and help support future efforts to introduce additional data into your AI environment. 

2) Ensure Quality (Data Cleanup)

The majority of organizations we’ve worked with have experienced issues with not knowing what data they have or what it’s intended to be used for. This is especially common in large enterprise settings as the sheer scale and variety of data can breed an environment where data becomes lost, buried, or degrades in quality. This sprawl occurs alongside another common problem, where multiple versions of the same dataset exist, with slight variations in the data they contain. Furthermore, the issue is exacerbated by yet another frequent challenge—a lack of business context. When data lacks context, neither humans nor AI can reliably determine the most up-to-date version, the assumptions and/or conditions in place when said data was collected, or even if the data warrants retention.

Once AI is introduced, these potential issues are only compounded. If an AI system is provided data that is out of date or of low quality, the model will ultimately fail to provide reliable answers to user queries. When data is collected for a specific purpose, such as identifying product preferences across customer segments, but not labeled for said use, and an AI model leverages that data for a completely separate purpose, such as dynamic pricing models, harmful biases can be introduced into the results that negatively impact both the customer and the organization.

Thankfully, there are several methods available to organizations today that allow them to inventory and restructure their data to fix these issues. Examples include data dictionaries, master data (MDM data), and reference data that help standardize data across an organization and help point to what is available at large. Additionally, data catalogs are a proven tool to identify what data exists within an organization, and include versioning and metadata features that can help label data with their versions and context. To help populate catalogs and data dictionaries and to create MDM/reference data, performing a data audit alongside stewards can help rediscover lost context and label data for better understanding by humans and machines alike. Another way to deduplicate, disambiguate, and contextualize data assets is through lineage. Lineage is a feature included in many metadata management tools that stores and displays metadata regarding source systems, creation and modification dates, and file contributors. Using this lineage metadata, data stewards can select which version of a data asset is the most current or relevant for a specific use case and only expose said asset to AI. These methods to ensure data quality and facilitate data stewardship can aid in action towards a larger governance framework. Finally, at a larger scale, a semantic layer can unify data and its meaning for easier ingestion into an AI solution, assist with deduplication efforts, and break down silos between different data users and consumers of knowledge assets at large. 

Separately, for the elimination of duplicate/near-duplicate data, entity resolution can autonomously parse the content of data assets, deduplicate them, and point AI to the most relevant, recent, or reliable data asset to answer a question. 

3) Fill Gaps (Data Creation or Acquisition)

With your organization’s data inventoried and priorities identified, it’s time to start identifying what gaps exist in your data landscape in light of the business questions and challenges you are looking to address. First, ask use case-based questions. Based on your identified use cases, what data would an AI model need to answer topical questions that your organization doesn’t already possess?

At a higher level, gaps in use cases for your AI solution will also exist. To drive use case creation forward, consider the use of a data model, entity relationship diagram (ERD), or ontology to serve as the conceptual map on which all organizational data exists. With a complete data inventory, an ontology can help outline the process by which AI solutions would answer questions at a high level, thanks to being both machine and human-readable. By traversing the ontology or data model, you can design user journeys and create questions that form the basis of novel use cases.

Often, gaps are identified that require knowledge assets outside of data to fill. A data model or ontology can help identify related assets, as they function independently of their asset type. Moreover, standardized metadata across knowledge assets and asset types can enrich assets, link them to one another, and provide insights previously not possible. When instantiated in a solution alongside a knowledge graph, this forms a semantic layer where data assets, such as data products or metrics, gain context and maturity based on related knowledge assets. We were able to enhance the performance of a large retail chain’s analytics team through such an approach utilizing a semantic layer.

To fill these gaps, organizations can look to collect or create more data, as well as purchase publicly available/incorporate open-source datasets (build vs. buy). Another common method of filling identified organizational gaps is the creation of content (and other non-data knowledge assets) to identify a gap via the extraction of tacit organizational knowledge. This is a method that more chief data officers/chief data and AI officers (CDOs/CDAOs) are employing, as their roles expand and reliance on structured data alone to gather insights and solve problems is no longer feasible.

As a whole, this process will drive future knowledge asset collection, creation, and procurement efforts and consequently is a crucial step in ensuring data at large is AI ready. If no such data exists for AI to rely on for certain use cases, users will be presented unreliable, hallucination-based answers, or in a best-case scenario, no answer at all. Yet as part of a solid governance plan as mentioned earlier, the continuation of the gap analysis process post-solution deployment can empower organizations to continually identify—and close—knowledge gaps, continuously improving data AI readiness and AI solution maturity.

4) Add Structure and Context (Semantic Components)

A key component of making data AI-ready is structure—not within the data per se (e.g., JSON, SQL, Excel), but the structure relating the data to use cases. As a term, ‘structure’ added meaning to knowledge assets in our previous blog, but can introduce confusion as a misnomer in this section. Consequently, ‘structure’ will refer to the added, machine-readable context a semantic model adds to data assets, rather than the format of the data assets themselves, as data loses meaning once taken out of the structure or format it is stored in (e.g., as takes place when retrieved by AI).

Although we touched on one type of semantic model in the previous step, there are three semantic models that work together to ensure data AI readiness: business glossaries, taxonomies, and ontologies. Adding semantics to data for the purpose of getting it ready for AI allows an organization to help users understand the meaning of the data they’re working with. Together, taxonomies, ontologies, and business glossaries imbue data with the context needed for an AI model to fully grasp the data’s meaning and make optimal use of it to answer user queries. 

Let’s dive into the business glossary first. Business glossaries define business context-specific terms that are often found in datasets in a plaintext, easy-to-understand manner. For AI models which are often trained generally, these glossary terms can further assist in the selection of the correct data needed to answer a user query. 

Taxonomies group knowledge assets into broader and narrower categories, providing a level of hierarchical organization not available with traditional business glossaries. This can help data AI readiness in manifold ways. By standardizing terminology (e.g., referring to “automobile,” “car,” and “vehicle” all as “Vehicles” instead of separately), data from multiple sources can be integrated more seamlessly, disambiguated, and deduplicated for clearer understanding. 

Finally, ontologies provide the true foundation for linking related datasets to one another and allow for the definition of custom relationships between knowledge assets. When combining ontology with AI, organizations can perform inferences as a way to capture explicit data about what’s only implied by individual datasets. This shows the power of semantics at work, and demonstrates that good, AI-ready data enriched with metadata can provide insights at the same level and accuracy as a human. 

Organizations who have not pursued developing semantics for knowledge assets before can leverage traditional semantic capture methods, such as business glossaries. As organizations mature in their curation of knowledge assets, they are able to leverage the definitions developed as part of these glossaries and dictionaries, and begin to structure that information using more advanced modeling techniques, like taxonomy and ontology development. When applied to data, these semantic models make data more understandable, both to end users and AI systems. 

5) Semantic Model Application (Labeling and Tagging) 

The data management community has more recently been focused on the value of metadata and metadata-first architecture, and is scrambling to catch up to the maturity displayed in the fields of content and knowledge management. Through replicating methods found in content management systems and knowledge management platforms, data management professionals are duplicating past efforts. Currently, the data catalog is the primary platform where metadata is being applied and stored for data assets. 

To aggregate metadata for your organization’s AI readiness efforts, it’s crucial to look to data stewards as the owners of, and primary contributors to, this effort. Through the process of labeling data by populating fields such as asset descriptions, owner, assumptions made upon collection, and purposes, data stewards help to drive their data towards AI readiness while making tacit knowledge explicit and available to all. Additionally, metadata application against a semantic model (especially taxonomies and ontologies) contextualizes assets in business context and connects related assets to one another, further enriching AI-generated responses to user prompts. While there are methods to apply metadata to assets without the need for as much manual effort (such as auto-classification, which excels for content-based knowledge assets), structured data usually dictates the need for human subject matter experts to ensure accurate classification. 

With data catalogs and recent investments in metadata repositories, however, we’ve noticed a trend that we expect will continue to grow and spread across organizations in the near future. Data system owners are more and more keen to manage metadata and catalog their assets within the same systems that data is stored/used, adopting features that were previously exclusive to a data catalog. Major software providers are strategically acquiring or building semantic capabilities for this purpose. This has been underscored by the recent acquisition of multiple data management platforms by the creators of larger, flagship software products. With the features of the data catalog being adapted from a full, standalone application that stores and presents metadata to a component of a larger application that focuses as a metadata store, the metadata repository is beginning to take hold as the predominant metadata management platform.

6) Address Access and Security (Unified Entitlements)

Applying semantic metadata as described above helps to make data findable across an organization and contextualized with relevant datasets—but this needs to be balanced alongside security and entitlements considerations. Without regard to data security and privacy, AI systems risk bringing in data they shouldn’t have access to because access entitlements are mislabeled or missing, leading to leaks in sensitive information.

A common example of when this can occur is with user re-identification. Data points that independently seem innocuous, when combined by an AI system, can leak information about customers or users of an organization. With as few as just 15 data points, information that was originally collected anonymously can be combined to identify an individual. Data elements like ZIP code or date of birth would not be damaging on their own, but when combined, can expose information about a user that should have been kept private. These concerns become especially critical in industries with small population sizes for their datasets, such as rare disease treatment in the healthcare industry.

EK’s unified entitlements work is focused on ensuring the right people and systems view the correct knowledge assets at the right time. This is accomplished through a holistic architectural approach with six key components. Components like a policy engine capture can enforce whether access to data should be given, while components like a query federation layer ensure that only data that is allowed to be retrieved is brought back from the appropriate sources. 

The components of unified entitlements can be combined with other technologies like dark data detection, where a program scrapes an organization’s data landscape for any unlabeled information that is potentially sensitive, so that both human users and AI solutions cannot access data that could result in compliance violations or reputational damage. 

As a whole, data that exposes sensitive information to the wrong set of eyes is not AI-ready. Unified entitlements can form the layer of protection that ensures data AI readiness across the organization.

7) Maintain Quality While Iteratively Improving (Governance)

Governance serves a vital purpose in ensuring data assets become, and remain, AI-ready. With the introduction of AI to the enterprise, we are now seeing governance manifest itself beyond the data landscape alone. As AI governance begins to mature as a field of its own, it is taking on its own set of key roles and competencies and separating itself from data governance. 

While AI governance is meant to guide innovation and future iterations while ensuring compliance with both internal and external standards, data governance personnel are taking on the new responsibility of ensuring data is AI-ready based on requirements set by AI governance teams. Barring the existence of AI governance personnel, data governance teams are meant to serve as a bridge in the interim. As such, your data governance staff should define a common model of AI-ready data assets and related standards (such as structure, recency, reliability, and context) for future reference. 

Both data and AI governance personnel hold the responsibility of future-proofing enterprise AI solutions, in order to ensure they continue to align to the above steps and meet requirements. Specific to data governance, organizations should ask themselves, “How do you update your data governance plan to ensure all the steps are applicable in perpetuity?” In parallel, AI governance should revolve around filling gaps in their solution’s capabilities. Once the AI solutions launch to a production environment and user base, more gaps in the solution’s realm of expertise and capabilities will become apparent. As such, AI governance professionals need to stand up processes to use these gaps to continue identifying new needs for knowledge assets, data or otherwise, in perpetuity.

Conclusion

As we have explored throughout this blog, data is an extremely varied and unique form of knowledge asset with a new and disparate set of considerations to take into account when standing up an AI solution. However, following the steps listed above as part of an iterative process for implementation of data assets within said solution will ensure data is AI-ready and an invaluable part of an AI-powered organization.

If you’re seeking help to ensure your data is AI-ready, contact us at info@enterprise-knowledge.com.

The post How to Ensure Your Data is AI Ready appeared first on Enterprise Knowledge.

]]>
How to Fill Your Knowledge Gaps to Ensure You’re AI-Ready https://enterprise-knowledge.com/how-to-fill-your-knowledge-gaps-to-ensure-youre-ai-ready/ Mon, 29 Sep 2025 19:14:44 +0000 https://enterprise-knowledge.com/?p=25629 “If only our company knew what our company knows” has been a longstanding lament for leaders: organizations are prevented from mobilizing their knowledge and capabilities towards their strategic priorities. Similarly, being able to locate knowledge gaps in the organization, whether … Continue reading

The post How to Fill Your Knowledge Gaps to Ensure You’re AI-Ready appeared first on Enterprise Knowledge.

]]>
“If only our company knew what our company knows” has been a longstanding lament for leaders: organizations are prevented from mobilizing their knowledge and capabilities towards their strategic priorities. Similarly, being able to locate knowledge gaps in the organization, whether we were initially aware of them (known unknowns), or initially unaware of them (unknown unknowns), represents opportunities to gain new capabilities, mitigate risks, and navigate the ever-accelerating business landscape more nimbly.  

AI implementations are already showing signs of knowledge gaps: hallucinations, wrong answers, incomplete answers, and even “unanswerable” questions. There are multiple causes for AI hallucinations, but an important one is not having the right knowledge to answer a question in the first place. While LLMs may have been trained on massive amounts of data, it doesn’t mean that they know your business, your people, or your customers. This is a common problem when organizations make the leap from how they experience “Public AI” tools like ChatGPT, Gemini, or Copilot, to attempting their own organization’s AI solutions. LLMs and agentic solutions need knowledge—your organization’s unique knowledgeto produce results that are unique to your and your customers’ needs, and help employees navigate and solve challenges they encounter in their day-to-day work. 

In a recent article, EK outlined key strategies for preparing content and data for AI. This blog post builds on that foundation by providing a step-by-step process for identifying and closing knowledge gaps, ensuring a more robust AI implementation.

 

The Importance of Bridging Knowledge Gaps for AI Readiness

EK lays out a six-step path to getting your content, data, and other knowledge assets AI-ready, yielding assets that are correct, complete, consistent, contextual, and compliant. The diagram below provides an overview of these six steps:

The six steps to AI readiness. Step one: Define Knowledge Assets. Step two: Conduct cleanup. Step three: Fill Knowledge Gaps (We are here). Step four: Enrich with context. Step five: Add structure. Step six: Protect the knowledge assets.

Identifying and filling knowledge gaps, the third step of EK’s path towards AI readiness, is crucial in ensuring that AI solutions have optimized inputs. 

Prior to filling gaps, an organization will have defined its critical knowledge assets and conducted a content cleanup. A content cleanup not only ensures the correctness and reliability of the knowledge assets, but also reveals the specific topics, concepts, or capabilities that the organization cannot currently supply to AI solutions as inputs.

This scenario presupposes that the organization has a clear idea of the AI use cases and purposes for its knowledge assets. Given the organization knows the questions AI needs to answer, an assessment to identify the location and state of knowledge assets can be targeted based on the inputs required. This assessment would be followed by efforts to collect the identified knowledge and optimize it for AI solutions. 

A second, more complicated, scenario arises when an organization hasn’t formulated a prioritized list of questions for AI to answer. The previously described approach, relying on drawing up a traditional knowledge inventory will face setbacks because it may prove difficult to scale, and won’t always uncover the insights we need for AI readiness. Knowledge inventories may help us understand our known unknowns, but they will not be helpful in revealing our unknown unknowns

 

Identifying the Gap

How can we identify something that is missing? At this juncture, organizations will need to leverage analytics, introduce semantics, and if AI is already deployed in the organization, then use it as a resource as well. There are different techniques to identify these gaps, depending on whether your organization has already deployed an AI solution or is ramping up for one. Available options include:

Before and After AI Deployment

Leveraging Analytics from Existing Systems

Monitoring and assessing different tools’ analytics is an established practice to understand user behavior. In this instance, EK applies these same methods to understand critical questions about the availability of knowledge assets. We are particularly interested in analytics that reveal answers to the following questions:

  • When are our people giving up when navigating different sections of a tool or portal? 
  • What sort of queries return no results?
  • What queries are more likely to get abandoned? 
  • What sort of content gets poor reviews, and by whom?
  • What sort of material gets no engagement? What did the user do or search for before getting to it? 

These questions aim to identify instances of users trying, and failing, to get knowledge they need to do their work. Where appropriate, these questions can also be posed directly to users via surveys or focus groups to get a more rounded perspective. 

Semantics

Semantics involve modeling an organization’s knowledge landscape with taxonomies and ontologies. When taxonomies and ontologies have been properly designed, updated, and consistently applied to knowledge, they are invaluable as part of wider knowledge mapping efforts. In particular, semantic models can be used as an exemplar of what should be there, and can then be compared with what is actually present, thus revealing what is missing.

We recently worked with a professional association within the medical field, helping them define a semantic model for their expansive amount of content, and then defining an automated approach to tagging these knowledge assets. As part of the design process, EK taxonomists interviewed experts across all of the association’s organizational functional teams to define the terms that should be present in the organization’s knowledge assets. After the first few rounds of auto-tagging, we examined the taxonomy’s coverage, and found that a significant fraction of the terms in the taxonomy went unused. We validated our findings with our clients’ experts, and, to their surprise, our engagement revealed an imbalance of knowledge asset production: while some topics were covered by their content, others were entirely lacking. 

Valid taxonomy terms or ontology concepts for which few to no knowledge assets exist reveal a knowledge gap where AI is likely to struggle.

After AI Deployment

User Engagement & Feedback

To ensure a solution can scale, evolve, and remain effective over time, it is important to establish formal feedback mechanisms for users to engage with system owners and governance bodies on an ongoing basis. Ideally, users should have a frictionless way to report an unsatisfactory answer immediately after they receive it, whether it is because the answer is incomplete or just plain wrong. A thumbs-up or thumbs-down icon has traditionally been used to solicit this kind of feedback, but organizations should also consider dedicated chat channels, conversations within forums, or other approaches for communicating feedback to which their users are accustomed.

AI Design and Governance 

Out-of-the-box, pre-trained language models are designed to prioritize providing a fluid response, often leading them to confidently generate answers even when their underlying knowledge is uncertain or incomplete. This core behavior increases the risk of delivering wrong information to users. However, this flaw can be preempted by thoughtful design in enterprise AI solutions: the key is to transform them from a simple answer generator into a sophisticated instrument that can also detect knowledge gaps. Enterprise AI solutions can be engineered to proactively identify questions which they do not have adequate information to answer and immediately flag these requests. This approach effectively creates a mandate for AI governance bodies to capture the needed knowledge. 

AI can move beyond just alerting the relevant teams about missing knowledge. As we will soon discuss, AI holds additional capabilities to close knowledge gaps by inferring new insights from disparate, already-known information, and connecting users directly with relevant human experts. This allows enterprise AI to not only identify knowledge voids, but also begin the process of bridging them.

 

Closing the Gap

It is important, at this point, to make the distinction between knowledge that is truly missing from the organization and knowledge that is simply unavailable to the organization’s AI solution. The approach to close the knowledge gap will hinge on this key distinction. 

 

If the ‘missing’ knowledge is documented or recorded somewhere… but the knowledge is not in a format that AI can use it, then:

Transform and migrate the present knowledge asset into a format that AI can more readily ingest. 

How this looks in practice:

A professional services firm had a database of meeting recordings meant for knowledge-sharing and disseminating lessons learned. The firm determined that there is a lot of knowledge “in the rough” that AI could incorporate into existing policies and procedures, but this was impossible to do by ingesting content in video format. EK engineers programmatically transcribed the videos, and then transformed the text into a machine-readable format. To make it truly AI-ready, we leveraged Natural Language Processing (NLP) and Named Entity Recognition (NER) techniques to contextualize the new knowledge assets by associating them with other concepts across the organization.

If the ‘missing’ knowledge is documented or recorded somewhere… but the knowledge exists in private spaces like email or closed forums, then:

Establish workflows and guidelines to promote, elevate, and institutionalize knowledge that had been previously informal.

How this looks in practice:

A government agency established online Communities of Practice (CoPs) to transfer and disseminate critical knowledge on key subject areas. Community members shared emerging practices and jointly solved problems. Community managers were able to ‘graduate’ informal conversations and documents into formal agency resources that lived within a designated repository, fully tagged, and actively managed. These validated and enhanced knowledge assets became more valuable and reliable for AI solutions to ingest.

If the ‘missing’ knowledge is documented or recorded somewhere… but the knowledge exists in different fragments across disjointed repositories, then: 

Unify the disparate fragments of knowledge by designing and applying a semantic model to associate and contextualize them. 

How this looks in practice:

A Sovereign Wealth Fund (SWF) collected a significant amount of knowledge about its investments, business partners, markets, and people, but kept this information fragmented and scattered across multiple repositories and databases. EK designed a semantic layer (composed of a taxonomy, ontology, and a knowledge graph) to act as a ‘single view of truth’. EK helped the organization define its key knowledge assets, like investments, relationships, and people, and weaved together data points, documents, and other digital resources to provide a 360-degree view of each of them. We furthermore established an entitlements framework to ensure that every attribute of every entity could be adequately protected and surfaced only to the right end-user. This single view of truth became a foundational element in the organization’s path to AI deployment—it now has complete, trusted, and protected data that can be retrieved, processed, and surfaced to the user as part of solution responses. 

If the ‘missing’ knowledge is not recorded anywhere… but the company’s experts hold this knowledge with them, then: 

Choose the appropriate techniques to elicit knowledge from experts during high-value moments of knowledge capture. It is important to note that we can begin incorporating agentic solutions to help the organization capture institutional knowledge, especially when agents can know or infer expertise held by the organization’s people. 

How this looks in practice:

Following a critical system failure, a large financial institution recognized an urgent need to capture the institutional knowledge held by its retiring senior experts. To address this challenge, they partnered with EK, who developed an AI-powered agent to conduct asynchronous interviews. This agent was designed to collect and synthesize knowledge from departing experts and managers by opening a chat with each individual and asking questions until the defined success criteria were met. This method allowed interviewees to contribute their knowledge at their convenience, ensuring a repeatable and efficient process for capturing critical information before the experts left the organization.

If the ‘missing’ knowledge is not recorded anywhere… and the knowledge cannot be found, then:

Make sure to clearly define the knowledge gap and its impact on the AI solution as it supports the business. When it has substantial effects on the solution’s ability to provide critical responses, then it will be up to subject matter experts within the organization to devise a strategy to create, acquire, and institutionalize the missing knowledge. 

How this looks in practice:

A leading construction firm needed to develop its knowledge and practices to be able to keep up with contracts won for a new type of project. Its inability to quickly scale institutional knowledge jeopardized its capacity to deliver, putting a significant amount of revenue at risk. EK guided the organization in establishing CoPs to encourage the development of repeatable processes, new guidance, and reusable artifacts. In subsequent steps, the firm could extract knowledge from conversations happening within the community and ingest them into AI solutions, along with novel knowledge assets the community developed. 

 

Conclusion

Identifying and closing knowledge gaps is no small feat, and predicting knowledge needs was nearly impossible before the advent of AI. Now, AI acts as both a driver and a solution, helping modern enterprises maintain their competitive edge.

Whether your critical knowledge is in people’s heads or buried in documents, Enterprise Knowledge can help. We’ll show you how to capture, connect, and leverage your company’s knowledge assets to their full potential to solve complex problems and obtain the results you expect out of your AI investments. Contact us today to learn how to bridge your knowledge gaps with AI.

The post How to Fill Your Knowledge Gaps to Ensure You’re AI-Ready appeared first on Enterprise Knowledge.

]]>
Top Ways to Get Your Content and Data Ready for AI https://enterprise-knowledge.com/top-ways-to-get-your-content-and-data-ready-for-ai/ Mon, 15 Sep 2025 19:17:48 +0000 https://enterprise-knowledge.com/?p=25370 As artificial intelligence has quickly moved from science fiction, to pervasive internet reality, and now to standard corporate solutions, we consistently get the question, “How do I ensure my organization’s content and data are ready for AI?” Pointing your organization’s … Continue reading

The post Top Ways to Get Your Content and Data Ready for AI appeared first on Enterprise Knowledge.

]]>
As artificial intelligence has quickly moved from science fiction, to pervasive internet reality, and now to standard corporate solutions, we consistently get the question, “How do I ensure my organization’s content and data are ready for AI?” Pointing your organization’s new AI solutions at the “right” content and data are critical to AI success and adoption, and failing to do so can quickly derail your AI initiatives.  

Though the world is enthralled with the myriad of public AI solutions, many organizations struggle to make the leap to reliable AI within their organizations. A recent MIT report, “The GenAI Divide,” reveals a concerning truth: despite significant investments in AI, 95% of organizations are not seeing any benefits from their AI investments. 

One of the core impediments to achieving AI within your own organization is poor-quality content and data. Without the proper foundation of high-quality content and data, any AI solution will be rife with ‘hallucinations’ and errors. This will expose organizations to unacceptable risks, as AI tools may deliver incorrect or outdated information, leading to dangerous and costly outcomes. This is also why tools that perform well as demos fail to make the jump to production.  Even the most advanced AI won’t deliver acceptable results if an organization has not prepared their content and data.

This blog outlines seven top ways to ensure your content and data are AI-ready. With the right preparation and investment, your organization can successfully implement the latest AI technologies and deliver trustworthy, complete results.

1) Understand What You Mean by “Content” and/or “Data” (Knowledge Asset Definition)

While it seems obvious, the first step to ensuring your content and data are AI-ready is to clearly define what “content” and “data” mean within your organization. Many organizations use these terms interchangeably, while others use one as a parent term of the other. This obviously leads to a great deal of confusion. 

Leveraging the traditional definitions, we define content as unstructured information (ranging from files and documents to blocks of intranet text), and data as structured information (namely the rows and columns in databases and other applications like Customer Relationship Management systems, People Management systems, and Product Information Management systems). You are wasting the potential of AI if you’re not seeking to apply your AI to both content and data, giving end users complete and comprehensive information. In fact, we encourage organizations to think even more broadly, going beyond just content and data to consider all the organizational assets that can be leveraged by AI.

We’ve coined the term knowledge assets to express this. Knowledge assets comprise all the information and expertise an organization can use to create value. This includes not only content and data, but also the expertise of employees, business processes, facilities, equipment, and products. This manner of thinking quickly breaks down artificial silos within organizations, getting you to consider your assets collectively, rather than by type. Moving forward in this article, we’ll use the term knowledge assets in lieu of content and data to reinforce this point. Put simply and directly, each of the below steps to getting your content and data AI-ready should be considered from an enterprise perspective of knowledge assets, so rather than discretely developing content governance and data governance, you should define a comprehensive approach to knowledge asset governance. This approach will not only help you achieve AI-readiness, it will also help your organization to remove silos and redundancies in order to maximize enterprise efficiency and alignment.

knowledge asset zoom in 1

2) Ensure Quality (Asset Cleanup)

We’ve found that most organizations are maintaining approximately 60-80% more information than they should, and in many cases, may not even be aware of what they still have. That means that four out of five knowledge assets are old, outdated, duplicate, or near-duplicate. 

There are many costs to this over-retention before even considering AI, including the administrative burden of maintaining this 80% (including the cost and environmental impact of unnecessary server storage), and the usability and findability cost to the organization’s end users when they go through obsolete knowledge assets.

The AI cost becomes even higher for several reasons. First, AI typically “white labels” the knowledge assets it finds. If a human were to find an old and outdated policy, they may recognize the old corporate branding on it, or note the date from several years ago on it, but when AI leverages the information within that knowledge asset and resurfaces it, it looks new and the contextual clues are lost.

Next, we have to consider the old adage of “garbage in, garbage out.” Incorrect knowledge assets fed to an AI tool will result in incorrect results, also known as hallucinations. While prompt engineering can be used to try to avoid these conflicts and, potentially even errors, the only surefire guarantee to avoid this issue is to ensure the accuracy of the original knowledge assets, or at least the vast majority of it.

Many AI models also struggle with near-duplicate “knowledge assets,” unable to discern which version is trusted. Consider your organization’s version control issues, working documents, data modeled with different assumptions, and iterations of large deliverables and reports that are all currently stored. Knowledge assets may go through countless iterations, and most of the time, all of these versions are saved. When ingested by AI, multiple versions present potential confusion and conflict, especially when these versions didn’t simply build on each other but were edited to improve findings or recommendations. Each of these, in every case, is an opportunity for AI to fail your organization.

Finally, this would also be the point at which you consider restructuring your assets for improved readability (both by humans and machines). This could include formatting (to lower cognitive lift and improve consistency) from a human perspective. For both humans and AI, this could also mean adding text and tags to better describe images and other non-text-based elements. From an AI perspective, in longer and more complex assets, proximity and order can have a negative impact on precision, so this could include restructuring documents to make them more linear, chronological, or topically aligned. This is not necessary or even important for all types of assets, but remains an important consideration especially for text-based and longer types of assets.

knowledge asset zoom in 2

3) Fill Gaps (Tacit Knowledge Capture)

The next step to ensure AI readiness is to identify your gaps. At this point, you should be looking at your AI use cases and considering the questions you want AI to answer. In many cases, your current repositories of knowledge assets will not have all of the information necessary to answer those questions completely, especially in a structured, machine-readable format. This presents a risk itself, especially if the AI solution is unaware that it lacks the complete range of knowledge assets necessary and portrays incomplete or limited answers as definitive. 

Filling gaps in knowledge assets is extremely difficult. The first step is to identify what is missing. To invoke another old adage, organizations have long worried they “don’t know what they don’t know,” meaning they lack the organizational maturity to identify gaps in their own knowledge. This becomes a major challenge when proactively seeking to arm an AI solution with all the knowledge assets necessary to deliver complete and accurate answers. The good news, however, is that the process of getting knowledge assets AI-ready helps to identify gaps. In the next two sections, we cover semantic design and tagging. These steps, among others, can identify where there appears to be missing knowledge assets. In addition, given the iterative nature of designing and deploying AI solutions, the inability of AI to answer a question can trigger gap filling, as we cover later. 

Of course, once you’ve identified the gaps, the real challenge begins, in that the organization must then generate new knowledge assets (or locate “hidden” assets) to fill those gaps. There are many techniques for this, ranging from tacit knowledge capture, to content inventories, all of which collectively can help an organization move from AI to Knowledge Intelligence (KI).    

knowledge asset zoom in 3

4) Add Structure and Context (Semantic Components)

Once the knowledge assets have been cleansed and gaps have been filled, the next step in the process is to structure them so that they can be related to each other correctly, with the appropriate context and meaning. This requires the use of semantic components, specifically, taxonomies and ontologies. Taxonomies deliver meaning and structure, helping AI to understand queries from users, relate knowledge assets based on the relationships between the words and phrases used within them, and leverage context to properly interpret synonyms and other “close” terms. Taxonomies can also house glossaries that further define words and phrases that AI can leverage in the generation of results.

Though often confused or conflated with taxonomies, ontologies deliver a much more advanced type of knowledge organization, which is both complementary to taxonomies and unique. Ontologies focus on defining relationships between knowledge assets and the systems that house them, enabling AI to make inferences. For instance:

<Person> works at <Company>

<Zach Wahl> works at <Enterprise Knowledge>

<Company> is expert in <Topic>

<Enterprise Knowledge> is expert in <AI Readiness>

From this, a simple inference based on structured logic can be made, which is that the person who works at the company is an expert in the topic: Zach Wahl is an expert in AI Readiness. More detailed ontologies can quickly fuel more complex inferences, allowing an organization’s AI solutions to connect disparate knowledge assets within an organization. In this way, ontologies enable AI solutions to traverse knowledge assets, more accurately make “assumptions,” and deliver more complete and cohesive answers. 

Collectively, you can consider these semantic components as an organizational map of what it does, who does it, and how. Semantic components can show an AI how to get where you want it to go without getting lost or taking wrong turns.

5) Semantic Model Application (Tagging)

Of course, it is not sufficient simply to design the semantic components; you must complete the process by applying them to your knowledge assets. If the semantic components are the map, applying semantic components as metadata is the GPS that allows you to use it easily and intuitively. This step is commonly a stumbling block for organizations, and again is why we are discussing knowledge assets rather than discrete areas like content and data. To best achieve AI readiness, all of your knowledge assets, regardless of their state (structured, unstructured, semi-structured, etc), must have consistent metadata applied against them. 

When applied properly, this consistent metadata becomes an additional layer of meaning and context for AI to leverage in pursuit of complete and correct answers. With the latest updates to leading taxonomy and ontology management systems, the process of automatically applying metadata or storing relationships between knowledge assets in metadata graphs is vastly improved, though still requires a human in the loop to ensure accuracy. Even so, what used to be a major hurdle in metadata application initiatives is much simpler than it used to be.

knowledge asset zoom in 4

6) Address Access and Security (Unified Entitlements)

What happens when you finally deliver what your organization has been seeking, and give it the ability to collectively and completely serve their end users the knowledge assets they’ve been seeking? If this step is skipped, the answer is calamity. One of the express points of the value of AI is that it can uncover hidden gems in knowledge assets, make connections humans typically can’t, and combine disparate sources to build new knowledge assets and new answers within them. This is incredibly exciting, but also presents a massive organizational risk.

At present, many organizations have an incomplete or actually poor model for entitlements, or ensuring the right people see the right assets, and the wrong people do not. We consistently discover highly sensitive knowledge assets in various forms on organizational systems that should be secured but are not. Some of this takes the form of a discrete document, or a row of data in an application, which is surprisingly common but relatively easy to address. Even more of it is only visible when you take an enterprise view of an organization. 

For instance, Database A might contain anonymized health information about employees for insurance reporting purposes but maps to discrete unique identifiers. File B includes a table of those unique identifiers mapped against employee demographics. Application C houses the actual employee names and titles for the organizational chart, but also includes their unique identifier as a hidden field. The vast majority of humans would never find this connection, but AI is designed to do so and will unabashedly generate a massive lawsuit for your organization if you’re not careful.

If you have security and entitlement issues with your existing systems (and trust me, you do), AI will inadvertently discover them, connect the dots, and surface knowledge assets and connections between them that could be truly calamitous for your organization. Any AI readiness effort must confront this challenge, before your AI solutions shine a light on your existing security and entitlements issues.

knowledge asset zoom in 5

7) Maintain Quality While Iteratively Improving (Governance)

Steps one through six describe how to get your knowledge assets ready for AI, but the final step gets your organization ready for AI. With a massive investment in both getting your knowledge assets in the right state for AI and in  the AI solution itself, the final step is to ensure ongoing quality of both. Mature organizations will invest in a core team to ensure knowledge assets go from AI-ready to AI-mature, including:

  • Maintaining and enforcing the core tenets to ensure knowledge assets stay up-to-date and AI solutions are looking at trusted assets only;
  • Reacting to hallucinations and unanswerable questions to fill gaps in knowledge assets; 
  • Tuning the semantic components to stay up to date with organizational changes.

The most mature organizations, those wishing to become AI-Powered organizations, will look first to their knowledge assets as the key building block to drive success. Those organizations will seek ROCK (Relevant, Organizationally Contextualized, Complete, and Knowledge-Centric) knowledge assets as the first line to delivering Enterprise AI that can be truly transformative for the organization. 

If you’re seeking help to ensure your knowledge assets are AI-Ready, contact us at info@enterprise-knowledge.com

The post Top Ways to Get Your Content and Data Ready for AI appeared first on Enterprise Knowledge.

]]>
When Should You Use An AI Agent? Part One: Understanding the Components and Organizational Foundations for AI Readiness https://enterprise-knowledge.com/when-should-you-use-an-ai-agent/ Thu, 04 Sep 2025 15:39:43 +0000 https://enterprise-knowledge.com/?p=25285 It’s been recognized for far too long that organizations spend as much as 30-40% of their time searching for or recreating information. Now, imagine a dedicated analyst who doesn’t just look for or analyze data for you but also roams … Continue reading

The post When Should You Use An AI Agent? Part One: Understanding the Components and Organizational Foundations for AI Readiness appeared first on Enterprise Knowledge.

]]>
It’s been recognized for far too long that organizations spend as much as 30-40% of their time searching for or recreating information. Now, imagine a dedicated analyst who doesn’t just look for or analyze data for you but also roams the office, listens to conversations, reads emails, and proactively sends you updates while spotting outdated data, summarizing new information, flagging inconsistencies, and prompting follow-ups. That’s what an AI agent does; it autonomously monitors content and data platforms, collaboration tools like Slack, Teams, and even email, and suggests updates or actions—without waiting for instructions. Instead of sending you on a massive data hunt to answer “What’s the latest on this client?”, an AI agent autonomously pulls CRM notes, emails, contract changes, and summarizes them in Slack or Teams or publishes findings as a report. It doesn’t just react, it takes initiative. 

The potential of AI agents for productivity gains within organizations is undeniable—and it’s no longer a distant future. However, the key question today is: when is the right time to build and deploy an AI agent, and when is simpler automation the more effective choice?

While the idea of a fully autonomous assistant handling routine tasks is appealing, AI agents require a complex framework to succeed. This includes breaking down silos, ensuring knowledge assets are AI-ready, and implementing guardrails to meet enterprise standards for accuracy, trust, performance, ethics, and security.

Over the past couple of years, we’ve worked closely with executives who are navigating what it truly means for their organizations to be “AI-ready” or “AI-powered”, and as AI technologies evolve, this challenge has only become more complex and urgent for all of us.

To move forward effectively, it’s crucial to understand the role of AI agents compared to traditional or narrow AI, automation, or augmentation solutions. Specifically, it is important to recognize the unique advantages of agent-based AI solutions, identify the right use cases, and ensure organizations have the best foundation to scale effectively.

In the first part of this two-part series, I’ll outline the core building blocks for organizations looking to integrate AI agents. The goal of this series is to provide insights that help set realistic expectations and contribute to informed decisions around AI agent integration—moving beyond technical experiments—to deliver meaningful outcomes and value to the organization.

Understanding AI Agents

AI agents are goal-oriented autonomous systems built from large language and other AI models, business logic, guardrails, and a supporting technology infrastructure needed to operate complex, resource-intensive tasks. Agents are designed to learn from data, adapt to different situations, and execute tasks autonomously. They understand natural language, take initiative, and act on behalf of humans and organizations across multiple tools and applications. Unlike traditional machine learning (ML) and AI automations (such as virtual assistants or recommendation engines), AI agents offer initiative, adaptability, and context-awareness by proactively accessing, analyzing, and acting on knowledge and data across systems.

 

Infographic explaining AI agents and when to use them, including what they are, when to use, and its limitations

 

Components of Agentic AI Framework

1. Relevant Language and AI Models

Language models are the agent’s cognitive core, essentially its “brain”, responsible for reasoning, planning, and decision-making. While not every AI agent requires a Large Language Model (LLM), most modern and effective agents rely on LLMs and reinforcement learning to evaluate strategies and select the best course of action. LLM-powered agents are especially adept at handling complex, dynamic, and ambiguous tasks that demand interpretation and autonomous decision-making.

Choosing the right language model also depends on the use case, task complexity, desired level of autonomy, and the organization’s technical environment. Some tasks are better served to remain simple, with more deterministic workflows or specialized algorithms. For example, an expertise-focused agent (e.g., a financial fraud detection agent) is more effective when developed with purpose-built algorithms than with a general-purpose LLM because the subject area requires hyper-specific, non-generalizable knowledge. On the other hand, well-defined, repetitive tasks, such as data sorting, form validation, or compliance checks, can be handled by rule-based agents or classical machine learning models, which are cheaper, faster, and more predictable. LLMs, meanwhile, add the most value in tasks that require flexible reasoning and adaptation, such as orchestrating integration with multiple tools, APIs, and databases to perform real-world actions like dynamic customer service process, placing trades or interpreting incomplete and ambiguous information. In practice, we are finding that a hybrid approach works best.

2. Semantic Layer and Unified Business Logic

AI agents need access to a shared, consistent view of enterprise data to avoid conflicting actions, poor decision-making, or the reinforcement of data silos. Increasingly, agents will also need to interact with external data and coordinate with other agents, which compounds the risk of misalignment, duplication, or even contradictory outcomes. This is where a semantic layer becomes critical. By standardizing definitions, relationships, and business context across knowledge and data sources, the semantic layer provides agents with a common language for interpreting and acting on information, connecting agents to a unified business logic. Across several recent projects, implementing a semantic layer has improved the accuracy and precision of initial AI results from around 50% to between 80% and 95%, depending on the use case.

The semantic layer includes metadata management, business glossaries, and taxonomy/ontology/graph data schemas that work together to provide a unified and contextualized view of data across typically siloed systems and business units, enabling agents to understand and reason about information within the enterprise context. These semantic models define the relationships between data entities and concepts, creating a structured representation of the business domain the agent is operating in. Semantic models form the foundation for understanding data and how it relates to the business. By incorporating two or more of these semantic model components, the semantic layer provides the foundation for building robust and effective agentic perception, cognition, action, and learning that can understand, reason, and act on org-specific business data. For any AI, but specifically for AI agents, a semantic layer is critical in providing access to:

  • Organizational context and meaning to raw data to serve as a grounding ‘map’ for accurate interpretation and agent action;
  • Standardized business terms that establish a consistent vocabulary for business metrics (e.g., defining “revenue” or “store performance” ), preventing confusion and ensuring the AI uses the same definitions as the business; and
  • Explainability and trust through metadata and lineage to validate and track why agent recommendations are compliant and safe to adopt.

Overall, the semantic layer ensures that all agents are working from the same trusted source of truth, and enables them to exchange information coherently, align with organizational policies, and deliver reliable, explainable results at scale. Specifically, in a multi-agent system with multiple domain-specific agents, all agents may not work off the same semantic layer, but each will have the organizational business context to interpret messages from each other as courtesy of the domain-specific semantic layers.

The bottom line is that, without this reasoning layer, the “black box” nature of agents’ decision-making processes erodes trust, making it difficult for organizations to adopt and rely on these source systems.

3. Access to AI-Ready Knowledge Assets and Sources

Agents require accurate, comprehensive, and context-rich organizational knowledge assets to make sound decisions. Without access to high-quality, well-structured data, agents, especially those powered by LLMs, struggle to understand complex tasks or reason effectively, often leading to unreliable or “hallucinated” outputs. In practice, this means organizations making strides with effective AI agents need to:

  • Capture and codify expert knowledge in a machine-readable form that is readily interpretable by AI models so that tacit know-how, policies, and best practices are accessible to agents, not just locked in human workflows or static documents;A callout box that explains what AI-ready knowledge assets are
  • Connect structured and unstructured data sources, from databases and transactional systems to documents, emails, and wikis, into a connected, searchable layer that agents can query and act upon; 
  • Provide semantically enriched assets with well-managed metadata, consistent labels, and standardized formats to make them interoperable with common AI platforms; 
  • Align and organize internal and external data so agents can seamlessly draw on employee-facing knowledge (policies, procedures, internal systems) as well as customer-facing assets (product documentation, FAQs, regulatory updates) while maintaining consistency, compliance, and brand integrity; and
  • Enable access to AI assets and systems while maintaining strict controls over who can use it, how it is used, and where it flows.

This also means, beyond static access to knowledge, agents must also query and interact dynamically with various sources of data and content. Doing this includes connecting to applications, websites, content repositories, and data management systems, and taking direct actions, such as reading/writing into enterprise applications, updating records, or initiating workflows.

Enabling this capability requires a strong design and engineering foundation, allowing agents to integrate with external systems and services through standard APIs, operate within existing security protocols, and respect enterprise governance and record compliance requirements. A unified approach, bringing together disparate data sources into a connected layer (see semantic layer component above), helps break down silos and ensures agents can operate with a holistic, enterprise-wide view of knowledge.

4. Instructions, Guardrails, and Observability

Organizations are largely unprepared for agentic AI due to several factors: the steep leap from traditional, predictable AI to complex multi-agent orchestration, persistent governance gaps, a shortage of specialized expertise, integration challenges, and inconsistent data quality, to name a few. Most critically, the ability to effectively control and monitor agent autonomy remains a fundamental barrier—posing significant security, compliance, and privacy risks. Recent real-world cases highlight how quickly things can go wrong, including tales of agents deleting valuable data, offering illegal or unethical advice, and amplifying bias in hiring decisions or in public-sector deployments. These failures underscore the risks of granting autonomous AI agents high-level permissions over live production systems without robust oversight, guardrails, and fail-safes. Until these gaps are addressed, autonomy without accountability will remain one of the greatest barriers to enterprise readiness in the agentic AI era.

As such, for AI agents to operate effectively within the enterprise, they must be guided by clear instructions, protected by guardrails, and monitored through dedicated evaluation and observability frameworks.

  • Instructions: Instructions define an AI agent’s purpose, goals, and persona. Agents don’t inherently understand how a specific business or organization operates. Instead, that knowledge comes from existing enterprise standards, such as process documentation, compliance policies, and operating models, which provide the foundational inputs for guiding agent behavior. LLMs can interpret these high-level standards and convert them into clear, step-by-step instructions, ensuring agents act in ways that align with organizational expectations. For example, in a marketing context, an LLM can take a general directive like, “All published content must reflect the brand voice and comply with regulatory guidelines”, and turn it into actionable instructions for a marketing agent. The agent can then assist the marketing team by reviewing a draft email campaign, identifying tone or compliance issues, and suggesting revisions to ensure the content meets both brand and regulatory standards.
  • Guardrails: Guardrails are safety measures that act as the protective boundaries within which agents operate. Agents need guardrails across different functions to prevent them from producing harmful, biased, or inappropriate content and to enforce security and ethical standards. These include relevance and output validation guardrails, personally identifiable information (PII) filters that detect unsafe inputs or prevent leakage of PII, reputation and brand alignment checks, privacy and security guardrails that enforce authentication, authorization, and access controls to prevent unauthorized data exposure, and guardrails against prompt attacks and content filters for harmful topics. 
  • Observability: Even with strong instructions and guardrails, agents must be monitored in real time to ensure they behave as expected. Observability includes logging actions, tracking decision paths, monitoring model outputs, cost monitoring and performance optimization, and surfacing anomalies for human review. A good starting point for managing agent access is mapping operational and security risks for specific use cases and leveraging unified entitlements (identity and access control across systems) to apply strict role-based permissions and extend existing data security measures to cover agent workflows.

Together, instructions, guardrails, and observability form a governance layer that ensures agents operate not only autonomously, but also responsibly and in alignment with organizational goals. To achieve this, it is critical to plan for and invest in AI management platforms and services that define agent workflows, orchestrate these interactions, and supervise AI agents. Key capabilities to look for in an AI management platform include: 

  • Prompt chaining where the output of one LLM call feeds the next, enabling multi-step reasoning; 
  • Instruction pipelines to standardize and manage how agents are guided;
  • Agent orchestration frameworks for coordinating multiple agents across complex tasks; and 
  • Evaluation and observability (E&O) monitoring solutions that offer features like content and topic moderation, PII detection and redaction, and protection against prompt injection or “jailbreaking” attacks. Furthermore, because training models involve iterative experimentation, tuning, and distributed computation, it is paramount to have benchmarks and business objectives defined from the onset in order to optimize model performance through evaluation and validation.

In contrast to the predictable expenses of standard software, AI project costs are highly dynamic and often underestimated during initial planning. Many organizations are grappling with unexpected AI cost overruns due to hidden expenses in data management, infrastructure, and maintenance for AI. This can severely impact budgets, especially for agentic environments. Tracking system utilization, scaling resources dynamically, and implementing automated provisioning allows organizations to maintain consistent performance and optimization for agent workloads, even under variable demand, while managing cost spikes and avoiding any surprises.

Many traditional enterprise observability tools are now extending their capabilities to support AI-specific monitoring. Lifecycle management tools such as MLflow, Azure ML, Vertex AI, or Databricks help with the management of this process at enterprise scale by tracking model versions, automating retraining schedules, and managing deployments across environments. As with any new technology, the effective practice is to start with these existing solutions where possible, then close the gaps with agent-specific, fit-for-purpose tools to build a comprehensive oversight and governance framework.

5. Humans and Organizational Operating Models

There is no denying it—the integration of AI agents will transform ways of working worldwide. However, a significant gap still exists between the rapid adoption plans for AI agents and the reality on the ground. Why? Because too often, AI implementations are treated as technological experiments, with a focus on performance metrics or captivating demos. This approach frequently overlooks the critical human element needed for AI’s long-term success. Without a human-centered operating model, AI deployments continue to run the risk of being technologically impressive but practically unfit for organizational use.

Human Intervention and Human-In-the-Loop Validation: One of the most pressing considerations in integrating AI into business operations is the role of humans in overseeing, validating, and intervening in AI decisions. Agentic AI has the power to automate many tasks, but it still requires human oversight, particularly in high-risk or high-impact decisions. A transparent framework for when and how humans intervene is essential for mitigating these risks and ensuring AI complies with regulatory and organizational standards. Emerging practices we are seeing are showing early success when combining agent autonomy with human checkpoints, wherein subject matter experts (SMEs) are identified and designated as part of the “AI product team” from the onset to define the requirements for and ensure that AI agents consistently focus on and meet the right organizational use cases throughout development. 

Shift in Roles and Reskilling: For AI to truly integrate into an organization’s workflow, a fundamental shift in the fabric of an organization’s roles and operating model is becoming necessary. Many roles as we know them today are shifting—even for the most seasoned software and ML engineers. Organizations are starting to rethink their structure to blend human expertise with agentic autonomy. This involves redesigning workflows to allow AI agents to automate routine tasks while humans focus on strategic, creative, and problem-solving roles. 

Implementing and managing agentic AI requires specialized knowledge in areas such as AI model orchestration, agent–human interaction design, and AI operations. These skill sets are often underdeveloped in many organizations and, as a result, AI projects are failing to scale effectively. The gap isn’t just technical; it also includes a cultural shift toward understanding how AI agents generate results and the responsibility associated with their outputs. To bridge this gap, we are seeing organizations start to invest in restructuring data, AI, content, and knowledge operations/teams and reskilling their workforce in roles like AI product management, knowledge and semantic modeling, and AI policy and governance.

Ways of Working: To support agentic AI delivery at scale, it is becoming evident that agile methodologies must also evolve beyond their traditional scope of software engineering and adapt to the unique challenges posed by AI development lifecycles. Agentic AI, requires an agile framework that is flexible, experimental, and capable of iterative improvements. This further requires deep interdisciplinary collaboration across data scientists, AI engineers, software engineers, domain experts, and business stakeholders to navigate complex business and data environments.

Furthermore, traditional CI/CD pipelines, which focus on code deployment, need to be expanded to support continuous model training, testing, human intervention, and deployment. Integrating ML/AI Ops is critical for managing agent model drift and enabling autonomous updates. The successful development and large-scale adoption of agentic AI hinges on these evolving workflows that empower organizations to experiment, iterate, and adapt safely as both AI behaviors and business needs evolve.

Conclusion 

Agentic AI will not succeed through technology advancements alone. Given the inherent complexity and autonomy of AI agents, it is essential to evaluate organizational readiness and conduct a thorough cost-benefit analysis when determining whether an agentic capability is essential or merely a nice-to-have.

Success will ultimately depend on more than just cutting-edge models and algorithms. It also requires dismantling artificial, system-imposed silos between business and technical teams, while treating organizational knowledge and people as critical assets in AI design. Therefore, a thoughtful evolution of the organizational operating model and the seamless integration of AI into the business’s core is critical. This involves selecting the right project management and delivery frameworks, acquiring the most suitable solutions, implementing foundational knowledge and data management and governance practices, and reskilling, attracting, hiring, and retaining individuals with the necessary skill sets. These considerations make up the core building blocks for organizations to begin integrating AI agents.

The good news is that when built on the right foundations, AI solutions can be reused across multiple use cases, bridge diverse data sources, transcend organizational silos, and continue delivering value beyond the initial hype. 

Is your organization looking to evaluate AI readiness? How well does it measure up against these readiness factors? Explore our case studies and knowledge base on how other organizations are tackling this or get in touch to learn more about our approaches to content and data readiness for AI.

The post When Should You Use An AI Agent? Part One: Understanding the Components and Organizational Foundations for AI Readiness appeared first on Enterprise Knowledge.

]]>
How KM Leverages Semantics for AI Success https://enterprise-knowledge.com/how-km-leverages-semantics-for-ai-success/ Wed, 03 Sep 2025 19:08:31 +0000 https://enterprise-knowledge.com/?p=25271 This infographic highlights how KM incorporates semantic technologies and practices across scenarios to enhance AI capabilities.

The post How KM Leverages Semantics for AI Success appeared first on Enterprise Knowledge.

]]>
This infographic highlights how KM incorporates semantic technologies and practices across scenarios to enhance AI capabilities.

To get the most out of Large Language Model (LLM)-driven AI solutions, you need to provide them with structured, context-rich knowledge that is unique to your organization. Without purposeful access to proprietary terminology, clearly articulated business logic, and consistent interpretation of enterprise-wide data, LLMs risk delivering incomplete or misleading insights. This infographic highlights how KM incorporates semantic technologies and practices across scenarios to  enhance AI capabilities and when they're foundational — empowering your organization to strategically leverage semantics for more accurate, actionable outcomes while cultivating sound knowledge intelligence practices and investing in your enterprise's knowledge assets. Use Case: Expert Elicitation - Semantics used for AI Enhancement Efficiently capture valuable knowledge and insights from your organization's experts about past experiences and lessons learned, especially when these insights have not yet been formally documented.  By using ontologies to spot knowledge gaps and taxonomies to clarify terms, an LLM can capture and structure undocumented expertise—storing it in a knowledge graph for future reuse. Example:  Capturing a senior engineer's undocumented insights on troubleshooting past system failures to streamline future maintenance. Use Case: Discovery & Extraction - Semantics used for AI Enhancement Quickly locate key insights or important details within a large collection of documents and data, synthesizing them into meaningful, actionable summaries, and delivering these directly back to the user. Ontologies ensure concepts are recognized and linked consistently across wording and format, enabling insights to be connected, reused, and verified outside an LLM's opaque reasoning process. Example: Scanning thousands of supplier agreements to locate variations of key contract clauses—despite inconsistent wording—then compiling a cross-referenced summary for auditors to accelerate compliance verification and identify high-risk deviations. Use Case: Context Aggregation - Semantics for AI Foundations Gather fragmented information from diverse sources and combine it into a unified, comprehensive view of your business processes or critical concepts, enabling deeper analysis, more informed decisions, and previously unattainable insights. Knowledge graphs unify fragmented information from multiple sources into a persistent, coherent model that both humans and systems can navigate. Ontologies make relationships explicit, enabling the inference of new knowledge that reveals connections and patterns not visible in isolated data. Example: Integrating financial, operational, HR, and customer support data to predict resource needs and reveal links between staffing, service quality, and customer retention for smarter planning. Use Case: Cleanup and Optimization - Semantics for AI Enhancement Analyze and optimize your organization's knowledge base by detecting redundant, outdated, or trivial (ROT) content—then recommend targeted actions or automatically archive and remove irrelevant material to keep information fresh, accurate, and valuable. Leverage taxonomies and ontologies to recognize conceptually related information even when expressed in different terms, formats, or contexts; allowing the AI to uncover hidden redundancies, spot emerging patterns, and make more precise recommendations than could be justified by keyword or RAG search alone. Example: Automatically detecting and flagging outdated or duplicative policy documents—despite inconsistent titles or formats—across an entire intranet, streamlining reviews and ensuring only current, authoritative content remains accessible. Use Case: Situated Insight - Semantics used for AI Enhancement Proactively deliver targeted answers and actionable suggestions uniquely aligned with each user's expressed preferences, behaviors, and needs, enabling swift, confident decision-making. Use taxonomies to standardize and reconcile data from diverse systems, and apply knowledge graphs to connect and contextualize a user's preferences, behaviors, and history; creating a unified, dynamic profile that drives precise, timely, and highly relevant recommendations. Example: Instantly curating a personalized learning path (complete with recommended modules, mentors, and practice projects) based on an employee's recent performance trends, skill gaps, and long-term career goals, accelerating both individual growth and organizational capability. Use Case: Context Mediation and Resolution - Semantics for AI Foundations Bridge disparate contexts across people, processes, technologies, etc., into a common, resolved machine readable understanding that preserves nuance while eliminating ambiguity. Semantics establish a shared, machine-readable understanding that bridges differences in language, structure, and context across people, processes, and systems. Taxonomies unify terminology from diverse sources, while ontologies and knowledge graphs capture and clarify the nuanced relationships between concepts—eliminating ambiguity without losing critical detail. Example: Reconciling varying medical terminologies, abbreviations, and coding systems from multiple healthcare providers into a single, consistent patient record—ensuring that every clinician sees the same unambiguous history, enabling faster diagnosis, safer treatment decisions, and more effective care coordination. Learn more about our work with AI and semantics to help your organization make the most out of these investments, don't hesitate to reach out at:  https://enterprise-knowledge.com/contact-us/

The post How KM Leverages Semantics for AI Success appeared first on Enterprise Knowledge.

]]>
Auto-Classification for the Enterprise: When to Use AI vs. Semantic Models https://enterprise-knowledge.com/auto-classification-when-ai-vs-semantic-models/ Tue, 26 Aug 2025 18:19:23 +0000 https://enterprise-knowledge.com/?p=25221 Auto-classification is a valuable process for adding context to unstructured content. Nominally speaking, some practitioners distinguish between auto-classification (placing content into pre-defined categories from a taxonomy) and auto-tagging (assigning unstructured keywords or metadata, sometimes generated without a taxonomy). In this article, I use ‘auto-classification’ in the broader sense, encompassing both approaches. Continue reading

The post Auto-Classification for the Enterprise: When to Use AI vs. Semantic Models appeared first on Enterprise Knowledge.

]]>
Auto-classification is a valuable process for adding context to unstructured content. Nominally speaking, some practitioners distinguish between auto-classification (placing content into pre-defined categories from a taxonomy) and auto-tagging (assigning unstructured keywords or metadata, sometimes generated without a taxonomy). In this article, I use ‘auto-classification’ in the broader sense, encompassing both approaches. While it can take many forms, its primary purpose remains the same: to automatically enrich content with metadata that improves findability, helps users immediately determine relevance, and provides crucial information on where content came from and when it was made. And while tagging content is always a recommended practice, it is not always scalable when human time and effort is required to perform it. To solve this problem, we have been helping organizations automate this process and minimize the amount of manual effort required, especially in the age of AI, where organized and well-labeled information is the key to success.

This includes designing and implementing auto-classification solutions that save time and resources – using methods such as natural language processing, machine learning, and rapidly-evolving AI models such as large language models (LLMs). In this article, I will demonstrate how auto-classification processes can deliver measurable value to organizations of diverse sizes or industries, using real-world examples to illustrate the costs and benefits. I will then give an overview of common methods for performing auto-classification, comparing their high-level strengths and weaknesses, and conclude by discussing how incorporating semantics can significantly enhance the performance of these methods.

How Can Auto-Classification Help My Organization?

It’s a good bet that your organization possesses a large repository of unstructured information such as documents, process guides, and informational resources, either meant for internal use or for display on a public webpage. Such a collection of knowledge assets is valuable – but only as valuable as the organization’s ability to effectively access, manage, and utilize them. That’s where auto-classification can shine: by serving as an automated processor of your organization’s unstructured content and applying tags, an auto-classifier adds structure quickly that provides value in multiple ways, as outlined below.

Time Savings

First, an auto-classifier saves content creators time in two key ways. For one, manually reading through documents and applying metadata tags to each individually can be tedious, taking time away from content creators’ other responsibilities – as a solution, auto-classification can free up time that can be used to perform more crucial tasks. On the other end of the process, auto-classification and the use of metadata tags can improve findability, saving employees time when searching for documents. When paired with a taxonomy or set list of terms, an auto-classifier can standardize the search experience by allowing for content to be consistently tagged with a set of standard language. 

Content Management and Strategy

These standard tags can also play a role in more content strategy-focused efforts, such as identifying gaps in content and content deduplication. For example, if some taxonomy terms feature no associated content, content strategists and managers may identify an organizational gap that needs to be filled via the authoring of new content. In contrast, too many content pieces identified as having similar themes can be deduplicated so that the most valuable content is prioritized for end users. These analytics-based decisions can help organizations maximize the efficacy of their content, increase content reach, and cut down on the cost of storing duplicate content. 

Ensuring Security

Finally, we have seen auto-classification play a key role in keeping sensitive content and information secure. Auto-classifiers can determine what content should be tagged with certain sensitivity classifications (for example, employee addresses being tagged as visible by HR only). One example of this is through dark data detection, where an auto-classifier parses through all organizational content to identify information that should not be visible to all end users. Assigning sensitivity classifications to content through auto-tagging can help to automatically address security concerns and ensure regulatory compliance, saving organizations from the reputational and legal costs associated with data leaks. 

Common Auto-Classification Methods

An infographic about the six common auto-classification methods: rules-based tagging, regular expressions tagging, frequency-based tagging, natural language processing, machine learning-based tagging, LLM-based tagging

So, how do we go about tagging content automatically? Organizations can choose to employ one of a number of methods as a standalone solution, or combine them as part of a hybrid solution. Below, I will give a high-level overview of six of the most commonly used methods in auto-classification, along with some considerations for each.

1. Rules-Based Tagging: Uses deterministic rules to map content to tags. Rules can be built from dictionaries/keyword lists, proximity or co-occurrence patterns (e.g., “treatment” within 10 words of “disorder”), metadata values (author, department), or structural cues (headings, templates).

  • Considerations: Highly transparent and auditable; great for regulated/compliance use cases and domain terms with stable phrasing. However, rules can be brittle, require ongoing maintenance, and may miss implied meaning or novel phrasing unless rules are continually expanded.

2. Regular Expression (RegEx) Tagging: A specialized form of rules-based tagging that applies RegEx patterns to detect and tag structured strings (for example, SKUs, case numbers, ICD-10 codes, dates, or email addresses).

  • Considerations: Excellent precision for well-formed patterns and semi-structured content; lightweight and fast. Can produce false positives without careful validation of results. Best combined with other methods (such as frequency or NLP) for context checks.

3. Frequency-Based Tagging: Frequency-based tagging considers the number of times that a certain term (or variations of said term) appear in a document, and assigns the most frequently appearing tags to the content. Early search engines, website indexers, and tag-mining software relied heavily on this approach for its simplicity and transparency; however, frequency of a term does not always guarantee its importance.

  • Considerations: Works well with a well-structured taxonomy with ample synonyms for terms, as well as content that has key terms appear frequently. Not as strong a method when meaning is implied/terms are not explicitly used or terms are excessively repeated.

4. Natural Language Processing (NLP): Uses basic calculations of semantic meaning (tokenization) to find the best matches by meaning between two pieces of text (such as a content piece and terms in a taxonomy).

  • Considerations: Can work well for terms that are not organization/domain-specific, but struggles with acronyms/more specific terms. Better than frequency-based tagging at determining implied meaning.

5. Machine Learning-Based Tagging: Machine learning methods allow for the training of models on pre-tagged content, empowering organizations to improve models iteratively for better results. By comparing new content against patterns they have already learned/been trained on, machine learning models can infer the most relevant concepts and tags to a content piece and apply them consistently. User input can help refine the classifier to identify patterns, trends, and domain-specific terms more accurately.

  • Considerations: A stock model may initially perform at a lower-than-expected level, while a well-trained model can deliver high-grade accuracy. However, this can come at the expense of time and computing resources.

6. Large Language Model (LLM)-Based Tagging: The newest form of auto-classification, this involves providing a large language model with a tagging prompt, content to tag, and a taxonomy/list of terms if desired. As interest around generative AI and LLMs grows, this method has become increasingly popular for its ability to parse more complex content pieces and analyze meaning deeply.

  • Considerations: Tags content like a human, meaning results may vary/become inconsistent if the same corpus is tagged multiple times. While LLMs can be smart regarding implied meaning and content sensitivity, they can be inconsistent without specific model tuning and prompt engineering. Additionally, suffers from accuracy/precision issues when fed a large taxonomy.

Some taxonomy and ontology management systems (TOMS), such as Graphwise PoolParty or Progress Semaphore, also offer auto-classification add-ons or extensions to their platforms that make use of one or more of these methods.

The Importance of Semantics in Auto-Classification

Imagine your repository of content as a bookstore, and your auto-classifier as the diligent (but easily confused!) store manager. You have a wide number of books you want to sort into different categories, such as their audience (children, teen, adult) and genre (romance, fantasy, sci–fi, nonfiction). 

Now, imagine if you gave your manager no instructions on how to sort the books. They start organizing too specifically. They put four books together on one shelf that says “Nonfiction books about history in 1814.” They put another three books on a shelf that says “Romance books in a fantasy universe with dragons.” They put yet another five books on a shelf that says “Books about knowledge management.” 

Before you know it, your bookstore has 1,098 shelves, and no happy customers. 

Therein lies the danger of tagging content without a taxonomy, leading to what’s known as semantic drift. While tagging without a taxonomy and creating an initial set of tags can be useful in some circumstances, such as when trying to generate tags or topics to later organize into a hierarchy as part of a taxonomy, it has its limitations. Tags often become very specific and struggle to maintain alignment in a way that makes them useful for search or for grouping larger amounts of content together. And, as I mentioned at the beginning of this article, auto-classification without a taxonomy in place is not auto-classification in the true sense of the word; rather, such approaches are auto-tagging, and may not produce the results business leaders/decision-makers expect.

I’ve seen this in practice when testing auto-classification methods with and without a taxonomy. When an LLM was given the same content corpus of 100 documents to tag, but one generated its own terms and the other was given a taxonomy, the results differed greatly. The LLM without a taxonomy generated 765 extremely domain-specific terms that often only applied to a singular content piece. In contrast, the LLM when given a taxonomy tagged the content with 240 terms, allowing the same tags to apply to multiple content pieces, creating topic clusters and groups of similar content that users can easily browse, search, and navigate, making discovery faster, more intuitive, and less fragmented than when every piece is labeled with unique, one-off terms

Bar graph showing the precision, recall, and accuracy of LLM's with and without semantics

Overall, incorporating a taxonomy into LLM-based auto-classification transforms fragmented, messy one-off tags into consistent topic clusters and hierarchies that make content easier to browse, search, and discover.

This illustrates the utility of a taxonomy in auto-classification. When you give your employee a list of shelves to stock in the store, they can avoid the “overthinking” of semantic drift and place books onto more well-architected shelves (e.g., Young Adult, Sci-Fi). A well-defined taxonomy acts as the blueprint for organizing content meaningfully and consistently using an auto-tagger.

 

When Should I Use AI, Semantic Models, or Both?

Bar graph about the accuracy of different auto-tagging methods

 

Bar graph showing the precision of different auto-classification methods

 

Bar graph showing the recall of different auto-classification methods
While results may vary by use case, methods including both AI and semantic models tend to score higher across the board. These images demonstrate results from one specific content corpus we tested internally.

Methods including both AI and semantic models tend to score higher in accuracy, precision, and recall.

 

As demonstrated above, tags created by generative AI models without any semantic model in place can become unwieldy and excessive, as LLMs look to create the best tag for that individual content piece rather than a tag that can be used as an umbrella term for multiple pieces of content. However, that does not completely eliminate AI as a standalone solution for all tagging use cases. These auto-tagging models and processes can prove helpful in the early stages of creating a term list as a method of identifying common themes across content in a corpus and forming initial topic clusters that can later bring structure to a taxonomy, either in the form of hierarchies or facets. Once again, while not true auto-classification as the industry dictates, auto-tagging with AI alone can work well for domains where topics don’t neatly fit within a hierarchy or when domain models and knowledge evolve quickly and a hierarchical structure would be infeasible.

On the other hand, semantic models are a great way to add the aforementioned structure to an auto-classification process, and work very well for exact or near-exact term matching. When combined with a frequency tagging, NLP, or machine learning-based auto-classifier in these situations, they tend to excel in terms of precision, applying very few incorrect tags. Additionally, these methods perform well in situations where content contains domain-specific jargon or acronyms located within semantic models, as it tags with a greater emphasis on these exact matches. 

Semantic models alone can prove to be a more cost-effective option for auto-classification as well, as lighter, less compute-heavy models that do not require paid cloud hosting can tag some content corpora with a high level of accuracy. Finally, semantic models can assist greatly in cases where security and compliance are paramount, as leading AI models are generally cloud-hosted, and most methods using semantics alone can be run on-premises without introducing privacy concerns.

Nonetheless, semantic models and AI can combine as part of auto-classification solutions that are more robust and well-equipped for complex use cases. LLMs can extract meaning from complex documents where topics may be implied and compare content against a taxonomy or term list, which helps ensure content is easy to organize and consistent with an organization’s model for knowledge. However, one key consideration with this method is taxonomy size – if a taxonomy grows too large (terms in the thousands, for example), an LLM may face difficulties finding/applying the right tag in a limited context window without mitigation strategies such as retrieving tags in batches. 

In more advanced use cases, an LLM can also be paired with an ontology, which can help LLMs understand more about interrelationships between organizational topics, concepts, and terms, and apply tags to content more intelligently. For example, a knowledge base of clinical notes and guidelines could be paired with a medical ontology that maps symptoms to potential conditions, and conditions to recommended treatments. An LLM that understands this ontology could tag a physician’s notes with all three layers (symptoms, conditions, and treatments) so when a doctor searches for “persistent cough,” the system retrieves not just symptom references, but also likely diagnoses (e.g., bronchitis, asthma) and corresponding treatment protocols. This kind of ontology-guided tagging makes the knowledge base more searchable and user-friendly and helps surface actionable insights instead of isolated pieces of information.

In some cases, privacy or security concerns may dictate that AI cannot be used alongside a semantic model. In others, an organization may lack a semantic model and may only have the capacity to tag content with AI as a start. However, as a whole, the majority of use cases for auto-classification benefit from a well-architected solution that combines AI’s ability to intelligently parse content with the structure and specific context that semantic models provide.

Conclusion

Auto-classification adds an important step in automation to organizations looking to enrich their content with metadata – whether it be for findability, analytics, or understanding. While there are many methods to choose from when exploring an auto-classification solution, they all rely on semantics in the form of a well-designed taxonomy to function to the best of their ability. Once implemented and governed correctly, these automated solutions can serve as key ways to unblock human efforts and direct them away from tedious tagging processes, allowing your organization’s experts to get back to doing what matters most. 

Looking to set up an auto-classification process within your organization? Want to learn more about auto-classification best practices? Contact us!

The post Auto-Classification for the Enterprise: When to Use AI vs. Semantic Models appeared first on Enterprise Knowledge.

]]>
The Evolution of Knowledge Management & Organizational Roles: Integrating KM, Data Management, and Enterprise AI through a Semantic Layer https://enterprise-knowledge.com/the-evolution-of-knowledge-management-km-organizational-roles/ Thu, 31 Jul 2025 16:51:14 +0000 https://enterprise-knowledge.com/?p=25082 On June 23, 2025, at the Knowledge Summit Dublin, Lulit Tesfaye and Jess DeMay presented “The Evolution of Knowledge Management (KM) & Organizational Roles: Integrating KM, Data Management, and Enterprise AI through a Semantic Layer.” The session examined how KM … Continue reading

The post The Evolution of Knowledge Management & Organizational Roles: Integrating KM, Data Management, and Enterprise AI through a Semantic Layer appeared first on Enterprise Knowledge.

]]>
On June 23, 2025, at the Knowledge Summit Dublin, Lulit Tesfaye and Jess DeMay presented “The Evolution of Knowledge Management (KM) & Organizational Roles: Integrating KM, Data Management, and Enterprise AI through a Semantic Layer.” The session examined how KM roles and responsibilities are evolving as organizations respond to the increasing convergence of data, knowledge, and AI.

Drawing from multiple client engagements across sectors, Tesfaye and DeMay shared patterns and lessons learned from initiatives where KM, Data Management, and AI teams are working together to create a more connected and intelligent enterprise. They highlighted the growing need for integrated strategies that bring together semantic modeling, content management, and metadata governance to enable intelligent automation and more effective knowledge discovery.

The presentation emphasized how KM professionals can lead the way in designing sustainable semantic architectures, building cross-functional partnerships, and aligning programs with organizational priorities and AI investments. Presenters also explored how roles are shifting from traditional content stewards to strategic enablers of enterprise intelligence.

Session attendees walked away with:

  • Insight into how KM roles are expanding to meet enterprise-wide data and AI needs;
  • Examples of how semantic layers can enhance findability, improve reuse, and enable automation;
  • Lessons from organizations integrating KM, Data Governance, and AI programs; and
  • Practical approaches to designing cross-functional operating models and governance structures that scale.

The post The Evolution of Knowledge Management & Organizational Roles: Integrating KM, Data Management, and Enterprise AI through a Semantic Layer appeared first on Enterprise Knowledge.

]]>