Metadata Articles - Enterprise Knowledge

Data Quality and Architecture Enrichment for Insights Visualization

EK Team — Wed, 10 Sep 2025 18:39:35 +0000

The Challenge

A radiopharmaceutical imaging company faced challenges in monitoring patient statistics and clinical trial logistics. A lack of visibility and awareness into this data hindered conversations with leadership regarding the status of active clinical trials, ultimately putting clinical trial results at risk. The company needed a trusted, single location to ask relevant business questions about their data and to see trends or anomalies across multiple clinical trials. They faced challenges, however, due to trial data being sent by various vendors in different formats (no standardized values across trials). To mitigate these issues, the company engaged Enterprise Knowledge (EK) to provide Semantic Data Management Advisory & Development as part of a data normalization and portfolio reporting program. The engagement’s goal was to develop data visualization dashboards to answer critical business questions with cleaned, normalized, and trustworthy patient data from four clinical trials, depicted in an easy-to-understand and actionable manner.

The Solution

To unlock data insights across trials in one accessible location, EK designed and developed a Power BI dashboard to visualize data from multiple trials in one centralized location. To begin developing dashboards, EK met with the client to confirm the business questions the dashboards would answer, ensuring the dashboards would visually display the patient and trial information needed to answer them. To remedy the varying data formats sent by vendors, EK mapped data values from trial reports to each other, normalizing and enriching the data with metadata and lineage. With structure and standardization added to the data, the dashboards could display robust data insights into patient status with filterable trial-specific information for the clinical imaging team.

EK also worked to transform the company’s data management environment—developing a medallion architecture structure to handle historical files and enforcing data cleaning and standardization on raw data inputs—to ensure dashboard insights were accurate and scalable to the inclusion of future trials. Implementing these data quality pre-processing steps and architecture considerations prepared the company for future applications and uses of reliable data, including the development of data products or the creation of a single view into the company-wide data landscape.

The EK Difference

To support the usage, maintenance, and future expansion of the data environment and data visualization tooling, EK developed knowledge transfer materials. These proprietary materials included setting up a semantic modeling foundation via a data dictionary to explain and define dashboard fields and features, a proposed future medallion architecture, and materials to socialize and expand the usage of visualization tools to additional sections of the company that could benefit from them.

Dashboard Knowledge Transfer Framework
To ensure the longevity of the dashboard, especially with the future inclusion of additional trial data, it was essential to develop materials for future dashboard users and developers. The knowledge transfer framework designed by EK outlined a repeatable process for dashboard development with enough detail and information that someone unfamiliar with the dashboards can understand the background, use cases, data inputs, visualization outputs, and the overall purpose of the dashboarding effort. Instructions for dashboard upkeep, including how to update and add data to the dashboard as business needs evolve, were also provided.

Semantic Model Foundations: Data Dictionary
To semantically enhance the dashboards, all dashboard fields and features were cataloged and defined by EK experts in semantics and data analysis. In addition to definitions, the dictionary included purpose statements and calculation rules for each dashboard concept (where applicable). This data dictionary was created to prepare the client to process all trial information moving forward and serve as a reference for the data transformation process.

Proposed Future Architecture
To optimize data storage in the future, EK proposed a medallion architecture strategy consisting of Bronze, Silver, and Gold layers to preserve historical data and pave the way for matured logging techniques. At the time EK engaged the client, there was no proper data storage. EK’s architecture strategy detailed storage preparation considerations for each layer, including workspace creation, file retention policies, and options for ingesting and storing data. EK leveraged technical expertise and a rich background in architecture strategies to provide expert advisory on the client’s future architecture.

Roadshow Materials
EK developed materials that summarized the mission and value of the clinical imaging dashboards. These materials included a high-level overview of the dashboard ecosystem so all audiences could comprehend the dashboard’s purpose and execution. With a KM-angled focus, the overall purpose of the materials was to gain organizational buy-in for the dashboard and build awareness of the clinical imaging team and the importance of the work they do. The roadshow materials also sought to promote dashboard adoption and future expansion of dashboarding into other areas of the company.

The Results

Before the dashboard, employees had to track down various spreadsheets for each trial sent from different sources and stored in at least four different locations. After the engagement, the company had a functional dashboard that displayed on-demand data visualizations across four clinical trials that pulled from a single data repository, creating a seamless way for the clinical imaging team to identify trial data and patient discrepancies early and often, preventing errors that could have resulted in unusable trial data. In all, having multiple trials’ information available in one streamlined view through the dashboard dramatically reduced the time and effort employees had previously spent tracking down and manually analyzing raw, disparate data for insights, from as high as 1–2 hours every week to as low as 15 minutes. Clinical imaging managers are now able to quickly determine and share trusted trial insights with their leadership confidently, enabling informed decision-making with the resources to explain where those insights were derived from.

In addition to the creation of the dashboard, EK helped develop a knowledge transfer framework and future architecture and data cleaning considerations, providing the company with a clear path to expand and scale usage to more clinical trials, other business units, and new business needs. In fact, the clinical imaging team identified at least four additional trials that, as a result of EK’s foundational work, can be immediately incorporated into the dashboard as the company sees fit.

Want to improve your organization’s content data quality and architecture? Contact us today!

Download Flyer

Ready to Get Started?

Get in Touch

The post Data Quality and Architecture Enrichment for Insights Visualization appeared first on Enterprise Knowledge.

Emily Crockett Participating in “Using Storytelling to Transform User Assistance” Panel at ConVEx Ideas Conference

EK Team — Mon, 11 Aug 2025 20:52:39 +0000

Emily Crockett, Senior Content Engineering Consultant at Enterprise Knowledge, will be participating as an expert panelist at the upcoming ConVEx Ideas Conference. The Component Content Alliance panel, titled, “Using Storytelling to Transform User Assistance,” will explore how structured content, metadata, and user insights come together to create meaningful narratives at scale. The panel will incorporate several unique voices in content, with Crockett representing the perspective of Knowledge Management and the understanding of content as an enterprise knowledge asset.

The session will be held online on Wednesday, September 17 from 9:00am – 10:00 AM PST. For more information and to register, visit here.

The post Emily Crockett Participating in “Using Storytelling to Transform User Assistance” Panel at ConVEx Ideas Conference appeared first on Enterprise Knowledge.

The Semantic Exchange Webinar Series Recap

EK Team — Mon, 11 Aug 2025 15:18:30 +0000

Enterprise Knowledge recently completed the first round of our new webinar series The Semantic Exchange, which offers participants an opportunity to engage in Q&A with EK’s Semantic Design thought leaders. Participants were able to engage with EK’s experts on topics such as the value of enterprise semantic architecture, best practices for generating buy-in for semantics across an organization, and techniques for semantic solution implementation. The series sparked thoughtful discussion on how to understand and address real-world semantic challenges.

To view any of the recorded sessions and their corresponding published work – use the links below:

Recording	Published Work	Author & Presenter
Why Your Taxonomy Needs SKOS	Infographic	Bonnie Griffin
What is Semantics and Why Does it Matter?	Blog	Ben Kass
Metadata Within the Semantic Layer	Blog	Kathleen Gollner
A Semantic Layer to Enable Risk Management	Case Study	Yumiko Saito
Humanitarian Foundation SemanticRAG POC	Case Study	James Egan

If you are interested in bringing semantics and data modeling solutions to your organization, contact us here!

The post The Semantic Exchange Webinar Series Recap appeared first on Enterprise Knowledge.

The Semantic Exchange: Humanitarian Foundation – SemanticRAG POC

EK Team — Thu, 17 Jul 2025 18:25:33 +0000

Enterprise Knowledge is concluding the first round of our new webinar series, The Semantic Exchange. In this webinar series, we follow a Q&A style to provide participants an opportunity to engage with our semantic design experts on a variety of topics about which they have written. This webinar is designed for a variety of audiences, ranging from those working in the semantic space as taxonomists or ontologists, to folks who are just starting to learn about structured data and content, and how they may fit into broader initiatives around artificial intelligence or knowledge graphs.

This 30-minute session invites you to engage with James Egan’s case study, Humanitarian Foundation – SemanticRAG POC. Come ready to hear and ask about:

How various types of organizations can leverage standards-based semantic graph technologies;
How can leveraging semantics addresses data integration challenges; and
What value semantics can provide to an organization’s overall data ecosystem.

This webinar will take place on Wednesday July 23rd, from 2:00 – 2:30PM EDT. Can’t make it? The session will also be recorded and published to registered attendees. View the recording here!

The post The Semantic Exchange: Humanitarian Foundation – SemanticRAG POC appeared first on Enterprise Knowledge.

The Semantic Exchange: A Semantic Layer to Enable Risk Management at a Multinational Bank

EK Team — Fri, 11 Jul 2025 17:02:13 +0000

Enterprise Knowledge is continuing our new webinar series, The Semantic Exchange with the fourth session. This session is designed for a variety of audiences, ranging from those working in the semantic space as taxonomists or ontologists, to folks who are just starting to learn about structured data and content, and how they may fit into broader initiatives around artificial intelligence or knowledge graphs.

This 30-minute session invites you to engage with Yumiko Saito’s case study, A Semantic Layer to Enable Risk Management at a Multinational Bank. Come ready to hear and ask about:

The challenges financial firms encounter with risk management;
The semantic solutions employed to mitigate these challenges; and
The value created by employing semantic layer solutions.

This webinar will take place on Thursday July 17th, from 1:00 – 1:30PM EDT. Can’t make it? The session will also be recorded and published to registered attendees. View the recording here!

The post The Semantic Exchange: A Semantic Layer to Enable Risk Management at a Multinational Bank appeared first on Enterprise Knowledge.

The Semantic Exchange: Metadata Within the Semantic Layer

EK Team — Tue, 01 Jul 2025 18:32:10 +0000

Enterprise Knowledge is pleased to introduce a new webinar series, The Semantic Exchange. This session is the third of a five part series where we invite fellow practitioners to tune in and hear more about work we’ve published from the authors themselves. In these moderated sessions, we invite you to ask the authors questions in a short, accessible format. Think of the series as a chance for a little semantic snack!

This session is designed for a variety of audiences, ranging from those working in the semantic space as taxonomists or ontologists, to folks who are just starting to learn about structured data and content, and how they may fit into broader initiatives around artificial intelligence or knowledge graphs.

This 30-minute session invites you to engage with Kathleen Gollner’s blog, Metadata Within the Semantic Layer. Come ready to hear and ask about:

Why metadata is foundational for a semantic layer;
How to optimize metadata for use across knowledge assets, systems, and use cases; and
How metadata can be leveraged in AI solutions.

This webinar will take place on Wednesday July 9th, from 1:00 – 1:30PM EDT. Can’t make it? The session will also be recorded and published to registered attendees. View the recording here!

The post The Semantic Exchange: Metadata Within the Semantic Layer appeared first on Enterprise Knowledge.

Rebecca Wyatt to Present on Context-Aware Structured Content to Mitigate Hallucinations at ConVEx conference

EK Team — Mon, 31 Mar 2025 17:50:48 +0000

Rebecca Wyatt, Partner and Division Director for Content Strategy and Operations at Enterprise Knowledge, will be delivering a presentation on Context-Aware Structured Content to Mitigate Hallucinations at the ConVEx conference, which takes place April 7-9 in San Jose, CA.

Wyatt will focus on techniques for ensuring that structured content remains tightly coupled with its source context, whether it’s through improved ontologies, metadata-driven relationships, or content validation against trusted sources to avoid the risks of hallucinations.

By the end of the session, attendees will have a deeper understanding of how to future-proof their content and make it both AI-ready and hallucination-resistant, fostering more accurate and trustworthy outputs from LLMs.

For more information on the conference, check out the schedule or register here.

The post Rebecca Wyatt to Present on Context-Aware Structured Content to Mitigate Hallucinations at ConVEx conference appeared first on Enterprise Knowledge.

Incorporating Unified Entitlements in a Knowledge Portal

Chris Marino — Wed, 12 Mar 2025 17:37:34 +0000

Recently, we have had a great deal of success developing a certain breed of application for our customers—Knowledge Portals. These knowledge-centric applications holistically connect an organization’s information—its data, content, people and knowledge—from disparate source systems. These portals provide a “single pane of glass” to enable an aggregated view of the knowledge assets that are most important to the organization.

The ultimate goal of the Knowledge Portal is to provide the right people access to the right information at the right time. This blog focuses on the first part of that statement—“the right people.” This securing of information assets is called entitlements. As our COO Joe Hilger eloquently points out, entitlements are vital in “enabling consistent and correct privileges across every system and asset type in the organization.” The trick is to ensure that an organization’s security model is maintained when aggregating this disparate information into a single view so that users only see what they are supposed to.

The Knowledge Portal Security Challenge

The Knowledge Portal’s core value lies in its ability to aggregate information from multiple source systems into a single application. However, any access permissions established outside of the portal—whether in the source systems or an organization-wide security model—need to be respected. There are many considerations to take into account when doing this. For example, how does the portal know:

Who am I?
Am I the same person specified in the various source systems?
Which information should I be able to see?
How will my access be removed if my role changes?

Once a user has logged in, the portal needs to know that the user has Role A in the content management system, Role B in our HR system, and Role C in our financial system. Since the portal aggregates information from the aforementioned systems, it uses this information to ensure what I see in the portal is reflective of what I would see in any of the individual systems.

The Tenets of Unified Entitlements in a Knowledge Portal

At EK, we have a common set of principles that guide us when implementing entitlements for a Knowledge Portal. They include:

Leveraging a single identity via an Identity Provider (IdP).
Creating a universal set of groups for access control.
Respecting access permissions set in source systems when available.
Developing a security model for systems without access permissions.

Leverage an Identity Provider (IdP)

When I first started working in search over 20 years ago, most source systems had their own user stores—the feature that allows a user to log into a system and uniquely identifies them within the system. One of the biggest challenges for implementing security was correctly mapping a user’s identity in the search application to their various identities in the source systems sending content to the search engine.

Thankfully, enterprise-wide Identity Providers (IdP) like Okta, Microsoft Entra ID (formerly Azure Active Directory), and Google Cloud Identity are ubiquitous these days. An Identity Provider (IdP) is like a digital doorkeeper for your organization. It identifies who you are and shares that information with your organization’s applications and systems.

By leveraging an IdP, I can present myself to all my applications with a single identifier such as “cmarino@enterprise-knowledge.com.” For the sake of simplicity in mapping my identity within the Knowledge Portal, I’m not “cmarino” in the content management system, “marinoc” in the HR system, and “christophermarino” in the financial system.

Instead, all of those systems recognize me as “cmarino@enterprise-knowledge.com” including the Knowledge Portal. And the subsequent decision by the portal to provide or deny access to information is greatly simplified. The portal needs to know who I am in all systems to make these determinations.

Create Universal Groups for Access Control

Working hand in hand with an IdP, the establishment of a set of universally used groups for access control is a critical step to enabling Unified Entitlements. These groups are typically created within your IdP and should reflect the common groupings needed to enforce your organization’s security model. For instance, you might choose to create groups based on a department or a project or a business unit. Most systems provide great flexibility in how these groups are created and managed.

These groups are used for a variety of tasks, such as:

Associating relevant users to groups so that security decisions are based on a smaller, manageable number of groups rather than on every user in your organization.
Enabling access to content by mapping appropriate groups to the content.
Serving as the unifying factor for security decisions when developing an organization’s security model.

As an example, we developed a Knowledge Portal for a large global investment firm which used Microsoft Entra ID as their IdP. Within Entra ID, we created a set of groups based on structures like business units, departments, and organizational roles. Access permissions were applied to content via these groups whether done in the source system or an external security model that we developed. When a user logged in to the portal, we identified them and their group membership and used that in combination with the permissions of the content. Best of all, once they moved off a project or into a different department or role, a simple change to their group membership in the IdP cascaded down to their access permissions in the Knowledge Portal.

Respect Permissions from Source Systems

The first two principles have focused on identifying a user and their roles. However, the second key piece to the entitlements puzzle rests with the content. Most source systems natively provide the functionality to control access to content by setting access permissions. Examples are SharePoint for your organization’s sensitive documents, ServiceNow for tickets only available to a certain group, or Confluence pages only viewable by a specific project team.

When a security model already exists within a source system, the goal of integrating that content within the Knowledge Portal is simple: respect the permissions established in the source. The key here is syncing your source systems with your IdP and then leveraging the groups managed there. When specifying access to content in the source, use the universal groups.

Thus, when the Knowledge Portal collects information from the source system, it pulls not only the content and its applicable metadata but also the content’s security information. The permissions are stored alongside the content in the portal’s backend and used to determine whether a specific user can view specific content within the portal. The permissions become just another piece of metadata by which the content can be filtered.

Develop Security Model for Unsupported Systems

Occasionally, there will be source systems where access permissions have not or can not be supported. In this case, you will have to leverage your own internal security model by developing one or using an entitlements tool. Instead of entitlements stored within the source system, the entitlements will be managed through this internal model.

The steps to accomplish this include:

Identify the tools needed to support unified entitlements;
Build the models for applying the security rules; and
Develop the integrations needed to automate security with other systems.

The process to implement this within the Knowledge Portal would remain the same: store the access permissions with the content (mapped using groups) and use these as filters to ensure that users see only the information they should.

Conclusion

Getting unified entitlements correct for your organization plays a large part in a successful Knowledge Portal implementation. If you need proven expertise to help guide managing access to your organization’s valuable information, contact us!

The “right people” in your organization will thank you.

The post Incorporating Unified Entitlements in a Knowledge Portal appeared first on Enterprise Knowledge.

The Resource Description Framework (RDF)

EK Team — Mon, 24 Feb 2025 19:33:53 +0000

Simply defined, a knowledge graph is a network of entities, their attributes, and how they’re related to one another. While these networks can be captured and stored in a variety of formats, most implementations leverage a graph based tool or database. However, within the world of graph databases, there are a variety of syntaxes or flavors that can be used to represent knowledge graphs. One of the most popular and ubiquitous is the Resource Description Framework (RDF), which provides a means to capture meaning, or semantics, in a way that is interpretable by both humans and machines.

What is RDF?

The Resource Description Framework (RDF) is a semantic web standard used to describe and model information for web resources or knowledge management systems. RDF consists of “triples,” or statements, with a subject, predicate, and object that resemble an English sentence. For example, take the English sentence: “Bess Schrader is employed by Enterprise Knowledge.” This sentence has:

A subject: Bess Schrader
A predicate: is employed by
An object: Enterprise Knowledge

Bess Schrader and Enterprise Knowledge are two entities that are linked by the relationship “employed by.” An RDF triple representing this information would look like this:

What is the goal of using RDF?

RDF is a semantic web standard, and thus has the goal of representing meaning in a way that is interpretable by both humans and machines. As humans, we process information through a combination of our experience and logical deduction. For example, I know that “Washington, D.C.” and “Washington, District of Columbia” refer to the same concept based on my experience in the world – at some point, I learned that “D.C.” was the abbreviation for “District of Columbia.” On the other hand, if I were to encounter a breathing, living object that has no legs and moves across the ground in a slithering motion, I’d probably infer that it was a snake, even if I’d never seen this particular object before. This determination would be based on the properties I associate with snakes (animal, no legs, slithers).

Unlike humans, machines have no experience on which to draw conclusions, so everything needs to be explicitly defined in order for a machine to process information this way. For example, if I want a machine to infer the type of an object based on properties (e.g. “that slithering object is a snake”), I need to define what a snake is and what properties it has. If I want a machine to reconcile that “Washington, D.C.” and “Washington, District of Columbia” are the same thing, I need to define an entity that uses both of those labels.

RDF allows us to create robust semantic resources, like ontologies, taxonomies, and knowledge graphs, where the meaning behind concepts is well defined in a machine readable way. These resources can then be leveraged for any use case that requires context and meaning to connect and unify data across disparate formats and systems, such as semantic layers and auto-classification.

How does RDF work?

Let’s go back to our single triple representing the fact that “Bess Schrader works at Enterprise Knowledge.”

We can continue building out information about the entities in our (very small) knowledge graph by giving all of our subjects and objects types (which indicate the general category/class that an entity belongs to) and labels (which capture the language used to refer to the entity).

These types and labels are helping us define the semantics, or meaning, of each entity. By explicitly stating that “Bess Schrader” is a person and “Enterprise Knowledge” is an organization, we’re creating the building blocks for a machine to start to make inferences about these entities based on their types.

Similarly, we can create a more explicit definition of our relationship and attributes, allowing machines to better understand what the “employed by” relationship means. While the above diagram represents our predicate (or relationship) as a straight line between two entities, in RDF, our predicate is itself an entity and can have its own properties (such as type, label, and description). This is often referred to as making properties “first class citizens.”

Uniform Resource Identifiers (URIs)

But how do we actually make this machine readable? Diagrams in a blog are great in helping humans understand concepts, but machines need this information in a machine readable format. To make our graph machine readable, we’ll need to leverage unique identifiers.

One of the key elements of any knowledge graph (RDF or otherwise) is the principle of “things, not strings.” As humans, we often use ambiguous labels (e.g. “D.C”) when referring to a concept, trusting that our audience will be able to use context to determine our meaning. However, machines often don’t have sufficient context to disambiguate strings – imagine “D.C.” has been applied as a tag to an unstructured text document. Does “D.C.” refer to the capital city of the US, the comic book publisher, “direct current,” or something else entirely? Knowledge graphs seek to reduce this ambiguity by using entities or concepts that have unique identifiers and one or more labels, instead of relying on labels themselves as unique identifiers.

RDF is no exception to this principle – all RDF entities are defined using a Uniform Resource Identifier (URI), which can be used to connect all of the labels, attributes, and relationships for a given entity.

Using URIs, our RDF knowledge graph would look like this:

These URIs make our triples machine readable by creating unambiguous identifiers for all of our subjects, predicates, and objects. URIs also enable interoperability and the ability to share information across multiple systems – because these URIs are globally unique, any two systems that reference the same URI should be referring to the same entity.

What are the advantages to using RDF?

The RDF Specification has been maintained by the World Wide Web Consortium (W3C) for over two decades, meaning it is a stable, well documented framework for representing data. This makes it easy for applications and organizations to develop RDF data in an interoperable way. If you create RDF data in one tool and share it with someone else using a different RDF tool, they will still be able to easily use your data. This interoperability allows you to build on what’s already been done — you can combine your enterprise knowledge graph with established, open RDF datasets like Wikidata, jump-starting your analytic capabilities. This also makes data sharing and migration between internal RDF systems simple, enabling you to unify data and reducing your dependency on a single tool or vendor.

The ability to treat properties as “first-class citizens” with their own properties allows you to store your data model along with your data, explaining what properties mean and how they should be used. This reduces ambiguity and confusion for both data creators, developers, and data consumers. However, this ability to treat properties as entities also allows organizations to standardize and connect existing data. RDF data models can store multiple labels for the same property, enabling them to act as a “Rosetta Stone” that translates metadata fields and values across systems. Connecting these disparate metadata values is crucial to being able to effectively retrieve, understand, and use enterprise data.

Many implementations of RDF also support inference and reasoning, allowing you to explore previously uncaptured relationships in your data, based on logic developed in your ontology. This reasoning capability can be an incredibly powerful tool, helping you gain insights from your business logic. For example, inference and reasoning can capture information about employee expertise – a relationship that’s notoriously difficult to explicitly store. While many organizations attempt to have employees self-select their skills or areas of expertise, the completion rate of these self-selections is typically low, and even those that do complete the selection often don’t keep them up to date. Reasoning in RDF can leverage business logic to automatically infer expertise based on your organization’s data. For example, if a person has authored multiple documents that discuss a given topic, an RDF knowledge graph may infer that this person has knowledge of or expertise in that topic.

What are the disadvantages to using RDF?

To fully leverage the benefits of RDF, entities must be explicitly defined (see best practices below), which can require burdensome overhead. The volume and structure of these assertions, combined with the length and format of Uniform Resource Identifiers (URIs), can make getting started with RDF challenging for information professionals and developers used to working with more straightforward (albeit more ambiguous) data models. While recent advancements in generative AI have great potential to make the learning curve to RDF less onerous via human-in-the-loop RDF creation processes, learning to create and work with RDF still poses a challenge to many organizations.

Additionally, the “triple” format (subject – predicate – object) used by RDF only allows you to connect two entities at a time, unlike labeled property graphs. For example, I can assert that “Bess Schrader -> employed by -> Enterprise Knowledge,” but it’s not very straightforward in RDF to then add additional information about that relationship, such as what role I perform at Enterprise Knowledge, my start and end dates of employment, etc. While a proposed modification to RDF called RDF* (RDF-star) has been developed to address this, it has not been officially adopted by the W3C, and implementation of RDF* in RDF compliant tools has occurred only on an ad hoc basis.

What are some best practices when using RDF to create a knowledge graph?

RDF, and knowledge graphs in general, are well known for their flexibility – there are very few restrictions on how data must be structured or what properties must be used for their implementation. However, there are some best practices when using RDF that will enable you to maximize your knowledge graph’s utility, particularly for reasoning applications.

All concepts should be entities with a URI

The guiding principle is “things, not strings”. If you’re describing something with a label that might have its own attributes, it should be an entity, not a literal string.

All entities should have a label

Using URIs is important, but a URI without at least one label is difficult to interpret for both humans and machines.

All entities should have a type

Again, remember that our goal is to allow machines to process information similarly to humans. To do this, all entities should have one or more types explicitly asserted (e.g. “Washington, D.C” might have the type “City”).

All entities should have a description

While using URIs and labels goes a long way in limiting ambiguity (see our “D.C.” example above), adding descriptions or definitions for each entity can be even more helpful. A well written description for an entity will leave little to no question around what this entity represents.

Following these best practices will help with reuse, governance, and reasoning.

Want to learn more about RDF, or need help getting started? Contact us today.

The post The Resource Description Framework (RDF) appeared first on Enterprise Knowledge.

Metadata Within the Semantic Layer

Kathleen Gollner — Thu, 06 Feb 2025 19:20:47 +0000

As a standardized framework for connecting organizational assets, a Semantic Layer captures organizational knowledge and domain meaning to support connecting and coordinating assets across systems and repositories. Metadata, as one component of a Semantic Layer approach, is foundational.

Whether you are striving to enhance user experiences by improving search or navigation, or by improving asset management or reporting, in a single system or across multiple systems—you need metadata.

But not just any metadata. You need metadata that provides the information and context needed to leverage assets effectively and meaningfully.

For those seeking to extend or enhance the metadata for their organizational assets, it can be difficult to ascertain what metadata should be captured. In this blog post, I provide an overview of the role of metadata, and guidance on what to consider when defining metadata.

The Role of Metadata in a Semantic Layer

Metadata codifies characteristics of an asset—what it is about, how it should be managed and used, and what contexts it is relevant for. For example, a document may have metadata to identify its title, publication date, lifecycle status, document type, and topic.

Additionally, metadata should capture actionable characteristics of an asset, reflecting the characteristics necessary for finding, managing, using, and understanding assets. In other words, if the characteristic is used to support a requisite interaction with the asset, it should be captured as metadata.

By codifying actionable characteristics, metadata enables user experiences by providing the connection between a specific asset and supported interactions.

Metadata also enables AI-supported interactions. As an explicit signal of an asset’s characteristics, metadata ensures that AI-powered tools can reference and leverage the asset when appropriate. For instance, a semantic search engine can be tuned to prioritize information in assets marked with specific Document Types, or a content generation tool can be directed to summarize the most recent, published assets on a topic.

The utility of metadata is limited by any differences in its definition or application. If different terms are used to identify the same topic, it is more difficult to identify assets that are about the same, or related, topics. Similarly, if a topic is applied inconsistently—missing on relevant assets or present on irrelevant assets—it is more difficult to identify similar or related assets.

This underscores the importance of standardizing metadata. To ensure concepts, like dates and topics, are identified and applied consistently, metadata should be defined as a shared representation; that is, characteristics common to assets across systems should share the same metadata definition, and common concepts, like topics and document types, should be standardized with a taxonomy. This approach controls what data should be captured for an asset and what terms can be used to identify concepts, helping to enforce a shared representation across assets.

When metadata is standardized, it is possible to improve user experiences, such as improving asset discovery within a repository via a common metadata definition and taxonomy or across repositories by establishing a metadata knowledge graph.

How to Define Metadata for a Semantic Layer

The process for defining metadata within your organization may look different, depending on what metadata, taxonomies, or other Semantic Layer components you already have in place. There are fundamental considerations, however, that can be useful regardless of what you have in place to date. These include:

Determining use cases: Consider what specific use cases you need to support. This will help you focus on the metadata that will be most actionable and impactful. Look to your users to understand what their tasks are and what pain points they have, like finding documents for research or identifying data sets for reporting.
Identifying metadata fields: Investigate the ways users describe, look for, or interact with the assets for the use cases you’ve identified. Consider the different kinds of information and context that may be required to support the use cases.
Allocating metadata fields: Consider which assets and systems will need to use the metadata field. Identify metadata fields that are applicable to all assets on all (in-scope) systems or some assets on all (in-scope) systems. These metadata fields will be the focus for standardization.
Standardizing metadata fields: Establish definitions for each metadata field and ensure those definitions can be upheld across in-scope systems. Specify the role of each field, the type of data they capture, and, if applicable, whether they accommodate single or multiple values. For metadata fields that leverage a controlled list of terms, design taxonomies to further standardize and enrich the capture of information and context.

Importantly, you don’t have to define all the metadata all at once. These considerations can be applied to a specific, priority use case to start, then revisited and expanded as needed. Even a small initial set of metadata can help transform user experiences.

Contact us if you are grappling with how to define and standardize your metadata, or are interested in learning more about the Semantic Layer.

The post Metadata Within the Semantic Layer appeared first on Enterprise Knowledge.