semantic architecture Articles - Enterprise Knowledge https://enterprise-knowledge.com/tag/semantic-architecture/ Wed, 17 Sep 2025 21:02:27 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.2 https://enterprise-knowledge.com/wp-content/uploads/2022/04/EK_Icon_512x512.svg semantic architecture Articles - Enterprise Knowledge https://enterprise-knowledge.com/tag/semantic-architecture/ 32 32 The Semantic Exchange: A Semantic Layer to Enable Risk Management at a Multinational Bank https://enterprise-knowledge.com/the-semantic-exchange-a-semantic-layer-to-enable-risk-management/ Fri, 11 Jul 2025 17:02:13 +0000 https://enterprise-knowledge.com/?p=24874 Enterprise Knowledge is continuing our new webinar series, The Semantic Exchange with the fourth session. This session is designed for a variety of audiences, ranging from those working in the semantic space as taxonomists or ontologists, to folks who are … Continue reading

The post The Semantic Exchange: A Semantic Layer to Enable Risk Management at a Multinational Bank appeared first on Enterprise Knowledge.

]]>

Enterprise Knowledge is continuing our new webinar series, The Semantic Exchange with the fourth session. This session is designed for a variety of audiences, ranging from those working in the semantic space as taxonomists or ontologists, to folks who are just starting to learn about structured data and content, and how they may fit into broader initiatives around artificial intelligence or knowledge graphs.

This 30-minute session invites you to engage with Yumiko Saito’s case study, A Semantic Layer to Enable Risk Management at a Multinational Bank. Come ready to hear and ask about:

  • The challenges financial firms encounter with risk management;
  • The semantic solutions employed to mitigate these challenges; and
  • The value created by employing semantic layer solutions.

This webinar will take place on Thursday July 17th, from 1:00 – 1:30PM EDT. Can’t make it? The session will also be recorded and published to registered attendees. View the recording here!

The post The Semantic Exchange: A Semantic Layer to Enable Risk Management at a Multinational Bank appeared first on Enterprise Knowledge.

]]>
Unlocking Knowledge Intelligence from Unstructured Data https://enterprise-knowledge.com/unlocking-knowledge-intelligence-from-unstructured-data/ Fri, 28 Mar 2025 17:18:28 +0000 https://enterprise-knowledge.com/?p=23553 Introduction Organizations generate, source, and consume vast amounts of unstructured data every day, including emails, reports, research documents, technical documentation, marketing materials, learning content and customer interactions. However, this wealth of information often remains hidden and siloed, making it challenging … Continue reading

The post Unlocking Knowledge Intelligence from Unstructured Data appeared first on Enterprise Knowledge.

]]>
Introduction

Organizations generate, source, and consume vast amounts of unstructured data every day, including emails, reports, research documents, technical documentation, marketing materials, learning content and customer interactions. However, this wealth of information often remains hidden and siloed, making it challenging to utilize without proper organization. Unlike structured data, which fits neatly into databases, unstructured data often lacks a predefined format, making it difficult to extract insights or apply advanced analytics effectively.

Integrating unstructured data into a knowledge graph is the right approach to overcome organizations’ challenges in structuring unstructured data. This approach allows businesses to move beyond traditional storage and keyword search methods to unlock knowledge intelligence. Knowledge graphs contextualize unstructured data by linking and structuring it, leveraging the business-relevant concepts and relationships. This enhances enterprise search capabilities, automates knowledge discovery, and powers AI-driven applications.

This blog explores why structuring unstructured data is essential; the challenges organizations face, and the right approach to integrate unstructured content into a graph-powered knowledge system. Additionally, this blog highlights real-world implementations demonstrating how we have applied his approach to help organizations unlock knowledge intelligence, streamline workflows, and drive meaningful business outcomes.

Why Structure Unstructured Data in a Graph

Unstructured data offers immense value to organizations if it can be effectively harnessed and contextualized using a knowledge graph. Structuring content in this way unlocks potential and drives business value. Below are three key reasons to structure unstructured data:

1. Knowledge Intelligence Requires Context

Unstructured data often holds valuable information, but is disconnected across different formats, sources, and teams. A knowledge graph enables organizations to connect these pieces by linking concepts, relationships, and metadata into a structured framework. For example, a financial institution can link regulatory reports, policy documents, and transaction logs to uncover compliance risks. With traditional document repositories, achieving knowledge intelligence may be impossible, or at least very resource intensive.

Additionally, organizations must ensure that domain-specific knowledge informs AI systems to improve relevance and accuracy. Injecting organizational knowledge into AI models, enhances AI-driven decision-making by grounding models in enterprise-specific data.

2. Enhancing Findability and Discovery

Unstructured data lacks standard metadata, making traditional search and retrieval inefficient. Knowledge graphs power semantic search by linking related concepts, improving content recommendations, and eliminating reliance on simple keyword matching. For example, in the financial industry, investment analysts often struggle to locate relevant market reports, regulatory updates, and historical trade data buried in siloed repositories. A knowledge graph-powered system can link related entities, such as companies, transactions, and market events, allowing analysts to surface contextually relevant information with a single query, rather than sifting through disparate databases and document archives.

3. Powering Explainable AI and Generative Applications

Generative AI and Large Language Models (LLMs) require structured, contextualized data to produce meaningful and accurate responses. A graph-enhanced AI pipeline allows enterprises to:

A. Retrieve verified knowledge rather than relying on AI-generated assumptions likely resulting in hallucinations.

B. Trace AI-generated insights back to trusted enterprise data for validation.

C. Improve explain ability and accuracy in AI-driven decision-making.

 

Challenges of Handling Unstructured Data in a Graph

While structured data neatly fits into predefined models, facilitating easy storage and retrieval of unstructured data presents a stark contrast. Unstructured data, encompassing diverse formats such as text documents, images, and videos lack the inherent organization and standardization to facilitate machine understanding and readability. This lack of structure poses significant challenges for data management and analysis, hindering the ability to extract valuable insights. The following key challenges highlight the complexities of handling unstructured data:

1. Unstructured Data is Disorganized and Diverse

Unstructured data is frequently available in multiple formats, including PDF documents, slide presentations, email communications, or video recordings. However, these diverse formats lack a standardized structure, making extracting and organizing data challenging. Format inconsistency can hinder effective data analysis and retrieval, as each type presents unique obstacles for seamless integration and usability.

2. Extracting Meaningful Entities and Relationships

Turning free text into structured graph nodes and edges requires advanced Natural Language Processing (NLP) to identify key entities, detect relationships, and disambiguate concepts. Graph connections may be inaccurate, incomplete, or irrelevant without proper entity linking.

3. Managing Scalability and Performance

Storing large-scale unstructured data in a graph requires efficient modeling, indexing, and processing strategies to ensure fast query performance and scalability.

Complementary Approaches to Unlocking Knowledge Intelligence from Unstructured Data

A strategic and comprehensive approach is essential to unlock knowledge intelligence from unstructured data. This involves designing a scalable and adaptable knowledge graph schema, deconstructing and enriching unstructured data with metadata, leveraging AI-powered entity and relationship extraction, and ensuring accuracy with human-in-the-loop validation and governance.

1. Knowledge Graph Schema Design for Scalability

A well-structured schema efficiently models entities, relationships, and metadata. As outlined in our best practices for enterprise knowledge graph design, a strategic approach to schema development ensures scalability, adaptability, and alignment with business needs. Enriching the graph with structured data sources (databases, taxonomies, and ontologies) improves accuracy. It enhances AI-driven knowledge retrieval, ensuring that knowledge graphs are robust and optimized for enterprise applications.

2. Content Deconstruction and Metadata Enrichment

Instead of treating documents as static text, break them into structured knowledge assets, such as sections, paragraphs, and sentences, then link them to relevant concepts, entities, and metadata in a graph. Our Content Deconstruction approach helps organizations break large documents into smaller, interlinked knowledge assets, improving search accuracy and discoverability.

3. AI-Powered Entity and Relationship Extraction

Advanced NLP and machine learning techniques can extract insights from unstructured text data. These techniques can identify key entities, categorize documents, recognize semantic relationships, perform sentiment analysis, summarize text, translate languages, answer questions, and generate text. They offer a powerful toolkit for extracting insights and automating tasks related to natural language processing and understanding.

A well-structured knowledge graph enhances AI’s ability to retrieve, analyze, and generate insights from content. As highlighted in How to Prepare Content for AI, ensuring content is well-structured, tagged, and semantically enriched is crucial for making AI outputs accurate and context-aware.

4. Human-in-the-loop for Validation and Governance

AI models are powerful but have limitations and can produce errors, especially when leveraging domain-specific taxonomies and classifications. AI-generated results should be reviewed and refined by domain experts to ensure alignment with standards, regulations, and subject matter nuances. Combining AI efficiency with human expertise maximizes data accuracy and reliability while minimizing compliance risks and costly errors.

From Unstructured Data to Knowledge Intelligence: Real-World Implementations and Case Studies

Our innovative approach addresses the challenges organizations face in managing and leveraging their vast knowledge assets. By implementing AI-driven recommendation engines, knowledge portals, and content delivery systems, we empower businesses to unlock the full potential of their unstructured data, streamline processes, and enhance decision-making. The following case studies illustrate how organizations have transformed their data ecosystems using our enterprise AI and knowledge management solutions which incorporate the four components discussed in the previous section.

  • AI-Driven Learning Content and Product Recommendation Engine
    A global enterprise learning and product organization struggled with the searchability and accessibility of its vast unstructured marketing and learning content, causing inefficiencies in product discovery and user engagement. Customers frequently left the platform to search externally, leading to lost opportunities and revenue. To solve this, we developed an AI-powered recommendation engine that seamlessly integrated structured product data with unstructured content through a knowledge graph and advanced AI algorithms. This solution enabled personalized, context-aware recommendations, improving search relevance, automating content connections, and enhancing metadata application. As a result, the company achieved increased customer retention and better product discovery, leading to six figures in closed revenue.
  • Knowledge Portal for a Global Investment Firm
    A global investment firm faced challenges leveraging its vast knowledge assets due to fragmented information spread across multiple systems. Analysts struggled with duplication of work, slow decision-making, and unreliable investment insights due to inconsistent or missing context. To address this, we developed Discover, a centralized knowledge portal powered by a knowledge graph that integrates research reports, investment data, and financial models into a 360-degree view of existing resources. The system aggregates information from multiple sources, applies AI-driven auto-tagging for enhanced search, and ensures secure access control to maintain compliance with strict data governance policies. As a result, the firm achieved faster decision-making, reduced duplicate efforts, and improved investment reliability, empowering analysts with real-time, contextualized insights for more informed financial decisions.
  • Knowledge AI Content Recommender and Chatbot
    A leading development bank faced challenges in making its vast knowledge capital easily discoverable and delivering contextual, relevant content to employees at the right time. Information was scattered across multiple systems, making it difficult for employees to find critical knowledge and expertise when performing research and due diligence. To solve this, we developed an AI-powered content recommender and chatbot, leveraging a knowledge graph, auto-tagging, and machine learning to categorize, structure, and intelligently deliver knowledge. The knowledge platform was designed to ingest data from eight sources, apply auto-tagging using a multilingual taxonomy with over 4,000 terms, and proactively recommend content across eight enterprise systems. This approach significantly improved enterprise search, automated knowledge delivery, and minimized time spent searching for information. Bank leadership recognized the initiative as “the most forward-thinking project in recent history.”
  • Course Recommendation System Based on a Knowledge Graph
    A healthcare workforce solutions provider faced challenges in delivering personalized learning experiences and effective course recommendations across its learning platform. The organization sought to connect users with tailored courses that would help them master key competencies, but its existing recommendation system struggled to deliver relevant, user-specific content and was difficult to maintain. To address this, we developed a cloud-hosted semantic course recommendation service, leveraging a healthcare-oriented knowledge graph and Named Entity Recognition (NER) models to extract key terms and build relationships between content components. The AI-powered recommendation engine was seamlessly integrated with the learning platform, automating content recommendations and optimizing learning paths. As a result, the new system outperformed accuracy benchmarks, replaced manual processes, and provided high-quality, transparent course recommendations, ensuring users understood why specific courses were suggested.

Conclusion

Unstructured data holds immense potential, but without structure and context, it remains difficult to navigate. Unlike structured data, which is already organized and easily searchable, unstructured data requires advanced techniques like knowledge graphs and AI to extract valuable insights. However, both data types are complementary and essential for maximizing knowledge intelligence. By integrating structured and unstructured data, organizations can connect fragmented content, enhance search and discovery, and fuel AI-powered insights. 

At Enterprise Knowledge, we know success requires a well-planned strategy, including preparing content for AI,  AI-driven entity and relationship extraction, scalable graph modeling or enterprise ontologies, and expert validation. We help organizations unlock knowledge intelligence by structuring unstructured content in a graph-powered ecosystem. If you want to transform unstructured data into actionable insights, contact us today to learn how we can help your business maximize its knowledge assets.

 

The post Unlocking Knowledge Intelligence from Unstructured Data appeared first on Enterprise Knowledge.

]]>
Enterprise AI Architecture Series: How to Inject Business Context into Structured Data using a Semantic Layer (Part 3) https://enterprise-knowledge.com/enterprise-ai-architecture-inject-business-context-into-structured-data-semantic-layer/ Wed, 26 Mar 2025 14:55:28 +0000 https://enterprise-knowledge.com/?p=23533 Introduction AI has attracted significant attention in recent years, prompting me to explore enterprise AI architectures through a multi-part blog series this year. Part 1 of this series introduced the key technical components required for implementing an enterprise AI architecture. … Continue reading

The post Enterprise AI Architecture Series: How to Inject Business Context into Structured Data using a Semantic Layer (Part 3) appeared first on Enterprise Knowledge.

]]>
Introduction

AI has attracted significant attention in recent years, prompting me to explore enterprise AI architectures through a multi-part blog series this year. Part 1 of this series introduced the key technical components required for implementing an enterprise AI architecture. Part 2 discussed our typical approaches and experiences in structuring unstructured content with a semantic layer. In the third installment, we will focus on leveraging structured data to power enterprise AI use cases.

Today, many organizations have developed the technical ability to capture enormous amounts of data to power improved business operations or compliance with regulatory bodies. For large organizations, this data collection process is typically decentralized so that organizations can move quickly in the face of competition and regulations. Over time, such decentralization results in increased complexities with data management, such as inconsistent data formats across various data platforms and multiple definitions for the same data concept. A common example in EK’s engagements includes reviewing customer data from different sources with variations in spelling and abbreviations (such as “Bob Smith” vs. “Robert Smith” or “123 Main St” vs. “123 Main Street”), or seeing the same business concept (such as customer or supplier) being referred to differently across various departments in an organization.  Obviously, with such extensive data quality and inconsistency issues, it is often impossible to integrate and harmonize data from the diverse underlying systems for a 360-degree view of the enterprise and enable cross-functional analysis and reporting. This is exactly the problem a semantic layer solves.  

A semantic layer is a business representation of data that offers a unified and consolidated view of data across an organization. It establishes common data definitions, metadata, categories and relationships, thereby enabling data mapping and interpretation across all organizational data assets. A semantic layer injects intelligence into structured data assets in an organization by providing standardized meaning and context to the data in a machine-readable format, which can be readily leveraged by Artificial Intelligence (AI) systems. We call this process of embedding business context into organizational data assets for effective use by AI systems knowledge intelligence (KI).  Providing a common understanding of structured data using a semantic layer will be the focus of this blog. 

How a Semantic Layer Provides Context for Structured Data 

A semantic layer provides AI with a programmatic framework to make organizational context and domain knowledge machine readable. It does so by using one or more components such as metadata, business glossary, taxonomy, ontology and knowledge graph. Specifically, it helps enterprise AI systems:

  • Leverage metadata to power understanding of the operational context;
  • Improve shared understanding of organizational nomenclature using business glossaries;
  • Provide a mechanism to categorize and organize the same data through taxonomies and controlled vocabularies;
  • Encode domain-specific business logic and rules in ontologies; and
  • Enable a normalized view of siloed datasets via knowledge graphs 

Embedding Business Context into Structured Data: An Architectural Perspective

The figure below illustrates how the semantic layer components work together to enable Enterprise AI. This shows the key integration patterns via which structured data sources can be connected using a knowledge graph in the KI layer,including batch and incremental data pull using declarative and custom data mappings, as well as data virtualization.

Enterprise AI Architecture: Injecting Business Content into Structured Data using a Semantic Layer

AI models can reason and infer based on explicit knowledge encoded in the graph. This is achieved when both the knowledge or data schema (e.g. ontology) and its instantiation are represented in the knowledge graph. This representation is made possible through a custom service that allows the ontology to be synchronized with the graph (labeled as Ontology Sync with Graph in the figure) and graph construction pipelines described above.

Enterprise AI can derive additional context on linked data when taxonomies are ingested into the same graph via a custom service that allows the taxonomy to be synchronized with the graph (labeled as Taxonomy Sync with Graph in the figure). This is because taxonomies can be used to consistently organize this data and provide clear relationships between different data points. Finally, technical metadata collected from structured data sources can be connected with other semantic assets in the knowledge graph through a custom service that allows this metadata to be loaded into the graph (labeled as Metadata Load into Graph in the figure). This brings in additional context regarding data sourcing, ownership, versioning, access levels, entitlements, consuming systems and applications into a single location.

As is evident from the figure above ,a semantic layer enables data from different sources to be quickly mapped and connected using a variety of mapping techniques, thus enabling a unified, consistent, and single view of data for use in advacned analytics. In addition, by injecting business context into this unified view via semantic assets such as taxonomies, ontologies and glossaries, organizations can power AI applications ranging from semantic recommenders and knowledge panels to traditional machine learning (ML) model training and LLM-powered AI agents.

Case Studies & Enterprise Applications

In many engagements, EK has used semantic layers with structured data to power various use cases, from enterprise 360 to AI enablement. As part of enterprise AI engagements, a common issue we’ve seen is a lack of business context surrounding data. AI engineers continue to struggle to locate relevant data and ensure its suitability for specific tasks, hindering model selection and leading to suboptimal results and abandoned AI initiatives. These experiences show that raw data lacks inherent value; it becomes valuable only when contextualized for its users. Semantic layers provide this context to both AI models and AI teams, driving successful Enterprise AI endeavors.

Last year, a global retailer partnered with EK to overcome delays in retrieving store performance metrics and creating executive dashboards. Their centralized data lakehouse lacked sufficient metadata, hindering engineers from locating and understanding crucial metrics. By standardizing metadata, aligning business glossaries, and establishing taxonomy, we empowered their data visualization engineers to perform self-service analytics and rapidly create dashboards. This streamlined their insight generation without relying on source data system owners and IT teams. You can read more about how we helped this organization democratize their AI efforts using a semantic layer here.

In a separate case, EK facilitated the rapid development of AI models for a multinational financial institution by integrating business context into the company’s structured risk data through a semantic layer. The semantic layer expedited data exploration, connection, and feature extraction for the AI team, leading to the efficient implementation of enterprise AI systems like intelligent search engines, recommendation engines, and anomaly detection applications. EK also integrated AI model outputs into the risk management graph, enabling the development of proactive alerts for critical changes or potential risks, which, in turn, improved the productivity and decision-making of the risk assessment team.

Finally, the significant role a semantic layer plays in reducing data cleaning efforts and streamlining data management. Research consistently shows AI teams spend more time cleaning data than modeling it to produce valuable insights. By connecting previously siloed data using an identity graph, EK helped a large digital marketing firm gain a deeper understanding of its customer base through behavior and trend analytics. This solution resolved the discrepancy between 2 billion distinct records in their relational databases and the actual user base of 240 million.

Closing

Semantic layers effectively represent complex relationships between data objects, unlike traditional applications built for structured data. This allows them to support highly interconnected use cases like analyzing supply chains and recommendation systems. To adopt this framework, organizations must shift from an application-centric to a data-centric enterprise architecture. A semantic layer ensures that data retains its meaning and context when extracted from a relational database. In the AI era, this metadata-first framework is crucial for staying competitive. Organizations need to provide their AI systems with a consolidated, context-rich view of all transactional data for more accurate predictions. 

This article completes our discussion about the technical integration between semantic layers and enterprise AI, introduced here. In the next segment of this KI architecture blog series, we will move onto the second KI component and discuss the technical approaches for encoding expert knowledge into enterprise AI systems.

To get started with leveraging structured data, building a semantic layer, and the KI journey at your organization, contact EK!

The post Enterprise AI Architecture Series: How to Inject Business Context into Structured Data using a Semantic Layer (Part 3) appeared first on Enterprise Knowledge.

]]>
Webinar: Semantic Layer Technical Deep Dive https://enterprise-knowledge.com/webinar-semantic-layer-technical-deep-dive/ Wed, 01 May 2024 20:44:57 +0000 https://enterprise-knowledge.com/?p=20452 In this webinar, Enterprise Knowledge’s Sara Mae O’Brien-Scott moderates a conversation with Chris Marino, Urmi Majumder, and Heather Hedden as they take a deep dive in the the technical aspects of a Semantic Layer. The panelists explore the components that … Continue reading

The post Webinar: Semantic Layer Technical Deep Dive appeared first on Enterprise Knowledge.

]]>
In this webinar, Enterprise Knowledge’s Sara Mae O’Brien-Scott moderates a conversation with Chris Marino, Urmi Majumder, and Heather Hedden as they take a deep dive in the the technical aspects of a Semantic Layer. The panelists explore the components that enable semantic capabilities, such as metadata managers, taxonomy managers, business glossary managers, data catalogs, ontology managers, and knowledge graphs. They emphasize how these components interconnect organizational knowledge and data assets, enhancing systems like recommendation engines and semantic search.

The discussion also covers the impact of generative AI and LLMs on semantic technologies, highlighting how these tools can enhance the retrieval and verification processes within semantic frameworks. The experts express excitement about the potential of semantic technologies to democratize data access within organizations, thereby enabling more informed decision-making.

The post Webinar: Semantic Layer Technical Deep Dive appeared first on Enterprise Knowledge.

]]>
Webinar: Different Applications of a Semantic Layer Industry Panel https://enterprise-knowledge.com/webinar-different-applications-of-a-semantic-layer-industry-panel/ Wed, 24 Apr 2024 17:29:29 +0000 https://enterprise-knowledge.com/?p=20389 In this webinar, Enterprise Knowledge’s Lulit Tesfaye moderates a conversation with Malcolm Hawker, Polly Alexander, and Mohammed Aaser to discuss the role of Semantic Layers, what they are, and what value they offer organizations in the quickly changing world being … Continue reading

The post Webinar: Different Applications of a Semantic Layer Industry Panel appeared first on Enterprise Knowledge.

]]>
In this webinar, Enterprise Knowledge’s Lulit Tesfaye moderates a conversation with Malcolm Hawker, Polly Alexander, and Mohammed Aaser to discuss the role of Semantic Layers, what they are, and what value they offer organizations in the quickly changing world being shaped by the AI Revolution. The expert panelists share their views on the state of the industry and discuss real-world applications and implementations of a Semantic Layer.

Malcolm Hawker is a former Chief Product Officer and Gartner analyst with over 25 years of experience across the fields of Data Strategy, Master Data Management (MDM), and Data Governance. Polly Alexander is Director of Metadata and Taxonomy for WebMD Ignite, with expertise bridging the fields of Knowledge Management, AI, and Machine Learning. Mohammed Aaser is Chief Data Officer (CDO) of Domo and former CDO of McKinsey and Company. 

 

 

 

 

The post Webinar: Different Applications of a Semantic Layer Industry Panel appeared first on Enterprise Knowledge.

]]>
Industry Panel: Different Applications of a Semantic Layer https://enterprise-knowledge.com/industry-panel-different-applications-of-a-semantic-layer/ Thu, 28 Mar 2024 15:11:36 +0000 https://enterprise-knowledge.com/?p=20269 Join a collection of the world’s most prominent voices in data science, information management, and artificial intelligence to discuss the role of Semantic Layers, what they are, and what value they offer organizations in the quickly changing world being shaped … Continue reading

The post Industry Panel: Different Applications of a Semantic Layer appeared first on Enterprise Knowledge.

]]>

Join a collection of the world’s most prominent voices in data science, information management, and artificial intelligence to discuss the role of Semantic Layers, what they are, and what value they offer organizations in the quickly changing world being shaped by the AI Revolution. In this webinar, Enterprise Knowledge Vice President Lulit Tesfaye will facilitate a spirited conversation with Malcom Hawker, Polly Alexander, Mohammed Aaser, and Jeff Jonas. Over the course of the webinar, the expert panelists will share their views on the state of the industry and discuss real-world applications and implementations of a Semantic Layer.

Malcolm Hawker is a former Chief Product Officer and Gartner analyst with over 25 years of experience across the fields of Data Strategy, Master Data Management (MDM), and Data Governance. Polly Alexander is Director of Metadata and Taxonomy for WebMD Ignite, with expertise bridging the fields of Knowledge Management, AI, and Machine Learning. Mohammed Aaser is Chief Data Officer (CDO) of Domo and former CDO of McKinsey and Company. Jeff Jonas is Founder and CEO of Senzing and a former IBM Fellow with acclaimed expertise in data science with a specific focus on entity resolution.

This session will be held on Monday, April 22, from 3:00-4:00 pm EST. Register for the webinar at https://attendee.gotowebinar.com/register/3743095360256423520

The post Industry Panel: Different Applications of a Semantic Layer appeared first on Enterprise Knowledge.

]]>
Jumpstarting Your Semantic Solution Design with UML Diagrams https://enterprise-knowledge.com/jumpstarting-your-semantic-solution-design-with-uml-diagrams/ Wed, 19 Oct 2022 21:05:39 +0000 https://enterprise-knowledge.com/?p=16665 Where do I start? Whether it be a taxonomy, an ontology, or a knowledge graph, this is a common question that we get from clients when they are beginning their scoping journey. We get it. It is difficult to define … Continue reading

The post Jumpstarting Your Semantic Solution Design with UML Diagrams appeared first on Enterprise Knowledge.

]]>
Where do I start? Whether it be a taxonomy, an ontology, or a knowledge graph, this is a common question that we get from clients when they are beginning their scoping journey. We get it. It is difficult to define solution requirements for a specific issue when there are multiple competing priorities and a myriad of anecdotal and systemic inefficiencies across the organization caused by siloed, inconsistent, or poorly managed information or data.

At EK, we strive to find a balance between the top-down and bottom-up perspectives during the scoping process. Our approach seeks to anticipate the most common needs of users while leaving the door open to meet the dynamic situations that emerge from the content or systems when users interact with them. There have been cases in which the information needed to spec out a solution is not available in business databases, policies, or content, and stakeholders don’t have any insights into user journeys due to the lack of concrete information emerging from the regular conduction of the business. 

When the organization doesn’t have representative business content to leverage, a resource that might provide just the right information to launch your scoping journey is architecture diagrams. In a previous blog we enticed the idea that Entity Relationship Diagrams (ERD) can be used as blueprints to navigate and leverage the data stored in relational databases. In this entry, we will dive a bit deeper into the information represented in diagrams, specifically in UML diagrams, and discuss some of the advantages and pitfalls of utilizing them in the early stages of your solution design. We will also share examples of the types of UML diagrams that we’ve found most helpful during the solution modeling process.

What are UML diagrams and how can they supplement your solution scope?

UML, shorthand for Unified Modeling Language, is a specification that allows IT professionals to represent the design and functionality of software systems through standard graphical elements: UML diagrams. There are fourteen UML diagram types grouped in two main categories: structural and behavioral. Structural diagrams show a system’s structure, behavioral diagrams show how a system should function.

A tree diagram with two branches: Structural and Behavioral. The Structural branch contains 7 components: Class, Component, Composite Structure, Deployment, Object, Profile, and Package. The Behavioral branch contains 7 components: Activity, Communication, Interaction Overview, Sequence, State, Timing, and Use Case

Given their capability to convey information in a syntactic and conventional format, these pictorial representations are some of the most popular tools in every software engineer’s toolkit. For the business owners and information managers, UML diagrams allow teams to visualize how a process is working, the sequence of tasks, how the data flows from one platform to another, and the systems that produce it. Below you will find a summary of the notation for a class diagram.

A summary of UML Class Diagram Notation

As you embark on your scoping journey, consider stress testing your solution’s requirements, functionalities, and assumptions against the organizational ecosystem illustrated in the UML diagrams. You can compare your design to the design of systems elsewhere in the organization, identifying points of alignment as well as the unique features of your own solution. This iterative exercise will let you construct a more accurate and complete picture of the agents, processes, scenarios, behaviors and data that you scoped.

A piece of advice before you move further down the page. It is important to warn you against the temptation to simply take any existing UML model and replicate it in your scope. Instead, you should analyze UML diagrams just as you would any other piece of content that you receive: assessing their shortcomings in their current state as well as their capabilities to support your envisioned “end state” solution.

Where do I start?

Now that we know that there are business artifacts that can function as charts to navigate the organizational information ecosystem, we can go back to the original question of where to start. In order to capitalize on the information contained in a UML diagram, first and foremost, you need to set a baseline that represents what you expect from your solution, whatever this might be. A simple way of visualizing your thoughts is to draw a diagram. You can use pen and paper, a whiteboard, or a diagramming software. For your drawing, think of the “things” that you want your model to represent and how they relate to each other.

In the initial phase, your drawing doesn’t have to be complete or accurate. What is important is that you capture the information that reflects your current view and understanding of the environment you will be operating in and the outcomes you would like to see. You should expect challenges, questions, and expansions to your original drawing. Changes are part of the development cycle, particularly during the initial stages.

While you can start your definition process at any level of abstraction, try to think in “concrete” objects. As you grow in your modeling competency and knowledge of the environment, you will be able to incorporate more abstract concepts in your model. Some relatable examples might be customers, orders, and products. These will be your concepts and can be represented as labeled circles.

Three circles, each containing one word: Customer, Order, Product

To represent relationships between concepts, you can simply draw a line from one circle to another and add a word or a short phrase that describes the connection.

Four circles with arrows linking them. Customer is linked to Product, and Product is linked to Order. The arrows linking the circles have their own names, which define the relationships.

Congratulations! You’ve successfully completed the first step in your scoping journey.

Talk to your stakeholders

At this point, you may want to show your diagram to stakeholders. The goal is to gather sufficient information that allows you to develop a functional definition for each one of the concepts and relationships that you originally came up with. This is particularly important if you are thinking about using your solution to support interoperable applications

In our example above, the definition of what a product is may differ by department, team, or source of revenue. For instance, does product encompass services? Who is a customer? Can both people and organizations be customers? What information do we collect about them? Is a customer the same as a client, a patron, or a shopper (all terms currently in use by different teams)?

It doesn’t matter if your solution is centered around taxonomies, ontologies, or knowledge graphs. They all are about meaning. Explicit meaning allows people and machines to collect, process, and exchange information more efficiently, and reduces the risk of misinterpretation and downtime due to data incompatibility. It is worth the time and effort you spend clarifying and documenting the variations and nuances of each piece of content that you analyze.

Ask the DBAs for their UML diagrams

It is now time to use your knowledge of UML diagrams. In addition to your business owners, database administrators (DBAs) are a group you will want to reach out to during the early phases of your scope definition. Since their function is to design databases and structures for new and existing applications, DBAs usually have a grounded perspective of the overall application ecosystem and infrastructure of an organization. You should definitely seek their expertise to inform your solution design.

You need to be strategic and determine the type of diagram that could provide the information that you need to consolidate your initial design. When you meet with the DBAs, ask them to walk you through their UML diagrams. Do they validate the concepts or relationships  that came out of your initial “pen and paper” drawing?

If you go back to the types of diagrams listed at the beginning of this post, you will notice that there is a diagram that allows software engineers to represent classes and relationships: the Class diagram. The following image is representing the same concepts from our example above: customer, order, and product.

An example UML Diagram linking Customers, Orders, Order Details, Product, Payment, and Customer types

By studying these drawings, you can start filling gaps in your knowledge, refining or reassuring your initial assumptions and understanding. Once again, you must make the extra effort not to replicate what currently exists. The goal with this analysis is to identify reusable elements but, more importantly, to make note of any missing pieces that are critical to the success of your solution. In our original drawing for instance, we had considered concepts for customer, product, and order. By inspecting the UML diagram we realize that it might be worth considering adding the concept of payment to our ontology or breaking down the product concept into more specific subconcepts. We can also reach the conclusion that our solution does not require distinguishing between domestic and international customers.

Another common situation is when a solution demands that you specify the causes for an event. In those cases, an Activity diagram might be a more appropriate artifact to analyze since they denote the actions, decision points, flows, start and end points of a given process.

In the diagram below, we are showcasing the standard notation for Activity diagrams and specifying what happens when a customer places an order. Once again, compare against your “pen and paper” drawing. Do you need a concept for the shipping company or the agent that processes, fills, and closes the order?

An example UML diagram outlining the process for an order from request to close

As a final word, we would like to reiterate that it is extremely easy to get caught on the details of what currently exists. Try to steer clear from discussing particulars such as the names of tables, columns, and data types, or to try to pin down the specifics of any existing platforms and deployments. At the onset, you should strive to define your solution in system-agnostic terms and detach it from specific implementations. Your ultimate goal when consulting UML diagrams is to help you determine what’s in and out of scope as well as the essential components of your desired solution. This is how you start assessing information gaps, prioritizing systems, clarifying processes, and identifying key partners and stakeholders to collect additional input. Not a bad place to start!

Conclusion

Many of our projects have benefited from incorporating UML diagrams as part of the initial information gathering activities. This is no surprise. UML diagrams are concise sources of information that can augment, validate, prove or disprove the assumptions, preconditions, and requirements established in preliminary scopes or solution designs.

For more information on case studies where we’ve leveraged these techniques to scope solutions or to get help doing so for your organization, contact us at info@enterprise-knowledge.com.

The post Jumpstarting Your Semantic Solution Design with UML Diagrams appeared first on Enterprise Knowledge.

]]>
The Importance of a Semantic Layer in a Knowledge Management Technology Suite https://enterprise-knowledge.com/the-importance-of-a-semantic-layer-in-a-knowledge-management-technology-suite/ Thu, 27 May 2021 16:43:36 +0000 https://enterprise-knowledge.com/?p=13229 One of the most common Knowledge Management (KM) pitfalls at any organization is the inability to find fresh, reliable information at the time of need.  One of, if not the most prominent, causes of this inability to quickly find information … Continue reading

The post The Importance of a Semantic Layer in a Knowledge Management Technology Suite appeared first on Enterprise Knowledge.

]]>
One of the most common Knowledge Management (KM) pitfalls at any organization is the inability to find fresh, reliable information at the time of need. 

One of, if not the most prominent, causes of this inability to quickly find information that EK has seen more recently is that an organization possesses multiple content repositories that lack a clear intention or purpose. As a result, users are forced to visit each repository within their organization’s technology landscape one at a time in order to search for the information that they need. Further, this problem is often exacerbated by other KM issues, such as a lack of proper search techniques, organization mismanagement of content, and content sprawl and duplication. In addition to a loss in productivity, these issues lead to rework, individuals making decisions on outdated information, employees losing precious working time trying to validate information, and users relying on experts for information they cannot find on their own. 

Along with a solid content management and KM related strategy, EK recommends that clients experiencing these types of findability related issues also seek solutions at the technical level. It is critical that organizations take advantage of the opportunity to streamline the way their users access the information they need to do their jobs; this will allow for the reduction of time and effort of users spent searching for information, as well as the assuage of the aforementioned challenges. This blog will explain how organizations can proactively mitigate the challenges of siloed information in different applications by instituting a unique set of technical solutions, including taxonomy management systems, metadata hubs, and enterprise search, to alleviate these problems.

With the abundance and variety of content that organizations typically possess, it is often unrealistic to have one repository that houses all types of content. There are very few, if any, content management systems on the market that can optimally support the storage of every type of content an organization may have, let alone possess the search and metadata capabilities required for proper content management. Organizations can address this dilemma by having a unified, centralized search experience that is able to search all content repositories in a secure and safe manner. This is achieved through the design and implementation of a semantic layer – a combination of unique solutions that work together to provide users one place to go to for searching for content, but behind the scenes allow for the return of results from multiple locations.

In the following sections, I will illustrate the value of Taxonomy Management Systems, Enterprise Search, and Metadata Hubs that make up the semantic layer, which collectively enable a unique and highly beneficial set of solutions.

The semantic layer is made up of three main systems/solutions: a Taxonomy Management System (TMS), an Enterprise Search (ES) tool, and a Metadata Hub.
As seen in the image above, the semantic layer is made up of three main systems/solutions: a Taxonomy Management System (TMS), an Enterprise Search (ES) tool, and a Metadata Hub.

Taxonomy Management Systems

In order to pull consistent data values back from different sources and filter, sort, and facet that data, there must be a taxonomy in place that applies to all content, in all locations. This is achieved by the implementation of an Enterprise TMS, which can be used to create, manage, and apply an enterprise-wide taxonomy to content in every system. This is important because it’s likely there are already multiple, separate taxonomies built into various content repositories that are different from one another and therefore cannot be leveraged in one system. An enterprise wide taxonomy allows for the design of a taxonomy that applies to all content, regardless of its type or location. An additional benefit of having an enterprise TMS is that organizations can utilize the system’s auto-tagging capabilities to assist in the tagging of content in various repositories. Most, if not all major contenders in the TMS industry provide auto-tagging capabilities, and organizations can use these capabilities to significantly reduce the burden on content authors and curators to manually apply metadata to content. Once integrated with content repositories, the TMS can automatically parse content, assign metadata based on a controlled vocabulary (stored in the enterprise taxonomy), and return those tags to a central location.

Metadata Hub

The next piece of this semantic layer puzzle is a metadata hub. We often find that one or more content repositories in an organization’s KM ecosystem lack the necessary metadata capabilities to describe and categorize content. This is extremely important because it facilitates the efficient indexing and retrieval of content. A ‘metadata hub’ can help to alleviate this dilemma by effectively giving those systems their needed metadata capabilities as well as creating a single place to store and manage that metadata. The metadata hub, when integrated with the TMS can apply the taxonomy and tag content from each repository, and store those tags in a single place for a search tool to index. 

This metadata hub acts as a ‘manage in place’ solution. The metadata hub points to content in its source location. Tags and metadata that are being generated are only stored in the metadata hub and are not ‘pushed’ down to the source repositories. This “pushing down” of tags can be achieved with additional development, but is generally avoided as not to disrupt the integrity of content within its respective repository. The main goal here is to have one place that contains metadata about all content in all repositories, and that this metadata is based on a shared, enterprise-wide taxonomy.

Enterprise Search

The final component of the semantic layer is Enterprise Search (ES). This is the piece that allows for individuals to perform a single search as opposed to visiting multiple systems and performing multiple searches, which is far from the optimal search experience. The ES solution acts as the enabling tool that makes the singular search experience possible. This search tool is the one that individuals will use to execute queries for content across multiple systems and includes the ability to filter, facet, and sort content to narrow down search results. In order for the search tool to function properly, there must be integrations set up between the source repositories, the metadata hub, and the TMS solution. Once these connectors are established, the search tool will be able to query each source repository with the search criteria provided by the user, and then return metadata and additional information made available by the TMS and metadata hub solutions. The result is a faceted search solution similar to what we are all familiar with at Amazon and other leading e-commerce websites. These three systems work together to not only alleviate the issues created by a lack of metadata functionalities in source repositories, but also to give users a single place to find anything and everything that relates to their search criteria.

Bringing It All Together

The value of a semantic layer can be exemplified through a common use case:

Let’s say you are trying to find out more information about a certain topic within your organization. In order to do this, you would love to perform a search for everything related to this certain topic, but realize that you have to visit multiple systems to do so. One of your content repositories stores digital media, i.e. videos and pictures, another of your content repositories stores scholarly articles, and another one stores information on individuals who are experts on the topic. There could be many more repositories, and you must visit each one separately and search within each system to gather the information you need. This takes considerable time and effort and in a best case scenario makes for a painstakingly long search process. In a worst case scenario, content is missed and the research is incomplete.

With the introduction of the semantic layer, the searchers would only have to visit one location and perform a single search. When doing so, searchers would see the results from each individual repository all in one location. Additionally, searchers would have extensive amounts of metadata on each piece of content to filter to ensure that they find the information they are looking for. Normally when we build these semantic layers the search allows users the option to narrow results by source system, content type (article, person, digital media), date created or modified, and many more. Once the searcher has found their desired content, a convenient link is provided which will take them directly to the content in its respective repository. 

Closing

The increasingly common issue of having multiple, disparate content repositories in a KM technology stack is one that causes organizations to lose valuable time and effort, while hindering employees’ ability to efficiently find information through mature, proven metadata and search capabilities. Enterprise Knowledge (EK) specializes in the design and implementation of the exact systems mentioned above and has proven experience building out these types of technologies for clients. If your company is facing issues with the findability of your content, struggling with having to search for content in multiple places, or even finding that searching for information is a cumbersome task, we can help. Contact us with any questions you have about how we can improve the way your organization searches for and finds information within your KM environment.

The post The Importance of a Semantic Layer in a Knowledge Management Technology Suite appeared first on Enterprise Knowledge.

]]>
Five Steps to Implement Search with a Knowledge Graph https://enterprise-knowledge.com/five-steps-to-implement-search-with-a-knowledge-graph/ Mon, 19 Apr 2021 13:00:58 +0000 https://enterprise-knowledge.com/?p=12943 Knowledge Graphs and Search are commonly linked together to support search use cases such as: Returning contextual relationships with search results; Displaying relevant topics in a knowledge panel; or Powering an expert finder. These advanced use cases enable an organization … Continue reading

The post Five Steps to Implement Search with a Knowledge Graph appeared first on Enterprise Knowledge.

]]>
Knowledge Graphs and Search are commonly linked together to support search use cases such as:

These advanced use cases enable an organization to provide more domain context and organizational information to users, reducing user time spent searching and improving a user’s ability to discover new content through recommendations. The five steps that EK recommends to implement search with a knowledge graph are as follows.

  1. Analyze the Search Content
  2. Develop an Ontology for the Knowledge Graph
  3. Design the User Search Experience
  4. Ingest the Data
  5. Implement and Iterate

Depending on your workflow, these steps may not occur in a waterfall order, so keep in mind that, for example, step 3 could be started while step 2 is still in progress. Also, these steps are analogous to the steps necessary to implement a semantic architecture.

Step One: Analyze the Search Content

The first step to a successful knowledge graph search implementation is to analyze the information available for users. If you are just starting a search effort, start small and analyze a handful of data sources that contain key information that end-users always need. This step often involves interviews with business and technical data source owners as well as users to answer the following questions. At the end of this step, you will have a collection of information about data source content with answers to what, where, and how the information can be leveraged.

What information is available?

We want to identify each type of information available from a data source. If we are analyzing a content management system, it may contain deliverables and reports. However, do not stop there. Continue asking questions to dive deeper into what is available.

  • What metadata fields exist on a report?
  • Can we segment the deliverables at all? i.e. Can we retrieve or link to the pages separately?
  • What users worked on this document?

As you dive deep into the content, you will surface key pieces of information that can be put together to solve user needs.

Where is the information and how do we get it?

These two questions inform the development process later on and ensure that information is actually available for use. We find it key to meet with data source technical owners as they will be able to figure out where information lives within a system, how it is generated, and, most importantly, how we can extract the information for use in the knowledge graph. It is best to start this conversation early as often there may be security concerns or development steps that need to be taken in order to build out an integration point.

How is the information related to other information?

Once you know what information is available, facilitate a conversation with the business owners to determine where the information originates and how the information relates to other data sources. With this question, we are hoping to surface concepts like

  • Content lifecycle processes that could be tracked to add more context to search;
  • Opportunities to combine information from multiple data sources together; or
  • New data sources that we should analyze in the future.

Knowledge graphs are great at representing and querying interconnected data as well as providing means to infer additional relationships. We want to take advantage of this feature as much as possible since it helps drive the search user interface design (that we will talk about later).

By collecting the answers to these questions, you are making it easier to take the next steps in implementing search. If step one is still unclear, think of it like designing a content type and consider that our main goal is to create custom search results that utilizes all information at your organization. Understanding not only where information is but where it comes from and how it will change over time is crucial to the next step of modeling the information.

Step Two: Develop an Ontology for the Knowledge Graph

At the end of the first step, we have a large amount of data describing the information contained in all of our data sources and how they relate to each other. The next step is to figure out how we can leverage the information to answer user questions and build a model to support them. This model, academically referred to as an ontology, is the data model of the knowledge graph that we will be piecing together in step four.

Define the User Questions

We strongly believe that the best way to ensure any solution’s success is to gather requirements from the users. EK usually runs a search workshop to facilitate a session with end-users and business stakeholders to elicit feature requirements and determine what information users find helpful. In step one, you collected a lot of data describing the types of information available. Use this data to ask pointed questions, gauging user interest in the data you uncovered. Work with the group to determine how they would like to see information displayed and what questions they would ask of the data. This is an opportunity for users and stakeholders to think outside the box and come up with their ideal solution, no matter how out of scope it may seem at the time. Every idea may be used later while iterating on the solution or to influence the creation of similar features.

Determine the Classes, Attributes, and Relationships

Almost all information can be represented using classes, their attributes, and the relationships between them. Once you know the questions that users want to ask and the requirements for the solution, you can begin to break down the data from step one into classes. For this process, you can follow the following questions.

  • What types of information does search need to display?
    (e.g. employees, deliverables)
  • For each type, what properties are necessary to display the information in an intuitive way for users?
    (e.g. do end-users need to see the employee’s email?)
  • For each type, what relationships exist to other types of information?
    (e.g. are employees related to deliverables at all?)

A majority of these questions will leverage the data collected in step one, but the data is now tuned to match the needs of the users and stakeholders. Use ontology design best practices to validate the reusability and scalability of the data model. The selected classes (types), properties (attributes), and relationships form the initial ontology.

Map the Data Sources to the Ontology

It is critical to keep a mapping of the data source information to the ontology so that you can maintain and upgrade the ontology in future iterations. Keep track of where each type of information originates, how attributes are calculated, and what steps are taken to extrapolate relationships within the information. While developing the mapping, pull a sample set of information from the data sources and mock up some data. Use this mocked data to validate the data types that should be used for each attribute with a technical member of the team. This ensures that the mapping has realistic inputs and outputs that can be leveraged when creating the data pipelines in step four.

Use the knowledge you already have to create complete views of your organization’s information, including people and clients.

Step Three: Design the User Search Experience

In steps one and two, we put our full attention on the data sources, interpreting the available information into a data model that will enable us to populate a knowledge graph. In this step, we want to shift our focus to the end-users and make sure we build a search solution that will solve user needs through an intuitive interface, leveraging the full capabilities of a knowledge graph.

Define the Search User Stories

Work with the application stakeholders and users to define user stories that will help guide the user interface design. Here’s a blog we have written about three key benefits of user stories.

Perspective Define the search and interface requirements (not features!) from the view of a user. What does a user need from the search solution?

Purpose

Determine why requirements are needed and the benefits they bring. This enables the team to brainstorm and build the best feature to meet the requirements.
Priority Work with users and stakeholders to order the requirements. A prioritized backlog of requirements ensures the team delivers high interest items first.

When defining the user stories, keep an eye out for use cases that could be solved through an action-oriented search result. We want to note what data points are important to users so that we can best leverage them in the design process to enable users to take immediate action.

Design using Search Best Practices

Start simple and include your basic search features, the search bar, results, and facets. These basic features ensure that anyone, regardless of their background, can find and discover information within search. Facilitate design workshop sessions with users and stakeholders to design search results for types of information and include search best practices.

Use a consistent view when displaying the same content on multiple pages.

Determine which attributes and relationships in the data need to be highlighted in search results versus those that should be only displayed in spots requiring an additional click, like an accordion dropdown or an entirely new page. When designing the interface, standardize how users will interact with the interface and different content types. The consistent interactions build trust with users and ensures that interaction with search is intuitive.

Innovate with Knowledge Graph Search Features

Up until now, step three has been all about designing the search solution using search design best practices. Now that we have that baseline, we want to include knowledge graph specific use cases like the below.

Identify the Search Subject

Use named-entity recognition (NER) or a knowledge graph entity lookup to identify what a user is looking for and present the user with all relevant compiled information about that entity. For example, imagine the search information includes people, documents, and projects. If a user searches for the id of a project, design a project page that the user is redirected to that includes all of the project metadata, links to the documents associated with the project, and all team members that worked on the project. Creating these encyclopedic-like pages for an organization’s content can greatly improve the user’s ability to find the information they are looking for.

Extend the Search Results

Along the same line as the above, surface additional information, properties and relationships, about the search query and search results from the knowledge graph. If a specific term or entity is recognized in the search query, use that to populate a knowledge panel on the right hand side with all relevant information about that term or entity. A knowledge panel provides users with a snapshot of information based on their search query. When displaying the knowledge panel and search results, pull the most up-to-date contextual information about a search result from the knowledge graph. For example, contextual information could include project statuses, most recent documents, or most similar content within the knowledge graph by metadata.

A knowledge panel collects and highlights project details in one place for a user search.

 

Natural Language Search Across Data

One of the most powerful resources for a knowledge graph search is natural language processing (NLP). NLP enables search to recognize entities in the graph as well as user intent. In one of our knowledge graph projects, EK developed an NLP-based search that recognized what entities a user was asking for and used that context to automatically collect and prioritize big data in a tabular format for analysts to review. This gave business analysts quick access to the data insights they needed from multiple large datasets. The ability to recognize the intent behind a user’s query enables the search interface to adapt and provide specialized answers to the most important questions.

Step Four: Ingest the Data

Steps one, two, and three focus on analyzing, prepping, and designing the knowledge graph search solution. Now that we have our initial plan, we can pull the data together through extract, transform, and load (ETL) pipelines and populate our knowledge graph.

Index the Data Source Information

Using the data collected in step one, build out the integrations with each of the required data sources. When possible, use application programming interfaces (APIs) or other feeds to extract content from the source systems. If this is not possible, database connections or temporary data exports may be required in order to proof out the integrations. Next, determine how content will be indexed from the sources by answering the following questions.

  • What amount of content should be extracted each time the pipeline runs?
  • How often should the content be indexed?
  • Does the content from this data source need to be combined with any other data source?

There are numerous types of indexing techniques in order to ensure that the knowledge graph and search data are kept accurate and up-to-date without overloading the search indexing or data pipelines.

Transform Information using the Ontology Mapping

When designing the ETL pipelines, reference the ontology and ontology data source mapping to ensure that all information is transformed into the expected format. In most cases, this involves using transformation techniques like object mapping (i.e. Entity Framework) or XSLT to transform information from the source format into a graph data format (i.e. RDF) or into a document format for search (i.e. JSON). This is the first time that all information from a data source is being transformed so expect some data quality issues. Required fields may not always be present in the data, the values may not match the expected type, and data standardization issues may need to be addressed. Work with your stakeholders and data source owners to determine where and how issues should be addressed.

Transform and enrich your knowledge with a consistent vocabulary to populate a knowledge graph and display that information to users.

Enrich the Information with Context

One key piece to developing relationships between information from various sources is to leverage NER or an existing taxonomy. As content is pulled into the knowledge graph and search, the metadata fields provided with each type of information may not be enough. Combining information together from multiple sources builds a better picture of each information type, but some of the best sources of similarity reasoning and clustering will come from associating content with entities through auto-tagging a taxonomy or NER and topic modeling of terms. When designing the ETL pipelines to bring in data, consider how content may be enriched by adding in auto-tagging and NER techniques to the pipelines.

Step Five: Implement and Iterate

At this point, the indexed information is within the knowledge graph and search platform. In this step, build out a prioritized feature set based on the search designs from step three. Depending on the selected development stack, it may be beneficial to build an API layer on top of the knowledge graph and search platform and leverage these APIs to pull data into the user interface. In order to get the interface in front of stakeholders quickly, you may need to leverage some of the sample data you created in step 2.

Developing a user-centric product requires feedback early and often. Validate the designs with both stakeholders and users through demos and user testing. Demos allow stakeholders to give instant feedback on the solution as soon as it is available. For user testing, provide users with tasks to perform and observe how users perform the task. It is important to note where users click, where their eyes are drawn to first, and how design choices impact the flow of the website navigation.

Prioritize and iterate on the user interface based on user feedback and testing.

Make sure to continue to explore the unique features of knowledge graphs. Incorporating new relevant sources with relationships to existing data can help cover edge cases of your search queries that are not yet answered. Highlighting inferences made through traversing the knowledge graph at query time, can bring users previously undiscovered steps.

 Always remember search is a journey.

Finally, iterate on the solution and respond to feedback. Steps one through five are meant to be repeated over and over as

  • New data sources and information is considered for search;
  • Users ask different questions that requires updating the ontology;
  • Designs adapt to feedback and testing to provide a more intuitive user experience;
  • Pipelines are extended to extract more information from the data sources; and
  • Features and design changes are required to the end solution.

Additionally, don’t forget to start small–EK recommends building out an end-to-end system, completing steps 1-5 in order for a subset of content and prioritized use cases. Use this first iteration to test capabilities and identify any integration risks or concerns.

Conclusion

These steps ensure that your organization builds the right search solution, creating a knowledge graph that answers your users’ questions and surfaces the results in an intuitive interface. Building search on top of a knowledge graph enables your organization to provide tailored, advanced search features as well as create a foundation of organizational knowledge that can be leveraged for other use cases such as chatbots, recommendation engines, and data analysis.

Interested in expanding your organization’s search to leverage the capabilities of a knowledge graph? Contact us and let’s work together to build a search solution that fits your organization’s needs.

The post Five Steps to Implement Search with a Knowledge Graph appeared first on Enterprise Knowledge.

]]>