graph database Articles - Enterprise Knowledge

Enterprise Knowledge Graphs: The Importance of Semantics

EK Team — Thu, 23 May 2024 17:13:34 +0000

Heather Hedden, Senior Consultant at Enterprise Knowledge, presented “Enterprise Knowledge Graphs: The Importance of Semantics” on May 9, 2024, at the annual Data Summit in Boston.

In her presentation, Hedden describes the components of an enterprise knowledge graph and provides further insight into the semantic layer – or knowledge model – component, which includes an ontology and controlled vocabularies, such as taxonomies, for controlled metadata. While data experts tend to focus on the graph database components (RDF triple store or a label property graph), Hedden emphasizes they should not overlook the importance of the semantic layer.

Explore the presentation to learn:

The definition and benefits of an enterprise knowledge graph
The components of a knowledge graph
The fundamentals of graph databases
The basics features of taxonomies and ontologies
The role of taxonomies and ontologies in knowledge graphs
How an enterprise knowledge graph is built

The post Enterprise Knowledge Graphs: The Importance of Semantics appeared first on Enterprise Knowledge.

The Top 3 Ways to Implement a Semantic Layer

Lulit Tesfaye — Tue, 12 Mar 2024 16:09:47 +0000

Over the last decade, we have seen some of the most exciting innovations emerge within the enterprise knowledge and data management spaces. Those innovations with real staying power have proven to drive business outcomes and prioritize intuitive user engagement. Within this list are a semantic layer (for breaking the silos between knowledge and data) and of course, generative AI (a topic that is often top of mind on today’s strategic roadmaps). Both have one thing in common – they are showing promise in addressing the age-old challenge of unlocking business insights from organizational knowledge and data, without the complexities of expensive data, system, and content migrations.

In 2019, Gartner published research emphasizing the end to “a single version of the truth” for data and knowledge management and that by 2026, “active metadata” will power over 50% of BI and analytics tools and solutions to provide a structured and consistent approach to connecting instead of consolidating data.

By employing semantic components and standards (through metadata, business glossaries, taxonomy/ontology, and graph solutions), a semantic layer arms organizations with a framework to aggregate and connect siloed data/content, explicitly provide business context for data, and serve as the layer for explainable AI. Once connected, independent business units can use the organization’s semantic layers to locate and work with not only enterprise data, but their own, unit-specific data as well.

Incorporating a semantic layer into enterprise architecture is not just a theoretical concept, it’s a practical enhancement that transforms how organizations harness their data. Over the last ten years, we’ve worked with a diverse set of organizations to design and implement the components of a semantic layer. Many organizations we work with support a data architecture that is based on relational databases, data warehouses, and/or a wide range of content management, cloud, or hybrid cloud applications and systems that drive data analysis and analytics capabilities. These models do not necessarily mean that organizations need to start from scratch or overhaul their working enterprise architecture in order to adopt/implement a semantic layer. To the contrary, it is more effective to shift the focus on metadata and data modeling or designing efforts by adding models and standards that will allow for capturing business meaning and context in a manner that provides the least disruptive starting point.

Though we’ve been implementing the individual components for over a decade, it has only been the last couple years where we’ve been integrating them all to form a semantic layer. The maturity of approaches, technologies, and awareness have all combined with the growing need of organizations and the AI revolution to create this opportunity now.

In this article, I will explore the top three common approaches we are seeing at play in order to weave this data and knowledge layer into the fabric of enterprise architecture, highlighting the applications and organizational considerations for each.

1. A Metadata-First Logical Architecture: Using Enterprise Semantic Layer Solutions

This is the most common and scalable model we see across various industries and use cases for enterprise-wide applications.

Architecture

Implementing a semantic layer through a metadata-first logical architecture involves creating a logical layer that abstracts the underlying data sources by focusing on metadata. This approach establishes an organizational logical layer through standardized definitions and governance at the enterprise level while allowing for additional, decentralized components and solutions to be “pushed,” “published,” or “pulled from” specific business units, use cases, and systems/applications at a set cadence.

Pros

Using middleware solutions like a data catalog or an ontology/graph storage, organizations are able to create a metadata layer that abstracts the underlying complexities, offering a unified view of data in real time based on metadata only. This allows organizations to abstract access, ditch application-centric approaches, and analyze data without the need for physical consolidation. This model effectively leverages the capabilities of standalone systems or applications to manage semantic layer components (such as metadata, taxonomies, glossaries, etc.) while providing centralized storage for semantic components to create a shared, enterprise semantic layer. This approach ensures consistency in core or shared data definitions to be managed at the enterprise level while providing the flexibility for individual teams to manage their unique secondary and group-level semantic data requirements.

Cons

Implementing a semantic layer as a metadata architecture or logical layer across enterprise systems requires planning in phases and incremental development to maintain cohesion and prevent fragmentation of shared metadata and semantic components across business groups and systems. Additionally, depending on the selected synchronization approach of the layer with downstream/upstream applications (push vs. pull), data orchestration and ETL pipelines will need to plan for a centralized vs. decentralized orchestration that ensures ongoing alignment.

Best Suited For

This approach is our most deployed and well-suited for organizations that want to balance standardization with the need for business unit or application level agility in data processing and operations in different parts of the business.

2. Built-for-Purpose Architecture: Individual Tools with Semantic Capabilities

This model allows for greater flexibility and autonomy at the business unit or functional level.

Architecture

This architecture approach is a distributed model that leverages each standalone system or application capabilities to own semantic layer components – without a connected technical framework or governance structure at the enterprise level for shared semantics. With this approach, organizations typically identify establishing semantic standards as a strategic initiative but each individual team or department (marketing, sales, product, data teams, etc.) is responsible for creating, executing, and managing its semantic components (metadata, taxonomies, glossaries, graph, etc.), tailored to their specific needs and requirements.

Most knowledge and data solutions such as content or document management systems (CMS/DMS), digital asset management (DAMs), customer relationship management (CRM), and data analytics/BI dashboards (such as Tableau and PowerBI) have inherent capabilities to manage simple semantic components (although with varied maturity and feature flexibility levels). This decentralized architecture results in the implementation of multiple system-level semantic layers. Let’s take SharePoint as an example, an enterprise document and content collaboration platform. For organizations that are in the early stages of growing their semantic capabilities, we leverage the Term Store for structuring metadata and taxonomy management within SharePoint, which allows teams to create a unified language, fostering consistency across documents, lists, and libraries. This helps with information retrieval and also enhances collaboration by ensuring a shared understanding of key metrics. On the other hand, Salesforce, a renowned CRM platform, offers semantic capabilities that enable teams across sales, marketing, and customer service to define and interpret customer data consistently across various modules.

Pros

This decentralized model promotes agility and empowers business units to leverage their existing platforms (that are built-for-purpose) as not just data/content repositories but as dynamic sources of context and alignment, driving consistent understanding of shared data and knowledge assets for specific business functions.

Cons

However, this decentralized approach typically leads those users who need cohesive organizational content and data to do so through separate interfaces. Data governance teams or content stewards are also likely to manage each system independently. This leads to data silos, “semantic drifts,” and inconsistency in data definitions and governance (where duplication and data quality issues arise). This ultimately results in misalignment between business units, as they may interpret data elements differently, leading to confusion and potential inaccuracies.

Best Suited For

This approach is particularly advantageous for organizations with diverse business units or teams that operate independently. It empowers business users to have more control over their data definitions and modeling and allows for quicker adaptation to evolving business needs, enabling business units to respond swiftly to changing requirements without relying on a centralized team.

3. A Centralized Architecture: Within an Enterprise Data Warehouse (EDW) or Data Lake (DL)

This structured environment simplifies data engineering and ensures a consistent and centralized semantic layer specifically for analytics and BI use cases.

Architecture

Organizations that are looking to create a single, unified representation of their core organizational domains develop a semantic layer architecture that serves as the authoritative source for shared data definitions and business logic within a centralized architecture – particularly within an Enterprise Data Warehouse or Data Lake. This model makes it easier to build the semantic layer since data is already in one place, and analytics solutions that are using cloud-based data warehousing platforms (e.g., Amazon Redshift, Google BigQuery, Snowflake, Azure Blob Storage, Databricks, etc.) can serve as a “centralized” location for semantic layer components.

Building a semantic layer within an EDW/DL involves consolidating and ingesting data from various sources into a centralized repository, identifying key data sources to be ingested, defining business terms, establishing relationships between different datasets, and mapping the semantic layer to the underlying data structures to create a unified and standardized interface for data access.

Pros

This model architecture is a common implementation approach we support specifically within a dedicated team of data management, data analytics, and BI groups that are consistently ingesting data, setting the implementation processes for changes to data structures, and enforcing business rules through dedicated pipelines (ETL/APIs) for governance across enterprise data.

Cons

The core consideration here (that usually suffers) is collaboration between business and data teams that is pivotal during the implementation process, guides investment in the right tools and solutions that have semantic modeling capabilities, and supports the creation of a semantic layer within this centralized landscape.

It is important to ensure that the semantic layer reflects the actual needs and perspectives of end users. Regular feedback loops and iterative refinements are essential for creating a model that evolves with the dynamic nature of business requirements. Adopting these solutions within this environment will enable the effective definition of business concepts, hierarchies, and relationships, allowing for translation of technical data into business-friendly terms.

Another important aspect with this type of centralized model is that it is dependent on data that is consolidated or co-located and requires upfront investment in terms of resources and time to design and implement the layer comprehensively. As such, it’s important to start small by focusing on specific business use cases, the relevant scope of knowledge/data sources and foundational models that are highly visible, and focused on business outcomes. This will allow the organization to create a foundational model that will expand across the rest of the organization’s data and knowledge assets, incrementally.

Best Suited For

We have seen this approach being particularly beneficial for large enterprises with complex but shared data requirements and that have the need for stringent knowledge and data governance and compliance rules – specifically, organizations that produce data products and need to control the data and knowledge assets that are shared internally or externally on a regular basis. This includes, but is not limited to, financial institutions, healthcare organizations, bioengineering firms, and retail companies.

Closing

A well-implemented semantic layer is not merely a technical necessity but a strategic asset for organizations aiming to harness the full potential of their knowledge and data assets, as well as have the right foundations in place to make AI efforts successful. The choice of how to architect and implement a semantic layer depends on the specific needs, size, and structure of the organization. When considering this solution, the core decision really comes down to striking the right balance between standardization and flexibility, in order to ensure that your semantic layer serves as an effective enabler for knowledge-driven decision making across the organization.

Organizations that invest in an enterprise architecture through the metadata layer and those that rely on experts with modeling experience that are anchored in semantic web standards find it the most flexible and scalable approach. As such, they are better positioned to abstract their data from vendor lock and ensure interoperability to navigate the complexities of today’s technologies and future evolutions.

When embarking on a semantic layer initiative, not understanding or planning for a solid technical architecture and phased implementation approach leads to unplanned investments or failure for many organizations. If you are looking to get started and learn more about how other organizations are approaching scale, read more from our case studies or contact us if you have specific questions.

The post The Top 3 Ways to Implement a Semantic Layer appeared first on Enterprise Knowledge.

Recommendation Engine Automatically Connecting Learning Content & Product Data

EK Team — Tue, 19 Sep 2023 15:14:24 +0000

The Challenge

A bioscience technology provider – and a leader in scientific research and solutions – identified a pivotal challenge within their digital ecosystem, particularly on their public facing e-commerce website. While the platform held an extensive reservoir of both product information and associated educational content, the content and data existed disjointedly (spread across more than five systems). As a result, their search interface failed to offer users a holistic and enriching experience. A few primary issues at hand were:

The search capability was largely driven by keywords, limiting its potential to be actionable.
The platform’s search functionality didn’t seamlessly integrate all available resources, leading to underutilized assets and a compromised user experience.
The painstaking manual process of collating content posed internal challenges in governance and hindered scalability.
In the absence of a cohesive content classification system, there was a disjunction between product information and corresponding educational content.
Inconsistencies plagued the lifecycle management of marketing content.
The array of platforms, managed by different product teams, exposed alignment challenges and prevented a unified user experience.

From a business perspective, the challenges were even more dire. The company faced potential revenue losses as users couldn’t gain enough insight to make buying decisions. The user experience became frustrating due to irrelevant content and inefficient searches, limiting employees with manual processes and impeding data-driven decision-making regarding the value of the site’s content; this caused both employees and customers to resort to doing Google searches that routed them back to the site to find what they needed.

The company engaged EK to help bridge the gap between product data and marketing and educational content to ultimately improve the search experience on their platform.

The Solution

Assessing Current Content and Asset Capabilities at Scale

EK commenced its engagement by comprehensively assessing the company’s current content and asset capabilities. This deep dive included a data mapping and augmented corpus analysis effort into the content and technologies that power their website, such as Adobe AEM (marketing content), a Learning Management System (LMS) with product-related educational content, a Product Information Management (PIM) solution with over 70,000 products, and Salesforce for storing customer data. This provided a clear picture of the existing content and data landscape.

A Semantic Data Model

With a deeper understanding of the content’s diversity and the need for efficient classification, EK defined and implemented a robust taxonomy and ontology system. This provided a structured way to classify and relate content, making it more discoverable and actionable for users. To tangibly demonstrate the potential of knowledge graphs, EK implemented a POC. This POC aimed to bridge the silos between the different systems, allowing for a more unified and cohesive content experience that connected product and marketing information.

Integrated Data Resources and Knowledge Graph Embeddings

EK utilized an integrated data set to counter data fragmentation across different platforms. A more cohesive content resource was built by combining Adobe AEM and LMS data with manually curated data and extracted information from the rendered website. However, the critical leap came when the entire knowledge graph, which encapsulated this unified data set, was loaded into memory. This in-memory knowledge graph paved the way for real-time processing and analysis, which is essential for generating meaningful embeddings.

Similarity Index and Link Classifier: Two-Fold Search Enhancement

Similarity Index: EK’s Enterprise AI and Search experts worked together to convert the in-memory knowledge graph into vector embeddings. These embeddings, teeming with intricate data relationships, were harnessed to power a similarity index; this index stands as a testament to AI’s potential, offering content recommendations rooted in contextual relevance and similarity metrics.
Link Classifier: Building upon the embeddings, EK introduced a machine learning (ML) classifier. This tool was meticulously trained to discern patterns and relationships within the embeddings, establishing connections between products and content. Consequently, the system was endowed with the capability to recommend content corresponding to a user’s engagement with a product or related content. This transformed the user journey, enriching it with timely and pertinent content suggestions.

ML-Infused User Experience Enhancement

Venturing beyond conventional methodologies, EK incorporated ML, knowledge graphs, taxonomy, and ontology to redefine the user experience. This allowed users to navigate and discover important content through an ML-powered content discovery system, yielding suggestions that resonated with their needs and browsing history.

Unified Platform Management via Predictive Insights

Addressing the multifaceted challenge of various teams steering different platforms, EK integrated the machine learning classifier with predictive insights. This fusion empowered teams with the foresight to gauge user preferences, allowing them to align platform features and fostering a cohesive and forward-looking digital landscape.

Search Dashboard Displaying ML-based Results

Concluding their engagement, EK presented with a search dashboard. This dashboard, designed to exhibit two distinct types of results – similarity index and link classifier – served as a window for the organization to witness and evaluate the dual functionalities. The underlying intent was to grant their e-commerce website backend avenues to elevate their search capabilities, giving them a comparative view of multiple ML-based systems.

The EK Difference

EK’s hallmark is rooted in the proficiency of advanced AI and knowledge graph technologies, as well as our profound commitment to client relationships. Working closely with the company’s content and data teams, EK displayed a robust understanding of the technological necessities and the organizational dynamics at play. Even when the level of effort and need from the solution extended beyond the initial scope of work, EK’s flexible approach allowed for open dialogue and iterative development and value demonstration. This ensured that the project’s progression aligned closely with the evolving needs of our client.

Recognizing the intricacy of the project and the importance of a well-documented process, EK meticulously enhanced the documentation of both the delivery process and development. This created transparency and ensured that all the resources needed to carry forward, modify, or scale the implemented solution are in place for the future.

Moreover, given the complexity and nuances involved in such large-scale implementations, EK provided a repeatable framework to validate AI results with stakeholders and maintain integrity and explainability of solutions with human-in-the-loop development throughout the engagement. This was achieved through iterative sessions, ensuring the final system met technical benchmarks and resonated with the company’s organizational context and language.

The Results

The engagement equipped the organization with a state-of-the-art, context-based recommendation system specifically tailored for their vast and diverse digital ecosystem. This solution drastically improved content discoverability, relevance, and alignment, fundamentally enhancing the user experience on their product website.

The exploratory nature of the project further unveiled opportunities for additional enhancements, particularly in refining the data, optimizing the system and exposing areas where the firm had gaps in content creation or educational materials as it relates to products. Other notable results include:

Automated framework to standardized metadata across systems for over 70,000 product categories;
A Proof of Concept (POC) that bridged content silos across 4+ different systems, demonstrating the potential of knowledge graphs;
A machine-learning classifier that expedited content aggregation and metadata application process through automation; and
Increased user retention and better product discovery, leading to 6 figures in closed revenue.

Download Flyer

Ready to Get Started?

Get in Touch

The post Recommendation Engine Automatically Connecting Learning Content & Product Data appeared first on Enterprise Knowledge.

What is A Data Fabric Architecture and What Are The Design Considerations?

Sara Nash — Thu, 31 Aug 2023 14:00:40 +0000

Introduction

In today’s data-driven world, effective data management is crucial for businesses to remain competitive. A modern approach to data management is the use of data fabric architecture. A data fabric is a data management solution that connects and manages data in a federated way, employing a logical data architecture that captures connections relevant to the business. Data fabrics help businesses make sense of their data by organizing it in a domain-centric way without physically moving data from source systems. What makes this possible is a shift in focus on metadata as opposed to data itself. At a high level, a semantic data fabric leverages a knowledge graph as an abstraction architecture layer to provide connectivity between diverse metadata. The knowledge graph enriches metadata by aggregating, connecting, and storing relationships between unstructured and structured data in a standardized, domain-centric format. Using a graph-based data structure helps the business embed their business data with context, drive information discovery and inference, and lay a foundation for scale.

Unlike a monolithic solution, a data fabric facilitates the alignment of different toolsets to enable domain-centric, integrated data as a service to multiple downstream applications. A data fabric architecture consists of five main components:

A data/metadata model
Entity extraction
Relationship extraction
Data pipeline orchestration
Persistent graph data storage.

While there are a number of approaches to designing all of these components, there are best practices to ensure the quality and scalability of a data fabric. This blog post will enumerate the approaches for each architectural component, discuss how to achieve a data fabric implementation from a technical approach and tooling perspective that suits a wide variety of business needs, and ultimately detail how data fabrics support the development of artificial intelligence (AI).

Data/Metadata Model

Data models – specifically, ontologies and taxonomies – play a vital role in building a data fabric architecture. An ontology is a central aspect of a data fabric that defines concepts, attributes, and relationships in a domain that is encoded in a machine and human-readable graph format. Similarly, a taxonomy is essential for metadata management in a data fabric, storing extracted entities and defining controlled vocabularies for core business domains like products, business lines, services, and skills. By creating relationships between domains and data, businesses can help users find insights and discover content more easily. Therefore, to effectively manage taxonomies and ontologies, business owners need a Taxonomy/Ontology Management System (TOMS) that provides a user-friendly platform and interface. A good TOMS should:

Help users build data models that follow common standards like RDF (Resource Description Framework), OWL (Web Ontology Language), and SKOS (Simple Knowledge Organization System);
Let users configure the main components of a data model such as classes, relationships, attributes, and labels that define the concepts, connections, and properties in the domain;
Add metadata about the data model itself through annotations, such as its name, description, version, creator, etc.;
Support automated governance, supporting quality checks for errors;
Allow for easy portability of the data model in different ways, serving multiple enterprise use cases; and
Allow users to link to external data models that already exist and can be reused.

Organizations that do not place their data modeling and management at the forefront of their data fabric introduce the risk of scalability issues, limited user-friendly schema views, and hampered utilization of linked open data. Furthermore, the absence of formal metadata management poses a risk of inadequate alignment with business needs and hinders flexible information discovery within the data fabric. There are different ways of creating and using data models with a TOMS to avoid these risks. One way is to use code or scripts to generate and validate the data model based on the rules and requirements of the domain. Using subject matter expertise input helps to further validate the data model and confirm that it aligns with business needs.

Entity Extraction

One of the functions of building your data fabric is to perform entity extraction. This is the process of identifying and categorizing named entities in both structured and unstructured data, such as person names, locations, organizations, dates, etc. Entity extraction enriches the data with additional information and enables semantic analysis. Identifying Named Entity Recognition (NER) tools and performing text preprocessing (e.g., tokenization, stop words elimination, coreference resolution) is recommended before determining an entity extraction approach, of which there are several: rule-based, machine learning-based, or a hybrid of both.

Rule-based approaches rely on predefined rules that use syntactic and lexical cues to extract entities. They require domain expertise to develop and maintain, and may not adapt well to new or evolving data.
Machine learning-based approaches use deep learning models that can learn complex patterns in the data and extrapolate to unseen cases. However, they may require large amounts of labeled data and computational resources to train and deploy.
Hybrid approaches (Best Practice) combine rule-based and machine learning-based methods to leverage the strengths of both. Hybrid approaches are recommended for businesses that foresee expanding their data fabric solutions.

Relationship Extraction

Relationship extraction is the process of identifying and categorizing semantic relationships that occur between two entities in text, such as who works for whom, what business line sells which product, what is located in what place, and so on. Relationship extraction helps construct a knowledge graph that represents the connections and interactions among entities, which enables semantic analysis and reasoning. However, relationship extraction can be challenging due to the diversity and complexity of natural language.There are again multiple approaches, including rule-based, machine learning-based, or hybrid.

Rule-based approaches rely on predefined rules that use word-sequence patterns and dependency paths in sentences to extract relationships. They require domain expertise to develop and maintain, and they may not capture all the possible variations and nuances of natural language.
One machine learning approach is to use an n-ary classifier that assigns a probability score to each possible relationship between two entities and selects the highest one. This supports capturing the variations and nuances of natural language and handling complex and ambiguous cases. However, machine learning approaches may require large amounts of labeled data and computational resources to train and deploy.
Hybrid approaches (Best Practice) employ a combination of ontology-driven relationship extraction and machine learning approaches. Ontology-driven relationship extraction uses a predefined set of relationships that are relevant to the domain and the task. This helps avoid generating a sparse relationship matrix that results in a non-traversable knowledge graph.

Data Pipeline Orchestration

Data pipeline orchestration is the driving force of data fabric creation that brings all the components together. This is the process of integrating data sources with two or more applications or services to populate the knowledge graph initially and update it regularly thereafter. It involves coordinating and scheduling various tasks, such as data extraction, transformation, loading, validation, and analysis, and helps ensure data quality, consistency, and availability across the knowledge graph. Data pipeline orchestration can be performed using different approaches, such as a manual implementation, an open source orchestration engine, or using a vendor-specific orchestration engine / cloud service provider.

A manual approach involves executing each step of the workflow manually, which is time-consuming, error-prone, and costly.
An open source orchestration engine approach involves managing ETL pipelines as directed acyclic graphs (DAGs) that define the dependencies and order of execution of each task. This helps automate and monitor the workflow and handle failures and retries. Open source orchestration engines may require installation and configuration, and businesses need to take into account the required features and integrations before opting to use one.
Third-party vendors or cloud service providers can leverage the existing infrastructure and services and provide scalability and reliability. However, vendor specific orchestration engines / cloud service providers may have limitations in terms of customization and portability.

Persistent Graph Data Storage

One of the central ideas behind a data fabric is the ability to store metadata and core relationships centrally while connecting to source data in a federated way. This manage-in-place approach enables data discovery and integration without moving or copying data. Persistent graph data storage is the glue that brings all the components together, storing extracted entities and relationships according to the ontology, and persisting the connected data for use in any downstream applications. A graph database helps preserve the semantic relationships among the data and enable efficient querying and analysis. However, not all graph databases are created equal. When selecting a graph database, there are 4 key characteristics to consider. Graph databases should be: standards-based, ACID compliant, widely-used, editable, and explorable via a UI.

Standards-based involves making sure the graph database follows a common standard, such as RDF (Resource Description Framework), to ensure interoperability so that it is easier to transition from one tool to another.
ACID compliant means the graph database ensures Atomicity, Consistency, Isolation, and Durability of the data transactions, which protects the data from infrastructure failures.
Strong user and community support ensures that developers will have access to good documentation and feedback.
Explorable via a UI supports verification by experts to ensure data quality and alignment with domain and use case needs.

Some of the common approaches for graph databases are using RDF-based graph databases, labeled property graphs, or custom implementations.

RDF-based graph databases use RDF which is the standard model for representing and querying data.
Labeled property graph databases use nodes, edges, and properties as the basic elements for storing and querying data.

Data Fabric Architecture for AI

A mature data fabric architecture built upon the preceding standards plays a pivotal role in supporting the development of artificial intelligence (AI) for businesses by providing a solid foundation for harnessing the power of data. The data fabric’s support for data exploration, data preparation, and seamless integration empowers businesses to harness the transformative and generative power of AI.

By leveraging an existing data fabric architecture, businesses can seamlessly integrate structured and unstructured data, capturing the relationships between them within a standardized, domain-centric format. With the help of the knowledge graph at the core of the data fabric, businesses can empower AI algorithms to discover patterns, make informed decisions, and generate valuable insights by traversing and navigating the graph. This capability allows AI models to uncover valuable insights that are not immediately apparent in isolated data silos.

Furthermore, data fabrics facilitate the process of data preparation and feature engineering, which are crucial steps in AI development. The logical architecture of the data fabric allows for efficient data transformation, aggregation, and enrichment. By streamlining the data preparation pipeline, AI practitioners can focus more on modeling and algorithm development, accelerating the overall AI development process. AI models often need continuous refinement and adaptation, and data fabrics enable seamless integration of new data sources and updates to ensure that AI models have the most up-to-date information.

Conclusion

A data fabric is a modern approach to data management that is crucial for businesses to remain competitive in a data-driven world. However, a data fabric is not a monolithic solution and the supporting architecture and technical approach can vary based on the state of sources, supporting use cases, and existing tooling at an organization. It’s important to prove out the value of the solutions before investing in costly tool procurement. We recommend starting small and iterating, beginning with a targeted domain in mind and sample source systems to lay a foundation for an enterprise data fabric. Once a data fabric has been established, businesses can unlock the full potential of their data assets, enabling AI algorithms to make intelligent predictions, discover hidden insights, and drive valuable business outcomes. Looking for a kick-start to get your solutions off the ground? Contact us to get started.

The post What is A Data Fabric Architecture and What Are The Design Considerations? appeared first on Enterprise Knowledge.

Knowledge Portal for a Global Investment Firm

EK Team — Tue, 04 Apr 2023 15:36:19 +0000

The Challenge

A major investment firm that manages over 250 billion USD in assets in a variety of industries across the globe engaged Enterprise Knowledge (EK) to fix their siloed information and business practices by designing and implementing a Knowledge Portal.

Siloed data scattered across multiple systems resulted in investment professionals wasting valuable time searching for the knowledge assets required to make fast, complete, and informed decisions. The firm manages a diverse portfolio of global assets and investments, with over 50,000 employees. Detailed records of these business deals existed as an incongruous mix of structured and unstructured information located across multiple repositories. Even gaining access to much of this information required awareness it existed, as well as knowledge of whom to contact to be granted permissions.

To fill knowledge gaps caused by misplaced or inaccessible content, investment teams also commissioned research reports and studies to support their decision-making processes. However, these reports were seldom shared across the organization and, in fact, were often duplicated across teams. Additionally, since investment records were siloed based on division and investment types, the firm was not leveraging the vast expertise of its employees.

The firm recognized it needed a centralized way to find, view, and share its knowledge assets and connect staff to experts. The solution required improved visibility across data resources, access management practices, and the ability to connect with expert staff.

The Solution

EK designed, developed, and deployed an Enterprise Knowledge Portal, leveraging a suite of best-of-breed technologies. EK first conducted business case refinement sessions to understand, in-depth, the problems that the Knowledge Portal needed to solve and its benefits to the firm, defining a series of personas, use cases, and user journeys to help prioritize key features along an Agile development plan. EK then developed the Agile roadmap for an MVP solution that would offer immediate value to the firm’s staff and business ventures while proving the value of the Knowledge Portal, as well as the follow-on backlog of features for an enhanced solution that added to the value of the foundation model to be delivered iteratively.

Over the next year, EK worked side by side with the client’s business and IT groups, as well as other third-party vendors, in order to iteratively develop and test the solution. The MVP was delivered on time, and EK is now continuing development of the system, adding additional features and back-end sources in order to enrich the overall wealth of knowledge, information, and data within the system.

Overall, the Knowledge Portal delivers several first-of-its-kind features for the organization, including:

Integration of structured and unstructured data, not just as links but as displayed results that merge source materials for easy comprehension, analysis, and user action;
Ability to understand and align complex security models, displaying only that content that should be accessible to each individual;
Machine Learning and AI to provide highly customized views, automatically assembled based on the user; and
Integration of all types of information with people, enabling individuals to find experts across the enterprise in a way that forms new connections and identifies new opportunities for collaboration.

Of specific note, the portal’s search application leverages a graph database modeled to integrate information from an extensive network of internal data sources for delivery in a single search result. For example, a single investment might combine information from as many as 12 different systems. In addition to the graph database, the portal leverages an insights engine powered by Artificial Intelligence (AI) that unifies siloed data and detects trends across repositories. The graph database and insights engine alike are powered by a semantic layer that maintains the relationships that users could take advantage of to traverse data sets existing across the enterprise, enhancing relevant content findability regardless of a user’s business role. Additionally, EK mapped user roles to access and permissions to refine the firm’s access controls, streamlining navigation of the firm’s data, thereby reducing reliance on staff to determine levels of access and increasing the efficiency of knowledge discovery.

In support of the advanced technology and ongoing system enhancement, EK also focused on several key foundational KM topics to ensure the long-term success and adoption of the Knowledge Portal. These included content governance and a wide-ranging content cleanup, migration, and enhancement effort, taxonomy and ontology design accompanied by a tagging strategy, change and communications, and content type design. These activities and deliverables ensured that the content and data integrated within the Knowledge Portal could be trusted, was easily consumable both by humans and machines, and would be maintained and further improved moving forward. Furthermore, the accompanying communications and education plan delivered an engaged and aware user base, ready to get value from the new tool.

The EK Difference

EK delivered every aspect of the Knowledge Portal solution using its own staff, deployed across three continents in order to support the client’s global needs. EK brought a broad range of internal experts to bear for this initiative, including knowledge management experts, software engineers, solution architects, change and organizational design experts, taxonomists, ontologists, content strategists, and UI/UX designers and developers. This unique assortment of experts collaborated on every element of the initiative to ensure it leveraged EK’s advanced methodologies and best practices and that the business stakeholders were engaged, aligned, and supportive of the new system.

This effort was also run leveraging true Agile principles to reduce risk and optimize stakeholder engagement and comprehension of the complex initiative. EK’s team of consultants and engineers expertly applied Agile to quickly adapt to unforeseen changes and roadblocks in development. As a result, rather than talking about the Knowledge Portal, we were able to show early prototypes, spawning a wealth of end-user understanding and feedback from the first months of the project.

The Results

The Knowledge Portal consolidated the firm’s vast intellectual resources in a single searchable space, arming investment professionals with easy access to valuable information and connections to experienced staff in ways that had never before been possible. EK is continuing to iterate to add additional features and sources, but the results are already being felt by the organization. Key performance indicators and milestones to date include:

Strong adoption, with overall user counts increasing and extremely high retention of all users.
Less time spent looking for information or recreating organizational knowledge, resulting in overall higher productivity and employee satisfaction.
Faster upskilling of new hires and junior staff, with more junior staffers reporting an ability to complete tasks without waiting for guidance from others.
Less redundant acquisitions of external research and data sets.

With additional iterations of the Knowledge Portal planned for release over the next two years, the organization continues to partner with EK and invest in the tool as a transformative solution for the organization.

Download Flyer

Ready to Get Started?

Get in Touch

The post Knowledge Portal for a Global Investment Firm appeared first on Enterprise Knowledge.

Building an Innovative Learning Ecosystem at Scale with Graph Technologies

EK Team — Mon, 28 Nov 2022 18:34:31 +0000

Todd Fahlberg, a Portfolio Manager for Enterprise Knowledge, and Amber Simpson, a Senior Manager at Walmart Academy, presented on November 9, 2022 at the KMWorld Conference in Washington, DC on the topic of Building an Innovative Learning Ecosystem at Scale with Graph Technologies. In this presentation, Fahlberg and Simpson share how they’re making it easier for Walmart’s learning organization to manage content used by 2.4 million global associates with a custom Digital Library. The presentation provides insight into the challenges they faced and the lessons they learned along the way, in addition to their approach to design and implement the Digital Library. Todd and Amber also detail how and why they used graph technologies to make certain their solution can continue to scale to meet the needs of Walmart’s massive workforce and evolving business needs.

The post Building an Innovative Learning Ecosystem at Scale with Graph Technologies appeared first on Enterprise Knowledge.

Elevating Your Point Solution to an Enterprise Knowledge Graph

Joe Hilger — Wed, 16 Nov 2022 16:08:39 +0000

I am fortunate to be able to speak with many vendors in the Graph space, as well as company executives and leaders in IT and KM departments around the world. So many of these people are excited about the power of knowledge graphs and the graph databases that power them. They want to know how to turn their point solution into an enterprise-wide knowledge graph powering AI solutions and solving critical problems for their clients or their companies. I have answered this question enough times that I thought I would share it in a blog post for others to learn.

Knowledge graphs are new and exciting tools. They provide a different way of managing information and can be used to solve a wide range of problems. Early adopters of this technology typically start with a small, targeted solution to “try it out.” This is a smart way to learn about any new technology, but all too often the project stops at a point solution or becomes pigeonholed for solving one problem when it can be used to solve so many more. The organizations that can grow and expand their graph solution have three things in common:

A backlog of use cases,
An enterprise ontology, and
Marketing and change management.

Knowledge graphs can solve many different types of problems. They can be recommendation engines, search enhancers, AI engines, data fabrics, or knowledge portals. That first solution that an organization picks only does one of these things, and it may also be targeted to just one department or one problem. This is a great way to start, but it can also lead to a stovepipe solution that misses some of the real power of graphs.

When we start knowledge graph projects with new clients, we always run a workshop with business users from across the organization. During this workshop, we share examples of what can be done with knowledge graphs and help them identify a backlog of use cases that their new knowledge graph can solve. This approach creates excitement for the new technology and gives the project team and the business a vision for how to add to what was built as part of the first solution. Once the first solution is effectively launched, the organization has a roadmap for what is next. If you have already launched your solution and do not have a backlog of use cases, that is okay. You can host a graph workshop at any time to create a list of the next projects. The most important thing is to get that backlog in place and begin to share it with your leadership team so that they can budget for the next project.

The structure of a graph is defined by an ontology. Think of an ontology as a model describing the information assets of the business and how they fit together. Graph databases are easy to change, so organizations can get started with simple knowledge graphs that solve targeted problems without an ontologist. The problem is, the solution will be designed to solve a specific problem rather than being aligned with the business as a whole. A good ontologist will design a model that both solves the initial problem being addressed and aligns with the larger business model of the organization. For example, a graph-enhanced search at a manufacturing company may have products, customers, factories, parts, employees, and designs. The search could be augmented with a simple knowledge graph that describes parts. An ontologist would use this opportunity to model the relationships of all of the organization’s entities up front. This more inclusive approach would allow for a wider range of search results and could serve as the baseline for a number of other projects. This same graph could fuel a recommendation service or chatbot for their customers. It could also be used as the map for their data elements to create a data fabric that simplifies the way people access data within the organization. One graph, properly designed, can easily expand to become the enterprise backbone for a number of different enterprise-centric applications.

Building a backlog of use cases and creating a proper ontology helps ensure that there is a framework and plan to grow. The final challenge in turning a point solution into an enterprise knowledge graph has to do with marketing the solution. Knowledge graphs and graph databases are still new, and the number of things they can do is very broad (see Using Knowledge Graph Data Models to Solve Real Business Problems). As a result, executives often do not know what to do with knowledge graphs. It is important to set success criteria for your point solution and regularly communicate the value it adds to the business. This brings attention to the solution and opens the door for discussions about expanding the knowledge graph. Once you have the executive’s attention, educate them as to what knowledge graphs can do through the industry literature and the backlog of use cases that you have already gathered. This will allow executives to see how they can get even greater value from their investment and drive more funding for your knowledge graph.

Knowledge graphs are powerful information management tools that are only now becoming fully understood. The leading graph database vendors offer free downloads of their software so that organizations can start to understand the true power of these tools. Unfortunately, too often these downloads are used only for small projects that disappear over time. The simple steps I have described above can pave the way to turn your initial project into an enterprise platform powering numerous, critical Artificial Intelligence solutions.

Learn more about how we enable this for our clients by contacting us at info@enterprise-knowledge.com.

The post Elevating Your Point Solution to an Enterprise Knowledge Graph appeared first on Enterprise Knowledge.

Learning 360: Crafting a Comprehensive View of Learning Content Using a Graph

Chris Marino — Wed, 03 Aug 2022 13:00:44 +0000

Chris Marino, a Principal Solution Consultant at Enterprise Knowledge (EK), was a featured speaker at this year’s Data Architecture Online event organized by Dataversity. Marino presented his webinar “Learning 360: Crafting a Comprehensive View of Learning Content Using a Graph” on July 20, 2022. In his presentation, Marino took participants through the entire Graph development process, including planning, designing, and developing the new tool, highlighting benefits to the organization and lessons learned throughout the process.

The post Learning 360: Crafting a Comprehensive View of Learning Content Using a Graph appeared first on Enterprise Knowledge.

Where Does a Knowledge Graph Fit Within the Enterprise?

Sara Nash — Thu, 21 Apr 2022 16:19:33 +0000

Our clients often assume that building a knowledge graph requires that all data be managed in a single place for it to be effective. That is not the case. There are a variety of ways that organizations can solve for their knowledge-first and relationship-based use cases while maintaining aspects of their existing data architecture. In this way, graph data is not a “one size fits all” solution. The spectrum of leveraging graph data models spans from using a graph database as the primary data storage to using an ontology model as the blueprint for a relational data schema.

Graph Database as Primary Storage: All data and ontology is stored within the graph database, ingesting all relevant source data and enabling inference and reasoning capabilities.

Graph Database as Relationship Management and Taxonomy Integration: Relationships between core concepts and content metadata (like taxonomy tags on documents) are stored within the graph, but actual content and descriptive metadata are stored within other systems and are connected to the graph via virtualization.

Graph Data Model (Ontology) as Relational Data Schema: The ontology model is an Enterprise Relational Diagram (ERD) that sets the “vision” for how to connect and leverage data stored in a relational database.

Organizations can see the value of capturing relationships in a machine-readable way, even when not all of the data relevant to the use case is captured in a graph database. The model that makes sense for your organization and your use case is dependent on factors including:

Restrictiveness of source data systems;
Volume and scale of data;
Enterprise architecture maturity;
Inference and reasoning needs; and
Integration needs with downstream systems.

At EK, we design graph-based architectures in a way that leverages your organization’s specifications and conventions, while introducing best practices and standards from the industry. Looking to get started? Contact us.

The post Where Does a Knowledge Graph Fit Within the Enterprise? appeared first on Enterprise Knowledge.

Expert Analysis: Does My Organization Need a Graph Database?

EK Team — Fri, 14 Jan 2022 15:00:16 +0000

As EK works with our clients to integrate knowledge graphs into their technical ecosystems, client stakeholders often ask, “Why should we leverage knowledge graphs?” and more specifically, “Do we need a graph database?” Our consultants then collaborate with stakeholders to weigh the pros and cons of using a knowledge graph and graph database to solve their findability and discoverability use cases. In this blog, two of our senior technical consultants, Bess Schrader and James Midkiff, answer common questions about knowledge graphs and graph databases, focusing on how to best fit them into your organization’s ecosystem without overengineering the solution.

Why should I leverage knowledge graphs?

Bess Schrader

Knowledge graphs model information in the same way human beings think about information, making it easier to organize, store, and logically find data. This reduces the silos between technical users and business users, reduces the ambiguity about what data and information mean, and makes knowledge more sustainable and accessible. Knowledge graphs are the implementation of an ontology, a critical design component for understanding your organization.

Many graph databases also support inference, allowing you to explore previously uncaptured relationships in your data, based on logic developed in your ontology. This reasoning capability can be an incredibly powerful tool, helping you gain insights from your business logic.

James Midkiff

Knowledge graphs are a concept, a way of thinking, and they aren’t always necessarily tied to a graph database. Even if you are against adopting a graph database, you should design an ontology for your organization’s data to visualize and align with how you and your colleagues think. Modeling your organization gives you the complete view and vision for how to best leverage your organization’s content. This vision is your knowledge graph, an innovative way for your organization to tackle the latest data problems. However, this ontology doesn’t have to be implemented in a graph database. The technical implementation should be built using technologies that efficiently support the use cases and are easy to maintain.

Does my use case require a graph database?

Bess Schrader

Any organization that wants to map its internal data to external data would benefit from a graph. If your use case includes publishing your data and connecting to other data sets, a knowledge graph and graph database (particularly one that uses the Resource Description Framework, or RDF) are the way to go to ensure the data is flexible and interoperable. Even if you do not intend to connect and/or publish data, storing robust definitions alongside data in a graph is one of the best ways to ensure that the meaning behind fields is not lost. With the addition of RDF*, the expressivity of a graph to describe organizational data is unmatched by other data formats.

When your ontology and instance data are all in the same place (a graph database), technical and non-technical users alike can always determine what a given field is supposed to mean. This ensures that your data is sustainable and maintainable. For example, many organizations use acronyms or abbreviations when setting up relational databases or nested data structures like JSON or XML. However, the definition and usage notes for these fields are often not included alongside the data itself. This leads to situations where data consumers and developers may find, for example, a field called “pqf” in a SQL table or JSON file created several years ago by a former employee. If no one at the organization knows what “pqf” means or what downstream systems might be using this field, this data becomes an unusable maintenance burden.

However, using well-formed ontologies and RDF knowledge graphs, this property “pqf” would be a “first-class citizen” with its own properties, including a label (“Prequalified for Funding”) and definition (“This field indicates whether a customer has been prequalified for a financial product. A value of ‘true’ indicates that the customer has been preapproved”), explaining what the property means and how it should be used. This reduces ambiguity and confusion for both developers and data consumers.

James Midkiff

A majority of knowledge graph use cases involve information discovery and search. Graph databases are flexible, allowing you to easily adapt the model as new data and use cases are considered. Additionally, graphs make it painless to aggregate data from separate data sources and combine the data to create a single view of an employee, topic, or other important entity at your organization. Below are some questions to ask when faced with this question.

Does the use case require data model flexibility and are the use cases going to adapt or change over time?
Do you need to combine data from multiple sources into a single view?
Do you need to be able to search for multiple types of information simultaneously?

If you answer yes to any of the above, graph databases are a good solution. Some use cases do not require cross-entity examination (i.e. asking questions across relationships) or are not easily calculated in a graph. In these cases, you should not invest in learning and implementing a graph database. As an alternative, you can create a dynamic model inside of a NoSQL database and provide search functionality via a search engine. You can also do network-based and machine learning calculations in your programming language of choice after a small data transformation. As stated previously, implementations should be largely driven by the use cases they are supporting and will support in the future.

I’m nervous about migrating to a new data format. Why should my team learn about and invest in graph database technologies?

Bess Schrader

In addition to the advantages described above, one major benefit of using RDF-compliant graph databases is that they’re based on standards that have been maintained by the W3C for over two decades. These standards, including RDF and SPARQL, were developed over 20 years ago to promote long-term growth for the web. In other words, RDF is not a trendy new format that may disappear in five years, and you can be confident when investing in learning about this technology. The use of standards provides freedom from proprietary vendor tools, enabling you to effortlessly create, move, integrate, and share your data between different standards-compliant software. Using semantic web standards also enables you to seamlessly connect your content and data to a taxonomy (whether internal or external), as most taxonomies are created and stored in an RDF format.

Similarly, SPARQL, the RDF query language, is based on pattern matching and can be easier to learn for non-technical users than more complex programming languages. SPARQL also allows for federated queries, which enable a user to query across multiple knowledge graphs stored in different graph databases, as long as the databases are RDF-compliant. Using federated queries, you could query your own data (e.g. a dataset of articles discussing the stock market and finances) in combination knowledge graphs like with Wikidata, a free and openly accessible RDF knowledge graph used by Wikipedia. This would allow you to take any article that mentions a stock ticker symbol, follow that symbol to the Wikidata entry, and pull back the size and industry of the organization to which the ticker refers. You could then leverage this information to filter your articles by industry or company size, without needing to gather that information yourself. In other words, federated queries allow you to query beyond the bounds of your own organization’s knowledge.

James Midkiff

Many organizations do not need to externally share the knowledge graph data they create. The data may support externally-facing use cases, like chatbots, search, and knowledge panels, and this is usually more than sufficient to meet an organization’s knowledge graph needs. Taxonomies can be transformed and imported into any relational or NoSQL database in a similar manner that we use to translate all other data formats into RDF when building a graph. While graph databases can make this connection more seamless, they are by no means the only way to implement a taxonomy. Relational and NoSQL databases are more commonly used, making it easier to find the necessary skill sets to implement and maintain them. With so many developers used to query languages like SQL, the pattern-based nature of SPARQL can be difficult for developers to learn and adopt.

To be clear, graph databases are an investment. They’re a shift in how we approach and integrate data, which can lead to some adoption costs. However, they can also bring advantages to an organization in addition to what Bess mentioned above.

Comprehensive, Connected Data – Graphs provide descriptive data models and the ability to query and combine multiple graphs together painlessly, without requiring the join tables, intermediary schemas, or rules often required by relational databases.
Extendable Foundation – Knowledge graphs and graph databases enable the reuse of existing information as well as the flexibility to add more types of data, properties, and relationships to implement new use cases with minimal effort.
Lower Costs – The upfront investment (licensing fees, the cost of migrating data, and the cost of hiring or growing the appropriate skill sets) can balance out in the long term given the flexibility to adapt the data model with evolving data and use cases.

Graph technologies are important to consider when building for the future and scale of data at your organization.

Conclusion

Like any major data architecture component, graph databases have their fair share of both pros and cons, and the choice to use them will ultimately come down to what fits the needs of each organization. If you’re looking for help in determining whether a graph database is a good fit for your organization, contact us.

The post Expert Analysis: Does My Organization Need a Graph Database? appeared first on Enterprise Knowledge.