machine learning Articles - Enterprise Knowledge

Beyond Traditional Machine Learning: Unlocking the Power of Graph Machine Learning

Kaleb Schultz — Thu, 12 Jun 2025 13:46:44 +0000

Traditional machine learning (ML) workflows have proven effective in a wide variety of use cases, from image classification to fraud detection. However, traditional ML leaves relationships between data points to be inferred by the model, which can limit its ability to fully capture the complex structures within the data. In enterprise environments, where data often spans multiple, interwoven systems—such as customer relations, supply chains, and product life cycles—traditional ML approaches can fall short by missing or oversimplifying relationships that drive critical insights into customer behavior, product interactions, and risk factors. In contrast, graph approaches allow these relationships to be explicitly represented, enabling a more comprehensive analysis of complex networks.

Graph machine learning (Graph ML) offers a new paradigm for handling the complexities of real-world data, which often exists in interconnected networks. For example, Graph ML can be leveraged to build highly effective recommender systems, to identify critical connections and enhance decision-making. Unlike traditional ML, which treats data as independent observations, Graph ML captures the interactions and connections between data points, revealing patterns that are invisible to traditional methods. Recognizing the pivotal role of graph technologies in driving innovation across data analytics, data professionals are increasingly optimizing their workflows to harness these powerful tools. But why should data professionals care about Graph ML? By understanding these differences and leveraging graph structures, data professionals can unlock new predictive capabilities that were previously out of reach. Whether you’re aiming to enhance fraud detection, optimize recommendation systems, or improve social network analysis, Graph ML is an increasingly valuable tool in the data scientist’s toolkit.

In this blog, we will explore the unique advantages that Graph ML offers over traditional approaches. We’ll dive into graph-specific considerations throughout each step of the machine learning process, from pre-processing to model evaluation, and provide expert advice for effectively integrating graph techniques into your ML workflow. While you can use standard machine learning processes to answer simple use cases and scenarios such as image classification, basic customer churn prediction, or straightforward regression analysis–graph machine learning allows you to tackle richer, network-driven scenarios, including fraud detection through network anomaly patterns, sophisticated recommendation engines built on user-item graphs, and social network influence analysis. If you haven’t yet built a graph for your organization, here are the high-level steps: identify the entities and relationships within your use-case, build a graph schema, and load your data into a graph database. For more in-depth guidance, see this detailed guide on developing an enterprise-level graph. This process often includes breaking down your data into triples (subject-predicate-object) and representing the connections between nodes through methods like adjacency matrices, embeddings, or random walks.

Understanding the ML Development Lifecycle

Machine Learning (ML) Development Lifecycle

Take a moment to review the ML development lifecycle wheel above. The wheel is divided into five distinct sections: Pre-Processing, Train-Test Split, Model Training, Model Evaluation, and Document and Iterate. Below, we start with Pre-Processing, where we transform raw data into a graph structure, extract critical graph features, and apply compression techniques to manage complexity. Each subsequent section will build on these foundations by detailing the specific approaches and methodologies used in Train-Test Split, Model Training, and Model Evaluation. The wheel serves as our roadmap for understanding how Graph Machine Learning allows for deeper insights from complex, interconnected data.

Step 1: Pre-Processing

Graph Conversion

Business Value: In traditional ML, raw data is processed as independent feature vectors–meaning models often miss the relationships between entities and can’t leverage network effects. In contrast, graph conversion allows for the systemic mapping of raw data into a structured network of entities and relationships, revealing new insights and perspectives.

The first step in Pre-Processing is graph conversion. Graph conversion is the process of transforming unstructured, semi-structured, or structured data into a graph model where individual entities become nodes and the connections, or relationships, between them are represented as edges. This conversion lays the groundwork for advanced graph analysis by explicitly modeling the relationships within the data, rather than leaving all of the connections to be inferred.

This foundational graph conversion not only organizes the raw data into a clear structure but also enables the extrapolation of clusters, central nodes, and intricate, multi-hop relationships. This structured representation not only enhances the clarity of data analysis, but also establishes a foundation for scalable predictive modeling and clearer understanding of intricate linkages. This base sets the stage for the next step of Pre-Processing: Graph Feature Extraction.

Graph Feature Extraction

Business Value: Conventional feature extraction methods treat each data point in isolation, often missing how entities connect in a network. Graph features capture both individual data attributes and relational patterns, allowing models to assess influence, connectivity, and community dynamics, providing a richer context compared to traditional feature extraction.

Graph-specific feature extraction captures not only individual data point attributes but also the relationships and structural patterns between data points that traditional methods miss. Graph features, such as Degree Centrality and Betweenness Centrality, reveal the importance of a node within the overall network, allowing models to predict how influential or well-connected an entity is in relation to others.

Features like PageRank Scores help in ranking nodes based on their connectivity and importance, making them especially useful in recommendation systems and fraud detection, where influence and connectivity play a key role. Clusters and community detection features capture groups of interconnected nodes, enabling tasks like identifying suspicious behavior within certain groups or detecting communities in social networks. These rich, interconnected features allow Graph ML models to make predictions based on the broader context, not just isolated points, giving them a deeper understanding of the data’s inherent relationships. This comprehensive feature extraction naturally leads into the next step in Pre-Processing: Compression, where we streamline the data while preserving these critical relational insights.

Graph Feature Extraction through Degree Centrality

Compression

Business Value: Graph compression preserves structural relationships while reducing complexity, enabling efficient analysis without sacrificing key insights embedded in the graph’s intricate connections.

Compression is used to reduce the size, complexity, and redundancy of a graph while ensuring its structure and information are preserved. In traditional ML, dimensionality reduction methods like PCA or feature selection are used to reduce data complexity, but these methods overlook the relational structure between entities. In contrast, graph compression techniques, such as node embeddings, graph pruning, and adjacency matrix compression, preserve the graph’s inherent connections and patterns while simplifying the data. Node embeddings, in particular, are a powerful way to represent nodes as feature-rich vectors, capturing both the attributes of a node and its relational context within the graph.

Compression is an essential step in Graph ML because graphs often contain far more intricate details about the relationships between entities, which require high computing power to analyze. Compression helps reduce the noise and irrelevant connections that can distract the model, allowing it to focus on the most critical relationships within the graph. This ability to retain the most meaningful structural information while reducing computational overhead gives Graph ML an edge over traditional methods, which may lose key insights during dimensionality reduction. With Compression, the Pre-Processing phase is complete, setting a clear and efficient foundation as we move into Step 2: Train-Test Split.

Compression through Embeddings

Step 2: Train-Test Split

Subgraph Sampling

Business Value: Basic train-test splitting methods sample instances without regard to connectivity, which can sever critical network links, so subgraph sampling ensures the test set reflects the overall graph structure, allowing models to learn and generalize from complex relationships present in real-world data.

Subgraph sampling is an essential part of graph machine learning, as it ensures the test set is representative of the entire graph structure by sampling subgraphs that reflect the entities and relationships in the overall graph. In traditional ML, splitting data into training and test sets is straightforward because data points are often independent, but graph data introduces dependencies between nodes. Complex graph data captures interconnected relationships such as communities, hierarchies, and long-range dependencies that traditional models would overlook. Subgraph sampling preserves these relationships, enabling the model to learn from the complex structures and generalize better to unseen data. By capturing these dependencies in the train-test split, the model maintains a more complete understanding of how entities interact, allowing it to make better predictions in cases where the relationships between data points are key, such as social network analysis or fraud detection. This careful sampling also highlights the need to address potential overlaps in relationships, which leads us into the next critical consideration: Link Leakage.

Train-Test Split with Proportional Representation

Link Leakage

Business Value: Random or node-based splitting can place connected nodes across sets, allowing information to leak via edges. Edge-based splitting prevents information leakage between training and test sets, preserving the integrity of graph relationships and delivering reliable, unbiased predictions.

Link leakage occurs when connections between nodes in the training data indirectly reveal information about the test data. Traditional ML doesn’t face this issue because data points are independent, but in graph ML, relationships between nodes can lead to unintended overlap between the training and test sets. To mitigate this, consider splitting the data by edges to ensure that the test set remains independent of the training set’s connections. Splitting on edges maintains the graph’s inherent relational information, which is a crucial advantage of graph data. This method allows the model to learn from the complex interdependencies within the graph, leading to more accurate predictions in real-world applications like fraud detection or recommendation systems. It also helps avoid biases that may arise from overlapping connections, enhancing the overall reliability of the model. This edge-based approach is vital in graph ML, where traditional methods fall short in addressing these complex dependencies. With a robust solution for link leakage in place, we are now ready to transition into the next major phase: Step 3, Model Training.

Compressing Train-Test Split Methods

Step 3: Model Training

Business Value: Conventional ML models treat instances independently and can’t model dependencies across entities, so graph-specific algorithms capture complex dependencies and relationships that traditional ML models often overlook, enabling deeper insights and more accurate predictions for tasks driven by connections.

Using algorithms designed specifically for graph data allows you to fully leverage the unique relationships and structures present in graph data, such as the connections between nodes, the importance of specific relationships, and the overall topology of the graph. Traditional ML models, such as decision trees or linear regression, assume that data points are independent and often fail to capture complex dependencies. In contrast, graph algorithms—like node classification, edge prediction, community detection, and anomaly detection—are built to capture the interdependencies between nodes, edges, and their neighbors. These algorithms can uncover patterns and dependencies that are hidden from traditional approaches, such as identifying key influencers in a network or detecting anomalies based on unusual connections between entities.

By utilizing graph algorithms, you can gain deeper insights and make more accurate predictions, especially in tasks where relationships between entities play a critical role, such as fraud detection, recommendation systems, or social network analysis. These insights, driven by the relational data that graph models are designed to exploit, give graph ML a clear advantage when interactions between entities drive outcomes. Following model training, it is essential to evaluate the performance of these specialized models.

Use Cases for Graph Algorithms

Step 4: Model Evaluation

Business Value: Standard evaluation metrics measure prediction independently and ignore graph structure. However, graph-specific metrics offer a more nuanced assessment of graph model performance, capturing structural relationships that traditional metrics overlook.

While common performance metrics apply broadly to most graph ML use cases, there are also specialized metrics for graph ML—such as Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Modularity. Traditional ML evaluation metrics like accuracy or F1-score work well for independent data points, but they don’t fully capture the nuances of graph structures, such as community detection or link prediction. Graph-specific performance metrics provide a more nuanced evaluation of models, ensuring that the unique aspects of graph structures are effectively measured and optimized. When you are evaluating a graph ML model, you are able to determine model performance with enhanced structural awareness, contextual evaluation, and imbalanced data handling—areas where traditional ML metrics often fall short.

Comparing Graph Performance Metrics

Graph ML Solution Components

To implement Graph ML successfully, an organization needs a cohesive set of features that support the entire graph workflow. At a minimum, you must have:

(1) a scalable graph storage layer that can ingest and index heterogeneous data sources (including batch and streaming updates) while enforcing a flexible schema;

(2) a pre-processing engine capable of automatically extracting and managing entity and relationship attributes (e.g., generating node and edge-level features);

(3) integrated support for generating and storing graph embeddings and/or handcrafted graph features (such as centrality scores, community assignments, or path-based statistics);

(4) a library of graph algorithms and GNN (Graph Neural Network) frameworks that can train on large-scale graphs, ideally with GPU-acceleration and distributed compute options;

(5) real-time inference capabilities that preserve graph connectivity (so predictions like link-forecasting or node classification remain aware of the surrounding network);

(6) visualization and exploration tools that let data teams inspect subgraphs, feature distributions, and model explainability outputs; and

(7) robust security, access controls, and lineage tracking to ensure data governance across the graph pipeline.

Case Studies – Adapt and Overcome

Bioscience Technology Provider Case Study

With the methodology now established, let’s take a look at a real-world situation. A leading bioscience technology provider’s e-commerce platform struggled to connect its 70,000 products and related educational content–each siloed across more than five different systems–using only keyword search, so we applied the same GraphML workflow outlined above to bridge those gaps. We ingested data from their various platforms into an in-memory knowledge graph, generated vector embeddings to capture content relationships, and trained a custom link-prediction model (sampling known product-content links rather than enforcing link-leakage controls) to infer new connections. The resulting similarity-index and link-classifier views were delivered via an interactive dashboard, validated through human-in-the-loop sessions, and backed by comprehensive documentation and a repeatable AI validation framework. While we skipped graph-specific metrics (favoring standard ML measure like AUC, precision, and recall) to accelerate delivery, this guideline-driven approach demonstrates how the techniques in this blog can be pragmatically adapted to real-world constraints.

Occupational Safety Government Agency

Another applied use case revolves around an occupational safety government agency. Enterprise Knowledge prototyped a semantic recommender that could infer potential workplace hazards from diverse site features–taxonomy values, structured historical datasets, and thousands of unstructured incident reports–so planners could rapidly assess risks and compliance requirements. We began by designing a custom taxonomy and ontology to model construction site elements and regulations, then processed data with zero-shot NLP on a distributed GPU cluster and loaded everything into an RDF knowledge graph. From there we generated vector embeddings and trained a custom edge-prediction classifier to link scenarios to likely risks, deploying the results in a cloud-hosted web app. Guided by performance metrics on a held-out test set, each step was iteratively refined through user feedback and expert workshops. EK maintained continuous collaboration with agency experts through regular design sessions and concluded with a detailed technical walkthrough to ensure transparency and client buy-in. Backed by detailed documentation and a clear roadmap for expanding feedback loops and analysis dimensions, this solution underscores that the GraphML lifecycle is a flexible framework–teams should tailor or simplify steps as needed to align with real-world constraints like timelines, data availability, and resource limits.

Conclusion

Graph machine learning offers a transformative approach to working with complex, interconnected datasets. By leveraging graph-specific techniques like feature extraction, compression, and graph algorithms, you can unlock deeper insights that go beyond what traditional ML can achieve. Whether it’s in community detection, fraud prevention, or recommendation systems, graph ML provides a way to model relationships and structures that traditional methods often miss. As we move toward a future where graph technologies are increasingly integrated into data workflows, it’s clear that understanding and applying these methods can lead to more accurate predictions, better decision-making, and ultimately, a competitive edge. If you’re interested in unlocking the potential of graph ML for your organization, contact EK to learn more!

The post Beyond Traditional Machine Learning: Unlocking the Power of Graph Machine Learning appeared first on Enterprise Knowledge.

Enhancing Insurance Fraud Detection through Graph-Based Link Analysis

EK Team — Wed, 21 May 2025 17:29:35 +0000

The Challenge

Technology is increasingly used as both a force for good and as a means to exploit vulnerabilities that greatly damage organizations – whether financially, reputationally, or through the release of classified information. Consequently, efforts to combat this fraud must evolve to become more sophisticated with each passing day. The field of fraud analytics is rapidly emerging and, over the past 10 years, has expanded to include graph analytics as a critical method for detecting suspicious behavior.

In one such application, a national agency overseeing insurance claims engaged EK to advise on developing and implementing graph-based analytics to support fraud detection. The agency had a capable team of data scientists, program analysts, and engineers focused on identifying suspicious activity among insurance claims, such as:

Personal information being reused across multiple claims;
Claims being filed under the identities of deceased individuals; or
Individuals claiming insurance from multiple locations.

However, they were reliant on relational databases to accomplish these tasks. This made it difficult for program analysts to identify subtle connections between records in tabular format, with data points often differing by just a single digit or character. Additionally, while the organization was effective at flagging anomalies and detecting potentially suspicious behavior, they faced challenges relating to legacy software applications and limited traditional data analytics processes.

EK was engaged to provide the agency with guidance on standing up graph capabilities. This graph-based solution would transform claim information into interconnected nodes, revealing hidden relationships and patterns among potentially fraudulent claims. In addition, EK was asked to build the agency’s internal expertise in graph analytics by sharing the methods and processes required to uncover deeper, previously undetectable patterns of suspicious behavior.

The Solution

To design a solution suitable for the agency’s production environment, EK began by assessing the client’s existing data infrastructure and analytical capabilities. Their initial cloud solution featured a relational database, which EK suggested extending with a graph database through the same cloud computing platform vendor for easy integration. Additionally, to identify suspicious connections between claims in a visual format, EK recommended an approach for the agency to select and integrate a link analysis visualization tool. These tools are crucial to a link analysis solution and allow for the graphical visualization of entities alongside behavior detection features that identify data anomalies, such as timeline views of relationship formation. EK made this recommendation using a custom and proprietary tooling evaluation matrix that facilitates informed decision-making based on a client’s priority factors. Once the requisite link analysis components were identified, EK delivered a solution architecture with advanced graph machine learning functionality and an intuitive user experience that promoted widespread adoption among technical and nontechnical stakeholders alike.

EK also assessed the agency’s baseline understanding of graphical link analysis and developed a plan for upskilling existing data scientists and program analysts on the foundations of link analysis. Through a series of primer sessions, EK’s subject matter experts introduced key concepts such as knowledge graphs, graph-based link analysis for detecting potentially suspicious behavior, and the underlying technology architecture required to instantiate a fully functional solution at the agency.

Finally, EK applied our link analysis experience to address client challenges by laying out a roadmap and implementation plan that detailed challenges along with proposed solutions to overcome them. This took the form of 24 separate recommendations and the delivery of bespoke materials meant to serve as quick-start guides for client reference.

The EK Difference

A standout feature of this project is its novel, generalizable technical architecture:

During the course of the engagement, EK relied on its deep expertise in unique domains such as knowledge graph design, cloud-based SaaS architecture, graph analytics, and graph machine learning to propose an easily implementable solution. To support this goal, EK developed an architecture recommendation that prompted as few modifications to existing programs and processes as possible. With the proposed novel architecture utilizing the same cloud platform that already hosted client data, the agency could implement the solution in production with minimal effort.

Furthermore, EK adapted a link analysis maturity benchmark and tool evaluation matrix to meet the agency’s needs and ensure that all solutions were aligned with the agency’s goal. Recognizing that no two clients face identical challenges, EK delivered a customized suite of recommendations and supporting materials that directly addressed the agency’s priorities, constraints, and long-term needs for scale.

The Results

Through this engagement, EK provided the agency with the expertise and tools necessary to begin constructing a production-ready solution that will:

Instantiate claims information into a knowledge graph;
Allow users to graphically explore suspicious links and claims through intuitive, no-code visualizations;
Alert partner agencies and fraud professionals to suspicious activity using graph-based machine learning algorithms; and
Track changes in data over time by viewing claims through a temporal lens.

In parallel, key agency stakeholders gained practical skills related to knowledge graphs, link analysis, and suspicious behavior detection using graph algorithms and machine learning, significantly enhancing their ability to address complex insurance fraud cases and support partner agency enforcement efforts.

Interested in strengthening your organization’s fraud detection capabilities? Want to learn what graph analytics can do for you? Contact us today!

Download Flyer

Ready to Get Started?

Get in Touch

The post Enhancing Insurance Fraud Detection through Graph-Based Link Analysis appeared first on Enterprise Knowledge.

Choosing the Right Approach: LLMs vs. Traditional Machine Learning for Text Summarization

Kyle Garcia — Tue, 05 Nov 2024 18:34:35 +0000

In an era where natural language processing (NLP) tools are becoming increasingly sophisticated and accessible, many look to automate text-related processes such as recognition, summarization, and generation to save crucial time and effort. Currently, both machine learning (ML) models and large language models (LLMs) are being used extensively for NLP. Choosing a model to use is dependent on various factors depending on client needs and consultant team capabilities. Summarization through machine learning has come a long way throughout the years, and is now an extremely viable and attractive option for those looking to automate natural language processing.

In this blog, I will dive into the history of NLP and compare and contrast LLMs, machine learning models, and summarization methods. Additionally, I will speak to a government project where a government agency tasked EK with summarizing thousands of text responses to a survey. Speaking to the following summarization methods and considerations in the blog, I will then explain EK’s choice between traditional machine learning methods for NLP and LLMs for this project and considerations to keep in mind when deciding on a summarization method for certain use cases, including when sensitive data is involved.

The History of Natural Language Processing

Natural language processing has been a relevant concept in computing since the days of Alan Turing, who defined the well-known Turing test in his famous 1950 article, “Computing Machinery and Intelligence.” The test was designed to measure a computer’s ability to impersonate a human in a real-time written conversation, such that a human would be unable to distinguish whether or not they were speaking with another human or a computer; over 70 years later, computers are still advancing to reach that point.

In 1954, the first successful attempt at an implementation of NLP was conducted by Georgetown University and IBM, where a computer used punch card code to automatically translate a batch of more than 60 Russian sentences into English. While this was an extremely controlled experiment, in the 1960s, ELIZA, one of the first “chatterbots,” was able to parse users’ sentences and output sensical and contextually appropriate sentences. However, ELIZA used pattern matching and substitution to appear like it understood prompts, as it was unable to truly understand them and provided canned responses to prompts that were unusual or nonstandard.

In the following two decades, NLP models mainly consisted of hand-written rulesets that machines relied on to understand input and produce relevant output, which were quite effortful for computer scientists to implement. Throughout the 1990s and 2000s, these were soon replaced with statistical models with the advent and propagation of machine learning and hardware that could support more complex computing. These statistical models were much more powerful and able to engage with and manipulate more data, but introduced more ambiguity due to the lack of concrete rules. Starting with machine translation models that learned how to translate text based on bilingual sets of the same text and then began using statistical machine translation, machines began to develop deeper text understanding, processing, and generation skills.

The most recent iterations of NLP have been based on transformer machine learning models, allowing for deep learning and domain-specific training, so that NLP can be customized more easily to a client use case. These attention mechanism-based models were first proposed as an initial method for modern artificial intelligence use cases in 2017, when eight computer scientists working at Google wrote the paper “Attention Is All You Need,” publicizing the transformer architecture for the first time, which has been used in models such as OpenAI’s ChatGPT and other large language models to great success. These models were the starting point for Generative AI, which for the first time allows computers to synthesize new content, rather than simply classifying, summarizing, or otherwise modifying existing content. Today, these models have taken the field of machine learning by storm, and have led to the current “AI boom,” or “AI spring.”

Abstractive vs. Extractive Summarization

There are two key types of NLP summarization techniques: extractive summarization and abstractive summarization. Understanding these methods and the models that employ them is essential for selecting the right tool for your text summarization needs. Let’s delve deeper into each type, explore the models used, and use a running example to illustrate how they work.

Extractive summarization involves selecting significant sentences or phrases directly from the source text to form a summary. The model ranks sentences based on predefined criteria, such as keyword frequency or sentence position, and then extracts the top-ranking sentences without modifying them. For an example, consider the following text:

“The rapid advancement of artificial intelligence (AI) is reshaping industries globally. Businesses are leveraging AI to optimize operations, enhance customer experiences, and drive innovation. However, the integration of AI technologies comes with challenges, including ethical considerations and the need for substantial investment.”

An extractive summarization model might produce the following summary:

“The rapid advancement of AI is reshaping industries globally. Businesses are leveraging AI to optimize operations. The integration of AI technologies comes with challenges, including ethical considerations.”

For most of the history of NLP, models have been extractive – two examples are the Natural Language Tool Kit (NLTK) and the Bidirectional Encoder Representations from Transformers (BERT) model, arguably one of the most advanced extractive models. NLTK is a more basic model that relies on frequency analysis and position-based ranking to identify key words to extract into sentences. While NLTK provides a straightforward approach, its summaries may lack coherence if the extracted sentences don’t flow naturally when combined. BERT’s ability to grasp nuanced meanings makes it more effective than basic frequency-based methods, but it still relies on extracting existing sentences.

Abstractive summarization generates new sentences that capture the essence of the source text, potentially using words and phrases not found in the original content. This approach mimics human summarization by paraphrasing and condensing information.

Using the same original text, an abstractive summarization model might produce:

“AI is rapidly transforming global industries by optimizing business operations and enhancing customer experiences. Despite its benefits, adopting AI presents ethical challenges and requires significant investment.”

In this summary, the model has rephrased the content, combining ideas from multiple sentences into a coherent and concise overview. The models used for abstractive summarization might be a little more familiar to you.

An example of an abstractive model is the Bidirectional and Auto-Regressive Transformer (BART) model, which is trained on a large dataset of text and, once given a prompt, creates a summary of the prompt using words and phrases outside of the input. BART is a sequence-to-sequence model that combines the bidirectional encoder of BERT with a decoder similar to GPT’s autoregressive models. BART is trained by corrupting text (e.g., removing or scrambling words) and learning to reconstruct the original text. This denoising process enables it to generate coherent and contextually relevant summaries. It excels at tasks requiring the generation of new text, making it suitable for abstractive summarization. BART effectively bridges the gap between extractive models like BERT and fully generative models, providing more natural summaries.

LLMs also perform abstractive summarization, as they “fill in the blanks” based on massive sets of training data. While LLMs provide the most comprehensive and elaborate human-like summaries, they are prone to “hallucinations,” where they output unrelated or nonsensical text. Furthermore, there are other concerns with using LLMs in an enterprise setting such as privacy and security, which should be considered when working with sensitive data.

Functional Use of LLMs for Summarization

Recently, a large government agency presented EK with a request to conduct and analyze a survey with the goal of gauging employee sentiment on the current state of their data landscape, in order to understand how to improve their data management processes organization-wide. This survey involved data from over 1,200 employees nationwide, and employed the use of multiple-choice questions, “select all that apply” questions, and most notably, 41 free-response questions. While free-response questions allow respondents to provide a much deeper level of thought and insight into a topic or issue, they can present issues when attempting to gauge a sentiment or identify a consensus among answers. To address this, EK created a plan of how best to summarize numerous, varied text responses without expending manual effort in reading thousands of lines of text. This led to the consideration of both machine learning models and LLMs which can capably perform summarization tasks, saving consultants time and effort best spent elsewhere.

EK prepared to analyze the survey results from this project by seeking to extract meaningful summaries of more than simply a list of words or a key quote – to deliver sentences and paragraphs that captured the sentiments of a survey question’s responses, capturing respondents’ multiple emotions or points of view. For this purpose, extractive summarization model was not a good fit – even with stopwords removed, NLTK did not provide enough keywords to provide a complete description of what respondents indicated in their responses, and BERT’s extractive approach could not accurately synthesize coherent responses from answers that varied from sentence to sentence. As such, EK found that abstractive summarization tools were more suitable for this survey analysis. Abstractive summarization allowed us to gather sentiment from multiple viewpoints without “chopping up” the text directly. This allowed us to create a polished and readable final product that was more than a set of quotations.

One key issue in our use case was that LLMs hosted by a provider through the Internet are prone to data leaks and unwanted data retention, where sensitive information becomes part of the LLM’s training set. A data breach affecting one of these provider/host companies can jeopardize proprietary client information, release sensitive personal information, completely upend months of hard work, and even expose companies to legal liability.

To securely automate the sentiment analysis of our client’s data, EK used Ollama, an API that allows for various LLMs to be downloaded locally behind a firewall and run using the computer’s CPU/GPU processing power. Ollama features a large selection of LLMs to choose from, including the latest model from Meta AI, Llama, which we chose to use for our project.

Based on this set of pros and cons and the context of this government project, EK chose LLMs for their superior ability at producing an output more similar to a final product and their ability to combine multiple similar viewpoints into one summary while being able to select the most common sentiments and present them as separate ideas.

Outcomes and Conclusion

Through this engagement with EK, the large federal agency received insights from the locally hosted instance of Llama that provided key stakeholders the information of over 1,200 respondents and their textual responses. Seeing these numerous survey answers over 41 free-response questions boiled down to key summaries and actionable insights allowed the agency to identify key areas of focus moving forward in their data management improvement efforts. Through the key areas of improvement identified through summarization, the agency was able to prioritize certain technical facets of their data landscape that were identified as must haves in future tooling solutions as well as areas for more immediate organizational change to garner organizational engagement and buy-in.

Free-text responses can be difficult to process and summarize, especially when filled with various distinct meanings and sentiments. While machine learning models excel at more basic sentiment and keyword analysis, the advanced language understanding power behind an LLM allows for coherent, nuanced, and comprehensive summaries to be formed, capturing multiple viewpoints and presenting them coherently. For this engagement, a locally hosted and secure LLM turned out to be the right choice, as EK was able to deliver survey results that were concise, accurate, and informative.

If you’re ready to unlock the full potential of advanced NLP tools—whether through traditional machine learning models or cutting-edge LLMs—Enterprise Knowledge can guide you every step of the way. Contact us at info@enterprise-knowledge.com to learn how we can help your organization streamline processes, gain actionable insights, and make more informed decisions faster!

The post Choosing the Right Approach: LLMs vs. Traditional Machine Learning for Text Summarization appeared first on Enterprise Knowledge.

Graph Machine Learning Recommender POC for Public Safety Agency

EK Team — Thu, 15 Feb 2024 16:25:13 +0000

The Challenge

A government agency responsible for regulating and enforcing occupational safety sought to build a content recommender proof-of-concept (POC) that leverages semantic technologies to model the relevant workplace safety domains. The agency aimed to optimize project planning and construction site design by centralizing information from siloed and unstructured sources and extracting a comprehensive view of potential safety risks and related regulations. Automatically connecting and surfacing this information in a single location via the recommender would serve to minimize time currently spent searching for content and limit burdensome manual efforts, ultimately improving risk awareness and facilitating data-driven decision-making for risk mitigation and regulatory adherence.

The Solution

The agency partnered with EK to develop a knowledge graph-powered semantic recommendation engine with a custom front-end. Based on the use case we refined for construction site project planners, we redesigned the agency’s applicable taxonomies and developed an ontology that defined relationships to model the recommendation journey from the user’s inputs of construction site elements to the expected outputs of risks and regulations. With data loaded into the graph from taxonomy values and structured historical data, EK leveraged machine learning (ML) and natural language processing (NLP) techniques to extract data from the agency’s large volume of structured data and generate risk recommendations from user input combinations. EK iterated upon these processes to enrich the data and fine tune the risk prediction models to achieve even more accurate results. Then, based on low-fidelity wireframes collaboratively developed and validated by the client, EK’s software engineers created an interactive front-end for users to view the results and provide feedback, and ultimately deployed the application on cloud infrastructure.

Lastly, in addition to the design and development of the initial POC, EK collaborated closely with the client to assess future uses for the application, as well as methods for improving performance and utility. Potential paths for improving the application include developing user feedback mechanisms, expanding the dimensions of analysis for work sites, and expanding the scope of the application to support additional use cases. EK provided the agency with clear recommendations for next steps and paths forward to build upon the POC and further optimize construction site design and planning.

The EK Difference

EK employed its extensive experience in taxonomy design, ontology design, and data science with specific expertise in the development of recommender systems to capture and model the semantic content of the construction safety domain. Throughout the engagement, EK prioritized close collaboration with the client’s core project team and involved their subject matter experts and stakeholders in taxonomy, ontology, and wireframe design sessions, iteratively soliciting their feedback and domain knowledge to ensure the final product would properly reflect the language and subject matter for the agency’s use case. EK also provided transparency into the development of the recommender, providing thorough technical walkthroughs of the solution. This ensured the agency had all the knowledge required to make informed decisions regarding next steps to scale the solution following the end of our engagement.

The Results

The graph-powered recommender solution delivered at the end of the engagement was a compelling POC for the client to consider for long-term application and scale. The recommendation engine provided coherent recommendations in a centralized location to reduce manual efforts for end users and displayed related regulations and supporting metrics to facilitate context-based, data-driven decision-making for construction site planners at the agency. The tailored roadmap to refine and expand the solution offered clear guidance for further data and system improvements to increase the overall utility of the recommender. With this POC and the accompanying roadmap, the agency has a tangible and effective solution with a path to scale to achieve widespread buy-in from across the organization and address more complex use cases in order to maximize the value of the recommender.

This project was an example of EK’s Knowledge Graph Accelerator offering, delivering the POC to the client in 4 months.

Download Flyer

Ready to Get Started?

Get in Touch

The post Graph Machine Learning Recommender POC for Public Safety Agency appeared first on Enterprise Knowledge.

Recommendation Engine Automatically Connecting Learning Content & Product Data

EK Team — Tue, 19 Sep 2023 15:14:24 +0000

The Challenge

A bioscience technology provider – and a leader in scientific research and solutions – identified a pivotal challenge within their digital ecosystem, particularly on their public facing e-commerce website. While the platform held an extensive reservoir of both product information and associated educational content, the content and data existed disjointedly (spread across more than five systems). As a result, their search interface failed to offer users a holistic and enriching experience. A few primary issues at hand were:

The search capability was largely driven by keywords, limiting its potential to be actionable.
The platform’s search functionality didn’t seamlessly integrate all available resources, leading to underutilized assets and a compromised user experience.
The painstaking manual process of collating content posed internal challenges in governance and hindered scalability.
In the absence of a cohesive content classification system, there was a disjunction between product information and corresponding educational content.
Inconsistencies plagued the lifecycle management of marketing content.
The array of platforms, managed by different product teams, exposed alignment challenges and prevented a unified user experience.

From a business perspective, the challenges were even more dire. The company faced potential revenue losses as users couldn’t gain enough insight to make buying decisions. The user experience became frustrating due to irrelevant content and inefficient searches, limiting employees with manual processes and impeding data-driven decision-making regarding the value of the site’s content; this caused both employees and customers to resort to doing Google searches that routed them back to the site to find what they needed.

The company engaged EK to help bridge the gap between product data and marketing and educational content to ultimately improve the search experience on their platform.

The Solution

Assessing Current Content and Asset Capabilities at Scale

EK commenced its engagement by comprehensively assessing the company’s current content and asset capabilities. This deep dive included a data mapping and augmented corpus analysis effort into the content and technologies that power their website, such as Adobe AEM (marketing content), a Learning Management System (LMS) with product-related educational content, a Product Information Management (PIM) solution with over 70,000 products, and Salesforce for storing customer data. This provided a clear picture of the existing content and data landscape.

A Semantic Data Model

With a deeper understanding of the content’s diversity and the need for efficient classification, EK defined and implemented a robust taxonomy and ontology system. This provided a structured way to classify and relate content, making it more discoverable and actionable for users. To tangibly demonstrate the potential of knowledge graphs, EK implemented a POC. This POC aimed to bridge the silos between the different systems, allowing for a more unified and cohesive content experience that connected product and marketing information.

Integrated Data Resources and Knowledge Graph Embeddings

EK utilized an integrated data set to counter data fragmentation across different platforms. A more cohesive content resource was built by combining Adobe AEM and LMS data with manually curated data and extracted information from the rendered website. However, the critical leap came when the entire knowledge graph, which encapsulated this unified data set, was loaded into memory. This in-memory knowledge graph paved the way for real-time processing and analysis, which is essential for generating meaningful embeddings.

Similarity Index and Link Classifier: Two-Fold Search Enhancement

Similarity Index: EK’s Enterprise AI and Search experts worked together to convert the in-memory knowledge graph into vector embeddings. These embeddings, teeming with intricate data relationships, were harnessed to power a similarity index; this index stands as a testament to AI’s potential, offering content recommendations rooted in contextual relevance and similarity metrics.
Link Classifier: Building upon the embeddings, EK introduced a machine learning (ML) classifier. This tool was meticulously trained to discern patterns and relationships within the embeddings, establishing connections between products and content. Consequently, the system was endowed with the capability to recommend content corresponding to a user’s engagement with a product or related content. This transformed the user journey, enriching it with timely and pertinent content suggestions.

ML-Infused User Experience Enhancement

Venturing beyond conventional methodologies, EK incorporated ML, knowledge graphs, taxonomy, and ontology to redefine the user experience. This allowed users to navigate and discover important content through an ML-powered content discovery system, yielding suggestions that resonated with their needs and browsing history.

Unified Platform Management via Predictive Insights

Addressing the multifaceted challenge of various teams steering different platforms, EK integrated the machine learning classifier with predictive insights. This fusion empowered teams with the foresight to gauge user preferences, allowing them to align platform features and fostering a cohesive and forward-looking digital landscape.

Search Dashboard Displaying ML-based Results

Concluding their engagement, EK presented with a search dashboard. This dashboard, designed to exhibit two distinct types of results – similarity index and link classifier – served as a window for the organization to witness and evaluate the dual functionalities. The underlying intent was to grant their e-commerce website backend avenues to elevate their search capabilities, giving them a comparative view of multiple ML-based systems.

The EK Difference

EK’s hallmark is rooted in the proficiency of advanced AI and knowledge graph technologies, as well as our profound commitment to client relationships. Working closely with the company’s content and data teams, EK displayed a robust understanding of the technological necessities and the organizational dynamics at play. Even when the level of effort and need from the solution extended beyond the initial scope of work, EK’s flexible approach allowed for open dialogue and iterative development and value demonstration. This ensured that the project’s progression aligned closely with the evolving needs of our client.

Recognizing the intricacy of the project and the importance of a well-documented process, EK meticulously enhanced the documentation of both the delivery process and development. This created transparency and ensured that all the resources needed to carry forward, modify, or scale the implemented solution are in place for the future.

Moreover, given the complexity and nuances involved in such large-scale implementations, EK provided a repeatable framework to validate AI results with stakeholders and maintain integrity and explainability of solutions with human-in-the-loop development throughout the engagement. This was achieved through iterative sessions, ensuring the final system met technical benchmarks and resonated with the company’s organizational context and language.

The Results

The engagement equipped the organization with a state-of-the-art, context-based recommendation system specifically tailored for their vast and diverse digital ecosystem. This solution drastically improved content discoverability, relevance, and alignment, fundamentally enhancing the user experience on their product website.

The exploratory nature of the project further unveiled opportunities for additional enhancements, particularly in refining the data, optimizing the system and exposing areas where the firm had gaps in content creation or educational materials as it relates to products. Other notable results include:

Automated framework to standardized metadata across systems for over 70,000 product categories;
A Proof of Concept (POC) that bridged content silos across 4+ different systems, demonstrating the potential of knowledge graphs;
A machine-learning classifier that expedited content aggregation and metadata application process through automation; and
Increased user retention and better product discovery, leading to 6 figures in closed revenue.

Download Flyer

Ready to Get Started?

Get in Touch

The post Recommendation Engine Automatically Connecting Learning Content & Product Data appeared first on Enterprise Knowledge.

Knowledge Portal for a Global Investment Firm

EK Team — Tue, 04 Apr 2023 15:36:19 +0000

The Challenge

A major investment firm that manages over 250 billion USD in assets in a variety of industries across the globe engaged Enterprise Knowledge (EK) to fix their siloed information and business practices by designing and implementing a Knowledge Portal.

Siloed data scattered across multiple systems resulted in investment professionals wasting valuable time searching for the knowledge assets required to make fast, complete, and informed decisions. The firm manages a diverse portfolio of global assets and investments, with over 50,000 employees. Detailed records of these business deals existed as an incongruous mix of structured and unstructured information located across multiple repositories. Even gaining access to much of this information required awareness it existed, as well as knowledge of whom to contact to be granted permissions.

To fill knowledge gaps caused by misplaced or inaccessible content, investment teams also commissioned research reports and studies to support their decision-making processes. However, these reports were seldom shared across the organization and, in fact, were often duplicated across teams. Additionally, since investment records were siloed based on division and investment types, the firm was not leveraging the vast expertise of its employees.

The firm recognized it needed a centralized way to find, view, and share its knowledge assets and connect staff to experts. The solution required improved visibility across data resources, access management practices, and the ability to connect with expert staff.

The Solution

EK designed, developed, and deployed an Enterprise Knowledge Portal, leveraging a suite of best-of-breed technologies. EK first conducted business case refinement sessions to understand, in-depth, the problems that the Knowledge Portal needed to solve and its benefits to the firm, defining a series of personas, use cases, and user journeys to help prioritize key features along an Agile development plan. EK then developed the Agile roadmap for an MVP solution that would offer immediate value to the firm’s staff and business ventures while proving the value of the Knowledge Portal, as well as the follow-on backlog of features for an enhanced solution that added to the value of the foundation model to be delivered iteratively.

Over the next year, EK worked side by side with the client’s business and IT groups, as well as other third-party vendors, in order to iteratively develop and test the solution. The MVP was delivered on time, and EK is now continuing development of the system, adding additional features and back-end sources in order to enrich the overall wealth of knowledge, information, and data within the system.

Overall, the Knowledge Portal delivers several first-of-its-kind features for the organization, including:

Integration of structured and unstructured data, not just as links but as displayed results that merge source materials for easy comprehension, analysis, and user action;
Ability to understand and align complex security models, displaying only that content that should be accessible to each individual;
Machine Learning and AI to provide highly customized views, automatically assembled based on the user; and
Integration of all types of information with people, enabling individuals to find experts across the enterprise in a way that forms new connections and identifies new opportunities for collaboration.

Of specific note, the portal’s search application leverages a graph database modeled to integrate information from an extensive network of internal data sources for delivery in a single search result. For example, a single investment might combine information from as many as 12 different systems. In addition to the graph database, the portal leverages an insights engine powered by Artificial Intelligence (AI) that unifies siloed data and detects trends across repositories. The graph database and insights engine alike are powered by a semantic layer that maintains the relationships that users could take advantage of to traverse data sets existing across the enterprise, enhancing relevant content findability regardless of a user’s business role. Additionally, EK mapped user roles to access and permissions to refine the firm’s access controls, streamlining navigation of the firm’s data, thereby reducing reliance on staff to determine levels of access and increasing the efficiency of knowledge discovery.

In support of the advanced technology and ongoing system enhancement, EK also focused on several key foundational KM topics to ensure the long-term success and adoption of the Knowledge Portal. These included content governance and a wide-ranging content cleanup, migration, and enhancement effort, taxonomy and ontology design accompanied by a tagging strategy, change and communications, and content type design. These activities and deliverables ensured that the content and data integrated within the Knowledge Portal could be trusted, was easily consumable both by humans and machines, and would be maintained and further improved moving forward. Furthermore, the accompanying communications and education plan delivered an engaged and aware user base, ready to get value from the new tool.

The EK Difference

EK delivered every aspect of the Knowledge Portal solution using its own staff, deployed across three continents in order to support the client’s global needs. EK brought a broad range of internal experts to bear for this initiative, including knowledge management experts, software engineers, solution architects, change and organizational design experts, taxonomists, ontologists, content strategists, and UI/UX designers and developers. This unique assortment of experts collaborated on every element of the initiative to ensure it leveraged EK’s advanced methodologies and best practices and that the business stakeholders were engaged, aligned, and supportive of the new system.

This effort was also run leveraging true Agile principles to reduce risk and optimize stakeholder engagement and comprehension of the complex initiative. EK’s team of consultants and engineers expertly applied Agile to quickly adapt to unforeseen changes and roadblocks in development. As a result, rather than talking about the Knowledge Portal, we were able to show early prototypes, spawning a wealth of end-user understanding and feedback from the first months of the project.

The Results

The Knowledge Portal consolidated the firm’s vast intellectual resources in a single searchable space, arming investment professionals with easy access to valuable information and connections to experienced staff in ways that had never before been possible. EK is continuing to iterate to add additional features and sources, but the results are already being felt by the organization. Key performance indicators and milestones to date include:

Strong adoption, with overall user counts increasing and extremely high retention of all users.
Less time spent looking for information or recreating organizational knowledge, resulting in overall higher productivity and employee satisfaction.
Faster upskilling of new hires and junior staff, with more junior staffers reporting an ability to complete tasks without waiting for guidance from others.
Less redundant acquisitions of external research and data sets.

With additional iterations of the Knowledge Portal planned for release over the next two years, the organization continues to partner with EK and invest in the tool as a transformative solution for the organization.

Download Flyer

Ready to Get Started?

Get in Touch

The post Knowledge Portal for a Global Investment Firm appeared first on Enterprise Knowledge.

Enterprise Knowledge Playing Unprecedented Role at KMWorld 2021

EK Team — Wed, 03 Nov 2021 13:00:01 +0000

Enterprise Knowledge (EK) is playing a central role at this year’s KMWorld Conference. Continuing EK’s principles of thought leadership and industry guidance, EK is playing an unprecedented role at KMWorld, the world’s leading KM conference. This year, EK experts are delivering an unprecedented thirteen different sessions across KMWorld and the related events including Taxonomy Boot Camp, Enterprise Search and Discovery, and Text Analytics.

The virtual conference runs from November 15-18th, with preceding workshops delivered on the 12th. The conference will provide practical advice, inspiring thought leadership, and access to in-depth training and workshops on how KM and related disciplines can provide value for your organization and transform your business. This year’s conference theme, Knowledge Sharing in the Age of New Technologies, focuses on culture, people processes, and the many different types of technologies supporting organizations as they excel in their industries.

To continue our tradition of thought leadership and in order to add a social and interactive element otherwise missing from many virtual conferences over the last two years, EK will also be hosting an open live stream reception on EK’s Youtube channel on Monday the 15th of November from 5-7pm. Over the course of two hours, EK’s CEO Zach Wahl and EK Consultant Adam Eltarhoni will speak with each of EK’s KMWorld presenters as well as assorted other guests. All KMWorld attendees, as well as the wider Knowledge Management community will be able to join the session, ask questions, and participate in the conversation via chat.

On the final day of KMWorld, following the closing keynote, Wahl will also present a live version of Knowledge Cast, the number one KM Podcast in the world, as ranked by Feedspot. This special episode of Knowledge Cast will include several KMWorld attendees sharing a live discussion on the themes from this year’s conference. In addition to EK’s prominent speaking roles and other thought leadership, EK is serving as a sponsor at the conference for the eighth year in a row.

The full list of EK speakers and topics is below. Register for the conference here.

11/12/2021 9:00–12:00 – Sara Mae O’Brien-Scott, Semantic Engineering Consultant and Zachary Wahl, CEO – Taxonomy 101
11/16/2021 2:00–2:45 – Jenni Doughty, Senior Solutions Consultant, Taxonomy & Ontology Design and Megan Salerno, Knowledge Management Consultant – Virtual Tools & Techniques to Promote a User-Centric Taxonomy Design
11/16/2021 4:00–4:45 – Joe Hilger, COO – Implementing Search in the New World of AI & ML
11/16/2021 4:00–4:45 – Polly Alexander Director, Knowledge Management, HealthStream, Inc.; Sara Nash, Technical Consultant, Data and Information Management – Enabling KM in Health Enterprises
11/17/2021 12:45–1:45 – Laurie Gray, VP, Customer Experience and Design, RGP; Tatiana Cakici, Senior KM Consultant – Taxonomy Case Studies: RGP and Health Education England
11/17/2021 12:45–1:45 – Helmut Nagy, COO, Semantic Web Company and Joe Hilger, COO – Enriching Knowledge Graphs – A Two-Way Street
11/17/2021 2:00–2:45 – Ann Bernath, Software Systems Engineer, NASA Jet Propulsion Laboratory (JPL); Bess Schrader, Senior Consultant; and Daria Topousis, Software Systems Engineer, NASA Jet Propulsion Laboratory (JPL) – Institutional Knowledge Graph: Leveraging Semantic Tech
11/17/2021 3:00–3:45 – Liz White, Senior KM Analyst – Understanding Your Users Through UX Design
11/17/2021 4:00–4:45 – Liz White, Senior KM Analyst – Maximizing KM Value With UX & Knowledge Graphs
11/17/2021 4:00–4:45 – Zachary Wahl, CEO – Stump the Taxonomist/Ontologist: Q&A with Experts!
11/17/2021 5:30–7:00 – Guillermo Galdamez, Senior Consultant – Crafting & Selling a KM Strategy to Your Organization
11/18/2021 12:30–1:30 – Aylin Cetin, Senior Analyst and Instructional Designer; Cari Kreshak, Learning Experience Manager, National Park Service – Learning & Culture for Better KM
11/18/2021 2:45–3:30 – Amber Simpson, Senior Manager, Learning & Development, Walmart and Todd Fahlberg, Senior KM Consultant – CM, Digital Workplaces, & Information Architecture

About Enterprise Knowledge

Enterprise Knowledge (EK) is a services firm that integrates Knowledge Management, Information and Data Management, Information Technology, and Agile Approaches to deliver comprehensive solutions. Our mission is to form true partnerships with our clients, listening and collaborating to create tailored, practical, and results-oriented solutions that enable them to thrive and adapt to changing needs. At the heart of these services, we always focus on working alongside our clients to understand their needs, ensuring we can provide practical and achievable solutions on an iterative, ongoing basis. Visit our website to see how optimizing your knowledge and data management will impact your organization.

The post Enterprise Knowledge Playing Unprecedented Role at KMWorld 2021 appeared first on Enterprise Knowledge.

EK to Host an Executive Masterclass at Squirro’s AI Week 2021

EK Team — Tue, 28 Sep 2021 20:10:52 +0000

Two members of the Enterprise Knowledge (EK) team will lead an Executive Masterclass at Squirro’s inaugural Artificial Intelligence (AI) Week 2021, held virtually from October 5th to the 8th. Squirro’s AI Week 2021 is an online event designed to showcase AI best practices and provide interactive masterclasses that enable business and IT leaders to improve their workflow outputs, decision intelligence, and customer success using AI and Machine Learning (ML).

Chris Marino, Principal Solution Consultant, and Sara Nash, Technical Consultant, will co-present their Executive Masterclass for interested business and IT team leaders. They will discuss several use cases where EK has successfully leveraged AI to improve their clients’ productivity. EK’s experts will outline the foundations of AI, explain how EK successfully implemented AI technologies, and provide lessons learned so that attendees can prepare to implement AI within their organization.

Join EK’s Executive Masterclass on Thursday, October 7th, 2021 from 10:00am-12:30pm EDT. Apply for the class here.

The post EK to Host an Executive Masterclass at Squirro’s AI Week 2021 appeared first on Enterprise Knowledge.

EK Again Listed on KMWorld’s AI 50 Leading Companies

EK Team — Fri, 09 Jul 2021 19:59:07 +0000

Enterprise Knowledge (EK) has been listed on KMWorld’s 2021 list of leaders in Artificial Intelligence, the “AI 50: The Companies Empowering Intelligent Knowledge Management.” This is the second year in a row EK has been included. To help spotlight innovation in knowledge management, KMWorld developed the annual KMWorld AI 50, a list of vendors that are helping their customers excel in an increasingly competitive marketplace by imbuing products and services with intelligence and automation.

Unique to the list, EK is one of the few dedicated consultancies that made the list, offering end-to-end technology selection, strategy, design, implementation, and support services for the full range of Enterprise AI components, including knowledge graphs, natural language processing, ontologies, and machine learning tools.

“A spectrum of AI technologies, including machine learning, natural language processing, and workflow automation, is increasingly being deployed by sophisticated organizations,” stated KMWorld Group Publisher Tom Hogan, Jr. “Their goal is simple. These organizations seek to excel in an increasingly competitive marketplace by improving decision making, enhancing customer interactions, supporting remote workers, and streamlining their processes. To showcase knowledge management solution providers that are imbuing their offerings with intelligence and automation, KMWorld created the ‘AI 50: The Companies Empowering Intelligent Knowledge Management.’ ”

Lulit Tesfaye, EK’s Practice Leader for Data and Information Management, shared, “Given our continued leadership in this space, and the growth of our team and its capabilities, I’m proud to be recognized in this way. We are increasingly seeing our work with customers grow from initial assessments and prototypes into enterprise engagements that are transforming the way they do business. I’m proud to be leading in this exciting space.”

EK CEO Zach Wahl added, “Thanks to KMWorld for this recognition. KM and AI are increasingly coming together, and we’re pleased to be leading organizations in their transformations to intelligent knowledge organizations.”

To read more about the recognition, visit Lulit’s AI Spotlight article on KMWorld and explore EK’s knowledge base for the latest thought leadership.

About Enterprise Knowledge

Enterprise Knowledge (EK) is a services firm that integrates Knowledge Management, Information Management, Information Technology, and Agile Approaches to deliver comprehensive solutions. Our mission is to form true partnerships with our clients, listening and collaborating to create tailored, practical, and results-oriented solutions that enable them to thrive and adapt to changing needs.

About KMWorld

KMWorld is the leading information provider serving the Knowledge Management systems market and covers the latest in Content, Document and Knowledge Management, informing more than 21,000 subscribers about the components and processes – and subsequent success stories – that together offer solutions for improving business performance.

KMWorld is a publishing unit of Information Today, Inc.

The post EK Again Listed on KMWorld’s AI 50 Leading Companies appeared first on Enterprise Knowledge.

AI Beyond a Prototype

Lulit Tesfaye — Tue, 11 May 2021 16:00:36 +0000

How to take an AI Project Beyond a Prototype

Before going “all in,” we often advise our clients to first understand and quickly validate the value proposition for adopting advanced Artificial Intelligence (AI) and Machine learning (ML) solutions within their organization by engaging in a beyond AI project prototype or pilot. Conducting such targeted experimentations not only provides the enterprise with a safe way to validate that AI and ML solutions will solve real problems, but also provides a design foundation for key AI elements required for their roadmap and supports long-term change management by showing immediate incremental benefits and developing interest.

Without the appropriate guidance and strategy, AI efforts may still get stalled right after a prototype or proof of concept, regardless of how successful these initial efforts may have been.

Although 84% of executives see the value and agree that they need to integrate and scale AI within their business processes, only 16% of them say that they have actually moved beyond the experimentation phase.

Mainly informed by the diverse set of organizational realities and AI projects we have delivered, below I will explore the common themes I see when it comes to potential roadblocks in moving from prototype to enterprise, and provide a selection of approaches that I have found helpful in scaling enterprise AI efforts.

1. Understand that AI projects have unique life cycles

In software delivery, Agile and DevOps continue to serve as successful frameworks for allowing iterative delivery, getting the product closer to the end user or customer and ultimately delivering immediate value. However, Enterprise AI efforts have surfaced the need to revisit Agile delivery within the context of AI and ML processes. What this means for the sponsoring organization and the teams involved is that any project management or delivery approach that is employed will need to balance the predictable nature of software programming with facilitation and ongoing education about expected machine outcomes for the end-user and subject matter expert (SME), while balancing the random number of experimental data ingestion and model refinement required for AI deliverables.

Enterprise AI projects typically have a number of workstreams or task areas that need to be at play, in parallel. These include use case definition, information architecture, data mapping and modeling, integration and pipeline development, the data science side of things where there are multiple Machine Learning (ML) processes running, and, of course, the software engineering aspect that is required to connect with downstream or upstream applications that will render the solution to end users. With all these variables at play, the following approaches help to build a more AI-centric delivery framework:

Sprints for data teams are different: While software programming or development is focused on predefined applications or features, the primary focuses for data science and machine learning tasks are analysis, modeling, cleaning, and exploration. Meaning, the data is the center of the universe and the exploration process is what determines the outcome or the features being delivered. The results from the machine and data exploration phase could result in the project having to loop back to the planning phase. As such, the data workstream doesn’t necessarily need to be within or work through the same sprint as the development team.

AI Project Delivery Iterations

Embed research or “spike” sprints to create room for understanding and data exploration: Unlike humans, machines need to go through diverse sets of data to understand the context within which it is being applied at your organization (a knowledge graph significantly helps in this process) and align it to your expected results. This process requires stages of understanding, analysis, and research to identify relevant data. Do your AI projects plan for this research?
Embrace testing and quality assurance (QA) from the start: Testing in AI/ML is not limited to the model itself. Ensuring the data quality stays sufficient to serve use cases and having the right entry point checks in place to detect potential data collection errors is a foundational step before starting the model. Additionally the QA process in AI and ML projects should take into account the ability to test integration points as well as any peripheral systems and processes that serve as inputs or outputs to the model itself. As time goes by, having a proven integration and automation process to continue updating and training the model is another area that will require automation itself.
Prepare for organizational impact: When it comes down to implementation, some projects are inherently too big. Imagine replacing legacy applications with AI technology and models, for instance. There needs to be supporting organization-wide processes in place to ensure your model and delivery is supported all the way throughout strategy, implementation, and adoption. There are more players that need to be involved in addition to the project team itself.

2. Know what is really being delivered

For machine learning and AI, the product is the algorithm, or the model, not necessarily the accuracy of the results. Meaning, if the model is right, with the right data, it will deliver the intended results. Otherwise, garbage in, garbage out. Understanding this dynamic is key when defining acceptance criteria and your minimum viable product. Additionally, leveraging UI/UX resources and wireframing sessions facilitates the explanation of what the AI tool really is and sets expectations around what it can help stakeholders achieve before they test the tool.

- AI scope is mostly driven by two factors, use cases and available data: Combining top-down discovery and ideation sessions with end-users and subject matter experts (SMEs) with bottom-up mapping and review of content, data, and systems is a technique we use to narrow down AI/ML opportunities to define the initial delivery requirements. As the project progresses, there will almost always be new developments, findings, and challenges that arise. The key to successful definition of what is really being delivered is building the required flexibility into iteration cycles and update loops for end-users and SMEs to review exploratory results from the data and ML workstream regularly and provide context and domain knowledge to refine the model based on available datasets.
- Plan for diverging stakeholder opinions: Machine learning models are better than a human at browsing through thousands of content items and finding recommendations that organizational SMEs may not have thought of. However, your current data may not necessarily capture the topics or the “aboutness” of how your data is used. Encouraging non-technical stakeholders to provide context by participating in the ideation and the acceptance criteria development process is key. You need SMEs to help create a rich semantic layer that captures key business facts and context. However, your stakeholders or SMEs may have their own tacit knowledge and memory of your organization’s content to say what’s good or bad when it comes to your project results. What if the machine uncovers better content for search results that everyone may have forgotten about? And remember, missing results are not necessarily bad because they can help identify the content or data your organization is currently missing.
- Defining KPIs or ROI for AI projects is an iterative process: It is important to create the ability to ensure the right solution is being developed and is effective. The definition of the use case, acceptance criteria, and gold standard typically serve as introductory benchmarks to determine how to measure impact of the solution and overall success and return. However, as more training data is added, the model is continually updated and can change significantly over time. Thus, it is important to understand that the initial KPIs will usually have assumptions that are validated and updated as the solutions are incrementally developed and tested. It is also critical to have baseline data in order to successfully compare outcomes with ML/AI and without. Because setting KPIs is a journey, it really boils down to planning for and setting up the right governance and monitoring processes to support continuous re-training of the model and measure impact frequently.

3. Plan for ancillary (potentially hidden) costs

This is one of the primary areas where AI projects encounter a reality check. If not planned for, these hidden costs can take many forms and cause significant delays or completely stall projects. The following items are some of the most common items to consider when planning to scale AI efforts:

Size and quality of the right data: AI and ML models learn from lots of training data. The larger the dataset, the better the AI model and results will perform. This required size of data introduces challenges including the need to aggregate and merge data from multiple sources with different security constraints, diverse formats (structured, unstructured, video files, text, images, etc.). This affects where and how your data and AI projects teams spend most of their time i.e., preparing data for analysis as opposed to building models and developing results. One of the most helpful ways to make such datasets easier to manage is to enhance them with rich, descriptive metadata (see next item) and a data knowledge graph.
Data preparation and labeling (taxonomies / metadata): Most organizations do not have labeled data readily available for effective execution of ML and AI projects. If not planned for or staffed properly, the majority of your resources will be spent in annotating or labeling training data. Because this step requires domain knowledge and the use of standards and best practices in knowledge organization systems, organizations will have to invest in formal and standardized semantic experts and hybrid automation in order to maintain quality and consistency data across sources.
Licenses and tools: The most common misconceptions for Enterprise AI implementations and why many AI projects fail starts with the assumption that AI is a “Single Technology” solution. Organizations looking to “plug-and-play AI” or who want to experiment with a variety of open source tools need to reset their expectations and plan for the requirements and actual cost using these tools as costs can add up quickly. AI solutions range from data management and orchestration capabilities to employing a solution for metadata storage, and depending on the use case, the ability to push ML model results to upstream or downstream applications.
Project team expertise (or lack thereof): Experienced data scientists are required to effectively handle most of the machine learning and AI projects, especially when it comes to defining the success criteria, final delivery, scale, and continuous improvement of the model. Overlooking this foundational need could result in even more costly outcomes, or wasted efforts after producing misleading results or results that aren’t actionable or insightful.

Closing

The approach to enable rapid delivery of AI and its adoption continue to evolve. However, the challenges with scale still remain attributed to many factors including selecting the right project management and delivery framework, acquiring the right solutions, instituting the foundational data management and governance practices, and finding, hiring, and retaining people with the right skill sets. And ultimately, enterprise leaders need to understand how AI and Machine Learning work and what AI really delivers for the organization. The good news is that if built with the right foundations, a given AI solution can be reusable for multiple use cases, connect diverse data sources, cross organizational silos, and continue to deliver on the hype.

How’s your organization tracking? Find out if your organization has the right foundations to take AI to production or email us to learn more about our experience and how we can help.

The post AI Beyond a Prototype appeared first on Enterprise Knowledge.