Kate Erfle, Author at Enterprise Knowledge

Expert Analysis: Top 5 Considerations When Building a Modern Knowledge Portal

Kate Erfle — Tue, 30 Jan 2024 16:30:38 +0000

Knowledge Portals aggregate and present various types of content – including unstructured content, structured data, and connections to people and enterprise resources. This facilitates the creation of new knowledge and discovery of existing information.

The following article highlights five key factors that design and implementation teams should consider when building a Knowledge Portal for their organizations.

Sources of Truth

Guillermo Galdamez

We define ‘sources of truth’ as the various systems responsible for generating data, recording transactions, or storing key documents about the vital business processes of an organization. These systems are fundamental to the day-to-day operations and long-term strategic objectives of the business.

In a modern enterprise, the systems supporting diverse business processes can number in the dozens, if not hundreds, depending on the organization’s size. However, from the business perspective of a Knowledge Portal implementation, it is critical to prioritize integrations with each source based on appropriate criteria. Drawing from our experience, we’ve identified three key factors that Knowledge Portal leaders should consider:

Business value. The source must contain data that is fundamental to both the business and to the Knowledge Portal’s objectives, aligning with the users’ expectations.
Data readiness. The data within the source should be in a state ready for ingestion and aggregation (more on this in the next section).
Technical readiness. It may sound obvious, but the source systems need to be capable of providing data to the Knowledge Portal. In some cases, a source system might be under active development (and not yet operational), or it might have limited functionality for exporting the necessary data for the Knowledge Portal’s use cases.

Kate Erfle

A Knowledge Portal should draw from well-defined information sources that are recognized as being authoritative and trusted. A Knowledge Portal isn’t intended to act as the “source of truth” itself, but rather to aggregate and meaningfully connect data sources and repositories, providing a cohesive “view of truth.”

As Guillermo pointed out, there are several key data and technical readiness factors to consider when integrating source systems within a Knowledge Portal ecosystem. For a successful implementation, the source systems should meet the following technical criteria:

The data must be clean, consistent, and standardized (more on this in the next section).
The data should be accessible in a standard, compatible format, either via an API or manual export.
The data must be protected by necessary and appropriate security measures, or it should provide data points that can be used to implement and enforce these security measures.

Once a data source meets the established criteria for quality, import/export capabilities, and security, it becomes eligible for integration with the Knowledge Portal. Within the portal, it may be possible to create or update content, but the data source remains its own “source of truth”. All changes made within the Knowledge Portal should be reflected in the corresponding source system to maintain consistency, accuracy, and integrity of the source system data. During the design and implementation of a Knowledge Portal, it is critical to consider the impact of user actions and to ensure that any changes are accurately reflected in the source data. This approach ensures the continued accuracy and reliability of data from the source systems.

Information Quality

Guillermo Galdamez

One of the most common issues I encounter when talking to our clients is the perception that their data and unstructured content is unreliable. This could be due to various issues: the data might be incomplete, duplicative, outdated, or just plain wrong. As a result, employees can spend hours not only searching for information and data but also tracking down people who can confirm its reliability and usability.

In discussing content and data quality, one of the foundational steps is taking inventory of the ‘stuff’ contained within your prioritized sources of truth. Though the maxim “You can’t manage what you can’t measure,” has often sparked debate about its merits, this is one occasion where it is notably relevant. It is important for the implementation team, as well as the business to have visibility of the data and content it means to ingest and display through the Knowledge Portal. Performing a content analysis is key in providing the Knowledge Portal team with the information they need to ensure that information provided by the Knowledge Portal is consistent, reliable, and timely.

A content inventory and audit often reveals areas where data and content needs to be remediated, migrated, archived, or disposed of. The Knowledge Portal team should take this opportunity to perform data and content cleanup. During development, the Knowledge Portal Team can collaborate with various teams to improve data and content quality. Even following its launch, the Portal, by aggregating and presenting information in new ways, can reveal gaps or inconsistencies across its sources. It will be helpful to define feedback mechanisms between users, the Knowledge Portal Team, and data and content owners to be able to address instances where data and content needs to be maintained.

Gaining and sustaining user trust is crucial for Knowledge Portals. Users will continuously visit the Portal as long as they perceive that it solves their previous challenges. If the Portal becomes a new ‘junk drawer’ for data, engagement will decline rapidly. To avoid this, implement a strong change management and communications strategy to continually remind users about the Portal’s capabilities and value.

Kate Erfle

Maintaining high-quality data and content is crucial for a Knowledge Portal’s success. As Guillermo stated, the implementation phase of a Knowledge Portal offers the perfect opportunity for data cleanup.

To begin, it’s important for individual system owners and administrators to do what is feasible within their systems to ensure high-quality data. Before it’s provided to the Portal, several transformation and cleaning steps can be applied directly to the source system data. The Knowledge Portal implementation team should collaborate closely with the various data repository teams to ensure the required data fields are standardized, cleaned, and validated before being exported. By working together, these teams can assess the current state of the data, identify missing fields, spot discrepancies, and address inconsistencies.

If the data from source systems still contains imperfections, a few remediation strategies can be applied to prepare it for integration:

Removing Placeholder or Dummy Data: If the data source team is unable to remediate placeholder or dummy data, the Portal team can compile a list of these “dummy values” and remove them entirely. Displaying a field as “empty” is preferable to showing a fake or false value.
Normalizing Terms with a Controlled Vocabulary: In cases where the source system lacks a controlled vocabulary, the Portal team can align certain data fields with the Knowledge Portal’s taxonomy and ontology. This involves using synonyms to match various representations of the same concept into one concise point.
Enforcing Data Standards through APIs: The Portal team’s APIs can be configured to expect and enforce specific data standards, models, and types, ensuring that only accurately conforming data is ingested and displayed to the end user. Such enforcement can also highlight required fields and alert data teams when essential data is missing, which increases visibility into the underlying issues associated with bad data.

Guillermo emphasized the importance of remedying data issues to build and maintain user trust and buy-in. Effectively addressing bad data is also critical to avoid significant issues:

Preventing Unauthorized Access to Information: Without proper security measures and clear definitions of user identities and access rights, there’s a high risk of sensitive information being exposed. The data needs to clearly indicate who should be granted access, and users need to be uniquely and consistently identifiable across systems.
Ensuring Full Functionality of the Knowledge Portal: If data is incomplete or untrustworthy, it impedes the use of advanced capabilities and functionalities of the Knowledge Portal. Reliable data is foundational for seeing its full potential.

Business Meaning and Context

Guillermo Galdamez

As mentioned earlier, Knowledge Portals aggregate information from diverse sources and present it to users, introducing a new capability to the organization. It’s essential for the Knowledge Portal team to fully understand the data and information being presented to the users. This includes knowing its significance and business value, its origin, how it is generated, and its connection to other business processes. Keep in mind that this information is seldom presented to users all at once, so they will likely face a learning curve to utilize the Knowledge Portal effectively. This challenge can be mitigated through thoughtful design, change management, training, and communication.

Designs for a Knowledge Portal need to strategically organize different information elements. This involves not only prioritizing these elements based on relative importance, but also ensuring they align with business logic and are linked to related data, information, and people. In other words, the design needs to be understandable to all intended users at a single glance. Achieving this requires clear, prioritized use cases tailored to the Knowledge Portal’s audiences, combined with thorough user research that informs user expectations. Knowing this, it becomes easier to design with user needs and objectives in mind and have it more seamlessly fit into their daily workflows and activities.

Effective change management, training, and communications help reinforce the purpose and the value of a Knowledge Portal, which might not always be intuitive to everyone across the organization; some users may be resistant to change, preferring to stick to familiar routines. It’s crucial for the Knowledge Portal team to understand these users’ motivations, their hesitations, and what they value. Clearly articulating the individual benefits users will gain from the Portal, setting clear expectations, and providing guidance on using the Portal successfully are crucial for new users to adopt it and appreciate its value in their work.

Kate Erfle

It is essential to provide context to the information available on the portal, especially within a specific business or industry setting. This involves adding metadata, descriptions, and categorizations to data, allowing siloed, disconnected information to be associated and helping users discover content relevant to their needs quickly and efficiently.

A robust metadata system and a well-defined taxonomy can aid in organizing and presenting content in a meaningful way. It’s important to evaluate the current state of existing taxonomies and controlled vocabularies across each source system, as well as to assess the prevalence and consistency of metadata applied to content within these systems. These evaluations help determine the level of effort required to standardize and connect content effectively. To obtain the full benefits of a Knowledge Portal–creating an Enterprise 360 view of the organization’s assets, knowledge, and data–this content needs to be well-defined, categorized, and described.

Security and Governance

Guillermo Galdamez

One of the most common motivations driving the implementation of Knowledge Portals is the user’s need to quickly find specific information required for their work. However, users often overlook the equally important aspect of securing this information.

Often, information is shared through unsecured channels like email, chat, or other common communication methods at users’ disposal. This approach places the responsibility entirely on the sender to ascertain and decide if a recipient is authorized to receive the information. Sometimes senders mistakenly send information to the wrong person, or they may need additional time to verify the recipient’s access rights. Furthermore, senders may need to redact parts of the information that the recipient isn’t permitted to see, which adds another time-consuming step.

The Knowledge Portal implementation must address this organizational challenge. At times, the Knowledge Portal team will need to guide the organization in establishing a clear framework for access control. This is especially necessary when the Knowledge Portal creates new types of information and data by aggregating, repackaging, and delivering them to users.

Kate Erfle

Security and governance are paramount in the construction of a Knowledge Portal. They profoundly influence various implementation details and are critical for ensuring the confidentiality, integrity, and availability of information within the portal.

The first major piece of security and governance is user authentication, which involves verifying a user’s identity. Several options for implementing user authentication include traditional username and password, Multi-Factor Authentication (MFA), and Single Sign-On (SSO). These choices will be influenced by the existing authentication and identity management systems in use within the client organization. Solidifying these design decisions early in the architecting process is critical as they affect many facets of the portal’s implementation.

The second major piece of security and governance is user authorization, which involves granting users permission to access specific resources based on their identity, as established through user authentication. Multiple authorization models may be necessary based on the level of fine-grained access control required. Popular models include:

Role-Based Access Control (RBAC): This model involves defining roles (e.g., admin, user, manager) and assigning specific permissions to each. Users are then assigned to these roles, which determine their access level.
Attribute-Based Access Control (ABAC): In this model, access decisions are based on user attributes (e.g., department, location, job title), with policies that specify the conditions for access.

Depending on the organization’s use case, one or a combination of these may be used to manage user access and ensure sensitive data is secured. The difficulty and complexity of the implementation will be directly correlated with the current and target state of identity and security management across the organization, as well as the breadth and depth of data classification applied to the organization’s data.

Information Seeking and Action

Guillermo Galdamez

Knowledge Portal users will approach their quest for information in a variety of ways. Users may prefer to browse through content during exploratory sessions, or they may leverage search when they know precisely what they need. Often, users employ a combination of these approaches depending on their specific needs for data or content.

For instance, in a recent Knowledge Portal project, our user research revealed that individuals rarely searched for documents directly. Instead, they searched for various business entities and then browsed through related documents. This prompted the team to reevaluate the prioritization of documents in search results and the necessary data points that should be displayed alongside these documents to provide meaningful context.

In summary, having a strong user research strategy is essential to understand what type of data and information users are seeking, their reasons for needing it, their subsequent use of it, and how this supports the broader organization’s processes and objectives.

Kate Erfle

Knowledge Portals are designed to provide users with access to a broader range of information and resources than available in the various source systems, and they should facilitate users in both finding necessary information and taking meaningful actions based on that information.

Information Seeking Involves:

Search Functionality: A robust search engine matches user queries to the most relevant content. This involves keyword relevance, search and ranking algorithms, and user-specific parameters. Tailoring these parameters to the organization’s specific business use cases improves search accuracy. The incorporation of taxonomies and ontologies for content categorization, tagging, and filtering further refines search results, aligning them with organization-specific terminology, and enables users to sift through results using familiar business vernacular.
Browsing and Navigation: Well-structured categories, facets, menus, and user-friendly navigation features help users discover not just the information they directly seek, but also related, relevant content they may not have anticipated. This can be done through various interfaces, including mobile applications, enhancing the portal’s accessibility and user interaction.
Dynamic Content Aggregation and Personalization: A standout feature of Knowledge Portals is their ability to aggregate data from various sources into a single page, which can be highly personalized. For instance, a project aggregator page might include sections on related projects, prioritizing those relevant to the user’s department.

Action Involves:

Integration with Source Systems or Applications: Providing seamless links to source systems within the Knowledge Portal allows users to easily find content and perform CRUD (Create, Read, Update, Delete) operations on the original content.
Task Support: Tools for document generation, data visualization, workflow automation, and more, assist users in their daily tasks and enable them to make the most of source data and optimize business processes.
Learning and Performance Support: Dynamic content and interactive features encourage users to actively engage with content which strengthens their comprehension and absorption of information.
Feedback Mechanism: Enabling users to contribute feedback on content and documents within the portal fosters continuous improvement and enhances the portal’s effectiveness over time.

Closing

The business and technical considerations outlined here are essential for creating a Knowledge Portal that intuitively delivers information to its users. Keep in mind that these considerations are interconnected, and a well-designed Knowledge Portal should strike a balance between them to provide users with a seamless and enriching experience. Should your organization aspire to implement a Knowledge Portal, our team of experts can guide you through these challenges and intricacies, ensuring a successful deployment.

The post Expert Analysis: Top 5 Considerations When Building a Modern Knowledge Portal appeared first on Enterprise Knowledge.

Knowledge Portal Architecture Explained

Kate Erfle — Thu, 09 Nov 2023 16:00:14 +0000

In today’s data-driven world, the need for efficient knowledge management and dissemination has never been more critical. Users are faced with an overwhelming amount of content and information, and thus need an efficient, intuitive, and structured way to retrieve it. Additionally, organizational knowledge is often inconsistent, incomplete, and dispersed among various systems.

The solution? A Knowledge Portal: a dynamic and interconnected system designed to transform the way we manage, access, and leverage knowledge. This provides users with a comprehensive Enterprise 360 view of all of the information they need to successfully do their jobs. At its core, a Knowledge Portal consists of five components: Web UI, API Layer, Enterprise Search Engine, Knowledge Graph, and Taxonomy/Ontology Management System.

The diagram below displays how these five components interact within the context of an Enterprise Knowledge Portal implementation.

Collectively, these components create a unified platform, empowering both organizations and individuals to discover information, break down organizational silos, and make informed decisions.

EK has expertise in Knowledge Portal implementations, and we would love to help you take the next step on your knowledge management journey. Please contact us for more information.

Knowledge Portal Architecture Components Explained Download

Knowledge Portal Architecture Diagram Download

Knowledge Portal Flyer Download

Special thank you to Adam Eltarhoni for his contributions to this infographic!

The post Knowledge Portal Architecture Explained appeared first on Enterprise Knowledge.

Transforming Tabular Data into Personalized, Componentized Content using Knowledge Graphs in Python

Kate Erfle — Tue, 22 Mar 2022 13:30:26 +0000

My colleagues Joe Hilger and Neil Quinn recently wrote blogs highlighting the benefits of leveraging a knowledge graph in tandem with a componentized content management system (CCMS) to curate personalized content for users. Hilger set the stage explaining the business value of a personalized digital experience and the logistics of these two technologies supporting one another to create it. Quinn makes these concepts more tangible by processing sample data into a knowledge graph in Python and querying the graph to find tailored information for a particular user. This post will again show the creation and querying of a knowledge graph in Python, however, the same sample data will now be sourced from external CSV files.

A Quick Note on CSVs

CSV files, or comma-separated values files, are widely used to store tabular data. If your company uses spreadsheet applications, such as Microsoft Excel or Google Sheets, or relational databases, then it is likely you have encountered CSV files before. This post will help you use the already existent, CSV-formatted data throughout your company, transform it into a usable knowledge graph, and resurface relevant pieces of information to users in a CCMS. Although this example uses CSV files as the tabular dataset format, the same principles apply to Excel sheets and SQL tables alike.

Aggregating Data

The diagram below is a visual model of the knowledge graph we will create from data in our example CSV files.

In order to populate this graph, just as in Quinn’s blog, we will begin with three sets of data about:

Customers and the products they own
Products and the parts they are composed of
Parts and the actions that need to be taken on them

This information is stored in three CSV files, Customer_Data.csv, Product_Data.csv and Part_Data.csv:

Customers

Customer ID	Customer Name	Owns Product
1	Stephen Smith	Product A
2	Lisa Lu	Product A

Products

Product ID	Product Name	Composed of
1	Product A	Part X
1	Product A	Part Y
1	Product A	Part Z

Parts

Part ID	Part Name	Action
1	Part X
2	Part Y
3	Part Z	Recall

To create a knowledge graph from these tables, we will need to

Read the data tables from our CSV files into DataFrames (an object representing a 2-D data structure, such as a spreadsheet or table)
Transform the DataFrames into RDF triples and add them to the graph

In order to accomplish these two tasks, we will be utilizing two Python libraries. Pandas, a data analysis and manipulation library, will help us serialize our CSV files into DataFrames and rdflib, a library for working with RDF data, will allow us to create RDF triples from the data in our DataFrames.

Reading CSV Data

This first task is quite easy to accomplish using pandas. Pandas has a read_csv method for ingesting CSV data into a DataFrame. For this use case, we only need to provide two parameters: the CSV’s file path and the number of rows to read. To read the Customers table from our Customer_Data.csv file:

import pandas as pd

customer_table = pd.read_csv("Customer_Data.csv", nrows=2)

The value of customer_table is:

       Customer ID      Customer Name     Owns Product
0                1      Stephen Smith        Product A
1                2            Lisa Lu        Product A

We repeat this process for the Products and Parts files, altering the filepath_or_buffer and nrows parameters to reflect the respective file’s location and table size.

Tabular to RDF

Now that we have our tabular data stored in DataFrame variables, we are going to use rdflib to create subject-predicate-object triples for each column/row entry in the three DataFrames. I would recommend reading Quinn’s blog prior to this one as I am following the methods and conventions that he explains in his post.

Utilizing the Namespace module will provide us a shorthand for creating URIs, and the create_eg_uri method will url-encode our data values.

from rdflib import Namespace
from urllib.parse import quote

EG = Namespace("http://example.com/")

def create_eg_uri(name: str) -> URIRef:
    """Take a string and return a valid example.com URI"""
    quoted = quote(name.replace(" ", "_"))
    return EG[quoted]

The columns in our data tables will need to be mapped to predicates in our graph. For example, the Owns Product column in the Customers table will map to the http://example.com/owns predicate in our graph. We must define the column to predicate mappings for each of our tables before diving into the DataFrame transformations. Additionally, each mapping object contains a “uri” field which indicates the column to use when creating the unique identifier for an object.

customer_mapping = {
    "uri": "Customer Name",
    "Customer ID": create_eg_uri("customerId"),
    "Customer Name": create_eg_uri("customerName"),
    "Owns Product": create_eg_uri("owns"),
}

product_mapping = {

    "uri": "Product Name",
    "Product ID": create_eg_uri("productId"),
    "Product Name": create_eg_uri("productName"),
    "Composed of": create_eg_uri("isComposedOf"),
}

part_mapping = {

    "uri": "Part Name",
    "Part ID": create_eg_uri("partId"),
    "Part Name": create_eg_uri("partName"),
    "Action": create_eg_uri("needs"),
}

uri_objects = ["Owns Product", "Composed of", "Action"]

The uri_objects variable created above indicates which columns from the three data tables should have their values parsed as URI References, rather than Literals. For example, Composed of maps to a Part object. We want to make the object in the triple EG:Product_A EG:isComposedOf a URI pointing to/referencing a particular Part, not just the string name of the Part. Whereas the Product Name column creates triples such as EG:Product_A EG:productName “name” and “name” is simply a string, i.e. a Literal, and not a reference to another object.

Now, using all of the variables and methods declared above, we can begin the translation from DataFrame to RDF. For the purposes of this example, we create a global graph variable and a reusable translate_df_to_rdf function which we will call for each of the three DataFrames. With each call to the translate function, all triples for that particular table are added to the graph.

from rdflib import URIRef, Graph, Literal
import pandas as pd

graph = Graph()

def translate_df_to_rdf(customer_data, customer_mapping):
    # Counter variable representing current row in the table
    i = 0
    num_rows = len(customer_data.index)

    # For each row in the table
    while i < num_rows:
        # Create URI subject for triples in this row using ‘Name’ column
        name = customer_data.loc[i, customer_mapping["uri"]]
        row_uri = create_eg_uri(name)

        # For each column/predicate mapping in mapping dictionary
        for column_name, predicate in customer_mapping.items():

            # Grab the value at this specific row/column entry
            value = customer_data.loc[i, column_name]

            # Strip extra characters from value
            if isinstance(value, str):
                value = value.strip()

            # Check if the value exists
            if not pd.isnull((value)):
                # Determine if object should be a URI or Literal
                if column_name in uri_objects:
                    # Create URI object and add triple to graph
                    uri_value = create_eg_uri(value)
                    graph.add((row_uri, predicate, uri_value))
                else:
                    # Create Literal object and add triple to graph
                    graph.add((row_uri, predicate, Literal(value)))
        i = i + 1

Querying the Graph

In this case, we make three calls to translate_df_to_rdf:

translate_df_to_rdf(customer_data, customer_mapping)
translate_df_to_rdf(product_data, product_mapping)
translate_df_to_rdf(part_data, part_mapping)

Now that our graph is populated with the Customers, Products, and Parts data, we can query it for personalized content of our choosing. So, if we want to find all customers who own products that are composed of parts that need a recall, we can create and use the same query from Quinn’s previous blog:

sparql_query = """SELECT ?customer ?product
WHERE {
  ?customer eg:owns ?product .
  ?product eg:isComposedOf ?part .
  ?part eg:needs eg:Recall .
}"""

results = graph.query(sparql_query, initNs={"eg": EG})
for row in results:
    print(row)

As you would expect, the results printed in the console are two ?customer ?product pairings:

(rdflib.term.URIRef('http://example.com/Stephen_Smith'), rdflib.term.URIRef('http://example.com/Product_A'))
(rdflib.term.URIRef('http://example.com/Lisa_Lu'), rdflib.term.URIRef('http://example.com/Product_A'))

Summary

By transforming our CSV files into RDF triples, we created a centralized, connected graph of information, enabling the simple retrieval of very granular and case-specific data. In this case, we simply traversed the relationships in our graph between Customers, Products, Parts, and Actions to determine which Customers needed to be notified of a recall. In practice, these concepts can be expanded to meet any personalization needs for your organization.

Knowledge Graphs are an integral part of serving up targeted, useful information via a Componentized Content Management System, and your organization doesn’t need to start from scratch. CSVs and tabular data can easily be transformed into RDF and aggregated as the foundation for your organization’s Knowledge Graph. If you are interested in transforming your data into RDF and want help planning or implementing a transformation effort, contact us here.

The post Transforming Tabular Data into Personalized, Componentized Content using Knowledge Graphs in Python appeared first on Enterprise Knowledge.