Developers Guide How to Build Knowledge Graph
Developers Guide How to Build Knowledge Graph
How to Build a
Knowledge Graph
The Developer’s Guide: How to Build a Knowledge Graph
Table of Contents
MATCH Clause.................................................................................. 14
Next Steps.................................................................................................................. 16
Supply Chain.................................................................................................. 18
Entity Resolution........................................................................................... 18
GenAI............................................................................................................. 18
This guide walks you through everything you need Figure 1. Knowledge graph example
to know to build your first knowledge graph. You’ll
learn core concepts and how to think about modeling Knowledge graphs surface hidden patterns through
data with relationships. Then, you’ll set up your connections in data. For instance, a medication
own knowledge graph and start querying it to manufacturer depends on a supplier network for
answer questions that you can’t answer in a product components. A knowledge graph could
“traditional database.” reveal that several key suppliers are located in
a hurricane-prone region — a risk that might go
unnoticed in a traditional database.
3
The Developer’s Guide: How to Build a Knowledge Graph
Figure 2. Healthcare example — nodes In the healthcare example, the “Patient” node could
have properties like name, date of birth, and contact
Labels identify nodes by role or type, serving as a
information, while the “Disease” node could include
classifier or tag that defines their function or purpose
properties like name and description:
in your domain. They add semantic meaning to nodes,
making the graph more intuitive to understand and
query. When you specify a label in your query, it
helps the graph database find the type of node you’re
looking for.
4
The Developer’s Guide: How to Build a Knowledge Graph
easier to understand and enables more efficient another layer: unstructured data. This is where the
querying, analysis, and inference across different magic starts to happen in a knowledge graph: the
levels of detail. ability to add new and different types of data and
then query relationships across all the data.
In the healthcare knowledge graph, diseases could
be organized into categories (such as cardiovascular Sign Up for a Neo4j Account
or respiratory diseases), while patients could be You’ll build your knowledge graph on the cloud-
grouped by risk factors or age ranges. This structure hosted, fully managed Neo4j AuraDB Graph
enables analysis at various levels, from individual Database. Neo4j stores data as nodes and
patient-disease relationships to broader population relationships, supports the Cypher graph query
health trends. language, and offers tools for data visualization, data
science, and data connectors. Before using AuraDB,
you’ll need an account.
5
The Developer’s Guide: How to Build a Knowledge Graph
haven’t already, navigate to the Aura Console and complete, you can move to the next section, where
log in. Then: you’ll design a graph data model for importing data.
2. You’ll see a list of instance types, with the Figure 10. Data import screen
Professional tier (center option) highlighted by
Click New data source in the middle:
default. AuraDB Free is a great way to start learning
and exploring knowledge graphs. When you’re ready
to move to production-quality, high-performance
applications in the cloud, you can progress to AuraDB
Professional. We’ll use the Free instance for our
knowledge graph, so click the Select button at the
bottom of the Free tier.
6
The Developer’s Guide: How to Build a Knowledge Graph
7
The Developer’s Guide: How to Build a Knowledge Graph
To start designing the model, click Add node label: 1. Type Customer as the label in the Name field. This
label will identify the type of entity these nodes
represent in the graph.
8
The Developer’s Guide: How to Build a Knowledge Graph
9
The Developer’s Guide: How to Build a Knowledge Graph
Figure 25. Create “Product” node screen Figure 27. Create “Supplier” node type screen
Next, create a new relationship type by drawing a line Your knowledge graph model now has four node
from the “Order” node to the “Product” node. Name types and three relationship types, but it still lacks
this new relationship type “ORDERS”: an organizing principle. In the Northwind example,
your organizing principle could be a product
hierarchy that streamlines product group searches.
As another option, you could choose a process-based
principle around the order fulfillment stages to
optimize the supply chain and delivery network.
10
The Developer’s Guide: How to Build a Knowledge Graph
• categories.csv
• customers.csv
• order-details.csv
Figure 31. Drag & Drop and browse support CSV selection screen
• orders.csv
• products.csv
• suppliers.csv
11
The Developer’s Guide: How to Build a Knowledge Graph
Here’s what you should see after uploading the files Next, scroll down to the Properties section. It shows
(properties collapsed): a list of the properties you defined earlier in your
graph model. Since you used the same naming
convention as the GitHub repository, the property
names in your graph model will match the field
names in the CSV files, which simplifies the mapping
process. Though you can map each property from
your model to the CSV field manually using the drop-
downs, a simpler option is to click Map from table
just above the properties, choose which columns
from the CSV files to map, and click Confirm:
Figure 35. Complete mappings screen Figure 37. Import results screen
After completing the mapping process for all Click the X to close the pop-up window.
elements of your knowledge graph, you’re ready to
Now that the data is imported into the database,
populate the database.
you can use queries to understand behaviors and
Click the Run import button to load your data: patterns in the data.
This action starts the import process. You’ll see a Cypher expresses graph patterns in a way that
progress bar indicating the status of the import. resembles how they’re drawn on a whiteboard. For
Once complete, a pop-up window will display the instance, a statement like “customer orders product”
import results. The window provides a quick overview can be represented in Cypher as:
of the import process outcome and lets you verify
(c:Customer)-[r:ORDERS]->(p:Product)
whether the data was successfully imported into your
knowledge graph.
13
The Developer’s Guide: How to Build a Knowledge Graph
The next sections cover the three most important relationship to a “Category” node (variable c).
Cypher clauses you’ll need to write queries and The “Category” node is filtered to only match
interact with your knowledge graph: where categoryName is “Beverages.”
• RETURN p, rel, c specifies the data to be
1. MATCH finds and returns the nodes or
returned from the matched pattern - products
patterns specified.
(p), PART_OF relationships (rel), and categories
2. CREATE adds new nodes or patterns specified (c):
to the graph.
MATCH Clause
MATCH (p:Product)-[rel:PART_OF]-
>(c:Category {categoryName: “Bever-
ages”})
RETURN p, rel, c; Figure 41. Sample output screen
14
The Developer’s Guide: How to Build a Knowledge Graph
MATCH (c:Customer)-[r1:PURCHASED]-
>(o:Order)-[r2:ORDERS]->(p:Product
{productName: “Ipoh Coffee”})
RETURN c.companyName, COUNT(o) AS or-
ders, collect(o.orderID)
ORDER BY orders DESC;
15
The Developer’s Guide: How to Build a Knowledge Graph
16
The Developer’s Guide: How to Build a Knowledge Graph
17
The Developer’s Guide: How to Build a Knowledge Graph
Entity Resolution
Entity resolution is the process of identifying whether
multiple records are referencing the same real-world
entity. In its simplest form, you can perform entity
resolution with hand-crafted queries to compare
key identifying attributes according to a company’s
business rules. However, this approach takes a lot of
Figure 46. Seven graphs of the enterprise effort to write code and a lot of time to run
the comparisons.
The next section explores a few use cases of
knowledge graphs to illustrate these benefits With a knowledge graph, you can accelerate the
in practice. development and runtime requirements of entity
resolution. Storing data as a knowledge graph has
Supply Chain
several advantages over other approaches:
Effective supply chain management requires
understanding relationships between suppliers, • Shared identifiers or attributes can be easily
distributors, warehouses, transportation logistics, discovered by modeling them as separate
raw materials, products, etc. A knowledge graph nodes in the knowledge graph. Modeling in
is a natural way to model and store this kind of this way makes it clear when two entities share
information because the connections between common information and are a candidate
different pieces of data are numerous, complex, for merging.
and (often) constantly changing. Because of these • Graph algorithms, such as weakly connected
characteristics, a knowledge graph provides a strong components, can segment the knowledge
foundation for supply chain optimization, contingency graph into separated communities, where there
planning, and risk management. are no shared connections between the data.
This helps reduce the number of comparisons
The benefits of using a knowledge graph for supply
needed in entity resolution because nodes in
chain insights include:
separate communities don’t need to
• The impact of a supply chain disruption can be compared.
be easily found by following the relationships • Knowledge graphs speed up transitive
downstream from the disruption. comparisons, which are needed to identify
• Graph algorithms, such as shortest path, can when more than two digital entities represent
help to optimize delivery routes and sourcing the same real-world entity.
strategies for time, cost, or other metrics.
To learn more about using a knowledge graph for
• Graph queries can quickly identify choke points
entity resolution, see “Graph Data Science Use
in a supply chain, which provides an opportunity
Cases: Entity Resolution.”
to find alternative suppliers, transportation
routes, etc. to mitigate the risk at that critical GenAI
point in the network. Despite LLMs’ impressive ability to produce
contextually relevant outputs, they have significant
Supply chains work well as knowledge graphs
weaknesses. They lack access to real-time data,
because they consist of multiple complex stages,
and they can’t incorporate private or proprietary
inputs, outputs, and connection points. Working with
information not included in their training set.
this data as a graph rather than in tables is much
Furthermore, responses are unverified and don’t
more intuitive and allows for better insights.
18
The Developer’s Guide: How to Build a Knowledge Graph
19
The Developer’s Guide: How to Build a Knowledge Graph
Concluding Thoughts and In this guide, you learned how to create a knowledge
graph from scratch and how to obtain insights from
Further Learning it using the Cypher graph query language. You also
learned about the role the knowledge graph plays in
Because knowledge graphs represent information certain domains, like supply chains, entity resolution,
as an interconnected network of entities and and GenAI.
relationships, they reflect the complex, context- Here are some immediate steps you can take to build
dependent nature of real-world information. upon this foundational knowledge:
Structuring data in this way allows you to model • Use your Neo4j instance to experiment with
reality with remarkable fidelity, capturing nuances different data models and explore complex
across and within domains that siloed data structures queries.
often miss. You can highlight connections and • Complete some of the free self-paced
insights that aren’t possible with traditional data courses on graph concepts and techniques in
structures. A flexible structure also makes it easy GraphAcademy.
to integrate new data from various sources without • Join the Neo4j Community to get support
disrupting existing relationships. and insights from fellow graph developers
and enthusiasts.
Acknowledgements
This guide was developed with contributions from technical subject matter experts who helped ensure accuracy
and clarity of the content. Special thanks to Jennifer Reif, John Stegeman, and Damaso Sanoja for their technical
expertise and contributions to this developer guide.
Build Now
20