Skip to content
Blog

Create your first graph

Once you have the Kùzu CLI or your preferred client library installed, you can define a graph schema and begin inserting data to the database using Cypher. The example below uses a graph schema with two node types, User and City, and two relationship types, Follows and LivesIn. The dataset in CSV format can be found here.

Quick start

Because Kùzu is an embedded database, there are no servers to set up — you can simply import the kuzu module in your preferred client library and begin interacting with the database. Cypher queries can be passed as string literals to the execute (or equivalent) method in the respective client, or run directly in the Kùzu CLI shell.

main.py
import kuzu
def main() -> None:
# Initialize database
db = kuzu.Database("./demo_db")
conn = kuzu.Connection(db)
# Create schema
conn.execute("CREATE NODE TABLE User(name STRING, age INT64, PRIMARY KEY (name))")
conn.execute("CREATE NODE TABLE City(name STRING, population INT64, PRIMARY KEY (name))")
conn.execute("CREATE REL TABLE Follows(FROM User TO User, since INT64)")
conn.execute("CREATE REL TABLE LivesIn(FROM User TO City)")
# Insert data
conn.execute('COPY User FROM "./data/user.csv"')
conn.execute('COPY City FROM "./data/city.csv"')
conn.execute('COPY Follows FROM "./data/follows.csv"')
conn.execute('COPY LivesIn FROM "./data/lives-in.csv"')
# Execute Cypher query
response = conn.execute(
"""
MATCH (a:User)-[f:Follows]->(b:User)
RETURN a.name, b.name, f.since;
"""
)
while response.has_next():
print(response.get_next())

Result:

Terminal window
['Adam', 'Karissa', 2020]
['Adam', 'Zhang', 2020]
['Karissa', 'Zhang', 2021]
['Zhang', 'Noura', 2022]

The approach shown above returned a list of lists containing query results. See below for more output options for Python.

Pandas

You can also pass the results of a Cypher query to a Pandas DataFrame for downstream tasks. This assumes that pandas is installed in your environment.

# pip install pandas
response = conn.execute(
"""
MATCH (a:User)-[f:Follows]->(b:User)
RETURN a.name, b.name, f.since;
"""
)
print(response.get_as_df())
Terminal window
a.name b.name f.since
0 Adam Karissa 2020
1 Adam Zhang 2020
2 Karissa Zhang 2021
3 Zhang Noura 2022

Polars

Polars is another popular DataFrames library for Python users, and you can process the results of a Cypher query in much the same way you did with Pandas. This assumes that polars is installed in your environment.

# pip install polars
response = conn.execute(
"""
MATCH (a:User)-[f:Follows]->(b:User)
RETURN a.name, b.name, f.since;
"""
)
print(response.get_as_pl())
Terminal window
shape: (4, 3)
┌─────────┬─────────┬─────────┐
a.name b.name f.since
--- --- ---
str str i64
╞═════════╪═════════╪═════════╡
Adam Karissa 2020
Adam Zhang 2020
Karissa Zhang 2021
Zhang Noura 2022
└─────────┴─────────┴─────────┘

Arrow Table

You can also use the PyArrow library to work with Arrow Tables in Python. This assumes that pyarrow is installed in your environment. This approach is useful when you need to interoperate with other systems that use Arrow as a backend. In fact, the get_as_pl() method shown above for Polars materializes a pyarrow.Table under the hood.

# pip install pyarrow
response = conn.execute(
"""
MATCH (a:User)-[f:Follows]->(b:User)
RETURN a.name, b.name, f.since;
"""
)
print(response.get_as_arrow())
Terminal window
pyarrow.Table
a.name: string
b.name: string
f.since: int64
----
a.name: [["Adam","Adam","Karissa","Zhang"]]
b.name: [["Karissa","Zhang","Zhang","Noura"]]
f.since: [[2020,2020,2021,2022]]