Kùzu implements a structured property graph model and requires a pre-defined schema.
Schema definition involves declaring node and relationship tables and their associated properties.
Each property key is strongly typed (types must be explicitly declared)
For node tables, a primary key must be defined
For relationship tables, no primary key is required
Persistence
Kùzu supports both on-disk and in-memory modes of operation. The mode is determined
at the time of creating the database, explained below.
On-disk database
At the time of creating your database, if you specify a database path, for example, ./demo_db, Kùzu
will be opened under on-disk mode. In this mode, Kùzu will persist all data to disk at the specified
path. All transactions are logged in the Write-Ahead Log (WAL), in which any changes will be merged into the
database files during checkpoints.
In-memory database
At the time of creating your database, if you omit the database path, specify it as an empty string
"", or explicitly specify the path as :memory:, Kùzu will be opened under in-memory mode.
In this mode, there are no writes to the WAL, and no data is persisted to
disk. All data is lost when the process finishes.
Quick start
To create your first graph, ensure that you have installed the Kùzu CLI or your preferred client API installed
as per the instructions in the Installation section.
The example below uses a graph schema with two node types, User and City, and two relationship types, Follows and LivesIn.
The dataset in CSV format can be found here.
Because Kùzu is an embedded database, there are no servers to set up — you can simply import
the kuzu module in your preferred client library and begin interacting with the database in your
client API of choice. The examples below demonstrate how to create a graph schema and insert data
into an on-disk database.
You can do the same using an in-memory database by omitting the database path, specifying
an empty string "", or specifying :memory: in your client API of choice.
The approach shown above returned a list of lists containing query results. See below for more
output options for Python.
Pandas
You can also pass the results of a Cypher query to a Pandas DataFrame
for downstream tasks. This assumes that pandas is installed in your environment.
Polars
Polars is another popular DataFrames library for Python users, and you
can process the results of a Cypher query in much the same way you did with Pandas. This assumes
that polars is installed in your environment.
Arrow Table
You can also use the PyArrow library to work with
Arrow Tables in Python. This assumes that pyarrow is installed in your environment. This
approach is useful when you need to interoperate with other systems that use Arrow as a backend.
In fact, the get_as_pl() method shown above for Polars materializes a pyarrow.Table under the hood.
Result:
Kùzu’s Java client library is available on Maven Central. You can add the following snippet to your pom.xml to get it installed:
Alternatively, if you are using Gradle, you can add the following snippet to your build.gradle file to include Kùzu’s Java client library:
For Groovy DSL:
For Kotlin DSL:
Below is an example Gradle project structure for a simple Java application that creates a graph schema and inserts
data into the database for the given example.
The minimal build.gradle contains the following configurations:
The Main.java contains the following code:
To execute the example, navigate to the project root directory and run the following command:
Result:
When installing the kuzu crate via Cargo, it will by default build and statically link Kùzu’s C++
library from source. You can also link against the dynamic release libraries (see the Rust
crate docs for details).
The main.rs file contains the following code:
Result:
Result:
The Kùzu C++ client is distributed as so/dylib/dll+lib library files along with a header file (kuzu.hpp).
Once you’ve downloaded and extracted the C++ files into a directory, it’s ready to use without
any additional installation. You just need to specify the library search path for the linker.
In the following example, we assume that the so/dylib/dll+lib, the header file, the CSV files, and
the cpp code file are all under the same directory as follows:
The main.cpp file contains the following code:
Compile and run main.cpp. Since we did not install the libkuzu as a system library, we need to
override the linker search path to correctly compile the C++ code and run the compiled program.
On Linux:
On macOS:
On Windows, the library file is passed to the compiler directly and the current directory is used
automatically when searching for kuzu_shared.dll at runtime:
Result:
The Kùzu C API shares the same so/dylib library files with the C++ API and can be used by
including the C header file (kuzu.h).
In this example, we assume that the so/dylib, the header file, the CSV files, and the C code file
are all under the same directory:
The file main.c contains the following code:
Compile and run main.c: Since we did not install the libkuzu as a system library, we need to
override the linker search path to correctly compile the C code and run the compiled program.
On Linux:
On macOS:
On Windows, the library file is passed to the compiler directly and the current directory is used
automatically when searching for kuzu_shared.dll at runtime:
Result:
When using the Kùzu CLI’s shell, you can create an on-disk database by specifying a path after
the kuzu command in the terminal.
You can create an in-memory database by omitting the path entirely, and just calling kuzu:
Then, proceed to enter the following Cypher statements separated by semicolons. Note that you must
indicate the end of each query statement with a semicolon in the shell, otherwise it will not be parsed
correctly and fail to execute.