Import data
There are multiple ways to import data in Kùzu. The only prerequisite for inserting data into a database is that you first create a graph schema, i.e., the structure of your node and relationship tables.
For small graphs (a few thousand nodes), the CREATE
and MERGE
Cypher clauses
can be used to insert nodes and
relationships. These are similar to SQL’s INSERT
statements, but bear in mind that they are slower than the bulk import
options shown below. The CREATE
/MERGE
clauses are intended to do small additions or updates on a sporadic basis.
In general, the recommended approach is to use COPY FROM
(rather than creating or
merging nodes one by one), for larger graphs of millions of nodes and beyond.
This is the fastest way to bulk insert data into Kùzu.
COPY FROM
CSV
The COPY FROM
command is used to bulk import data from a CSV file into a node or relationship table.
See the linked card below for more information and examples.
COPY FROM
Parquet
Similar to CSV, the COPY FROM
command is used to bulk import data from a Parquet file into a node or relationship table.
See the linked card below for more information and examples.
COPY FROM
NumPy
Importing from NumPy is a specific use case that allows you to import numeric data from a NumPy file into a node table.
COPY FROM
subquery results
You can copy data from a subquery’s results into a node or relationship table.
This is useful when you need to modify existing data and re-insert it into the database, or if youw
want to copy data from a LOAD FROM
scan operation on another data format.
COPY FROM
DataFrames
You can copy data from a Pandas or Polars DataFrame into a node or relationship table. This is useful when you are already doing your data transformations in either Pandas or Polars DataFrames and want to directly insert results from their columns into a Kùzu table.