Skip to content
Blog

Import data

There are multiple ways to import data in Kùzu. The only prerequisite for inserting data into a database is that you first create a graph schema, i.e., the structure of your node and relationship tables.

For small graphs (a few thousand nodes), the CREATE and MERGE Cypher clauses can be used to insert nodes and relationships. These are similar to SQL’s INSERT statements, but bear in mind that they are slower than the bulk import options shown below. The CREATE/MERGE clauses are intended to do small additions or updates on a sporadic basis.

In general, the recommended approach is to use COPY FROM (rather than creating or merging nodes one by one), for larger graphs of millions of nodes and beyond. This is the fastest way to bulk insert data into Kùzu.

COPY FROM CSV

The COPY FROM command is used to bulk import data from a CSV file into a node or relationship table. See the linked card below for more information and examples.

COPY FROM Parquet

Similar to CSV, the COPY FROM command is used to bulk import data from a Parquet file into a node or relationship table. See the linked card below for more information and examples.

COPY FROM NumPy

Importing from NumPy is a specific use case that allows you to import numeric data from a NumPy file into a node table.

COPY FROM subquery results

You can copy data from a subquery’s results into a node or relationship table. This is useful when you need to modify existing data and re-insert it into the database, or if youw want to copy data from a LOAD FROM scan operation on another data format.

COPY FROM DataFrames

You can copy data from a Pandas or Polars DataFrame into a node or relationship table. This is useful when you are already doing your data transformations in either Pandas or Polars DataFrames and want to directly insert results from their columns into a Kùzu table.