Skip to content
Blog

Import data from CSV files

You can bulk import data to node and relationship tables from CSV files using the COPY FROM command. It is highly recommended to use COPY FROM if you are creating large databases. You can use COPY FROM to import data into an empty table or to append data to an existing table.

The CSV import configuration can be manually set by specifying the parameters inside ( ) at the end of the the COPY FROM clause. The following table shows the configuration parameters supported:

Any option that is a Boolean can be enabled or disabled in multiple ways.

You can write true, or 1 to enable the option, and false, or 0 to disable it.

The Boolean value can also be omitted (e.g., by only passing (HEADER)), in which case true is assumed.

The assignment operator = can also be space .

ParameterDescriptionDefault Value
HEADERWhether the first line of the CSV file is the header. Can be true or false.false
DELIM or DELIMITERCharacter that separates different columns in a lines.,
QUOTECharacter to start a string quote."
ESCAPECharacter within string quotes to escape QUOTE and other characters, e.g., a line break.
See the important note below about line breaks lines below.
\
SKIPNumber of rows to skip from the input file0
PARALLELRead csv files in parallel or nottrue

The example below specifies that the CSV delimiter is| and also that the header row exists.

COPY User FROM "user.csv" (HEADER=true, DELIM="|");

Import to node table

Create a node table User as follows:

CREATE NODE TABLE User(name STRING, age INT64, reg_date DATE, PRIMARY KEY (name))

The CSV file user.csv contains the following fields:

name,age,reg_date
Adam,30,2020-06-22
Karissa,40,2019-05-12
...

The following statement will load user.csv into User table.

COPY User FROM "user.csv" (header=true);

Import to relationship table

When loading into a relationship table, Kùzu assumes the first two columns in the file are:

  • FROM Node Column: The primary key of the FROM nodes.
  • TO Node Column: The primary key of the TO nodes.

The rest of the columns correspond to relationship properties.

Create a relationship table Follows using the following Cypher query:

CREATE REL TABLE Follows(FROM User TO User, since DATE)

This reads data from the below CSV file follows.csv:

Adam,Karissa,2010-01-30
Karissa,Michelle,2014-01-30
...

The following statement loads the follows.csv file into a Follows table.

COPY Follows FROM "follows.csv";

Note that the header wasn’t present in the CSV file, hence the header parameter is not set.

To skip the first 3 lines of the CSV file, you can use the SKIP parameter as follows:

COPY Follows FROM "follows.csv" (SKIP=3);

Import multiple files to a single table

It is common practice to divide a large CSV file into several smaller files for cleaner data management. Kùzu can read multiple files with the same structure, consolidating their data into a single node or relationship table. You can specify that multiple files are loaded in the following ways:

Glob pattern

This is similar to the Unix glob pattern, where you specify file paths that match a given pattern. The following wildcard characters are supported:

WildcardDescription
*match any number of any characters (including none)
?match any single character
[abc]match any one of the characters enclosed within the brackets
[a-z]match any one of the characters within the range
COPY User FROM "User*.csv"

List of files

Alternatively, you can just specify a list of files to be loaded.

COPY User FROM ["User0.csv", "User0.csv", "User2.csv"]