Import data from CSV files
You can bulk import data to node and relationship tables from CSV files
using the COPY FROM
command. It is highly recommended to use COPY FROM
if you are creating large
databases. You can use COPY FROM
to import data into an empty table or to append data to an existing table.
The CSV import configuration can be manually set by specifying the parameters inside ( )
at the
end of the the COPY FROM
clause. The following table shows the configuration parameters supported:
Parameter | Description | Default Value |
---|---|---|
HEADER | Whether the first line of the CSV file is the header. Can be true or false. | false |
DELIM | Character that separates different columns in a lines. | , |
QUOTE | Character to start a string quote. | " |
ESCAPE | Character within string quotes to escape QUOTE and other characters, e.g., a line break. See the important note below about line breaks lines below. | \ |
SKIP | Number of rows to skip from the input file | 0 |
PARALLEL | Read csv files in parallel or not | true |
The example below specifies that the CSV delimiter is|
and also that the header row exists.
Import to node table
Create a node table User
as follows:
The CSV file user.csv
contains the following fields:
The following statement will load user.csv
into User table.
Import to relationship table
When loading into a relationship table, Kùzu assumes the first two columns in the file are:
FROM
Node Column: The primary key of theFROM
nodes.TO
Node Column: The primary key of theTO
nodes.
The rest of the columns correspond to relationship properties.
Create a relationship table Follows
using the following Cypher query:
This reads data from the below CSV file follows.csv
:
The following statement loads the follows.csv
file into a Follows
table.
Note that the header wasn’t present in the CSV file, hence the header
parameter is not set.
To skip the first 3 lines of the CSV file, you can use the SKIP
parameter as follows:
Import multiple files to a single table
It is common practice to divide a large CSV file into several smaller files for cleaner data management. Kùzu can read multiple files with the same structure, consolidating their data into a single node or relationship table. You can specify that multiple files are loaded in the following ways:
Glob pattern
This is similar to the Unix glob pattern, where you specify file paths that match a given pattern. The following wildcard characters are supported:
Wildcard | Description |
---|---|
* | match any number of any characters (including none) |
? | match any single character |
[abc] | match any one of the characters enclosed within the brackets |
[a-z] | match any one of the characters within the range |
List of files
Alternatively, you can just specify a list of files to be loaded.