Create table
As a first step to creating your database, you need to define your node and directed relationships.
In the property graph model, nodes and relationships have labels. In Kùzu, every node or
relationship can have one label. The node and relationships and the predefined properties on them are
defined through CREATE NODE TABLE
and CREATE REL TABLE
statements.
The choice of using the term “table” over “label” is intentional and explained below.
Create a node table
To create a node table, use the CREATE NODE TABLE
statement as shown below:
CREATE NODE TABLE User (name STRING, age INT64 DEFAULT 0, reg_date DATE, PRIMARY KEY (name))
Alternatively, you can specify the keyword PRIMARY KEY
immediately after the column name, as follows:
CREATE NODE TABLE User (name STRING PRIMARY KEY, age INT64 DEFAULT 0, reg_date DATE)
The above statements adds a User
table to the catalog of the system with three properties: name
, age
, and reg_date
,
with the primary key being set to the name
property in this case.
The name of the node table, User
, specified above will serve as the “label” which we want to query
in Cypher, for example:
MATCH (a:User) RETURN *
Primary key
Kùzu requires a primary key column for node table which can be either a STRING
, numeric, DATE
, or BLOB
property of the node. Kùzu will generate an index to do quick lookups on the primary key (e.g., name
in the above example). Alternatively, you can use the SERIAL
data type to generate an auto-increment column as primary key.
Default value
Each property in a table can have a default value. If not specified, the default value is NULL
.
CREATE NODE TABLE User (name STRING, age INT64 DEFAULT 0, reg_date DATE, PRIMARY KEY (name))
In the above example, the age
property is set to a default value of 0
rather than NULL
. The
name
and reg_date
properties do not have default values, so they will be NULL
if not provided
during data insertion.
Create a relationship table
Once you create node tables, you can define relationships between them using the CREATE REL TABLE
statement.
The following statement adds to the catalog a Follows
relationship table between User
and User
with one date
property on the relationship.
CREATE REL TABLE Follows(FROM User TO User, since DATE);
Defining a rel table with multiple node table pairs is also possible. The following statement adds a Knows
relationship table between two node table pairs:
User
andUser
User
andCity
CREATE REL TABLE Knows(FROM User TO User, FROM User TO City);
Bulk insert to relationship table with multiple from-to pairs
Internally, a relationship table with multiple from-to pairs creates a child table per from-to pair. In the example above, the following two children tables are created internally
Knows_User_UserKnows_User_City
When bulk insert into a relationship table with multiple from-to pairs, user need to specify which child table to insert through from, to
options. For exmaple, the following two statements will bulk insert into Knows
relationship table.
Copy Knows FROM 'knows_user_user.csv' (from='User', to='User');Copy Knows FROM 'knows_user_city.csv' (from='User', to='City');
Alternatively, you can also directly copy into a child table. Though this approach is not recommended and will be deprecated in the future release.
Copy Knows_User_User FROM 'knows_user_user.csv';Copy Knows_User_City FROM 'knows_user_city.csv';
Relationship Multiplicities
For any relationship label E, e.g., , by default there can be multiple relationships from any node v both in the forward and backward direction. In database terminology, relationships are by default many-to-many. For example in the first Follows
example above: (i) any User
node v
can follow multiple User
nodes; and (ii) be followed by multiple User
nodes. You can also constrain the multiplicity to at most 1 (we don’t yet support “exactly 1” semantics as you may be used to via foreign key constraints in relational systems) in either direction.
Below are a few examples:
CREATE REL TABLE LivesIn(FROM User TO City, MANY_ONE)
The DDL shown above indicates that LivesIn
has n-1
multiplicity. This command enforces an additional constraint that each User
node v
might live in at most one City
node (assuming our database has City
nodes). It does not put any constraint in the “backward” direction, i.e., there can be multiple User
s living in the same City
. As another example to explain the semantics of multiplicity constraints in the presence of multiple node labels, consider the following:
CREATE REL TABLE Likes(FROM Pet TO User, ONE_MANY)
The DDL above indicates that Likes
has 1-to-n multiplicity. This DDL command enforces the constraint that each User
node v
might be Liked
by one Pet
node. It does not place any constraints in the forward direction, i.e., each Pet
node might know multiple User
s.
In general in a relationship E
’s multiplicity, if the “source side” is ONE
, then for each node v
that can be the destination of E
relationships, v
can have at most one backward edge. If the “destination side” is ONE
, then each node v
that can be the source of E
relationships, v
can have at most one forward edge.
Create relationship table group [Deprecated]
You can use relationship table groups to gain added flexibility in your data modelling, by defining a relationship table with multiple node table pairs. This is done via the CREATE REL TABLE GROUP
statement. This has a similar syntax to CREATE REL TABLE
, but uses multiple FROM ... TO ...
clauses. Internally, a relationship table group defines a relationship table for each FROM ... TO ...
block. Any query to a relationship table group is treated as a query on the union of all relationship tables in the group.
CREATE REL TABLE GROUP Knows (FROM User To User, FROM User to City, year INT64);
The statement above creates a Knows_User_User
rel table and a Knows_User_City
rel table. And a Knows
rel table group refering these two rel tables.
A relationship table group can be used as a regular relationship table for querying purposes.
MATCH (a:User)-[:Knows]->(b) RETURN *;
The query above is equivalent to the following:
MATCH (a:User)-[:Knows_User_User|:Knows_User_city]->(b) RETURN *;
As you can imagine, the more relationships you want to selectively query on, the more useful relationship table groups become.
Create table if not exists
If the given table name already exists in the database, Kùzu throws an exception when you try to create it.
To avoid the exception being raised, use the IF NOT EXISTS
clause. This tells Kùzu to do nothing when
the given table name already exists in the database.
Example:
CREATE NODE TABLE IF NOT EXISTS UW(ID INT64, PRIMARY KEY(ID))
This query tells Kùzu to only create the UW
table if it doesn’t exist.
The same applies to relationship tables as well.