Create table
As a first step to creating your database, you need to define your node and directed relationships.
In the property graph model, nodes and relationships have labels. In Kùzu, every node or
relationship can have one label. The node and relationships and the predefined properties on them are
defined through CREATE NODE TABLE
and CREATE REL TABLE
statements.
The choice of using the term “table” over “label” is intentional and explained below.
Create a node table
To create a node table, use the CREATE NODE TABLE
statement as shown below:
Alternatively, you can specify the keyword PRIMARY KEY
immediately after the column name, as follows:
The above statements adds a User
table to the catalog of the system with three properties: name
, age
, and reg_date
,
with the primary key being set to the name
property in this case.
The name of the node table, User
, specified above will serve as the “label” which we want to query
in Cypher, for example:
Primary key
Kùzu requires a primary key column for node table which can be either a STRING
, numeric, DATE
, or BLOB
property of the node. Kùzu will generate an index to do quick lookups on the primary key (e.g., name
in the above example). Alternatively, you can use the SERIAL
data type to generate an auto-increment column as primary key.
Default value
Each property in a table can have a default value. If not specified, the default value is NULL
.
In the above example, the age
property is set to a default value of 0
rather than NULL
. The
name
and reg_date
properties do not have default values, so they will be NULL
if not provided
during data insertion.
Create a relationship table
Once you create node tables, you can define relationships between them using the CREATE REL TABLE
statement.
The following statement adds to the catalog a Follows
relationship table between User
and User
with one date
property on the relationship.
Relationship Multiplicities
For any relationship label E, e.g., , by default there can be multiple relationships from any node v both in the forward and backward direction. In database terminology, relationships are by default many-to-many. For example in the first Follows example above: (i) any User node v can follow multiple User nodes; and (ii) be followed by multiple User nodes. You can also constrain the multiplicity to at most 1 (we don’t yet support exactly 1 semantics as in foreign key constraints in relational systems) in either direction. You can restrict the multiplicities for two reasons:
- Constraint: Multiplicities can serve as constraints you would like to enforce (e..g, you want Kùzu to error if an application tries to add a second relationship of a particular label to some node)
- Performance: Kùzu can store 1-to-1, many-to-1, or 1-to-many relationships (explained momentarily) in more efficient/compressed format, which is also faster to scan.
You can optionally declare the multiplicity of relationships by adding MANY_MANY
, ONE_MANY
, MANY_ONE
, or ONE_ONE
clauses to the end of the CREATE REL TABLE
command.
Below are a few examples:
The DDL shown above indicates that LivesIn
has n-1 multiplicity. This command enforces an additional constraint that each User
node v
might live in at most one City
node (assuming our database has City
nodes). It does not put any constraint in the “backward” direction, i.e., there can be multiple User
s living in the same City
. As another example to explain the semantics of multiplicity constraints in the presence of multiple node labels, consider the following:
The DDL above indicates that Likes
has 1-to-n multiplicity. This DDL command enforces the constraint that each User
node v
might be Liked
by one Pet
node. It does not place any constraints in the forward direction, i.e., each Pet
node might know multiple User
s.
In general in a relationship E
’s multiplicity, if the “source side” is ONE
, then for each node v
that can be the destination of E
relationships, v
can have at most one backward edge. If the “destination side” is ONE
, then each node v
that can be the source of E
relationships, v
can have at most one forward edge.
Create relationship table group
You can use relationship table groups to gain added flexibility in your data modelling, by defining a relationship table with multiple node table pairs. This is done via the CREATE REL TABLE GROUP
statement. This has a similar syntax to CREATE REL TABLE
, but uses multiple FROM ... TO ...
clauses. Internally, a relationship table group defines a relationship table for each FROM ... TO ...
block. Any query to a relationship table group is treated as a query on the union of all relationship tables in the group.
The statement above creates a Knows_User_User rel table and a Knows_User_City rel table. And a Knows rel table group refering these two rel tables.
Output:
A relationship table group can be used as a regular relationship table for querying purposes.
The query above is equivalent to the following:
As you can imagine, the more relationships you want to selectively query on, the more useful relationship table groups become.
Create table if not exists
If the given table name already exists in the database, Kùzu throws an exception when you try to create it.
To avoid the exception being raised, use the IF NOT EXISTS
clause. This tells Kùzu to do nothing when
the given table name already exists in the database.
Example:
This query tells Kùzu to only create the UW
table if it doesn’t exist.
The same applies to relationship tables as well.