MATCH is the clause where you define a “graph pattern”, i.e., a join of node or relationship records,
to find in the database1. MATCH is often accompanied by WHERE (equivalent to SQL’s WHERE clause)
to define more predicates on the patterns that are matched.
We will use the example database for demonstration, whose schema and data import commands are given here.
Match nodes
Match nodes with a single label
The query below matches variable “a” to nodes with label User and returns “a”, which
is a shortcut in openCypher to return all properties of the node together with label and internal ID that variable “a” matches.
Output:
Match nodes with multiple labels
The query below matches variable “a” to nodes with label User or label City. “Return a” will return all properties of the node together with label and internal ID. Properties not exist in a label will be returned as NULL value (e.g. “population” not exists in “User”). Properties exists in multiple labels are expected to have the same data type (e.g. “name” has STRING data type in “User” and “City” ).
Output:
Match nodes with any label
Below query matches variable “a” to nodes with any label. In example database, it is equivalent to MATCH (a:User:City) RETURN a;.
Output:
Match relationships
Match directed relationships with a label
Similar to binding variables to node records, you can bind variables to relationship records and return them. You can specify the direction of relationship by <- or ->. The following query finds all “a” Users that follow a “b” User through an outgoing relationship from “a”, and returns name of “a”, relationship “e”, and name of “b”, where “e” will match the relationship from “a” to “b”.
Output:
The following query matches all the relationships through an incoming relationship from “a” (so “a” and “b” are swapped in output):
Output:
Match relationships with multi labels
Similar to matching nodes with multiple labels, you can bind variables to relationships with multiple labels. Below query finds all “a” User that Follows “b” User or LivesIn “b” City.
Output:
Match relationships with any label
Similar to matching nodes with any label, you can bind variables to relationships with any label by not specifying a label. Below query finds all relationships in the database.
Output:
Match undirected relationships
Users can match a relationship in both directions by not specifying a relationship direction (i.e. -). The following query finds all “b” users who either follows or being followed by “Karissa”.
Output:
Omit binding variables to nodes or relationships
You can also omit binding a variable to a node or relationship in your graph patterns if
you will not use them in somewhere else in your query (e.g., WHERE or RETURN). For example, below, we query for 2-hop paths searching for “the cities of Users that “a” Follows”.
Because we do not need to return the Users that “a” Users follows or the properties
of the Follows and LivesIn edges that form these 2-paths, we can omit giving variable names to them.
Output:
Match multiple patterns
Although paths can be matched in a single pattern, some patterns, in particular
cyclic patterns, require specifying multiple patterns/paths that form the pattern.
These multiple paths are separated by a comma. The following is a (directed) triangle
query and returns the only triangle in the database between Adam, Karissa, and Zhang.
Output:
Note that in the query node variables a and c appear twice, once on each of the 2 paths
in the query. In such cases, their labels need to be specified only the first time they appear
in the pattern. In the above query a and c’s labels are defined on the first/left path,
so you don’t have to specify them on the right path (though you still can).
Equality predicates on node/relationship properties
The WHERE clause is the main clause to specify arbitrary predicates on the nodes and relationships in your patters (e.g., a.age < b.age in where “a” and “b” bind to User nodes).
As a syntactic sugar openCypher allows equality predicates to be matched on
nodes and edges using the {prop1 : value1, prop2 : value2, ...} syntax. For example:
is syntactic sugar for:
and both queries output:
Match recursive relationships
You can also find recursive relationships (that are of variable length) between node records.
Specifically, you can find variable-length
connections between nodes by specifying in the relationship patterns,
e.g., -[:Label*min..max]->, where min and max specify the minimum and the maximum number of hops2.
The following query finds all Users that “Adam” follows within 1 to 2 hops and returns their names as well as length of the path.
Output:
Karissa is found through Adam -> Karissa
Zhang is found through Adam -> Zhang and Adam -> Karissa -> Zhang
Noura is found through Adam -> Zhang -> Noura
Similar to matching relationships, you can match undirected relationships or relationship with multiple labels.
The following query finds all Nodes excluding “Noura” that connects to “Noura” in both directions through any relationship with 2 hops.
Output:
Adam is found through Noura <- Zhang -> Adam
Karissa is found through Noura <- Zhang -> Karissa
Kitchener is found through Noura <- Zhang -> Kitchener
Return recursive relationships
A recursive relationship has the logical data type RECURSIVE_REL and is physically represented as STURCT{LIST[NODE], LIST[REL]}. Returning a recursive relationship will return all properties
Output:
By default, recursive relationship follows a WALK semantic, in which nodes and relationships can be visited repeatedly.
Kùzu also supports TRAIL and ACYCLIC semantics, which can be specified inside the recursive pattern after *.
A TRAIL is a walk in which all relationships are distinct.
Output:
The example above doesn’t include any recursive relationships that contain redundant internal IDs.
A ACYCLIC is a walk in which all nodes are distinct.
Output:
The example above doesn’t include recursive patterns that contain any repeated nodes.
Filter recursive relationships
We also support running predicates on recursive relationships to constrain the relationship being traversed.
The following query finds name of users and the number of path that are followed between 1-2 hops from Adam by person with age more than 45 and before 2022.
Output:
Our filter grammar is similar to that used by Memgraph
for example, in Cypher list comprehensions. The first variable represents intermediate relationships and the second one represents intermediate nodes.
Project properties of intermediate nodes/relationships
You can project a subset of properties for the intermediate nodes and relationships that bind within a recursive
relationship. You can define the projection list of intermediate nodes and relationships within two curly brackets {}{} at
the end of the recursive relationship. The first {} is used for projecting relationship properties and the second {} for
node properties. Currently, Kùzu only supports directly projecting properties and not of expressions using
the properties. Projecting properties of intermediate nodes and relationships can improve both performance and memory footprint.
Below is an example that projects only the since property of the intermediate relationship and the name property of the
intermediate nodes that will bind to the variable length relationship pattern of e. Readers can assume
that there are other properties than since on the Follow relationship table for our purposes (in our running example, the User nodes already have a second property age, which will be removed from the output as shown below).
Returns:
As can be seen in the output, the nodes that bind to e contain only the name property and the relationships that
bind to e contain only the since property.
Single shortest path
On top of variable length relationships, users can search for single shortest path by specifying SHORTEST key word in relationship, e.g. -[:Label* SHORTEST 1..max].
The following query finds a shortest path between Adam and any city and returns city name as well as length of the path.
Output:
All shortest paths
You can also search for all shortest paths with ALL SHORTEST key word, e.g. -[:Label* ALL SHORTEST 1..max]
The following query finds all shortest paths between Zhang and Waterloo.
Output:
Named paths
Kùzu treats paths as a first-class citizen, so users can assign a named variable to a path (i.e. connected graph ) and use it later on.
The following query returns all paths between Adam and Karissa.
Output:
Named paths can also be assigned to recursive graph patterns.
Output:
Multiple named path can appear in a single MATCH clause.
Output:
Extracting nodes and relationships from a path
A named path has the logical data type RECURSIVE_REL. You can access nodes and relationships within a named path through nodes(p) and rels(p) function calls.
Output:
More recursive relationship functions can be found here.
Footnotes
MATCH is similar to the FROM clause of SQL, where the list of tables that need to be joined are specified. ↩
Max number of hop will be set to 30 if omitted. You can change the configuration through SET statement. ↩