Strongly Connected Components
A strongly connected component (SCC) in a directed graph is a maximal subgraph where every pair of
vertices is mutually reachable - meaning for any two nodes u
and v
in the component, there exists a
directed path from u
to v
and a directed path from v
to u
.
This mutual reachability property means that nodes within an SCC form a cycle where they can all
reach each other through directed paths, and no additional nodes can be added to the component while
maintaining this property.
Kuzu implements two types of SCC algorithms:
- Parallel BFS-based coloring algorithm
- DFS-based Kosaraju’s algorithm
You can choose between either of the two algorithms, depending on your use case. See below for more details.
Syntax
// BFS-based algorithmCALL strongly_connected_components( graph_name, maxIterations := 100)RETURN node, group_id
Alias: scc
// DFS-based algorithm (Kosaraju's algorithm)CALL strongly_connected_components_kosaraju( graph_name)RETURN node, group_id
Alias: scc_ko
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
graph_name | STRING | No | - | Name of the projected graph to run the algorithm on |
maxIterations | INT64 | Yes | 100 | Maximum number of iterations to run (This parameter is only for BFS-based algorithm) |
Return
Column | Type | Description |
---|---|---|
node | NODE | Node object |
group_id | INT64 | The ID of the SCC that the node belongs to |
Example
Define schema
CREATE NODE TABLE Node(id STRING PRIMARY KEY);CREATE REL TABLE Edge(FROM Node to Node);
Insert nodes and edges
CREATE (u0:Node {id: 'A'}), (u1:Node {id: 'B'}), (u2:Node {id: 'C'}), (u3:Node {id: 'D'}), (u4:Node {id: 'E'}), (u5:Node {id: 'F'}), (u6:Node {id: 'G'}), (u7:Node {id: 'H'}), (u8:Node {id: 'I'}), (u0)-[:Edge]->(u1), (u1)-[:Edge]->(u2), (u1)-[:Edge]->(u4), (u2)-[:Edge]->(u2), (u2)-[:Edge]->(u5), (u3)-[:Edge]->(u1), (u3)-[:Edge]->(u6), (u4)-[:Edge]->(u0), (u4)-[:Edge]->(u5), (u4)-[:Edge]->(u3), (u5)-[:Edge]->(u7), (u6)-[:Edge]->(u4), (u6)-[:Edge]->(u5), (u7)-[:Edge]->(u8), (u8)-[:Edge]->(u5);
We first create a projected graph from the node and edge tables.
CALL project_graph('Graph', ['Node'], ['Edge']);
Then, you can run SCC using either of the two algorithms.
// BFS-based algorithmCALL strongly_connected_components('Graph')RETURN group_id, collect(node.id);
// DFS-based algorithm (Kosaraju's algorithm)CALL strongly_connected_components_kosaraju('Graph')RETURN group_id, collect(node.id);
┌──────────┬──────────────────┐│ group_id │ COLLECT(node.id) ││ INT64 │ STRING[] │├──────────┼──────────────────┤│ 3 │ [G,E,D,A,B] ││ 0 │ [I,F,H] ││ 8 │ [C] │└──────────┴──────────────────┘
Frequently asked questions
What’s the difference between scc
and scc_ko
?
scc
is a parallel BFS-based algorithm. While scc_ko
is a DFS-based algorithm.
Due to the nature of DFS, scc_ko
can only run in single-threaded mode.
Kosaraju’s algorithm is recommended in the following cases:
- If your graph is very sparse, i.e. with a large number of nodes but very small number of edges; or
- If your graph has a very high diameter
How is group_id
assigned?
group_id
is assigned based on Kuzu’s internal node offsets. Currently there is no way to assign group_id
based on node properties.