Skip to content
Blog

Rust tutorial: Analyze a social network

This tutorial will get you started using Kuzu’s Rust API to analyze a dataset.

In this tutorial, we will use the following social network dataset that consists of users and their posts.

Setup

Ensure that you have Rust installed on your system. Create a new Cargo project with Kuzu as a dependency:

Terminal window
cargo new kuzu_social_network
cd kuzu_social_network
cargo add kuzu

Next, download the zipped data and unzip the files.

Terminal window
curl -o tutorial_data.zip https://rgw.cs.uwaterloo.ca/kuzu-test/tutorial/tutorial_data.zip
unzip tutorial_data.zip
rm tutorial_data.zip

Before we can start querying the data, we need to create a new Kuzu database and import the CSV files. Since this is a one-time operation, we will use a separate Cargo target.

Create a new file called create_db.rs inside src/bin:

mkdir src/bin
cp src/main.rs src/bin/create_db.rs

Your directory should now look something like this:

kuzu_social_network
├── Cargo.lock
├── Cargo.toml
├── src
│   ├── bin
│   │   └── create_db.rs
│   └── main.rs
└── tutorial_data
├── node
│   ├── post.csv
│   └── user.csv
└── relation
├── FOLLOWS.csv
├── LIKES.csv
└── POSTS.csv

Test the setup by running the Cargo target programs. Note that the first cargo run will take some time as it compiles Kuzu from scratch. The second and subsequent runs should be faster.

Terminal window
$ cargo run --bin kuzu_social_network
Finished `dev` profile [unoptimized + debuginfo] target(s) in 3m 18s
Running `target/debug/kuzu_social_network`
Hello, world!
$ cargo run --bin create_db
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s
Running `target/debug/create_db`
Hello, world!

We are now ready to start the tutorial.

Create the database

Replace the contents of src/bin/create_db.rs with the following code snippet:

use kuzu::{Connection, Database, Error, SystemConfig};
fn main() -> Result<(), Error> {
// Rest of the code goes here.
Ok(())
}

Create and connect to an empty Kuzu database:

let db = Database::new("./social_network_db", SystemConfig::default())?;
let conn = Connection::new(&db)?;

This will create a database directory ./social_network_db where the imported csv data will be stored.

Kuzu also supports an in-memory mode where the database is active only during the execution of the Rust program. For more information, refer to the example in the create your first graph section.

Define schema

The first step in building a Kuzu graph is to define a schema. For our example dataset, we need the following node and relationships tables:

conn.query(
"CREATE NODE TABLE User(
user_id INT64 PRIMARY KEY,
username STRING,
account_creation_date DATE
)",
)?;
conn.query(
"CREATE NODE TABLE Post(
post_id INT64 PRIMARY KEY,
post_date DATE,
like_count INT64,
retweet_count INT64
)",
)?;
conn.query(
"CREATE REL TABLE FOLLOWS(
FROM User TO User
)",
)?;
conn.query(
"CREATE REL TABLE POSTS(
FROM User TO Post
)",
)?;
conn.query(
"CREATE REL TABLE LIKES(
FROM User TO Post
)",
)?;

Import CSV data

We can now import the CSV data into the node and relationship tables:

conn.query("COPY User FROM './tutorial_data/node/user.csv'")?;
conn.query("COPY Post FROM './tutorial_data/node/post.csv'")?;
conn.query("COPY FOLLOWS FROM './tutorial_data/relation/FOLLOWS.csv'")?;
conn.query("COPY POSTS FROM './tutorial_data/relation/POSTS.csv'")?;
conn.query("COPY LIKES FROM './tutorial_data/relation/LIKES.csv'")?;

See the documentation on importing data for more details on the parameters for CSV import.

Show table information

Finally, we can check the schema of the newly created tables:

let mut result = conn.query("CALL SHOW_TABLES() RETURN *")?;
println!("{}", result.display());

Running the code

Run create_db.rs to create the Kuzu database and import the CSV files:

Terminal window
$ cargo run --bin create_db
id|name|type|database name|comment
0|User|NODE|local(kuzu)|
2|FOLLOWS|REL|local(kuzu)|
1|Post|NODE|local(kuzu)|
3|POSTS|REL|local(kuzu)|
4|LIKES|REL|local(kuzu)|

The CSV data has been copied into a new Kuzu database in ./social_network_db.

Query the graph

We can now start querying the graph to answer some questions about the social network.

Replace the contents of src/main.rs with the following code snippet:

use kuzu::{Connection, Database, Error, SystemConfig, Value};
fn main() -> Result<(), Error> {
let db = Database::new("./social_network_db", SystemConfig::default())?;
let conn = Connection::new(&db)?;
// Rest of the code goes here.
Ok(())
}

Then for each the following queries, type in the code and run the program as:

Terminal window
cargo run --bin kuzu_social_network

Q1: Which user has the most followers? And how many followers do they have?

First, we use a simple MATCH query to list some users who are folowed by some other user:

let mut result1 = conn.query(
"MATCH (u1:User)-[f:FOLLOWS]->(u2:User)
RETURN u2.username
LIMIT 5",
)?;
println!("{}", result1.display());

Returns:

u2.username
coolwolf752
stormfox762
stormninja678
darkdog878
brightninja683

For each relation FOLLOWS, this will simply return the followee’s username of that relation. Next, we will improve on this query by using aggregation.

Let’s count the number of appearances of a followee by using the count() function:

let mut result2 = conn.query(
"MATCH (u1:User)-[f:FOLLOWS]->(u2:User)
RETURN u2.username, count(u2) as follower_count
LIMIT 5",
)?;
println!("{}", result2.display());

Returns

u2.username|follower_count
coolwolf752|3
stormfox762|5
stormcat597|2
brightninja683|5
stormqueen831|4

This is a lot more useful! We can now clearly see how many followers each user has.

Lastly, we use ORDER BY and LIMIT to order the usernames by their followers count, and return only the first entry. This is the user we are looking for, including the count of how many followers that user has:

let mut result3 = conn.query(
"MATCH (u1:User)-[f:FOLLOWS]->(u2:User)
RETURN u2.username, count(u2) as follower_count
ORDER BY follower_count DESC
LIMIT 1",
)?;
println!("{}", result3.display());

Returns:

u2.username|follower_count
stormninja678|6

Q2: What is the shortest path between two users?

We might also be interested in finding the shortest path between two users. For example, how many follows does it take to get from user silentguy245 to epicwolf202?

  1. We can query this by using recursive match to find shortest path length:
let mut result4 = conn.query(
"MATCH p = (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User)
WHERE u1.username = 'silentguy245'
AND u2.username = 'epicwolf202'
RETURN p",
)?;
println!("{}", result4.display());

Returns:

{_NODES: [{_ID: 0:19, _LABEL: User, user_id: 20, username: silentguy245, account_creation_date: 2022-10-11},{_ID: 0:14, _LABEL: User, user_id: 15, username: mysticwolf198, account_creation_date: 2021-01-04},{_ID: 0:0, _LABEL: User, user_id: 1, username: epicwolf202, account_creation_date: 2022-09-09}], _RELS: [(0:19)-{_LABEL: FOLLOWS, _ID: 2:64}->(0:14),(0:14)-{_LABEL: FOLLOWS, _ID: 2:47}->(0:0)]}

This will return a result of the RECURSIVE_REL datatype, which as you can see is difficult to interpret.

To make this more useful, we will collect just the usernames in the path by using a recursive relationship function

let mut result5 = conn.query(
"MATCH p = (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User)
WHERE u1.username = 'silentguy245'
AND u2.username = 'epicwolf202'
RETURN properties(nodes(p), 'username')",
)?;
println!("{}", result5.display());

Returns:

PROPERTIES(NODES(p),username)
[silentguy245,mysticwolf198,epicwolf202]

This is much easier to read! We can now clearly see the shortest path between the two users with one user, mysticwolf198, in between them.

Q3: How many 3-hop paths exist between userA and userB that passes through userC?

To answer this query, we will follow a similar approach as Q1. We first count the number of 3-hop queries:

let mut result6 = conn.query(
"MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:Follows]->(u3:User)-[f3:FOLLOWS]->(u4:User)
RETURN count(u4)",
)?;
println!("{}", result6.display());

Returns:

COUNT(u4._ID)
667

This tells us that there are 667 paths of length 3 in the graph. Next, we use a prepared statement to construct queries involving particular users.

Run a prepared statement

We will now modify the query to only match paths with userA, userB, and userC. Here, we will use prepared statements to separate the query from its parameters. We arbitrarily choose userA to be epicwolf202, userB to be stormcat597, and userC to be stormfox762.

let query =
"MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:Follows]->(u3:User)-[f3:FOLLOWS]->(u4:User)
WHERE u1.username= $usersrc AND u4.username= $userdst AND (u2.username= $userint OR u3.username= $userint)
RETURN count(u4)";
let mut prepared_query = conn.prepare(query)?;
let u1 = "epicwolf202";
let u2 = "stormcat597";
let u3 = "stormfox762";
let params = vec![
("usersrc", Value::String(u1.to_string())),
("userdst", Value::String(u2.to_string())),
("userint", Value::String(u3.to_string())),
];
let mut result7 = conn.execute(&mut prepared_query, params)?;
println!("{}", result7.display());

Returns:

COUNT(u4._ID)
1

This tells us that there is exactly one path of length 3 between the three users specified.

Summary

In this tutorial, we’ve shown how to use Kuzu’s Rust API to import CSV data as a graph and query it using Cypher.

You’re now ready to import your own datasets into Kuzu and query them using Kuzu’s Rust API!

Code

Here’s the full code from the tutorial that you can use to follow along. Reminder that you can run the code as follows:

Terminal window
# Copy data from CSV files to the database
cargo run --bin create_db
# Run queries on the database
cargo run --bin kuzu_social_network
use kuzu::{Connection, Database, Error, SystemConfig};
fn main() -> Result<(), Error> {
let db = Database::new("./social_network_db", SystemConfig::default())?;
let conn = Connection::new(&db)?;
conn.query(
"CREATE NODE TABLE User(
user_id INT64 PRIMARY KEY,
username STRING,
account_creation_date DATE
)",
)?;
conn.query(
"CREATE NODE TABLE Post(
post_id INT64 PRIMARY KEY,
post_date DATE,
like_count INT64,
retweet_count INT64
)",
)?;
conn.query(
"CREATE REL TABLE FOLLOWS(
FROM User TO User
)",
)?;
conn.query(
"CREATE REL TABLE POSTS(
FROM User TO Post
)",
)?;
conn.query(
"CREATE REL TABLE LIKES(
FROM User TO Post
)",
)?;
conn.query("COPY User FROM './tutorial_data/node/user.csv'")?;
conn.query("COPY Post FROM './tutorial_data/node/post.csv'")?;
conn.query("COPY FOLLOWS FROM './tutorial_data/relation/FOLLOWS.csv'")?;
conn.query("COPY POSTS FROM './tutorial_data/relation/POSTS.csv'")?;
conn.query("COPY LIKES FROM './tutorial_data/relation/LIKES.csv'")?;
let mut result = conn.query("CALL SHOW_TABLES() RETURN *")?;
println!("{}", result.display());
Ok(())
}