Rust tutorial: Analyze a social network
This tutorial will get you started using Kuzu’s Rust API to analyze a dataset.
In this tutorial, we will use the following social network dataset that consists of users and their posts.
Setup
Ensure that you have Rust installed on your system. Create a new Cargo project with Kuzu as a dependency:
cargo new kuzu_social_networkcd kuzu_social_networkcargo add kuzu
Next, download the zipped data and unzip the files.
curl -o tutorial_data.zip https://rgw.cs.uwaterloo.ca/kuzu-test/tutorial/tutorial_data.zipunzip tutorial_data.ziprm tutorial_data.zip
Before we can start querying the data, we need to create a new Kuzu database and import the CSV files. Since this is a one-time operation, we will use a separate Cargo target.
Create a new file called create_db.rs
inside src/bin
:
mkdir src/bincp src/main.rs src/bin/create_db.rs
Your directory should now look something like this:
kuzu_social_network├── Cargo.lock├── Cargo.toml├── src│ ├── bin│ │ └── create_db.rs│ └── main.rs└── tutorial_data ├── node │ ├── post.csv │ └── user.csv └── relation ├── FOLLOWS.csv ├── LIKES.csv └── POSTS.csv
Test the setup by running the Cargo target programs. Note that the first
cargo run
will take some time as it compiles Kuzu from scratch.
The second and subsequent runs should be faster.
$ cargo run --bin kuzu_social_network Finished `dev` profile [unoptimized + debuginfo] target(s) in 3m 18s Running `target/debug/kuzu_social_network`Hello, world!
$ cargo run --bin create_db Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s Running `target/debug/create_db`Hello, world!
We are now ready to start the tutorial.
Create the database
Replace the contents of src/bin/create_db.rs
with the following code snippet:
use kuzu::{Connection, Database, Error, SystemConfig};
fn main() -> Result<(), Error> { // Rest of the code goes here.
Ok(())}
Create and connect to an empty Kuzu database:
let db = Database::new("./social_network_db", SystemConfig::default())?;let conn = Connection::new(&db)?;
This will create a database directory ./social_network_db
where the imported csv data will be stored.
Kuzu also supports an in-memory mode
where the database is active only during the execution of the Rust program. For more information, refer to the example in the create your first graph section.
Define schema
The first step in building a Kuzu graph is to define a schema. For our example dataset, we need the following node and relationships tables:
conn.query( "CREATE NODE TABLE User( user_id INT64 PRIMARY KEY, username STRING, account_creation_date DATE )",)?;conn.query( "CREATE NODE TABLE Post( post_id INT64 PRIMARY KEY, post_date DATE, like_count INT64, retweet_count INT64 )",)?;conn.query( "CREATE REL TABLE FOLLOWS( FROM User TO User )",)?;conn.query( "CREATE REL TABLE POSTS( FROM User TO Post )",)?;conn.query( "CREATE REL TABLE LIKES( FROM User TO Post )",)?;
Import CSV data
We can now import the CSV data into the node and relationship tables:
conn.query("COPY User FROM './tutorial_data/node/user.csv'")?;conn.query("COPY Post FROM './tutorial_data/node/post.csv'")?;conn.query("COPY FOLLOWS FROM './tutorial_data/relation/FOLLOWS.csv'")?;conn.query("COPY POSTS FROM './tutorial_data/relation/POSTS.csv'")?;conn.query("COPY LIKES FROM './tutorial_data/relation/LIKES.csv'")?;
See the documentation on importing data for more details on the parameters for CSV import.
Show table information
Finally, we can check the schema of the newly created tables:
let mut result = conn.query("CALL SHOW_TABLES() RETURN *")?;println!("{}", result.display());
Running the code
Run create_db.rs
to create the Kuzu database and import the CSV files:
$ cargo run --bin create_dbid|name|type|database name|comment0|User|NODE|local(kuzu)|2|FOLLOWS|REL|local(kuzu)|1|Post|NODE|local(kuzu)|3|POSTS|REL|local(kuzu)|4|LIKES|REL|local(kuzu)|
The CSV data has been copied into a new Kuzu database in ./social_network_db
.
Query the graph
We can now start querying the graph to answer some questions about the social network.
Replace the contents of src/main.rs
with the following code snippet:
use kuzu::{Connection, Database, Error, SystemConfig, Value};
fn main() -> Result<(), Error> { let db = Database::new("./social_network_db", SystemConfig::default())?; let conn = Connection::new(&db)?;
// Rest of the code goes here.
Ok(())}
Then for each the following queries, type in the code and run the program as:
cargo run --bin kuzu_social_network
Q1: Which user has the most followers? And how many followers do they have?
First, we use a simple MATCH
query to list some users who are folowed by some other user:
let mut result1 = conn.query( "MATCH (u1:User)-[f:FOLLOWS]->(u2:User) RETURN u2.username LIMIT 5",)?;println!("{}", result1.display());
Returns:
u2.usernamecoolwolf752stormfox762stormninja678darkdog878brightninja683
For each relation FOLLOWS
, this will simply return the followee’s username of that relation. Next, we will improve on this query by using aggregation.
Let’s count the number of appearances of a followee by using the count()
function:
let mut result2 = conn.query( "MATCH (u1:User)-[f:FOLLOWS]->(u2:User) RETURN u2.username, count(u2) as follower_count LIMIT 5",)?;println!("{}", result2.display());
Returns
u2.username|follower_countcoolwolf752|3stormfox762|5stormcat597|2brightninja683|5stormqueen831|4
This is a lot more useful! We can now clearly see how many followers each user has.
Lastly, we use ORDER BY
and LIMIT
to order the usernames by their followers count, and
return only the first entry. This is the user we are looking for, including the count of how many followers that user has:
let mut result3 = conn.query( "MATCH (u1:User)-[f:FOLLOWS]->(u2:User) RETURN u2.username, count(u2) as follower_count ORDER BY follower_count DESC LIMIT 1",)?;println!("{}", result3.display());
Returns:
u2.username|follower_countstormninja678|6
Q2: What is the shortest path between two users?
We might also be interested in finding the shortest path between two users.
For example, how many follows does it take to get from user silentguy245
to epicwolf202
?
- We can query this by using recursive match to find shortest path length:
let mut result4 = conn.query( "MATCH p = (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User) WHERE u1.username = 'silentguy245' AND u2.username = 'epicwolf202' RETURN p",)?;println!("{}", result4.display());
Returns:
{_NODES: [{_ID: 0:19, _LABEL: User, user_id: 20, username: silentguy245, account_creation_date: 2022-10-11},{_ID: 0:14, _LABEL: User, user_id: 15, username: mysticwolf198, account_creation_date: 2021-01-04},{_ID: 0:0, _LABEL: User, user_id: 1, username: epicwolf202, account_creation_date: 2022-09-09}], _RELS: [(0:19)-{_LABEL: FOLLOWS, _ID: 2:64}->(0:14),(0:14)-{_LABEL: FOLLOWS, _ID: 2:47}->(0:0)]}
This will return a result of the RECURSIVE_REL
datatype, which as you can see is difficult to interpret.
To make this more useful, we will collect just the usernames in the path by using a recursive relationship function
let mut result5 = conn.query( "MATCH p = (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User) WHERE u1.username = 'silentguy245' AND u2.username = 'epicwolf202' RETURN properties(nodes(p), 'username')",)?;println!("{}", result5.display());
Returns:
PROPERTIES(NODES(p),username)[silentguy245,mysticwolf198,epicwolf202]
This is much easier to read! We can now clearly see the shortest path between the two users with one
user, mysticwolf198
, in between them.
Q3: How many 3-hop paths exist between userA and userB that passes through userC?
To answer this query, we will follow a similar approach as Q1. We first count the number of 3-hop queries:
let mut result6 = conn.query( "MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:Follows]->(u3:User)-[f3:FOLLOWS]->(u4:User) RETURN count(u4)",)?;println!("{}", result6.display());
Returns:
COUNT(u4._ID)667
This tells us that there are 667 paths of length 3 in the graph. Next, we use a prepared statement to construct queries involving particular users.
Run a prepared statement
We will now modify the query to only match paths with userA, userB, and userC.
Here, we will use prepared statements to separate the query from its parameters.
We arbitrarily choose userA to be epicwolf202
, userB to be stormcat597
, and userC to be stormfox762
.
let query = "MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:Follows]->(u3:User)-[f3:FOLLOWS]->(u4:User) WHERE u1.username= $usersrc AND u4.username= $userdst AND (u2.username= $userint OR u3.username= $userint) RETURN count(u4)";let mut prepared_query = conn.prepare(query)?;let u1 = "epicwolf202";let u2 = "stormcat597";let u3 = "stormfox762";let params = vec![ ("usersrc", Value::String(u1.to_string())), ("userdst", Value::String(u2.to_string())), ("userint", Value::String(u3.to_string())),];let mut result7 = conn.execute(&mut prepared_query, params)?;println!("{}", result7.display());
Returns:
COUNT(u4._ID)1
This tells us that there is exactly one path of length 3 between the three users specified.
Summary
In this tutorial, we’ve shown how to use Kuzu’s Rust API to import CSV data as a graph and query it using Cypher.
You’re now ready to import your own datasets into Kuzu and query them using Kuzu’s Rust API!
Code
Here’s the full code from the tutorial that you can use to follow along. Reminder that you can run the code as follows:
# Copy data from CSV files to the databasecargo run --bin create_db
# Run queries on the databasecargo run --bin kuzu_social_network
use kuzu::{Connection, Database, Error, SystemConfig};
fn main() -> Result<(), Error> { let db = Database::new("./social_network_db", SystemConfig::default())?; let conn = Connection::new(&db)?;
conn.query( "CREATE NODE TABLE User( user_id INT64 PRIMARY KEY, username STRING, account_creation_date DATE )", )?; conn.query( "CREATE NODE TABLE Post( post_id INT64 PRIMARY KEY, post_date DATE, like_count INT64, retweet_count INT64 )", )?; conn.query( "CREATE REL TABLE FOLLOWS( FROM User TO User )", )?; conn.query( "CREATE REL TABLE POSTS( FROM User TO Post )", )?; conn.query( "CREATE REL TABLE LIKES( FROM User TO Post )", )?;
conn.query("COPY User FROM './tutorial_data/node/user.csv'")?; conn.query("COPY Post FROM './tutorial_data/node/post.csv'")?; conn.query("COPY FOLLOWS FROM './tutorial_data/relation/FOLLOWS.csv'")?; conn.query("COPY POSTS FROM './tutorial_data/relation/POSTS.csv'")?; conn.query("COPY LIKES FROM './tutorial_data/relation/LIKES.csv'")?;
let mut result = conn.query("CALL SHOW_TABLES() RETURN *")?; println!("{}", result.display());
Ok(())}
use kuzu::{Connection, Database, Error, SystemConfig, Value};
fn main() -> Result<(), Error> { let db = Database::new("./social_network_db", SystemConfig::default())?; let conn = Connection::new(&db)?;
let mut result1 = conn.query( "MATCH (u1:User)-[f:FOLLOWS]->(u2:User) RETURN u2.username LIMIT 5", )?; println!("{}", result1.display());
let mut result2 = conn.query( "MATCH (u1:User)-[f:FOLLOWS]->(u2:User) RETURN u2.username, count(u2) as follower_count LIMIT 5", )?; println!("{}", result2.display());
let mut result3 = conn.query( "MATCH (u1:User)-[f:FOLLOWS]->(u2:User) RETURN u2.username, count(u2) as follower_count ORDER BY follower_count DESC LIMIT 1", )?; println!("{}", result3.display());
let mut result4 = conn.query( "MATCH p = (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User) WHERE u1.username = 'silentguy245' AND u2.username = 'epicwolf202' RETURN p", )?; println!("{}", result4.display());
let mut result5 = conn.query( "MATCH p = (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User) WHERE u1.username = 'silentguy245' AND u2.username = 'epicwolf202' RETURN properties(nodes(p), 'username')", )?; println!("{}", result5.display());
let mut result6 = conn.query( "MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:Follows]->(u3:User)-[f3:FOLLOWS]->(u4:User) RETURN count(u4)", )?; println!("{}", result6.display());
let query = "MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:Follows]->(u3:User)-[f3:FOLLOWS]->(u4:User) WHERE u1.username= $usersrc AND u4.username= $userdst AND (u2.username= $userint OR u3.username= $userint) RETURN count(u4)"; let mut prepared_query = conn.prepare(query)?; let u1 = "epicwolf202"; let u2 = "stormcat597"; let u3 = "stormfox762"; let params = vec![ ("usersrc", Value::String(u1.to_string())), ("userdst", Value::String(u2.to_string())), ("userint", Value::String(u3.to_string())), ]; let mut result7 = conn.execute(&mut prepared_query, params)?; println!("{}", result7.display());
Ok(())}