Unity Catalog extension
This extension adds the ability to directly scan from delta tables registered in a Unity Catalog using the LOAD FROM
statement.
Usage
INSTALL unity_catalog;LOAD unity_catalog;
Set up a Unity Catalog server
First, set up the open-source version of Unity Catalog:
git clone https://github.com/unitycatalog/unitycatalog.gitbin/start-uc-server
Attach to a Unity Catalog
ATTACH [CATALOG_NAME] AS [alias] (dbtype UC_CATALOG)
CATALOG_NAME
: The catalog name to attach to in the Unity Catalogalias
: Database alias to use in Kuzu. If not provided, the catalog name will be used. When attaching multiple databases, it’s recommended to use aliases.
Unity Catalog to Kuzu type mapping
The table below shows the mapping from Unity Catalog types to Kuzu types:
Data type in Unity Catalog | Corresponding data type in Kuzu |
---|---|
BOOLEAN | BOOLEAN |
BYTE | Unsupported |
SHORT | INT16 |
INT | INT32 |
LONG | INT64 |
DOUBLE | DOUBLE |
FLOAT | FLOAT |
DATE | DATE |
TIMESTAMP | TIMESTAMP |
TIMESTAMP_NTZ | Unsupported |
STRING | STRING |
BINARY | Unsupported |
DECIMAL | DECIMAL |
Scan data from a Unity Catalog table
You can use the LOAD FROM
statement to scan the numbers
table. Note that you need to prefix the
external numbers
table with the database alias (in our example unity
).
LOAD FROM unity.numbersRETURN *
┌────────┬────────────┐│ as_int │ as_double ││ INT32 │ DOUBLE │├────────┼────────────┤│ 564 │ 188.755356 ││ 755 │ 883.610563 ││ 644 │ 203.439559 ││ 75 │ 277.880219 ││ 42 │ 403.857969 ││ 680 │ 797.691220 ││ 821 │ 767.799854 ││ 484 │ 344.003740 ││ 477 │ 380.678561 ││ 131 │ 35.443732 ││ 294 │ 209.322436 ││ 150 │ 329.197303 ││ 539 │ 425.661029 ││ 247 │ 477.742227 ││ 958 │ 509.371273 │└────────┴────────────┘
Use a default Unity Catalog name
You can attach a Unity Catalog with a default name using the USE
statement, to avoid having to specify the full catalog name in every query.
For example, for the Unity Catalog above:
ATTACH 'unity' AS unity (dbtype UC_CATALOG);USE unity;LOAD FROM numbersRETURN *
Copy a Unity Catalog table into Kuzu
You can use the COPY FROM
statement to import data from a Unity Catalog table into Kuzu.
First, create a numbers
table in Kuzu with the same schema as the one defined in the Unity Catalog.
CREATE NODE TABLE numbers (id INT32 PRIMARY KEY, score DOUBLE);
Then, copy the data from the external Unity Catalog table to the Kuzu table.
copy numbers from unity.numbers;
You can also use a subquery to copy only a subset of the columns:
COPY numbers FROM (LOAD FROM unity.numbers RETURN score);
You can verify that the data has been copied successfully:
MATCH (n:numbers) RETURN n.*;
┌───────┬────────────┐│ n.id │ n.score ││ INT32 │ DOUBLE │├───────┼────────────┤│ 564 │ 188.755356 ││ 755 │ 883.610563 ││ 644 │ 203.439559 ││ 75 │ 277.880219 ││ 42 │ 403.857969 ││ 680 │ 797.691220 ││ 821 │ 767.799854 ││ 484 │ 344.003740 ││ 477 │ 380.678561 ││ 131 │ 35.443732 ││ 294 │ 209.322436 ││ 150 │ 329.197303 ││ 539 │ 425.661029 ││ 247 │ 477.742227 ││ 958 │ 509.371273 │└───────┴────────────┘
Detach a Unity Catalog
DETACH unity;