Import NumPy
The .npy
format is the standard binary file format in NumPy for persisting a
single arbitrary NumPy array on disk.
The primary use case for bulk loading NumPy files is to load
large node features or vectors that are stored in .npy
format. You can use the COPY FROM
statement
to import a set of *.npy
files into a node table.
Import to node table
Consider a Paper
table with an id
column, a feature column that is an embedding (vector) with 768 dimensions,
a year
column and a label
column as ground truth. We first define the schema with the following statement:
The raw data is stored in .npy
format where each column is represented as a NumPy array on disk. The files are
specified below:
We can copy the files with the following statement:
As stated before, the number of *.npy
files must equal the number of columns, and must also be
specified in the same order as they are defined in the DDL.