Skip to content
Blog

Azure extension

The azure extension allows you to scan data from Azure Blob Storage and Azure Data Lake Storage (ADLS).

Usage

Unlike the S3 and GCS extensions, the azure extension is implemented as a separate extension, rather than as a feature of the httpfs extension.

INSTALL azure;
LOAD azure;

Configure the connection

Before reading and writing from Azure, you have to configure the connection using CALL statements.

CALL <OPTION_NAME> = <OPTION_VALUE>;

The following options are supported:

OptionDescription
azure_connection_stringAzure connection string
azure_account_nameAzure storage account name

Alternatively, you can set the following environment variables:

Environment variableDescription
AZURE_CONNECTION_STRINGAzure connection string
AZURE_ACCOUNT_NAMEAzure storage account name

At least one of the options must be set. Generally, azure_connection_string should be used. azure_account_name is only useful for connecting to a container that allows anonymous read access.

Scan data from Azure

The extension supports both az and abfss URI schemes. We recommend using abfss when scanning from ADLS for much better performance.

For example, to scan from a container:

LOAD FROM 'az://container/path/to/file.csv'
RETURN *;
LOAD FROM 'abfss://container/path/to/file.csv'
RETURN *;

You can also use a fully qualified path:

LOAD FROM 'az://account_name.blob.core.windows.net/container/path/to/file.csv'
RETURN *;

Glob data from Azure

You can glob data from Azure just as you would from a local file system.

For example, the following query will scan the contents of all files matching the pattern vPerson*.csv.

LOAD FROM "az://tinysnb/vPerson*.csv" RETURN *;