Azure extension
The azure
extension allows you to scan data from Azure Blob Storage
and Azure Data Lake Storage (ADLS).
Usage
Unlike the S3 and GCS extensions, the azure
extension
is implemented as a separate extension, rather than as a feature of the httpfs
extension.
INSTALL azure;LOAD azure;
Configure the connection
Before reading and writing from Azure, you have to configure the connection using CALL statements.
CALL <OPTION_NAME> = <OPTION_VALUE>;
The following options are supported:
Option | Description |
---|---|
azure_connection_string | Azure connection string |
azure_account_name | Azure storage account name |
Alternatively, you can set the following environment variables:
Environment variable | Description |
---|---|
AZURE_CONNECTION_STRING | Azure connection string |
AZURE_ACCOUNT_NAME | Azure storage account name |
At least one of the options must be set. Generally, azure_connection_string
should be used.
azure_account_name
is only useful for connecting to a container that allows anonymous read access.
Scan data from Azure
The extension supports both az
and abfss
URI schemes. We recommend using abfss
when scanning
from ADLS for much better performance.
For example, to scan from a container:
LOAD FROM 'az://container/path/to/file.csv'RETURN *;
LOAD FROM 'abfss://container/path/to/file.csv'RETURN *;
You can also use a fully qualified path:
LOAD FROM 'az://account_name.blob.core.windows.net/container/path/to/file.csv'RETURN *;
Glob data from Azure
You can glob data from Azure just as you would from a local file system.
For example, the following query will scan the contents of all files matching the pattern vPerson*.csv
.
LOAD FROM "az://tinysnb/vPerson*.csv" RETURN *;