Pangolin's credential vending and REST support extend beyond AWS to Azure Blob Storage (ADLS Gen2) and Google Cloud Storage (GCS).
Install PyIceberg with the adlfs and pyarrow extras:
pip install "pyiceberg[adlfs,pyarrow]"When using Azure, the catalog URI remains the same, but the storage properties change.
from pyiceberg.catalog import load_catalog
catalog = load_catalog(
"azure_catalog",
**{
"type": "rest",
"uri": "http://localhost:8080/v1/azure_catalog",
"token": "YOUR_JWT_TOKEN",
# Enable Vending
"header.X-Iceberg-Access-Delegation": "vended-credentials",
# Optional: Direct Key Access
# "adls.account-name": "mystorageaccount",
# "adls.account-key": "YOUR_ACCOUNT_KEY"
}
)| PyIceberg Property | Description | Vended by Pangolin |
|---|---|---|
adls.token |
OAuth2 access token for Azure AD authentication. | ✅ Yes (OAuth2 mode) |
adls.account-name |
Azure storage account name. | ✅ Yes (all modes) |
adls.account-key |
Azure storage account key. | ✅ Yes (account key mode) |
adls.container |
Container name within the storage account. | ✅ Yes (all modes) |
Note: When using Pangolin's credential vending, you don't need to provide these properties manually. Pangolin vends them automatically based on your warehouse configuration.
Install PyIceberg with the gcsfs and pyarrow extras:
pip install "pyiceberg[gcsfs,pyarrow]"from pyiceberg.catalog import load_catalog
catalog = load_catalog(
"gcp_catalog",
**{
"type": "rest",
"uri": "http://localhost:8080/v1/gcp_catalog",
"token": "YOUR_JWT_TOKEN",
# Enable Vending
"header.X-Iceberg-Access-Delegation": "vended-credentials",
# Optional: Direct Service Account Access
# "gcs.project-id": "my-project",
# "gcs.service-account-key": "/path/to/key.json"
}
)| PyIceberg Property | Description | Vended by Pangolin |
|---|---|---|
gcp-oauth-token |
OAuth2 access token for GCP authentication. | ✅ Yes (OAuth2 mode) |
gcp-project-id |
GCP project ID. | ✅ Yes (all modes) |
gcs.service-account-key |
Path to service account JSON key (client-provided). | ❌ No (client manages) |
Note: When using Pangolin's credential vending, you only need to provide token for authentication. Pangolin vends gcp-oauth-token and gcp-project-id automatically.
For testing or private cloud deployments using MinIO or other S3-compatible storage.
from pyiceberg.catalog import load_catalog
catalog = load_catalog(
"minio",
**{
"type": "rest",
"uri": "http://localhost:8080/v1/minio",
"token": "YOUR_TOKEN",
# Mandatory for MinIO
"s3.endpoint": "http://localhost:9000",
"s3.path-style-access": "true",
"s3.region": "us-east-1",
}
)| Property | Description |
|---|---|
s3.endpoint |
The full HTTP(S) URL of your MinIO server. |
s3.path-style-access |
Must be true for MinIO to use endpoint/bucket/key instead of bucket.endpoint/key. |
Pangolin allows you to manage catalogs across different clouds from a single control plane. One tenant can have an S3 warehouse for analytics and a GCS warehouse for ML workloads.
- Signer Implementation: Ensure your Pangolin server is built with the appropriate cloud SDK features (
--features azure-oauthor--features gcp-oauth). - Library Versions: Earlier versions of PyIceberg had limited support for non-S3 backends. We recommend using PyIceberg 0.7.0+ for the best multi-cloud experience.
- Region Consistency: Ensure the
regionorlocationproperties in your Pangolin Warehouse configuration match the physical bucket locations to minimize latency and costs.