Skip to content

Commit 0c9104d

Browse files
api and cli updated
1 parent 96a6908 commit 0c9104d

15 files changed

Lines changed: 1411 additions & 267 deletions

docker-compose.emulators.yml

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
version: '3.8'
2+
3+
services:
4+
# MinIO - S3-compatible storage with STS support
5+
minio:
6+
image: minio/minio:latest
7+
container_name: pangolin-minio
8+
ports:
9+
- "9000:9000" # API
10+
- "9001:9001" # Console
11+
environment:
12+
MINIO_ROOT_USER: minioadmin
13+
MINIO_ROOT_PASSWORD: minioadmin
14+
command: server /data --console-address ":9001"
15+
healthcheck:
16+
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
17+
interval: 5s
18+
timeout: 3s
19+
retries: 3
20+
volumes:
21+
- minio-data:/data
22+
23+
# Azurite - Azure Storage Emulator
24+
azurite:
25+
image: mcr.microsoft.com/azure-storage/azurite:latest
26+
container_name: pangolin-azurite
27+
ports:
28+
- "10000:10000" # Blob service
29+
- "10001:10001" # Queue service
30+
- "10002:10002" # Table service
31+
command: azurite-blob --blobHost 0.0.0.0 --blobPort 10000 --loose
32+
healthcheck:
33+
test: ["CMD", "nc", "-z", "localhost", "10000"]
34+
interval: 5s
35+
timeout: 3s
36+
retries: 3
37+
volumes:
38+
- azurite-data:/data
39+
40+
# fake-gcs-server - GCS Emulator
41+
fake-gcs:
42+
image: fsouza/fake-gcs-server:latest
43+
container_name: pangolin-fake-gcs
44+
ports:
45+
- "4443:4443"
46+
command: -scheme http -port 4443 -external-url http://localhost:4443
47+
healthcheck:
48+
test: ["CMD", "wget", "--spider", "-q", "http://localhost:4443/storage/v1/b"]
49+
interval: 5s
50+
timeout: 3s
51+
retries: 3
52+
53+
# oidc-server-mock - OAuth2/OIDC Mock for Azure AD testing
54+
oidc-mock:
55+
image: ghcr.io/soluto/oidc-server-mock:latest
56+
container_name: pangolin-oidc-mock
57+
ports:
58+
- "8081:80" # Changed from 8080 to avoid conflict
59+
environment:
60+
ASPNETCORE_ENVIRONMENT: Development
61+
SERVER_OPTIONS_INLINE: |
62+
{
63+
"AccessTokenJwtType": "JWT",
64+
"Discovery": {
65+
"ShowKeySet": true
66+
}
67+
}
68+
USERS_CONFIGURATION_INLINE: |
69+
[
70+
{
71+
"SubjectId": "test-user",
72+
"Username": "testuser",
73+
"Password": "testpassword"
74+
}
75+
]
76+
CLIENTS_CONFIGURATION_INLINE: |
77+
[
78+
{
79+
"ClientId": "pangolin-client",
80+
"ClientSecrets": ["secret"],
81+
"AllowedGrantTypes": ["client_credentials"],
82+
"AllowedScopes": ["https://storage.azure.com/.default", "openid", "profile"]
83+
}
84+
]
85+
healthcheck:
86+
test: ["CMD", "wget", "--spider", "-q", "http://localhost:80/.well-known/openid-configuration"]
87+
interval: 5s
88+
timeout: 3s
89+
retries: 3
90+
91+
volumes:
92+
minio-data:
93+
azurite-data:
Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
# Documentation Update Summary - Credential Vending
2+
3+
## Files Requiring Updates
4+
5+
### 1. `/docs/features/security_vending.md` ⚠️ HIGH PRIORITY
6+
**Current State:** Only documents S3/AWS credential vending
7+
**Required Updates:**
8+
- Add section for Azure ADLS Gen2 credential vending
9+
- OAuth2 mode (adls.token)
10+
- Account key mode (adls.account-name, adls.account-key)
11+
- Add section for GCP credential vending
12+
- OAuth2 mode (gcp-oauth-token)
13+
- Service account mode (gcp-project-id)
14+
- Update PyIceberg integration examples with multi-cloud
15+
- Add warehouse configuration examples for Azure and GCP
16+
17+
**Suggested New Sections:**
18+
```markdown
19+
## Azure ADLS Gen2 Credential Vending
20+
21+
### OAuth2 Mode
22+
```python
23+
# Warehouse configuration
24+
{
25+
"type": "azure",
26+
"account_name": "mystorageaccount",
27+
"container": "data",
28+
"tenant_id": "...",
29+
"client_id": "...",
30+
"client_secret": "..."
31+
}
32+
```
33+
34+
### Account Key Mode
35+
```python
36+
# Warehouse configuration
37+
{
38+
"type": "azure",
39+
"account_name": "mystorageaccount",
40+
"container": "data",
41+
"account_key": "..."
42+
}
43+
```
44+
45+
## GCP Credential Vending
46+
...
47+
```
48+
49+
---
50+
51+
### 2. `/docs/architecture/signer-trait.md` ⚠️ MEDIUM PRIORITY
52+
**Current State:** Documents old Signer trait, outdated structure
53+
**Required Updates:**
54+
- Replace with new `CredentialSigner` trait documentation
55+
- Update `VendedCredentials` structure
56+
- Document PyIceberg-compatible property names:
57+
- S3: `s3.access-key-id`, `s3.secret-access-key`, `s3.session-token`
58+
- Azure: `adls.token`, `adls.account-name`, `adls.account-key`
59+
- GCP: `gcp-oauth-token`, `gcp-project-id`
60+
- Update implementation examples for all three cloud providers
61+
62+
**New Trait Definition:**
63+
```rust
64+
#[async_trait]
65+
pub trait CredentialSigner: Send + Sync {
66+
async fn generate_credentials(
67+
&self,
68+
resource_path: &str,
69+
permissions: &[String],
70+
duration: Duration,
71+
) -> Result<VendedCredentials>;
72+
73+
fn storage_type(&self) -> &str;
74+
}
75+
76+
pub struct VendedCredentials {
77+
pub prefix: String,
78+
pub config: HashMap<String, String>, // PyIceberg-compatible properties
79+
pub expires_at: Option<DateTime<Utc>>,
80+
}
81+
```
82+
83+
---
84+
85+
### 3. `/docs/pyiceberg/multi_cloud.md` ⚠️ HIGH PRIORITY
86+
**Current State:** Mentions vending but uses incorrect property names
87+
**Required Updates:**
88+
- Update Azure properties section:
89+
-`adls.token` (for OAuth2 vended credentials)
90+
-`adls.account-name` (account name)
91+
-`adls.account-key` (for account key mode)
92+
- ❌ Remove outdated `adls.connection-string` (not used in vending)
93+
- Update GCP properties section:
94+
-`gcp-oauth-token` (for vended OAuth2 tokens)
95+
-`gcp-project-id` (project ID)
96+
- ❌ Remove `gcs.oauth2.token` (use `gcp-oauth-token`)
97+
- Add complete credential vending examples
98+
99+
**Updated Azure Example:**
100+
```python
101+
catalog = load_catalog(
102+
"azure_catalog",
103+
**{
104+
"type": "rest",
105+
"uri": "http://localhost:8080/v1/azure_catalog",
106+
"token": "YOUR_JWT_TOKEN",
107+
# Pangolin vends credentials automatically!
108+
# Properties vended: adls.token, adls.account-name
109+
}
110+
)
111+
```
112+
113+
---
114+
115+
### 4. `/docs/features/warehouse_management.md` ⚠️ LOW PRIORITY
116+
**Current State:** May only show S3 warehouse examples
117+
**Required Updates:**
118+
- Add Azure warehouse creation example
119+
- Add GCP warehouse creation example
120+
- Document `use_sts` behavior for each cloud provider
121+
122+
---
123+
124+
### 5. `/docs/pyiceberg/auth_vended_creds.md` ⚠️ MEDIUM PRIORITY
125+
**Current State:** Unknown (need to review)
126+
**Likely Updates:**
127+
- Add multi-cloud vended credentials examples
128+
- Update property names to PyIceberg-compatible format
129+
130+
---
131+
132+
## Property Name Reference (PyIceberg Compatible)
133+
134+
### S3 / AWS
135+
| Property | Description | Example |
136+
|----------|-------------|---------|
137+
| `s3.access-key-id` | AWS access key | `AKIAIOSFODNN7EXAMPLE` |
138+
| `s3.secret-access-key` | AWS secret key | `wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY` |
139+
| `s3.session-token` | STS session token | `FwoGZXIvYXdzEBYaD...` |
140+
| `s3.endpoint` | S3 endpoint URL | `http://localhost:9000` |
141+
| `s3.region` | AWS region | `us-east-1` |
142+
143+
### Azure ADLS Gen2
144+
| Property | Description | Example |
145+
|----------|-------------|---------|
146+
| `adls.token` | OAuth2 access token | `eyJ0eXAiOiJKV1QiLCJhbGc...` |
147+
| `adls.account-name` | Storage account name | `mystorageaccount` |
148+
| `adls.account-key` | Storage account key | `Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==` |
149+
| `adls.container` | Container name | `data` |
150+
151+
### GCP Cloud Storage
152+
| Property | Description | Example |
153+
|----------|-------------|---------|
154+
| `gcp-oauth-token` | OAuth2 access token | `ya29.a0AfH6SMBx...` |
155+
| `gcp-project-id` | GCP project ID | `my-project-12345` |
156+
157+
---
158+
159+
## Testing Status
160+
161+
### ✅ Tested and Working
162+
- **S3 Static Credentials:** Tested with MinIO ✅
163+
- **S3 STS Foundation:** MinIO IAM configured ✅
164+
- **Azure Account Key:** Tested with Azurite ✅
165+
- **GCP Service Account:** Tested with fake-gcs-server ✅
166+
167+
### ⚠️ Code Ready, Needs Real Testing
168+
- **Azure OAuth2:** Code complete, needs real Azure AD testing
169+
- **S3 STS AssumeRole:** Needs AWS STS feature flag enabled
170+
171+
---
172+
173+
## Implementation Details
174+
175+
**Files Modified:**
176+
- `pangolin_api/src/credential_signers/azure_signer.rs` - PyIceberg properties
177+
- `pangolin_api/src/credential_signers/gcp_signer.rs` - PyIceberg properties
178+
- `pangolin_api/src/credential_signers/s3_signer.rs` - PyIceberg properties
179+
- `pangolin_api/src/credential_vending.rs` - Factory and helpers
180+
- `pangolin_api/src/signing_handlers.rs` - Refactored (-170 lines)
181+
182+
**Tests:** 28/28 passing
183+
- 4 unit tests
184+
- 15 integration tests
185+
- 5 end-to-end tests
186+
- 4 live emulator tests
187+
188+
---
189+
190+
## Recommended Documentation Update Priority
191+
192+
1. **HIGH:** `security_vending.md` - Add Azure/GCP sections
193+
2. **HIGH:** `multi_cloud.md` - Fix property names
194+
3. **MEDIUM:** `signer-trait.md` - Update trait documentation
195+
4. **MEDIUM:** `auth_vended_creds.md` - Add multi-cloud examples
196+
5. **LOW:** `warehouse_management.md` - Add Azure/GCP warehouse examples
197+
198+
---
199+
200+
## Quick Reference for Documentation Writers
201+
202+
**When documenting Azure credential vending:**
203+
- Use `adls.token` (not `azure-oauth-token`)
204+
- Use `adls.account-name` (not `azure-account-name`)
205+
- Use `adls.account-key` (not `azure-account-key`)
206+
207+
**When documenting GCP credential vending:**
208+
- Use `gcp-oauth-token` (not `gcs.oauth2.token`)
209+
- Use `gcp-project-id` (not `gcs.project-id`)
210+
211+
**All property names must match PyIceberg expectations for compatibility.**

0 commit comments

Comments
 (0)