|
| 1 | +# Documentation Update Summary - Credential Vending |
| 2 | + |
| 3 | +## Files Requiring Updates |
| 4 | + |
| 5 | +### 1. `/docs/features/security_vending.md` ⚠️ HIGH PRIORITY |
| 6 | +**Current State:** Only documents S3/AWS credential vending |
| 7 | +**Required Updates:** |
| 8 | +- Add section for Azure ADLS Gen2 credential vending |
| 9 | + - OAuth2 mode (adls.token) |
| 10 | + - Account key mode (adls.account-name, adls.account-key) |
| 11 | +- Add section for GCP credential vending |
| 12 | + - OAuth2 mode (gcp-oauth-token) |
| 13 | + - Service account mode (gcp-project-id) |
| 14 | +- Update PyIceberg integration examples with multi-cloud |
| 15 | +- Add warehouse configuration examples for Azure and GCP |
| 16 | + |
| 17 | +**Suggested New Sections:** |
| 18 | +```markdown |
| 19 | +## Azure ADLS Gen2 Credential Vending |
| 20 | + |
| 21 | +### OAuth2 Mode |
| 22 | +```python |
| 23 | +# Warehouse configuration |
| 24 | +{ |
| 25 | + "type": "azure", |
| 26 | + "account_name": "mystorageaccount", |
| 27 | + "container": "data", |
| 28 | + "tenant_id": "...", |
| 29 | + "client_id": "...", |
| 30 | + "client_secret": "..." |
| 31 | +} |
| 32 | +``` |
| 33 | + |
| 34 | +### Account Key Mode |
| 35 | +```python |
| 36 | +# Warehouse configuration |
| 37 | +{ |
| 38 | + "type": "azure", |
| 39 | + "account_name": "mystorageaccount", |
| 40 | + "container": "data", |
| 41 | + "account_key": "..." |
| 42 | +} |
| 43 | +``` |
| 44 | + |
| 45 | +## GCP Credential Vending |
| 46 | +... |
| 47 | +``` |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +### 2. `/docs/architecture/signer-trait.md` ⚠️ MEDIUM PRIORITY |
| 52 | +**Current State:** Documents old Signer trait, outdated structure |
| 53 | +**Required Updates:** |
| 54 | +- Replace with new `CredentialSigner` trait documentation |
| 55 | +- Update `VendedCredentials` structure |
| 56 | +- Document PyIceberg-compatible property names: |
| 57 | + - S3: `s3.access-key-id`, `s3.secret-access-key`, `s3.session-token` |
| 58 | + - Azure: `adls.token`, `adls.account-name`, `adls.account-key` |
| 59 | + - GCP: `gcp-oauth-token`, `gcp-project-id` |
| 60 | +- Update implementation examples for all three cloud providers |
| 61 | + |
| 62 | +**New Trait Definition:** |
| 63 | +```rust |
| 64 | +#[async_trait] |
| 65 | +pub trait CredentialSigner: Send + Sync { |
| 66 | + async fn generate_credentials( |
| 67 | + &self, |
| 68 | + resource_path: &str, |
| 69 | + permissions: &[String], |
| 70 | + duration: Duration, |
| 71 | + ) -> Result<VendedCredentials>; |
| 72 | + |
| 73 | + fn storage_type(&self) -> &str; |
| 74 | +} |
| 75 | + |
| 76 | +pub struct VendedCredentials { |
| 77 | + pub prefix: String, |
| 78 | + pub config: HashMap<String, String>, // PyIceberg-compatible properties |
| 79 | + pub expires_at: Option<DateTime<Utc>>, |
| 80 | +} |
| 81 | +``` |
| 82 | + |
| 83 | +--- |
| 84 | + |
| 85 | +### 3. `/docs/pyiceberg/multi_cloud.md` ⚠️ HIGH PRIORITY |
| 86 | +**Current State:** Mentions vending but uses incorrect property names |
| 87 | +**Required Updates:** |
| 88 | +- Update Azure properties section: |
| 89 | + - ✅ `adls.token` (for OAuth2 vended credentials) |
| 90 | + - ✅ `adls.account-name` (account name) |
| 91 | + - ✅ `adls.account-key` (for account key mode) |
| 92 | + - ❌ Remove outdated `adls.connection-string` (not used in vending) |
| 93 | +- Update GCP properties section: |
| 94 | + - ✅ `gcp-oauth-token` (for vended OAuth2 tokens) |
| 95 | + - ✅ `gcp-project-id` (project ID) |
| 96 | + - ❌ Remove `gcs.oauth2.token` (use `gcp-oauth-token`) |
| 97 | +- Add complete credential vending examples |
| 98 | + |
| 99 | +**Updated Azure Example:** |
| 100 | +```python |
| 101 | +catalog = load_catalog( |
| 102 | + "azure_catalog", |
| 103 | + **{ |
| 104 | + "type": "rest", |
| 105 | + "uri": "http://localhost:8080/v1/azure_catalog", |
| 106 | + "token": "YOUR_JWT_TOKEN", |
| 107 | + # Pangolin vends credentials automatically! |
| 108 | + # Properties vended: adls.token, adls.account-name |
| 109 | + } |
| 110 | +) |
| 111 | +``` |
| 112 | + |
| 113 | +--- |
| 114 | + |
| 115 | +### 4. `/docs/features/warehouse_management.md` ⚠️ LOW PRIORITY |
| 116 | +**Current State:** May only show S3 warehouse examples |
| 117 | +**Required Updates:** |
| 118 | +- Add Azure warehouse creation example |
| 119 | +- Add GCP warehouse creation example |
| 120 | +- Document `use_sts` behavior for each cloud provider |
| 121 | + |
| 122 | +--- |
| 123 | + |
| 124 | +### 5. `/docs/pyiceberg/auth_vended_creds.md` ⚠️ MEDIUM PRIORITY |
| 125 | +**Current State:** Unknown (need to review) |
| 126 | +**Likely Updates:** |
| 127 | +- Add multi-cloud vended credentials examples |
| 128 | +- Update property names to PyIceberg-compatible format |
| 129 | + |
| 130 | +--- |
| 131 | + |
| 132 | +## Property Name Reference (PyIceberg Compatible) |
| 133 | + |
| 134 | +### S3 / AWS |
| 135 | +| Property | Description | Example | |
| 136 | +|----------|-------------|---------| |
| 137 | +| `s3.access-key-id` | AWS access key | `AKIAIOSFODNN7EXAMPLE` | |
| 138 | +| `s3.secret-access-key` | AWS secret key | `wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY` | |
| 139 | +| `s3.session-token` | STS session token | `FwoGZXIvYXdzEBYaD...` | |
| 140 | +| `s3.endpoint` | S3 endpoint URL | `http://localhost:9000` | |
| 141 | +| `s3.region` | AWS region | `us-east-1` | |
| 142 | + |
| 143 | +### Azure ADLS Gen2 |
| 144 | +| Property | Description | Example | |
| 145 | +|----------|-------------|---------| |
| 146 | +| `adls.token` | OAuth2 access token | `eyJ0eXAiOiJKV1QiLCJhbGc...` | |
| 147 | +| `adls.account-name` | Storage account name | `mystorageaccount` | |
| 148 | +| `adls.account-key` | Storage account key | `Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==` | |
| 149 | +| `adls.container` | Container name | `data` | |
| 150 | + |
| 151 | +### GCP Cloud Storage |
| 152 | +| Property | Description | Example | |
| 153 | +|----------|-------------|---------| |
| 154 | +| `gcp-oauth-token` | OAuth2 access token | `ya29.a0AfH6SMBx...` | |
| 155 | +| `gcp-project-id` | GCP project ID | `my-project-12345` | |
| 156 | + |
| 157 | +--- |
| 158 | + |
| 159 | +## Testing Status |
| 160 | + |
| 161 | +### ✅ Tested and Working |
| 162 | +- **S3 Static Credentials:** Tested with MinIO ✅ |
| 163 | +- **S3 STS Foundation:** MinIO IAM configured ✅ |
| 164 | +- **Azure Account Key:** Tested with Azurite ✅ |
| 165 | +- **GCP Service Account:** Tested with fake-gcs-server ✅ |
| 166 | + |
| 167 | +### ⚠️ Code Ready, Needs Real Testing |
| 168 | +- **Azure OAuth2:** Code complete, needs real Azure AD testing |
| 169 | +- **S3 STS AssumeRole:** Needs AWS STS feature flag enabled |
| 170 | + |
| 171 | +--- |
| 172 | + |
| 173 | +## Implementation Details |
| 174 | + |
| 175 | +**Files Modified:** |
| 176 | +- `pangolin_api/src/credential_signers/azure_signer.rs` - PyIceberg properties |
| 177 | +- `pangolin_api/src/credential_signers/gcp_signer.rs` - PyIceberg properties |
| 178 | +- `pangolin_api/src/credential_signers/s3_signer.rs` - PyIceberg properties |
| 179 | +- `pangolin_api/src/credential_vending.rs` - Factory and helpers |
| 180 | +- `pangolin_api/src/signing_handlers.rs` - Refactored (-170 lines) |
| 181 | + |
| 182 | +**Tests:** 28/28 passing |
| 183 | +- 4 unit tests |
| 184 | +- 15 integration tests |
| 185 | +- 5 end-to-end tests |
| 186 | +- 4 live emulator tests |
| 187 | + |
| 188 | +--- |
| 189 | + |
| 190 | +## Recommended Documentation Update Priority |
| 191 | + |
| 192 | +1. **HIGH:** `security_vending.md` - Add Azure/GCP sections |
| 193 | +2. **HIGH:** `multi_cloud.md` - Fix property names |
| 194 | +3. **MEDIUM:** `signer-trait.md` - Update trait documentation |
| 195 | +4. **MEDIUM:** `auth_vended_creds.md` - Add multi-cloud examples |
| 196 | +5. **LOW:** `warehouse_management.md` - Add Azure/GCP warehouse examples |
| 197 | + |
| 198 | +--- |
| 199 | + |
| 200 | +## Quick Reference for Documentation Writers |
| 201 | + |
| 202 | +**When documenting Azure credential vending:** |
| 203 | +- Use `adls.token` (not `azure-oauth-token`) |
| 204 | +- Use `adls.account-name` (not `azure-account-name`) |
| 205 | +- Use `adls.account-key` (not `azure-account-key`) |
| 206 | + |
| 207 | +**When documenting GCP credential vending:** |
| 208 | +- Use `gcp-oauth-token` (not `gcs.oauth2.token`) |
| 209 | +- Use `gcp-project-id` (not `gcs.project-id`) |
| 210 | + |
| 211 | +**All property names must match PyIceberg expectations for compatibility.** |
0 commit comments