K8SPSMDB-1602: support Workload Identity for GCS backup storage#2315
Conversation
mayankshah1607
left a comment
There was a problem hiding this comment.
Hi, thanks for your PR. I believe we also need to add E2E tests for this
| // When WorkloadIdentity is enabled, skip credential secret loading entirely. | ||
| // PBM will use Application Default Credentials (ADC) provided by GKE Workload Identity. | ||
| useWorkloadIdentity := stg.GCS.Credentials != nil && stg.GCS.Credentials.WorkloadIdentity | ||
|
|
||
| if !useWorkloadIdentity && stg.GCS.CredentialsSecret != "" { |
There was a problem hiding this comment.
Please add coverage for this in pbm_test.go
| // When WorkloadIdentity is true, PBM uses Application Default Credentials (ADC) | ||
| // provided by GKE Workload Identity instead of a credentialsSecret. | ||
| type GCSCredentials struct { | ||
| WorkloadIdentity bool `json:"workloadIdentity,omitempty"` |
There was a problem hiding this comment.
Why does it need to be embedded inside GCSCredentials object?
|
Ready |
| if err := r.disableBalancer(ctx, cluster); err != nil { | ||
| return status, errors.Wrap(err, "disable balancer") | ||
| } |
There was a problem hiding this comment.
what was the rationale behind this change? pbm already disables balancer before restore.
- Add WorkloadIdentity bool field to BackupStorageGCSSpec - Make CredentialsSecret omitempty (not required when using workload identity) - Skip credential secret loading when WorkloadIdentity is true, allowing PBM to use GKE Application Default Credentials - Add unit test for GCS workload identity storage config - Update CRD schemas for all three resource types
26c0506 to
e547891
Compare
This approach assumes someone would be using Workload Identity, but what if they didn't set either The user would be able to create the database without problem, and the error wouldn't be caught until runtime. I prefer that it fails fast to the user instead. |
egegunes
left a comment
There was a problem hiding this comment.
This approach assumes someone would be using Workload Identity, but what if they didn't set either
credentialSecretorworkloadIdentity?The user would be able to create the database without problem, and the error wouldn't be caught until runtime.
I prefer that it fails fast to the user instead.
@TineoC we discussed this internally. although most of us think explicitly enabling workloadIdentity is a better design, we don't have this behavior for s3 storages for the same feature. I think consistency across features is worth preserving.
Could you please remove workloadIdentity field and check if credentialSecret is empty like we do for s3 storages?
|
@TineoC friendly ping |
- Remove explicit workloadIdentity field from API - Follow AWS S3 pattern: empty credentialsSecret triggers ADC fallback (hors feedback) - Remove workloadIdentity from all CRD YAMLs via make generate manifests - Add E2E test: demand-backup-gcs-workload-identity (mayankshah1607 feedback) - Keep PBM-side ADC fallback for when credentials are not provided
There was a problem hiding this comment.
I am not sure this works, don't we need to set WorkloadIdentity: true in the PBM storage conf for the fallback to happen? @TineoC did you have a chance to test this?
There was a problem hiding this comment.
Pull request overview
This pull request aims to enable GCS backups without providing a credentialsSecret (intended for GKE Workload Identity / ADC-based auth) by relaxing the Operator API + CRD requirements and adding unit/e2e coverage around “GCS with no explicit credentials”.
Changes:
- Make
spec.backup.storages[].gcs.credentialsSecretoptional in the Go API types and in generated CRD schemas. - Add a unit test case asserting PBM storage config generation for GCS works with empty credentials (no secret).
- Add a new e2e test scenario and wire it into PR/release e2e run lists.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/psmdb/backup/pbm_test.go | Adds a unit test case for GCS config generation without credentials. |
| pkg/apis/psmdb/v1/psmdb_types.go | Makes BackupStorageGCSSpec.CredentialsSecret omitempty (optional in JSON). |
| config/crd/bases/psmdb.percona.com_perconaservermongodbs.yaml | Removes credentialsSecret from required fields for GCS storage in the main CRD base. |
| config/crd/bases/psmdb.percona.com_perconaservermongodbbackups.yaml | Removes credentialsSecret from required fields for GCS storage in the Backup CRD base. |
| config/crd/bases/psmdb.percona.com_perconaservermongodbrestores.yaml | Removes credentialsSecret from required fields for GCS storage in the Restore CRD base. |
| deploy/crd.yaml | Regenerated CRD manifest reflecting credentialsSecret no longer required. |
| deploy/bundle.yaml | Regenerated bundle reflecting credentialsSecret no longer required. |
| deploy/cw-bundle.yaml | Regenerated cw-bundle reflecting credentialsSecret no longer required. |
| e2e-tests/version-service/conf/crd.yaml | Updates version-service CRD copy to not require credentialsSecret. |
| e2e-tests/run-release.csv | Adds the new WI/ADC GCS e2e test to release runs. |
| e2e-tests/run-pr.csv | Adds the new WI/ADC GCS e2e test to PR runs. |
| e2e-tests/demand-backup-gcs-workload-identity/run | New e2e test runner script for GCS WI/ADC flow. |
| e2e-tests/demand-backup-gcs-workload-identity/conf/some-name.yml | New PSMDB cluster manifest for the WI/ADC e2e test. |
| e2e-tests/demand-backup-gcs-workload-identity/conf/restore.yml | Restore manifest template used by the new e2e test. |
| e2e-tests/demand-backup-gcs-workload-identity/conf/backup-gcs-wi.yml | Backup manifest template used by the new e2e test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| type BackupStorageGCSSpec struct { | ||
| Bucket string `json:"bucket"` | ||
| Prefix string `json:"prefix,omitempty"` | ||
| CredentialsSecret string `json:"credentialsSecret"` | ||
| CredentialsSecret string `json:"credentialsSecret,omitempty"` | ||
| ChunkSize int `json:"chunkSize,omitempty"` | ||
| Retryer *GCSRetryer `json:"retryer,omitempty"` | ||
| } |
| required: | ||
| - bucket | ||
| - credentialsSecret | ||
| type: object |
| kubectl_bin annotate serviceaccount \ | ||
| --namespace "${namespace}" \ | ||
| "${namespace}-psmdb-db" \ | ||
| "iam.gke.io/gcp-service-account=${GCS_WI_SERVICE_ACCOUNT}" |
| storages: | ||
| gcs-wi: | ||
| type: gcs | ||
| gcs: | ||
| bucket: operator-testing | ||
| prefix: psmdb-demand-backup-gcs-wi |
| required: | ||
| - bucket | ||
| - credentialsSecret | ||
| type: object |
| required: | ||
| - bucket | ||
| - credentialsSecret | ||
| type: object |
- Set WorkloadIdentity: true in GCS credentials when credentialsSecret is empty so PBM uses ADC instead of erroring - Update unit test to expect WorkloadIdentity: true in the no-credentials GCS config
commit: 13ff352 |
|
@TineoC thank you for your contribution! |
Summary
credentials.workloadIdentityfield toBackupStorageGCSSpecenabling GCS backups via GKE Workload Identity without exported service account keysnewGoogleClientto fall back to Application Default Credentials (ADC) when no explicit credentials are providedcredentialsSecretoptional when workload identity is enabledMotivation
GKE environments using Workload Identity (Google's recommended approach) currently cannot use GCS backups because:
credentialsSecretand passesclientEmail/privateKeyto PBMnewGoogleClientrequires these fields and errors with:"clientEmail and privateKey are required for GCS credentials"roles/storage.objectUser, PBM never tries ADCThis is blocking for IL4/FedRAMP environments where exporting service account JSON keys is not permitted.
Closes #2314
Changes
Operator API (
pkg/apis/psmdb/v1/psmdb_types.go)GCSCredentialsstruct withWorkloadIdentity boolfieldCredentials *GCSCredentialstoBackupStorageGCSSpecCredentialsSecrettagomitemptyOperator Backup Logic (
pkg/psmdb/backup/pbm.go)credentials.workloadIdentity: true, skip readingcredentialsSecretPBM GCS Client (
pbm/storage/gcs/google_client.go)ClientEmailandPrivateKeyare both empty, usestoragegcs.NewClient(ctx)(ADC) instead of erroringUsage
Test plan
go test ./pkg/...)credentialsSecretstill works (backward compat)credentials.workloadIdentity: trueon GKE with WI succeeds🤖 Generated with Claude Code