Skip to content

fix: resolve CA client FD leak & enhance kubectl-hlf to generate declarative secretRefs#319

Open
Mau-MR wants to merge 6 commits into
hyperledger-bevel:mainfrom
Mau-MR:fix/ca-cert-provision-flush
Open

fix: resolve CA client FD leak & enhance kubectl-hlf to generate declarative secretRefs#319
Mau-MR wants to merge 6 commits into
hyperledger-bevel:mainfrom
Mau-MR:fix/ca-cert-provision-flush

Conversation

@Mau-MR
Copy link
Copy Markdown
Contributor

@Mau-MR Mau-MR commented May 6, 2026

What this PR does / why we need it:

This PR delivers key reliability fixes and declarative GitOps enhancements to the Hyperledger Bevel Fabric Operator and its corresponding kubectl-hlf CLI plugin:

  1. Race Condition & File Descriptor Leak Fix:
    • The Bug: The operator occasionally failed to initialize the Fabric CA client with the error Failed to get client TLS config: Failed to process certificate from file .... This was caused by the temporary CA certificate file not being flushed and closed before passing its path to the underlying Fabric-CA SDK library.
    • The Fix: Ensures the file is flushed and closed, ensuring data integrity before the library reads it.
  2. Disk Leak Fix:
    • The Bug: Temporary certificate files were being generated in the system's root /tmp directory and were never cleaned up, resulting in long-term disk leaks.
    • The Fix: Relocates temporary file generation to the managed caHomeDir where the operator's existing cleanup routine takes care of automatic pruning.
  3. Declarative GitOps & CA Certificate Rotation Support (kubectl-hlf CLI):
    • Enhancement: Previously, the kubectl-hlf CLI commands (for peers, orderers, and identities) imperatively fetched the target CA, read its .status.tlsCert, and embedded it as a static base64-encoded cacert block. This broke declarative/GitOps pipelines (like ArgoCD) because resources could not be template-generated or applied before the CA was fully up and running.
    • The Solution: Updated kubectl hlf peer create, kubectl hlf ordnode create, kubectl hlf identity create, and kubectl hlf identity update to natively output a secretRef pointing to the CA's deterministic <ca-name>--tls-cryptomaterial secret (under key tls.crt).
    • To prevent schema validation issues with the CRD spec, cacert is preserved as an explicit empty string (""). This enables true offline manifest rendering and automatic CA certificate rotation!
  4. Helm Installation Reference Updates:
    • Fix: Corrected the legacy Helm installation repository endpoints in README.md to point to the new officially-managed hyperledger-bevel charts registry (https://hyperledger-bevel.github.io/bevel-operator-fabric/) instead of the deprecated kfsoftware endpoints.

Which issue(s) this PR fixes:

Fixes #318 (#318)


Special notes for your reviewer:

The core operator changes are centered in controllers/certs/provision_certs.go:

  • Added caCertFile.Close() to ensure data integrity before the library reads.
  • Refactored ioutil.TempFile to use caHomeDir instead of "" (system tmp) to ensure proper clean up.

The CLI plugin enhancements are centered in:

  • kubectl-hlf/cmd/peer/create.go
  • kubectl-hlf/cmd/ordnode/create.go
  • kubectl-hlf/cmd/identity/create.go
  • kubectl-hlf/cmd/identity/update.go

How to reproduce the problem exactly:

  1. Revert the changes in controllers/certs/provision_certs.go (remove caCertFile.Close()).
  2. Run the regression test: go test -v ./controllers/certs/ -run TestGetClient_ValidCertFile -count 10.
  3. The test will consistently fail with a "File descriptor leak detected" message (e.g., started with 6, ended with 7).
  4. In production, this manifests as: Failed to get client TLS config: Failed to process certificate from file /tmp/ca-cert....

Does this PR introduce a user-facing change?

As the kubectl hlf plugin now handles identity enrollment as well as peer and orderer enrollment, the plugin only needs to be updated when the Operator Controller Manager running in the cluster has also been updated. This is because the hlf plugin now performs enrollment directly using secretRef instead of fetching the secrets at runtime. Therefore, identity enrollment will only work with the corresponding updated Operator version that supports the secretRef field introduced in this change.

Fixed a race condition and disk leak in CA client certificate processing. Updated kubectl-hlf CLI generator commands to natively produce declarative secretRef specifications for Peers, Orderers, and Identities, enabling GitOps workflows and seamless CA certificate rotation out of the box.

Additional documentation, usage docs, etc.:

  • Regression Tests: Automated tests in controllers/certs/provision_certs_test.go verify successful CA client initialization and guarantee that file descriptors and temporary files are fully closed and cleaned up.

Mau-MR added 6 commits May 4, 2026 18:34
… leaks

Signed-off-by: Mauricio E. Merida Rivera <mauricio@rubidex.ai>
…oning

Signed-off-by: Mauricio E. Merida Rivera <mauricio@rubidex.ai>
… location

After the project migrated from kfsoftware/hlf-helm-charts to
hyperledger-bevel/bevel-operator-fabric, the release_charts.yml
workflow was updated to publish charts under the new GitHub Pages URL.

This commit updates the README installation instructions to reflect
that change:
- Repo URL: kfsoftware.github.io/hlf-helm-charts → hyperledger-bevel.github.io/bevel-operator-fabric/
- Repo alias: kfs → bevel
- Fixed stray `--` in the helm install command that would cause a parse error

Signed-off-by: Mauricio E. Merida Rivera <mauricio@rubidex.ai>
This updates the  plugin commands (peer create, ordnode create, identity create, and identity update)
to configure the generated resources with a  referencing the CA's  secret
instead of embedding a raw base64-encoded  string.

Benefits:
1. True declarative/GitOps-friendly deployments without needing to look up certs at templating/applying time.
2. Portability and native support for CA certificate rotations.

Signed-off-by: Mauricio E. Merida Rivera <mauricio@rubidex.ai>
Since version 1.14 of the operator, the authentication and authorization
for the metrics endpoint are now handled directly by the controller manager
server (using filters.WithAuthenticationAndAuthorization). The `kube-rbac-proxy`
sidecar is deprecated and no longer needed.

This commit removes the remaining Kustomize and Helm configurations for the
`kube-rbac-proxy` to fix the `test-kubectl-plugin.yml` CI pipeline, which was
failing with an `ImagePullBackOff` error when trying to pull the proxy image.

Changes include:
- Removing the proxy sidecar patch from config/default/kustomization.yaml
- Removing proxy-related RBAC and service resources from config/rbac/kustomization.yaml
- Adding tokenreviews and subjectaccessreviews RBAC markers to main.go to ensure make manifests correctly generates the required permissions in config/rbac/role.yaml
- Cleaning up leftover proxy cluster roles and bindings from the Helm chart (chart/hlf-operator/templates/rbac.yaml)

Signed-off-by: Mauricio E. Merida Rivera <mauricio@rubidex.ai>
The previous commit introduced a bug where the generated SecretRef
for the CA TLS used `fabricCA.Name` or `certAuth.Name`. These properties
are populated by `MapClusterCA` with the format `<name>.<namespace>`.
This caused the secret name to incorrectly include the namespace,
resulting in names like `org1-ca.default--tls-cryptomaterial`.

This fix updates the CLI to use the raw Kubernetes object name via
`Item.Name` so that the secret name matches what the CA controller
generates (e.g., `org1-ca--tls-cryptomaterial`).

Signed-off-by: Mauricio E. Merida Rivera <mauricio@rubidex.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Unclosed temp file descriptor causes intermittent "Failed to process certificate" when using secretRef for CA certificates

1 participant