Skip to content

refactor(k8s): organize Helm templates into base/aws/on-prem#5757

Open
aicam wants to merge 3 commits into
apache:mainfrom
aicam:aws-eks/01-template-reorg
Open

refactor(k8s): organize Helm templates into base/aws/on-prem#5757
aicam wants to merge 3 commits into
apache:mainfrom
aicam:aws-eks/01-template-reorg

Conversation

@aicam

@aicam aicam commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

This is a refactoring of the Helm chart under bin/k8s/ — it reorganizes the flat templates/ directory into a clear, self-documenting layout, and brings the development values overlay back in sync with the source-of-truth values.yaml. It is the first in a planned series of small, non-breaking PRs that make the chart cleanly deployable on AWS/EKS while keeping on-prem/local as the unchanged default.

1. Template folder reorganization (behavior-neutral). Helm renders templates/** recursively, so moving files into subdirectories does not change rendered output. Templates are now grouped two ways:

  • By where they applytemplates/common/ (shared by every deployment), templates/onprem/ (minio-persistence.yaml), and templates/aws/ (placeholder for AWS-only, value-gated templates added in later PRs; empty for now).
  • Within common/, by component — one subfolder per service holding all of its manifests, e.g. common/access-control-service/, common/gateway/, common/workflow-computing-unit-manager/, common/workflow-computing-unit-pool/, etc.

A templates/README.md documents the convention, and .helmignore is added so *.md/.gitkeep are not loaded as manifests.

2. values-development.yaml realigned with values.yaml. The dev overlay had drifted from the source of truth:

  • imageRegistry/imageTag updated from ghcr.io/apache/latest to docker.io/apache/1.3.0-incubating-SNAPSHOT to match values.yaml.
  • Added the AUTH_JWT_SECRET dev-default that already exists in values.yaml but was missing from the dev overlay. Without it, creating a Kubernetes computing unit fails with a NoSuchElementException (None.get) in ComputingUnitManagingResource, which reads AUTH_JWT_SECRET from the environment unconditionally.

No values were renamed or removed and templates/aws/ is empty, so the default on-prem/local install is unchanged.

Any related issues, documentation, discussions?

Closes #5892 (refactor sub-task).
Part of #5891 — unify AWS (EKS) and on-premise Kubernetes deployment under bin/k8s (parent feature).
Follows the design discussion in #5641.

This is the first of a planned series of incremental, non-breaking PRs to add AWS/EKS deployment support to the chart; later PRs build on this structure (pluggable object storage, node placement, AWS load balancer, computing-unit warm pool, eksctl/runbook). The new bin/k8s/templates/README.md documents the common/aws/onprem and per-component conventions.

How was this PR tested?

This is a chart refactor, so it was verified to be a no-op and then exercised end-to-end:

  1. Render no-op proof: helm template texera bin/k8s -f bin/k8s/values-development.yaml was captured before and after the file moves. The output is identical — 102 resources both times — with the only differences being the # Source: provenance comments (the new paths) and a per-render randomized secret value. helm lint bin/k8s passes.
  2. End-to-end smoke on minikube (on-prem/development profile, published apache images): the reorganized chart installs, all core pods reach Running, the UI loads and authenticates, dataset upload + preview work (file-service + MinIO), and Kubernetes computing-unit creation succeeds and goes green — exercising the AUTH_JWT_SECRET fix.

No unit tests were added because the change is limited to Helm chart file organization and values, which is validated by the render-diff and helm lint above rather than by JVM unit tests.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

@github-actions github-actions Bot added docs Changes related to documentations dev labels Jun 17, 2026
@aicam aicam marked this pull request as draft June 17, 2026 18:47
Group the chart's templates by where they apply so the layout makes the
deployment surface obvious at a glance:

  templates/common/  resources every deployment needs (services, gateway,
                     postgres/lakefs/lakekeeper, computing-unit pool, RBAC)
  templates/onprem/  self-hosted-only resources (in-cluster MinIO)
  templates/aws/     AWS/EKS-only resources (added in later PRs; placeholder)

Helm renders templates/** recursively, so this is purely organizational:
`helm template` output is byte-identical to before the move (verified, modulo
the chart's pre-existing per-render random LakeFS keys). A templates/README.md
documents the convention and a .helmignore keeps the doc and .gitkeep
placeholder from being loaded as manifests.

Also bring values-development.yaml back in line with values.yaml (the ground
truth), which had drifted:
  - image source: docker.io/apache + 1.3.0-incubating-SNAPSHOT (was the stale
    ghcr.io/apache + latest), so both value files pull from the same place;
  - add the AUTH_JWT_SECRET entry to texeraEnvVars. values.yaml has it but the
    dev profile omitted it, so the computing-unit manager started without the
    secret and k8s computing-unit creation crashed with NoSuchElementException
    (None.get) in ComputingUnitManagingResource. Adding it (same dev-only
    default as values.yaml) makes CU creation work under the dev profile.

No behavior change to the default (on-prem) install.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@aicam aicam force-pushed the aws-eks/01-template-reorg branch from a1fcbab to bd9fa60 Compare June 22, 2026 16:51
@github-actions

Copy link
Copy Markdown
Contributor

Automated Reviewer Suggestions

Based on the git blame history of the changed files, we recommend the following reviewers:

  • Contributors with relevant context: @bobbai00, @aglinxinyuan, @mengw15
    You can notify them by mentioning @bobbai00, @aglinxinyuan, @mengw15 in a comment.

@aicam aicam requested a review from bobbai00 June 22, 2026 18:16

@Ma77Ball Ma77Ball left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM!! Please make sure the helm chart and Docker can still be built, but otherwise I think it can be merged and looks like good work.

Comment thread bin/k8s/values-development.yaml Outdated

@Ma77Ball Ma77Ball left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aicam, this might need to be adjusted on Docker or here.

Comment thread bin/k8s/values-development.yaml Outdated

@bobbai00 bobbai00 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename onprem to on-prem, common to base. Make sure you also update the README.md under the templates folder accordingly

Comment thread bin/k8s/values-development.yaml Outdated
…-development

- Rename templates/common -> templates/base and templates/onprem ->
  templates/on-prem per review (@bobbai00), and update templates/README.md
  accordingly.
- Revert values-development.yaml to apache/main: drop the
  imageRegistry/imageTag override (@Ma77Ball, @bobbai00) and the hardcoded
  AUTH_JWT_SECRET (@Ma77Ball), keeping this PR a pure mechanical reorg.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@aicam aicam changed the title refactor(k8s): organize Helm templates into common/aws/onprem refactor(k8s): organize Helm templates into base/aws/on-prem Jun 23, 2026
@aicam

aicam commented Jun 23, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review @bobbai00 @Ma77Ball — addressed in ca0a4c7:

  • Folder names (@bobbai00): renamed templates/commontemplates/base and templates/onpremtemplates/on-prem, and updated templates/README.md to match.
  • imageRegistry/imageTag (@Ma77Ball, @bobbai00): reverted values-development.yaml back to ghcr.io/apache / latest so the out-of-the-box helm install -f values-development.yaml keeps pulling published images.
  • Hardcoded AUTH_JWT_SECRET (@Ma77Ball): dropped from this PR as well. This keeps PR1 a pure mechanical reorg with zero value/logic changes. I'll open a separate follow-up for an install-time generated dev secret (lookup + randAlphaNum) as you suggested.

Verification: helm template texera bin/k8s renders byte-identical output to the pre-rename commit (5085 lines, only # Source: paths differ), and helm lint is clean. The reorg remains behavior-neutral.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dev docs Changes related to documentations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(k8s): reorganize Helm templates into common/aws/onprem + per-component subfolders

3 participants