feat: move GitLab from EKS to EC2 docker-compose by allamand · Pull Request #755 · aws-samples/appmod-blueprints

allamand · 2026-06-23T17:00:20Z

Moves GitLab CE out of the EKS hub cluster onto the IDE EC2 instance via docker-compose, exposed through a CDK-managed NLB + CloudFront distribution.

Changes

Disable gitlab ArgoCD addon (enabled-addons, hub-config, platform.yaml)
Remove GitLab Keycloak SSO client registration (no SSO for EC2 GitLab)
Remove gitlab-nlb + gitlab-distribution from Taskfile.cloudfront.yaml (CloudFront now CDK-managed in platform-engineering-on-eks)
Add gitlab:init-ec2 task: wait for GitLab CE readiness, create root token, user1, repos via GitLab API (replaces kubectl exec into the k8s pod)
Replace k8s Job wait in clone-repos with CloudFront readiness poll
GITLAB_DOMAIN_INT now uses EC2 private IP (private/gitlab-ec2-private-ip) for in-cluster ArgoCD git access
Remove git_token from seed-secrets (seeded by CDK deploy-time Lambda)

Companion MR

platform-engineering-on-eks: feat/pr-709-kind-crossplane-gitlab-on-ec2 → feat/pr-709-kind-crossplane

Closes #754

- Disable GitLab ArgoCD addon (enabled-addons, hub-config, platform.yaml registry) - Remove Keycloak SSO client for GitLab - Remove gitlab-nlb/gitlab-distribution Taskfile.cloudfront.yaml tasks - Add gitlab:init-ec2 task: wait for CE readiness, create root token, user1, repos - Replace k8s Job wait in clone-repos with CF readiness poll - GITLAB_DOMAIN_INT uses EC2 private IP for in-cluster ArgoCD git access - Remove git_token from seed-secrets (seeded by CDK at deploy time) Refs #754

Added: urls, hub:set-overlay-repo, hub:restart-langfuse, hub:wait-for-full-sync, secrets-manager:seed-secrets, secrets-manager:seed-observability, hub:create-mgmt-roles, hub:restart-identity-pods, hub:update, spokes:enable-crossplane/kro, spokes:create-capabilities, spokes:disable-crossplane/kro/all, spokes:seed-provider-identity Removed: idc:setup from install task (moved to workshop:Taskfile.yaml)

- Add parallel install phases (phase1/phase2) to kind-kro-ack install task - Move gitlab:init-ec2 + gitlab:clone-repos to workshop/Taskfile.yaml - crossplane-system refs in copied tasks left intentional (Crossplane addon on hub EKS)

…crossplane)

…es managed declaratively by RGD

…tally removed)

…yaml

- Restore argocd login block in ssm-setup-ide-logs.sh - Restore progressive-app image_name/service name (rollout-demo → progressive-app)

…tter diagnostics

…xists, project exists)

…lex hash/validation

RGDs ekscluster.kro.run and eksclusterwithvpc.kro.run were Inactive because: 1. ACK SecretsManager controller was missing (rg-eks.yaml references secretsmanager.services.k8s.aws/v1alpha1) 2. ESO was not installed on the bootstrap kind cluster (rg-eks.yaml references external-secrets.io/v1) Fixes: - Add ACK_SECRETSMANAGER_VERSION=1.3.1 and install it in ack:install - Add kro:install-eso-bootstrap task (idempotent) before kro:apply-rgds

…dd argoCdCapabilityRoleArn - Remove spec.argocdCapability block (not in EksclusterWithVpc RGD schema, causes strict decode error) - Add missing argoCdCapabilityRoleArn field (required by RGD) - Remove now-unused IDC sed substitution lines

… proper user+namespace creation

…lity Port from feature/platform-cluster-kro-ack: - rg-eks.yaml: add argocdCapability schema object, replace accessEntryArgoCdCapability with conditional argocdCapabilityRole + argocdCapability (EKS Capability with IDC) + argocdCapabilityAccessEntry (all guarded by includeWhen enabled==true) - rg-eks-vpc.yaml: add argocdCapability schema object, pass through to nested EksCluster - hub:claim: restore argocdCapability block with IDC sed substitutions

…otect after creation

…CRD with removed property

Deletes eksclusters.kro.run and eksclusterwithvpcs.kro.run CRDs on the bootstrap kind cluster. Needed when RGD schema changes remove fields (KRO breaking-change protection blocks the update otherwise). Run: task kind-kro-ack:kro:reset-crds, then task install.

Hardcoded 10.0.x.0/24 subnets break when vpcCidr is not 10.0.0.0/16. Add HUB_VPC_PREFIX var (first two octets of HUB_VPC_CIDR) and use it in hub:claim subnet CIDRs.

…xist When the EKS ArgoCD Capability is used, ArgoCD CRDs are already present on the hub cluster. Helm install then fails with 'CRD already exists'. Add a second status check: skip if applications.argoproj.io CRD exists.

…pendency) Switch spoke-dev from spokes:enable-crossplane + create-capabilities to spokes:enable-kro, matching spoke-prod. Both spokes are now provisioned via the KRO EksclusterWithVpc RGD + ACK controllers. This eliminates the Crossplane provider credential chicken-and-egg for spoke provisioning entirely — the hub's Crossplane providers are still used for pod-identity/IAM on the hub itself, but spoke EKS clusters no longer depend on them. Also makes the environment label dynamic (derived from cluster name) so spoke-dev correctly gets 'environment: dev' instead of hardcoded 'prod'.

The task failed immediately (exit 1) because under 'set -e -o pipefail' (the global Taskfile setting), the kubectl jsonpath that queries unhealthy providers returns non-zero when no items match (empty result). Combined with pipefail, this propagated through the pipe to wc and killed the script on the first loop iteration — even when all providers were healthy. Fix: - Split kubectl|wc pipeline: kubectl with '|| true', then wc separately - Replace '&& exit 0' conditional chain with if/then/break (chains with set -e are a trap: any false part returns non-zero → errexit fires) - Use 'break' instead of 'exit 0' so cleanup (rm kubeconfig) still runs - Add '|| true' to the progress printf line (same && chain issue)

…-for-providers The rm -f at the end of hub:wait-for-providers deletes a kubeconfig that the very next task immediately recreates. It adds no value (the file doesn't go stale within a single install run) and can cause failures if a subsequent task expects it to exist.

…served Wait for applicationsets/applications.argoproj.io CRDs to be Established before applying root-appset.yaml, to avoid 'no matches for kind ApplicationSet in version argoproj.io/v1alpha1' when the ArgoCD EKS Capability has not yet reconciled after the EksCluster becomes ACTIVE.

…dfrontDomain prereq gate - kind-crossplane: run cloudfront:sync-domain right after setup-exposure and before hub:seed (mirroring kind-kro-ack) so cloudfrontDomain is written to config.local.yaml + private/cloudfront-domain before hub:seed reads it. Covers install retries where setup-exposure's status guard skips because the <hub>-platform distribution already exists, which previously left hub:seed with an empty ingress_domain_name. - workshop:pre-install: cloudfrontDomain (platform CloudFront) is created by hub-distribution during install, not a prerequisite. Remove the hard exit 1 on empty cloudfrontDomain and correct messaging (create-config.sh sets cloudfront.gitlabDomain, not cloudfrontDomain).

…rs values phase2 ran spokes:enable-kro for spoke-dev and spoke-prod concurrently. Both commit+push to the same fleet-config main branch and write the same shared files, so the second push was rejected, its retry rebased two add/add commits into an unresolvable conflict, left a detached HEAD, and failed the whole task install (exit 128). - install:phase2-parallel now runs spoke-dev then spoke-prod sequentially (the tasks only do git clone/commit/push; EKS provisioning is async via ArgoCD/KRO, so serialising costs ~seconds). - spokes:enable-kro now deep-merges its cluster block into gitops/fleet/.../kro-clusters/values.yaml via 'yq . *= load(...)' instead of a full 'cat >' overwrite that dropped the sibling spoke's block (last-writer-wins data loss, and the source of the add/add rebase conflict).

…p config.local.yaml storage The platform CloudFront domain is created mid-install by cloudfront:hub-distribution, but consumers read it via taskfile-level (global) vars that go-task evaluates ONCE at load time — before the distribution exists — so the value was frozen empty. That is why idc:configure got an empty --keycloak-dns and 'urls' printed https:/// even though config.local.yaml and private/cloudfront-domain both held the domain. Make private/cloudfront-domain the single source of truth (runtime artifact), resolved lazily in each consumer's own task-level vars with an AWS 'Comment==<hub>-platform' fallback. Stop storing/reading it in config.local.yaml entirely. common/Taskfile.cloudfront.yaml: - hub-distribution writes only private/cloudfront-domain (drop yq -i config write) - sync-domain reads private -> AWS (drop config read), writes only private workshop/Taskfile.yaml (shared by both providers): - idc:configure KEYCLOAK_DNS is now a self-contained task-level resolver (private -> AWS) - setup-env CF_DOMAIN resolves lazily; gitlab domain from gitlabDomain -> private file - top-level CLOUDFRONT_DOMAIN reads private only (informational/pre-install); messaging updated kind-crossplane + kind-kro-ack: - hub:seed ingress_domain_name/exposure_mode use a task-level lazy CLOUDFRONT_DOMAIN - urls fallback resolves inline (private -> AWS) instead of the frozen global - crossplane hub:update-ingress-domain reads private -> AWS - top-level CLOUDFRONT_DOMAIN reads private only

…nt mode The platform ALB is pre-created as scheme=internal (CloudFront VPC Origin backend), but the platform IngressClassParams hardcoded scheme=internet-facing. ALB scheme is immutable, so when the AWS LBC adopts the ALB by loadBalancerName it refuses to reconcile on the scheme conflict — no listener rules are attached, every platform ingress (keycloak, backstage, grafana, ...) gets no address, and all requests hit the ALB default 404 action. This blocked idc:configure (Keycloak SAML descriptor 404) and made all platform URLs return 404, for both kind-crossplane and kind-kro-ack. - ingress-class-alb chart: scheme is now templated ({{ .Values.scheme | default internet-facing }}) - core.yaml appset: pass scheme=internal when exposure_mode=cloudfront, internet-facing otherwise

…s to EKS RGD The kro-ack EksCluster RGD created PodIdentityAssociations for external-secrets, external-dns, adot, policy-reporter, cni-metrics-helper and cloudwatch-agent, but NOT for two service accounts that make AWS calls: - kube-system/aws-load-balancer-controller-sa: without creds the LBC can't build the ingress model (IMDS fallback fails: 'no EC2 IMDS role found'), attaches no listener rules to the platform ALB, so every platform ingress (keycloak, backstage, grafana, argo-workflows) gets no address and the ALB serves its default 404 — which blocked idc:configure (Keycloak SAML descriptor 404). - keycloak/keycloak-config: the config job runs 'aws secretsmanager create/ put-secret-value' to publish keycloak-clients; without creds ESO never syncs it and Backstage/Argo SSO break. Adds an ACK Policy + Role (pods.eks.amazonaws.com trust) + PodIdentityAssociation for each, modeled on the existing external-dns chain. LBC policy reuses the canonical AWS LBC IAM policy; keycloak-config gets least-privilege Secrets Manager permissions. (Crossplane provider already covers these via crossplane-pod-identity.)

…amp hub provider label On the kro-ack provider, pod identities are created by the KRO EksCluster RGD (ACK). Crossplane's own providers are not bootstrapped on the kro-ack hub, so the crossplane-pod-identity (pod_identities) app there produces permanently-Degraded CRs (SYNCED=False) and is a latent 409 conflict with the ACK-created associations. - core.yaml: pod_identities selector now excludes clusters labelled provider=kro-ack (NotIn also matches clusters with no provider label, so crossplane/legacy clusters — including crossplane's KRO-provisioned spoke-prod, which needs the crossplane provider bootstrap identities — are unaffected). - kind-kro-ack hub:seed writes addons='{"provider":"kro-ack"}' into peeks-hub/config; the fleet-secret ExternalSecret renders the addons key as labels, so the kro-ack hub cluster secret gets provider=kro-ack and pod_identities is not generated for it. Note: kro-ack spoke secrets still need provider=kro-ack wired through spokes:enable-kro -> EksclusterWithVpc/EksCluster schema -> argocdSecret label (follow-up).

… pod_identities Threads a provider field from spokes:enable-kro through the KRO claim to the spoke argocd cluster secret label, so kro-ack spokes (like the hub) are excluded from the crossplane pod_identities app (provider NotIn [kro-ack]). - kind-kro-ack spokes:enable-kro writes provider: kro-ack into the kro-clusters values - kro-clusters chart maps $cluster.provider -> EksclusterWithVpc.spec.provider - rg-eks-vpc EksclusterWithVpc schema + passthrough to nested EksCluster - rg-eks EksCluster schema + argocdSecret label provider: ${schema.spec.provider} Default is empty, so crossplane-provisioned clusters (and crossplane's KRO-provisioned spoke-prod, whose spokes:enable-kro does not set it) keep provider unset -> NotIn matches -> pod_identities stays enabled for them (preserves crossplane provider bootstrap identities).

…GD (temporary bridge) Until devlake (and other data resources) are migrated off Crossplane, kro-ack still needs Crossplane working on the hub. But on kro-ack the Crossplane AWS providers have no credentials — provider-aws-iam/eks pod identities are themselves crossplane CRs (chicken-and-egg) — so all downstream crossplane roles/PIAs (amp/rds/grafana/devlake) stay SYNCED=False and devlake RDS / AMP / Grafana break. Seed the two ROOT providers' pod identities via ACK in the EksCluster RGD (which provisions both hub and spokes on kro-ack). Once provider-aws-iam/eks have creds, Crossplane reconciles the rest itself. - rg-eks.yaml: add crossplaneProviderRole (AdministratorAccess, pods.eks trust) + PodIdentityAssociations for crossplane-system/provider-aws-iam and provider-aws-eks. Gated includeWhen provider=='kro-ack' && enable_crossplane_aws=='true' so it does NOT run on crossplane-provisioned clusters (incl. crossplane's KRO-provisioned spoke-prod, where crossplane-pod-identity already creates these -> avoids 409). - platform-cluster-kro hub claim: set provider=kro-ack so the gate fires on the hub. TEMPORARY: remove these resources once the devlake->kro-ack migration lands.

…created ALB (curl 000) Document the failure where platform URLs time out with curl 000 in cloudfront mode because the CloudFront VPC origin is bound to a deleted ALB ARN (the LBC delete-recreated the ALB, classically due to an IngressClassParams scheme mismatch). Includes the direct-ALB isolation test, the VPC-origin-ARN vs current-ALB-ARN check, the CloudTrail DeleteLoadBalancer lookup, and the fix (scheme=internal to stop churn + recreate/swap the VPC origin to re-point CloudFront).

…s ALB in place (no churn) Root-cause hardening for the stale-VPC-origin issue: the LBC recreated the platform ALB when its subnet set didn't match what create-alb built, orphaning the CloudFront VPC origin. - create-alb now selects PRIVATE subnets by internal-elb tag then by MapPublicIpOnLaunch==false (instead of a fragile *private* name tag), and tags the chosen subnets kubernetes.io/role/internal-elb=1 so the AWS LBC discovers the SAME set and adopts the ALB in place instead of calling SetSubnets / recreating it. - steering/troubleshooting.md: document the implemented prevention (scheme=internal + subnet tagging) and the proposed-but-NOT-implemented cloudfront:sync-vpc-origin reconcile as future hardening (design + wiring), to add only if ALB churn recurs.

…kfile (not RGD) The hub's crossplane iam/eks providers had no AWS credentials, so crossplane-base's provider roles/PIAs (amp/rds/grafana/devlake) never reconciled -> no AMP/Grafana/RDS -> crossplane-base, observability-aws, grafana-dashboards, devlake all Degraded. Root cause: crossplane-base declares the downstream provider PIAs but iam/eks are createIdentity:false (chicken-and-egg), and nothing on the kro-ack hub seeds the first credential. The prior RGD-based bootstrap (6f8fb2c) can't fix the hub because the hub's EksclusterWithVpc lives on the transient kind bootstrap cluster and is never reconciled by the hub-EKS RGD, so it can't self-heal an already-running hub. Fix (mirrors the crossplane/terraform flows): new hub:bootstrap-crossplane-identity task, run after hub:wait-for-sync, that idempotently creates the peeks-hub-crossplane-provider admin role + provider-aws-iam/eks pod identities, then restarts the crossplane provider pods (label pkg.crossplane.io/revision) twice — pass 1 credentials iam/eks, they reconcile the downstream provider roles/PIAs, pass 2 credentials those providers. Idempotent via a status guard (skips once CrossplaneAMPProviderRole exists). RGD bootstrap left in place for spokes.

…moved in bb387b0 The previous commit's strReplace anchored on the repeated 'rm -f private/hub-kubeconfig' boilerplate and inadvertently dropped the spokes:create-capabilities task. Restore it verbatim; the hub:bootstrap-crossplane-identity task and its install wiring are unchanged. Task count back to 55 (was erroneously net-neutral at 54).

… + async spokes + tolerant spoke gate Replace the imperative EKS Capability creation with declarative Crossplane Capability MRs, make Crossplane spokes provision fully async, and add a non-fatal final gate that verifies spokes before install completes. - Bump provider-aws-eks (+family/iam/ec2/rds/dynamodb/amp/grafana) v2.5.3->v2.6.1 (registry, kind-crossplane bootstrap, terraform helm.tf). v2.6.1 serves a cluster-scoped eks.aws.upbound.io/v1beta1 Capability, so it composes into the existing legacy composition with no v2/namespaced migration. - platform-cluster XRD/composition: native Capability MRs (kro/ack/argocd) + capability IAM roles, matching the kro-ack RGD (capabilities.eks.amazonaws.com trust; KRO=AmazonEKSClusterPolicy; ACK=inline AssumeWorkloadRoles+ManageIRSARoles), CEL-gated on spec.capabilities.<type>.enabled; deletePropagationPolicy RETAIN. Adds spec.accountId + spec.capabilities.* to the XRD and claim chart. - kind-crossplane: spokes:enable-crossplane sets capabilities+accountId; drop the imperative spokes:create-capabilities and its ~19min foreground EKS-ACTIVE wait (spokes now async). Hub claim sets kro/ack/argocd(+IDC); hub:seed waits on the Capability MRs instead of the create-capability.yaml Job. Delete argocd:capability / argocd:delete-capability tasks + create/delete-capability.yaml (per ITERATION_PLAN.md). - kind-kro-ack: remove the dead, unreferenced spokes:create-capabilities copy. - workshop:install: add tolerant, non-fatal wait-for-spokes gate before setup-env (EKS ACTIVE -> ArgoCD cluster secrets -> spoke apps synced within tolerance, with backoff nudge), overlapping the ray/IDC/model tail.

…d argocd:delete-capability call in destroy - README/ITERATION_PLAN/steering: reflect native Capability MRs instead of the create-capability.yaml Job (mark ITERATION_PLAN item 10 done). - destroy Phase 3: remove the now-broken '- task: argocd:delete-capability' reference (task deleted in prior commit). The AWS-API fallback already force-deletes capabilities before the cluster delete; capability IAM role cleanup retained.

…estart/wait) hub:bootstrap-crossplane-identity aborted the entire install (exit 201): the wait loop's 'grep -c True' returns exit 1 on zero Ready PIAs, which under go-task's set:[errexit,pipefail] failed the task. It also ran before crossplane-base is deployed (No resources found on the pod restart), making the restart premature. Strip the task to PIA creation only. Provider pods start after these PIAs exist (credentialed at startup); the phase1 hub:restart-identity-pods task already waits for all PIAs Ready and restarts providers non-fatally.

…ned spokes Provisioning a spoke via the KRO path creates the EksclusterWithVpc claim on the HUB, so the HUB's kro capability (peeks-hub-kro-capability-role/KRO) renders the ACK vpc/subnet/eks CRs. The eks-capabilities-rbac ClusterRole/Binding that grants those ACK API groups only targeted enable_kro_manifests (spokes); the hub uses enable_kro_manifests_hub, so it never got the RBAC. The hub's kro cap role is not cluster-admin, so KRO was RBAC-denied: vpcs.ec2.services.k8s.aws is forbidden: User .../peeks-hub-kro-capability-role/KRO cannot get resource vpcs in ec2.services.k8s.aws in namespace peeks-spoke-prod Add eks-capabilities-rbac-hub, gated on enable_kro_manifests_hub, mirroring the kro-manifests/kro-manifests-hub split. No cluster has both labels, so no collision.

…bject delete recovery Document the 'vpcs.ec2.services.k8s.aws forbidden' error for kro-provisioned spokes (hub kro cap not cluster-admin, eks-capabilities-rbac only on spokes), the eks-capabilities-rbac-hub fix, and the key manual step: KRO won't reconcile over the ACK object left half-created during the denied window — delete it so KRO recreates it cleanly.

KRO-provisioned spokes are created by the hub's ACK EKS capability, not the Crossplane providers, so spoke EKS creation does not depend on phase1 restart / wait-for-providers. Move set-overlay-repo + install:phase2-parallel ahead of the provider bring-up so the ~25min spoke build starts earlier and overlaps with the provider restart, observability seeding, ray image build and idc. kro-ack only; kind-crossplane spoke-dev genuinely uses the Crossplane path and is unchanged.

…usters-kro The clusters-kro ApplicationSet hardcoded argoCdHubRoleArn/argoCdCapabilityRoleArn to <cluster>-argocd-capability-role (kebab). That's correct on kro-ack (its RGD creates the role kebab-case) but wrong on the crossplane hub, whose ArgoCD capability role is <cluster>-ArgoCDCapabilityRole (PascalCase). A KRO-provisioned spoke on the crossplane hub then built an argocd-role trust policy referencing a non-existent principal, and ACK IAM went terminal: MalformedPolicyDocument: Invalid principal in policy: AWS: arn:...:role/peeks-hub-argocd-capability-role Key the suffix off the hub secret's provider label (kro-ack => kebab; crossplane/ no-label => PascalCase) so the trust policy references an existing role.

…Capability enabled The spoke argocd-role trust policy unconditionally listed <cluster>-argocd-capability-role as a principal, but that role is only created when argocdCapability.enabled==true (hub only). On any spoke (enabled=false) the principal doesn't exist, so IAM rejected the whole trust: MalformedPolicyDocument: Invalid principal in policy: AWS: arn:...:role/peeks-spoke-prod-argocd-capability-role Make the second principal conditional: capability role ARN when enabled (hub), otherwise duplicate argoCdHubRoleArn (IAM dedupes) so there is no dangling principal. Fixes KRO-provisioned spokes on both hubs (surfaced on the crossplane hub where spoke-prod is the KRO path).

…labels The ApplicationSet clusters generator exposes .metadata.labels as map[string]string, but sprig hasKey expects map[string]interface{} -> 'wrong type for value' template error, leaving clusters-kro Degraded and blocking spoke generation. Use a plain index lookup instead: index on map[string]string returns "" for a missing key without tripping missingkey=error (which only affects .field access), so crossplane hubs (no provider label) correctly fall through to the PascalCase suffix.

…visions secrets-manager:seed wrote the config secret's `vpc` property as a bare VPC-id string (--arg vpc '{{.VPC_ID}}'). But the aws-resources ExternalSecret reads that property into peeks-hub-vpc-secret.vpc_data, and the init-env-config Job parses .id/.subnet_ids/.cluster_security_group_id from it to build the vpc-config EnvironmentConfig the devlake RDS composition consumes. A bare string makes every jq lookup empty -> empty vpc-config -> RDS security group gets vpcId=null and the subnet group gets empty subnetIds -> RDS Instance never created -> devlake stuck waiting for its MySQL endpoint secret. Resolve PRIVATE_SUBNET_IDS + CLUSTER_SG and write `vpc` as the JSON object {id,subnet_ids,cluster_security_group_id} (stored as a JSON string via --arg, matching kind-crossplane). Sole consumer of the property is that ExternalSecret; bare-id consumers use the separate aws_vpc_id metadata field, so no other reader is affected.

… access entry clusters-crossplane set argoCDRoleArn to <cluster>-argocd-capability-role (kebab), but the crossplane hub's ArgoCD capability role is <cluster>-ArgoCDCapabilityRole (PascalCase). The composition therefore created the spoke's argocd AccessEntry for a non-existent principal, so the hub ArgoCD (connecting as the real PascalCase role) failed with 'failed to verify the access entry' and could not load state / sync any app on the crossplane-provisioned spoke (spoke-dev). Correct to PascalCase; this appset only runs on the crossplane hub.

The eks-capabilities-kro ClusterRole granted ec2/eks/iam/ecr/secretsmanager/ dynamodb ACK groups but NOT s3.services.k8s.aws. Creating an S3 bucket via the Backstage ACK/KRO template (workshop module 10.6) was RBAC-denied for KRO/ACK on s3.services.k8s.aws. Add the S3 group (covers both spoke eks-capabilities-rbac and hub eks-capabilities-rbac-hub, same chart).

…lass Two workshop-breaking blockers found by kro-ack testing: #3 Backstage 'Invalid GitLab integration config, $GITLAB_CF_DOMAIN is not a valid host': the kro-ack secrets seed wrote the config metadata as a single-quoted jq --arg, so the bash var $GITLAB_CF_DOMAIN was NOT expanded and the cluster-secret gitlab_domain_name annotation held the literal '$GITLAB_CF_DOMAIN'. That flows to the Backstage gitlab_domain_name Helm value -> dynamic-catalog gitlab_hostname -> integrations.gitlab host, breaking EVERY Backstage template. Break out of the single quotes so the real GitLab CloudFront host is substituted (crossplane already did this via jq --arg). #4 kro AppmodService/RayService hardcoded ingressClassName: alb, but EKS Auto Mode provides the 'platform' IngressClass (controller eks.amazonaws.com/alb) and there is no 'alb' class -> every app Ingress failed with 'ingressClass alb not found', the ALB address never populated, curl to /hello-world etc. failed. Align appmod-service.yaml (x2) and ray-service.yaml to 'platform' (cicd-pipeline.yaml already used it).

allamand mentioned this pull request Jun 23, 2026

feat: move GitLab from EKS to EC2 (docker-compose on IDE instance) #754

Open

allamand added agentic-platform branch-to-branch refactor labels Jun 24, 2026

allamand force-pushed the feature/cloudfront-on-agent-platform branch from 695eafe to 2ed034d Compare June 26, 2026 09:04

allamand force-pushed the feature/cloudfront-on-agent-platform-without-gitlab branch from f5ad84b to ea0b1b1 Compare June 26, 2026 09:24

allamand added 23 commits June 26, 2026 11:28

feat: complete kind-kro-ack consolidation

ac60d6c

- Add parallel install phases (phase1/phase2) to kind-kro-ack install task - Move gitlab:init-ec2 + gitlab:clone-repos to workshop/Taskfile.yaml - crossplane-system refs in copied tasks left intentional (Crossplane addon on hub EKS)

feat: create-config.sh reads CLUSTER_PROVIDER env var (default: kind-…

ec3d5f8

…crossplane)

refactor: remove argocd:capability job from kind-kro-ack — Capabiliti…

19425f2

…es managed declaratively by RGD

fix: restore Kargo ECR pod-identity setup in deploy-kargo.sh (acciden…

17263cb

…tally removed)

fix: restore create-ray-models-bucket from deleted Taskfile.workshop.…

3cce2f8

…yaml

fix: restore accidentally removed code from cherry-pick

4a223b4

- Restore argocd login block in ssm-setup-ide-logs.sh - Restore progressive-app image_name/service name (rollout-demo → progressive-app)

fix: add kind-kro-ack to root Taskfile includes

a3178d9

fix: increase RGD wait timeout 180s→300s and wait individually for be…

9b0eaad

…tter diagnostics

fix: add kind:kubeconfig task to ensure kubeconfig is present on re-runs

7f9f22a

fix: gitlab:init-ec2 rails runner fully idempotent on re-runs (user e…

2504593

…xists, project exists)

fix: gitlab:init-ec2 use save(validate:false) — update! fails on comp…

07711e8

…lex hash/validation

fix: ensure user namespace exists before creating GitLab projects

fbc1edb

fix: gitlab:init-ec2 use Users::CreateService + Devise::Encryptor for…

a32e677

… proper user+namespace creation

fix: gitlab:init-ec2 definitive - namespace via organization_id, unpr…

2baa6d2

…otect after creation

fix: restore argoCdCapabilityRoleArn as optional field — KRO rejects …

82ce00a

…CRD with removed property

fix(kind-kro-ack): derive subnet CIDRs from HUB_VPC_CIDR prefix

4b74a8e

Hardcoded 10.0.x.0/24 subnets break when vpcCidr is not 10.0.0.0/16. Add HUB_VPC_PREFIX var (first two octets of HUB_VPC_CIDR) and use it in hub:claim subnet CIDRs.

fix: remove force push in seed loop - repos are fresh, no need for -f

fb883c9

allamand added 13 commits July 1, 2026 21:43

allamand mentioned this pull request Jul 2, 2026

feat: kind-kro-ack cluster provider (bootstrap hub via ACK+KRO) #642

Draft

allamand added 16 commits July 2, 2026 21:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: move GitLab from EKS to EC2 docker-compose#755

feat: move GitLab from EKS to EC2 docker-compose#755
allamand wants to merge 201 commits into
feature/cloudfront-on-agent-platformfrom
feature/cloudfront-on-agent-platform-without-gitlab

allamand commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

allamand commented Jun 23, 2026

Changes

Companion MR

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant