For: Claude Code
Project: A cloud-native CLI tool + backend registry for discovering, installing, and publishing Claude Code skills, with federated marketplace support and admin-controlled external skill whitelisting
Last updated: 2026-03-11
Changes from v1: Two-repo split, K8s-native storage (PostgreSQL + Valkey), OCI artifact skill storage, generic OIDC device flow, Helm chart on ghcr.io, Nebari NebariApp CRD opt-in
- Two-Repo Split
- Architecture Overview
- Repo 1:
skillsctl(OSS tool)- 3.1 Repository Structure
- 3.2 Technology Decisions
- 3.3 Phase 1 — Protobuf & ConnectRPC
- 3.4 Phase 2 — Backend Service
- 3.5 Phase 3 — Helm Chart
- 3.6 Phase 4 — CLI
- 3.7 Phase 5 — CI/CD Pipelines
- Repo 2:
skillsctl-deploy(org GitOps)- 4.1 Repository Structure
- 4.2 ArgoCD Application Manifests
- 4.3 Org Values Overlay
- 4.4 NebariApp CRD
- Federation & Marketplace Management
- Auth Design
- Dogfood Skill
- Outstanding Decisions & Constraints
| Repo | Audience | What it contains |
|---|---|---|
yourorg/skillsctl |
Public / OSS | CLI binary, backend server, Helm chart, proto definitions, CI/CD, docs |
yourorg/skillsctl-deploy |
Internal / private | ArgoCD Application manifests, org-specific Helm values, Keycloak client config, NebariApp CRD overlay |
skillsctl is a generic tool — any team on any K8s cluster with any OIDC provider can deploy it. It ships with no org-specific configuration. The skillsctl-deploy repo is where OpenTeams-specific configuration lives: Keycloak issuer URL, realm, admin group name, ingress hostname, storage class names, and the NebariApp CRD that wires it into the Nebari platform.
This means skillsctl can be open-sourced without leaking internal infrastructure details, and the org repo can reference pinned chart versions from ghcr.io rather than building anything itself.
skillsctl repo (CI)
→ builds + pushes ghcr.io/yourorg/skillsctl-backend:vX.Y.Z
→ pushes oci://ghcr.io/yourorg/charts/skillsctl:vX.Y.Z
skillsctl-deploy repo (ArgoCD)
→ Application manifest references chart vX.Y.Z
→ values.yaml points at Keycloak, sets storage class, enables NebariApp
→ ArgoCD syncs to cluster
┌─────────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ (Nebari or plain K8s) │
│ │
│ skillsctl namespace │
│ ├── Deployment: skillsctl-server │
│ │ ├── ConnectRPC API │
│ │ ├── In-memory skill cache (warm from PostgreSQL) │
│ │ ├── Valkey subscriber (cache invalidation pub/sub) │
│ │ └── Federation poller goroutine │
│ │ │
│ ├── StatefulSet: PostgreSQL (via CloudNativePG operator) │
│ │ ├── skills, skill_versions tables │
│ │ ├── marketplaces, federated_skills tables │
│ │ └── audit_log table │
│ │ │
│ ├── Deployment: Valkey │
│ │ └── Cache invalidation pub/sub channel │
│ │ (server publishes on write, all instances invalidate) │
│ │ │
│ └── OCI Registry (external or in-cluster) │
│ └── ghcr.io/yourorg/skills/{name}:{version} │
│ (skill archives stored as OCI artifacts via oras) │
│ │
│ [If Nebari] nebari-operator watches NebariApp CRD: │
│ ├── Envoy Gateway HTTPRoute → skillsctl-server Service │
│ ├── cert-manager TLS certificate │
│ └── Keycloak OIDC SecurityPolicy on the route │
└─────────────────────────────────────────────────────────────────────┘
│ HTTPS / ConnectRPC
┌─────────┴──────────────────────────────────────────────────────────┐
│ skillsctl CLI (Go binary) │
│ linux/amd64, linux/arm64 │
│ darwin/amd64, darwin/arm64 │
│ windows/amd64 │
│ │
│ Auth: Generic OIDC device flow │
│ Issuer URL, client ID: read from ~/.config/skillsctl/config.yaml │
│ Configured by org during onboarding (or pre-set in binary │
│ build via ldflags for org-specific distribution) │
└────────────────────────────────────────────────────────────────────┘
│ Federated fetch (server-side, scheduled)
┌─────────┴──────────────────────────────────────────────────────────┐
│ Whitelisted External Marketplaces │
│ github.com/anthropics/skills │
│ github.com/obra/superpowers │
│ ... (admin-controlled) │
└────────────────────────────────────────────────────────────────────┘
PostgreSQL (CloudNativePG) is the durable source of truth. All skill metadata, versions, marketplace configs, and audit logs live here.
Valkey handles cache invalidation only — not primary storage. When a server instance writes to PostgreSQL (skill publish, marketplace update), it publishes an invalidation message to a Valkey channel. All server instances (there may be multiple replicas) subscribe and drop their in-memory cache entry, triggering a reload from PostgreSQL. This replaces the Firestore real-time listener from v1 with a standard K8s-native pub/sub pattern.
OCI registry stores skill archives. When a developer runs skillsctl push, the server packages the skill as an OCI artifact and pushes it to the configured registry using the oras library. On skillsctl install, the server generates a short-lived pull token and returns the OCI reference + token to the CLI. The CLI pulls directly using the oras client library. This reuses your existing container registry infrastructure — no separate object store needed.
skillsctl/
├── .github/
│ ├── workflows/
│ │ ├── ci-cli.yml # lint → test → build → e2e → release
│ │ ├── ci-backend.yml # lint → test → build → e2e → push image
│ │ ├── ci-chart.yml # helm lint → ct test → push to ghcr.io OCI
│ │ ├── ci-proto.yml # buf lint + breaking check
│ │ └── ci-docs.yml # generate + publish to GitHub Pages
│ └── PULL_REQUEST_TEMPLATE.md
│
├── proto/
│ ├── buf.yaml
│ ├── buf.gen.yaml
│ └── skillsctl/v1/
│ ├── skill.proto
│ ├── registry.proto
│ ├── federation.proto
│ └── auth.proto
│
├── gen/
│ ├── go/skillsctl/v1/
│ └── python/skillsctl/v1/
│
├── backend/
│ ├── cmd/server/main.go
│ ├── internal/
│ │ ├── auth/
│ │ │ ├── middleware.go # OIDC token validation + claims extraction
│ │ │ ├── oidc.go # Generic OIDC provider: JWKS fetch, token verify
│ │ │ └── groups.go # Admin group check via OIDC groups claim
│ │ ├── registry/
│ │ │ ├── service.go
│ │ │ ├── cache.go # In-memory cache + Valkey invalidation subscriber
│ │ │ └── search.go
│ │ ├── federation/
│ │ │ ├── service.go
│ │ │ ├── poller.go
│ │ │ ├── github.go
│ │ │ └── agentskills.go
│ │ ├── store/
│ │ │ ├── postgres.go # PostgreSQL reads/writes via pgx
│ │ │ ├── migrations/ # SQL migration files (goose)
│ │ │ │ ├── 001_initial.sql
│ │ │ │ ├── 002_federation.sql
│ │ │ │ └── 003_audit_log.sql
│ │ │ └── oci.go # OCI artifact push/pull via oras-go
│ │ ├── cache/
│ │ │ └── valkey.go # Valkey pub/sub for cache invalidation
│ │ └── ratelimit/
│ │ └── middleware.go
│ ├── Dockerfile
│ └── .goreleaser-backend.yml
│
├── chart/
│ ├── Chart.yaml
│ ├── values.yaml # All defaults, no org-specific values
│ ├── values.schema.json # JSON Schema validation for values
│ ├── templates/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── serviceaccount.yaml
│ │ ├── configmap.yaml
│ │ ├── secret.yaml # DB credentials (or external-secrets ref)
│ │ ├── hpa.yaml
│ │ ├── pdb.yaml
│ │ ├── _helpers.tpl
│ │ ├── nebariapp.yaml # Conditionally rendered: nebari.enabled=true
│ │ └── NOTES.txt
│ ├── crds/ # NebariApp CRD (only installed if nebari.enabled)
│ │ └── nebariapp-crd.yaml # Copied from nebari-operator release, versioned
│ └── ci/
│ └── test-values.yaml # Values used by chart-testing in CI
│
├── cli/
│ ├── main.go
│ ├── cmd/
│ │ ├── root.go
│ │ ├── explore.go
│ │ ├── install.go
│ │ ├── push.go
│ │ ├── auth.go
│ │ ├── update.go
│ │ ├── marketplace.go
│ │ └── docs.go
│ ├── internal/
│ │ ├── api/client.go
│ │ ├── auth/
│ │ │ ├── device_flow.go # Generic OIDC device flow
│ │ │ └── token_store.go
│ │ ├── oci/
│ │ │ └── pull.go # oras-go client for pulling skill archives
│ │ ├── skill/
│ │ │ ├── package.go
│ │ │ └── validate.go
│ │ ├── selfupdate/update.go
│ │ └── config/config.go
│ └── .goreleaser.yml
│
├── e2e/
│ ├── cli/
│ │ ├── explore_test.go
│ │ ├── install_test.go
│ │ ├── push_test.go
│ │ └── marketplace_test.go
│ └── backend/
│ └── zap-baseline.yaml
│
├── docs/
│ └── .gitkeep
│
├── skills/
│ └── goreleaser/SKILL.md
│
├── Makefile
├── .golangci.yml
├── .gitleaks.toml
├── VERSION
└── README.md
| Concern | Choice | Rationale |
|---|---|---|
| RPC | ConnectRPC | Same as v1 — JSON/HTTP compatible, ZAP-friendly |
| Primary store | PostgreSQL via CloudNativePG operator | Durable, queryable, standard K8s pattern. pgx driver in Go. Goose for migrations. |
| Cache invalidation | Valkey pub/sub | Lightweight, K8s-native, handles multi-replica cache coherence. Not primary storage. |
| Skill archive storage | OCI registry via oras-go | Reuses existing container registry. Content-addressed. No separate object store. |
| CLI auth | Generic OIDC device flow | Configurable issuer URL + client ID. Works with Keycloak, Okta, Dex, etc. |
| Backend auth | OIDC token validation | Validates JWT signature against issuer's JWKS. Checks groups claim for admin role. No provider-specific SDK. |
| Admin role | OIDC groups claim |
Admins are members of a group (e.g. skillsctl-admins) that Keycloak includes in the token's groups claim. No separate API call needed at runtime. |
| Helm chart distribution | OCI registry (ghcr.io) | helm install oci://ghcr.io/yourorg/charts/skillsctl --version X.Y.Z |
| Nebari integration | NebariApp CRD, opt-in via nebari.enabled=true |
When enabled, renders a NebariApp resource that the nebari-operator picks up for routing + TLS + Keycloak OIDC |
| DB migrations | goose (embedded) | Server runs migrations on startup via goose's embed FS support. No separate init container needed. |
| Container | Scratch final image | Same as v1 — CGO_ENABLED=0, static binary |
| Secrets scanning | Gitleaks | Pre-commit + CI |
| DAST | OWASP ZAP | Post-deploy against running server |
| Release | GoReleaser | CLI binaries + Scoop manifest. Backend container via goreleaser-backend.yml. Helm chart via ci-chart.yml. |
Same buf.yaml and buf.gen.yaml as v1 (Go + Python generation).
The proto service definitions are identical to v1 with one addition: remove any GCP-specific fields (no GCS paths, no Firestore references). The GetDownloadURL RPC now returns an OCI reference + short-lived pull token instead of a GCS signed URL:
message GetDownloadURLResponse {
string oci_ref = 1; // e.g. ghcr.io/yourorg/skills/pdf:0.9.1
string pull_token = 2; // short-lived registry token (TTL: 15 min)
string digest = 3; // sha256 digest for verification
}Breaking change policy: same as v1 (warn pre-1.0, block post-1.0).
// internal/auth/oidc.go
type Config struct {
IssuerURL string // e.g. https://keycloak.example.com/realms/myrealm
ClientID string
AdminGroup string // e.g. "skillsctl-admins"
// AllowedDomain is optional — if set, checks the "email" claim's domain.
// Leave empty to allow any authenticated user from the OIDC provider.
AllowedDomain string
}
type Validator struct {
cfg Config
keySet *jwk.AutoRefresh // auto-refreshes JWKS from issuer
}
func (v *Validator) Validate(ctx context.Context, rawToken string) (*Claims, error) {
// 1. Fetch + cache JWKS from {IssuerURL}/.well-known/openid-configuration
// 2. Verify signature, expiry, audience (must include ClientID)
// 3. If AllowedDomain set: check email domain
// 4. Extract groups claim → []string
// 5. Return Claims{Email, Groups, Subject}
}
func (v *Validator) IsAdmin(claims *Claims) bool {
for _, g := range claims.Groups {
if g == v.cfg.AdminGroup {
return true
}
}
return false
}No external API calls for group membership — it comes from the token itself. Keycloak is configured (in the org repo) to include group membership in the groups claim. This means admin checks add zero latency.
-- 001_initial.sql
CREATE TABLE skills (
name TEXT PRIMARY KEY,
description TEXT NOT NULL,
owner TEXT NOT NULL,
tags TEXT[] NOT NULL DEFAULT '{}',
latest_version TEXT NOT NULL,
install_count BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE skill_versions (
skill_name TEXT NOT NULL REFERENCES skills(name),
version TEXT NOT NULL,
oci_ref TEXT NOT NULL, -- ghcr.io/yourorg/skills/pdf:0.9.1
digest TEXT NOT NULL, -- sha256 content digest
published_by TEXT NOT NULL,
changelog TEXT,
size_bytes BIGINT,
published_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (skill_name, version)
);
CREATE INDEX idx_skills_tags ON skills USING GIN(tags);
CREATE INDEX idx_skills_updated ON skills(updated_at DESC);
-- 002_federation.sql
CREATE TABLE marketplaces (
id TEXT PRIMARY KEY,
display_name TEXT NOT NULL,
source_url TEXT NOT NULL,
type TEXT NOT NULL, -- GITHUB_REPO, AGENTSKILLS_IO
mode TEXT NOT NULL, -- ALL, SPECIFIC
allowed_skills TEXT[] NOT NULL DEFAULT '{}',
sync_interval_minutes INT NOT NULL DEFAULT 60,
last_synced_at TIMESTAMPTZ,
skill_count INT NOT NULL DEFAULT 0,
enabled BOOLEAN NOT NULL DEFAULT TRUE,
added_by TEXT NOT NULL,
added_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE federated_skills (
marketplace_id TEXT NOT NULL REFERENCES marketplaces(id) ON DELETE CASCADE,
name TEXT NOT NULL,
description TEXT,
tags TEXT[] NOT NULL DEFAULT '{}',
latest_version TEXT,
upstream_url TEXT NOT NULL,
allowed BOOLEAN NOT NULL DEFAULT TRUE,
synced_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (marketplace_id, name)
);
-- 003_audit_log.sql
CREATE TABLE audit_log (
id BIGSERIAL PRIMARY KEY,
action TEXT NOT NULL,
actor TEXT NOT NULL,
target_id TEXT NOT NULL,
details JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_audit_log_actor ON audit_log(actor);
CREATE INDEX idx_audit_log_target ON audit_log(target_id);
CREATE INDEX idx_audit_log_time ON audit_log(created_at DESC);Goose is embedded in the binary and runs migrations at startup:
// cmd/server/main.go
//go:embed internal/store/migrations/*.sql
var migrations embed.FS
func main() {
db := mustOpenDB(cfg.DatabaseURL)
goose.SetBaseFS(migrations)
goose.Up(db, "internal/store/migrations")
// ... start server
}Each server instance maintains an in-memory skill cache loaded from PostgreSQL on startup. When any instance writes (publish, update, federation sync), it publishes to a Valkey channel. All instances subscribe and drop the affected cache entry.
// internal/cache/valkey.go
const invalidationChannel = "skillsctl:cache:invalidate"
type Invalidator struct {
client *valkey.Client
cache *registry.Cache
}
// Called after any write to PostgreSQL
func (inv *Invalidator) Publish(ctx context.Context, key string) error {
return inv.client.Do(ctx,
inv.client.B().Publish().Channel(invalidationChannel).Message(key).Build(),
).Error()
}
// Started as a goroutine on server startup
func (inv *Invalidator) Subscribe(ctx context.Context) {
inv.client.Receive(ctx,
inv.client.B().Subscribe().Channel(invalidationChannel).Build(),
func(msg valkey.PubSubMessage) {
inv.cache.Invalidate(msg.Message)
},
)
}// internal/store/oci.go
import "oras.land/oras-go/v2"
const skillMediaType = "application/vnd.skillsctl.skill.v1.tar+gzip"
func (s *OCIStore) PushSkill(ctx context.Context, name, version string, archive []byte) (digest string, err error) {
ref := fmt.Sprintf("%s/skills/%s:%s", s.registry, name, version)
// Build OCI manifest with skillMediaType layer
// Push using oras.Copy
// Return content digest
}
func (s *OCIStore) GeneratePullToken(ctx context.Context, name, version string) (token string, err error) {
// For ghcr.io: exchange service account credentials for a short-lived
// registry token scoped to pull on the specific repository.
// TTL: 15 minutes.
// Return the token — CLI uses it as Bearer auth when pulling.
}Identical to v1 (multi-stage, scratch final image, CGO_ENABLED=0, Trivy scan stage).
Same pattern as v1 — deploy with --set image.tag=sha-<sha> as a second Helm release (or ArgoCD preview), run smoke suite, then update the primary release. In a Nebari/ArgoCD setup this is handled via the org repo: PR updates the image tag, ArgoCD syncs to a canary Application, e2e pass triggers a merge that updates the production Application.
apiVersion: v2
name: skillsctl
description: A Kubernetes-native registry for Claude Code skills
type: application
version: 0.1.0 # chart version, bumped independently of app version
appVersion: 0.1.0 # app version, set by CI from git tag
dependencies:
# CloudNativePG is a cluster-level operator — we don't bundle it.
# Document as a prerequisite. The chart creates a Cluster CR, not the operator.
# No subchart dependencies — keeps the chart portable.image:
repository: ghcr.io/yourorg/skillsctl-backend
tag: "" # defaults to .Chart.AppVersion
pullPolicy: IfNotPresent
replicaCount: 2
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
memory: 256Mi
service:
type: ClusterIP
port: 8080
# PostgreSQL — uses CloudNativePG Cluster CR
postgres:
# If enabled, chart creates a CloudNativePG Cluster in the same namespace.
# Set enabled=false and provide externalDatabaseURL if using an external DB.
enabled: true
instances: 1 # bump to 3 for HA
storage:
size: 5Gi
storageClass: "" # leave empty for default
externalDatabaseURL: "" # used when enabled=false
# Valkey
valkey:
enabled: true
storage:
size: 1Gi
storageClass: ""
externalURL: "" # used when enabled=false
# OCI registry for skill archives
oci:
registry: ghcr.io
repository: "" # e.g. yourorg/skills — REQUIRED, no default
# Credentials for the server to push/pull skill archives.
# Provide as a pre-created K8s secret name, or set credentials directly.
credentialsSecret: ""
username: ""
password: "" # use existingSecret in production
# Auth — generic OIDC, no provider-specific defaults
auth:
oidc:
issuerURL: "" # REQUIRED — e.g. https://keycloak.example.com/realms/myrealm
clientID: "" # REQUIRED
allowedDomain: "" # optional — restrict to email domain
adminGroup: "skillsctl-admins" # OIDC groups claim value for admins
pushTokens: {} # map of token→team for write auth, stored in a K8s Secret
# Rate limiting (env vars passed to server)
rateLimit:
readPerIPPerMin: 100
writePerTokenPerMin: 10
adminPerEmailPerMin: 30
# Federation poller
federation:
pollIntervalSeconds: 60
# Nebari integration — opt-in
nebari:
enabled: false
app:
displayName: "skillsctl"
subdomain: "skillsctl" # results in skillsctl.{nebari-domain}
icon: "" # optional URL to icon
description: "Claude Code skill registry"
# Keycloak client ID for the Nebari OIDC SecurityPolicy.
# This is the client the nebari-operator creates in Keycloak.
keycloakClientID: "" # REQUIRED when nebari.enabled=true
ingress:
# Used when nebari.enabled=false and you want a plain ingress
enabled: false
className: ""
host: ""
tls: []{{- if .Values.nebari.enabled }}
apiVersion: nebari.dev/v1alpha1
kind: NebariApp
metadata:
name: {{ include "skillsctl.fullname" . }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "skillsctl.labels" . | nindent 4 }}
spec:
displayName: {{ .Values.nebari.app.displayName | quote }}
subdomain: {{ .Values.nebari.app.subdomain | quote }}
description: {{ .Values.nebari.app.description | quote }}
{{- if .Values.nebari.app.icon }}
icon: {{ .Values.nebari.app.icon | quote }}
{{- end }}
service:
name: {{ include "skillsctl.fullname" . }}
port: {{ .Values.service.port }}
auth:
keycloakClientID: {{ required "nebari.app.keycloakClientID is required when nebari.enabled=true" .Values.nebari.app.keycloakClientID | quote }}
{{- end }}The NebariApp CRD is copied from a pinned nebari-operator release and committed to the chart's crds/ directory. Helm installs CRDs from crds/ automatically on first install and never deletes them on uninstall (safe behavior). The CRD is harmless on non-Nebari clusters — it just sits there unused unless nebari.enabled=true actually creates a NebariApp resource.
Pin to a specific nebari-operator version and document the update procedure in the chart's CHANGELOG.
The ci-chart.yml workflow:
On PR:
└── helm lint chart/
└── helm template chart/ -f chart/ci/test-values.yaml (dry-run render check)
└── chart-testing (ct lint) — validates against values.schema.json
On merge to main / tag:
└── helm package chart/ --app-version $TAG --version $TAG
└── helm push skillsctl-$TAG.tgz oci://ghcr.io/yourorg/charts
Install command for users:
helm install skillsctl oci://ghcr.io/yourorg/charts/skillsctl \
--version 0.1.0 \
--namespace skillsctl --create-namespace \
-f my-values.yamlapi_url: https://skillsctl.example.com
skills_dir: ~/.claude/skills
auth:
oidc_issuer: https://keycloak.example.com/realms/myrealm
client_id: skillsctl-cli
# token cached at ~/.config/skillsctl/credentials.json after loginThe auth.oidc_issuer and auth.client_id values are set once during onboarding (skillsctl config set auth.oidc_issuer <url>). The org documents the correct values in their internal README.
Optionally, the org can build a pre-configured binary via GoReleaser ldflags:
# in .goreleaser.yml, for an org-specific build
ldflags:
- -X main.defaultOIDCIssuer=https://keycloak.example.com/realms/myrealm
- -X main.defaultClientID=skillsctl-cli
- -X main.defaultAPIURL=https://skillsctl.openteams.comThis means new devs can run skillsctl auth login immediately without any config step.
// cli/internal/auth/device_flow.go
type OIDCConfig struct {
IssuerURL string
ClientID string
}
func (c *OIDCConfig) Login(ctx context.Context) error {
// 1. Fetch {IssuerURL}/.well-known/openid-configuration
// to discover device_authorization_endpoint and token_endpoint
meta, err := fetchOIDCMetadata(ctx, c.IssuerURL)
// 2. POST to device_authorization_endpoint with client_id + scope
dc, err := requestDeviceCode(ctx, meta.DeviceAuthorizationEndpoint, c.ClientID)
// 3. Print user_code + verification_uri
fmt.Printf("\nOpen: %s\nEnter code: %s\n\n", dc.VerificationURI, dc.UserCode)
// 4. Poll token_endpoint until user completes or timeout
tok, err := pollForToken(ctx, meta.TokenEndpoint, c.ClientID, dc)
// 5. Save to ~/.config/skillsctl/credentials.json
return saveToken(tok)
}Works with Keycloak out of the box (Keycloak supports RFC 8628 device flow). Also works with any other standard OIDC provider.
// cli/internal/oci/pull.go
import "oras.land/oras-go/v2"
func PullSkill(ctx context.Context, ociRef, pullToken, destDir string) error {
// Configure oras remote with Bearer token auth
repo, err := remote.NewRepository(ociRef)
repo.Client = &auth.Client{
Credential: auth.StaticCredential("", auth.Credential{
AccessToken: pullToken,
}),
}
// Pull the skill layer (application/vnd.skillsctl.skill.v1.tar+gzip)
// Extract to destDir
}All commands from v1 are unchanged in behavior. The differences are:
skillsctl auth loginuses OIDC device flow against the configured issuer instead of Google OAuthskillsctl installpulls from OCI registry via oras instead of a GCS signed URLskillsctl config set auth.oidc_issuer <url>andskillsctl config set auth.client_id <id>are new subcommands
Stage 1: Quality → gitleaks, golangci-lint, buf-check
Stage 2: Unit tests
Stage 3: Build matrix (5 targets, parallel)
Stage 4: E2E tests (parallel per platform)
Stage 5: Release via GoReleaser (on tag)
Stage 6: Docs to GitHub Pages
The e2e stage deploys using Helm against a kind cluster (not Cloud Run), making the e2e stage fully self-contained in CI without needing a real cloud environment:
Stage 1: Quality
Stage 2: Unit tests
Stage 3: Container build → Trivy scan → smoke test → push sha-tagged image
Stage 4: E2E + DAST
├── Spin up kind cluster
├── Install CloudNativePG operator
├── helm install skillsctl ./chart -f chart/ci/test-values.yaml
│ (test-values.yaml uses in-cluster Valkey, disables Nebari)
├── Wait for rollout
├── Run e2e/backend/ suite + marketplace tests
├── OWASP ZAP baseline scan
└── Tear down kind cluster
Stage 5: Release (merge to main)
└── goreleaser → ghcr.io :latest + :vX.Y.Z
On PR (paths: chart/**):
├── helm lint chart/
├── helm template (dry-run render)
├── ct lint (validates values.schema.json)
└── kind-based install test (helm install + helm test)
On tag vX.Y.Z:
├── helm package --version $TAG --app-version $TAG
└── helm push oci://ghcr.io/yourorg/charts/skillsctl:$TAG
No OpenTofu in skillsctl. All infra lives in the org GitOps repo. The skillsctl repo is purely application code.
| Secret | Description |
|---|---|
GHCR_TOKEN |
Push containers + Helm chart to ghcr.io |
SCOOP_BUCKET_TOKEN |
PAT for scoop-bucket repo |
SKILLCTL_TEST_OIDC_TOKEN |
Pre-minted OIDC token for a test user (e2e) |
SKILLCTL_TEST_ADMIN_TOKEN |
Pre-minted OIDC token for a test admin (e2e marketplace tests) |
SKILLCTL_TEST_PUSH_TOKEN |
Push token for e2e publish tests |
SKILLCTL_TEST_OCI_PASSWORD |
Registry credentials for e2e kind cluster |
This repo is internal-only. It contains everything org-specific.
skillsctl-deploy/
├── argocd/
│ ├── dev/
│ │ └── application.yaml # ArgoCD Application for dev namespace
│ └── prod/
│ └── application.yaml # ArgoCD Application for prod namespace
│
├── helm-values/
│ ├── base/
│ │ └── values.yaml # Shared values (auth config, OCI registry, etc.)
│ ├── dev/
│ │ └── values.yaml # Dev overrides (lower replicas, scale-to-zero)
│ └── prod/
│ └── values.yaml # Prod overrides (HA postgres, HPA settings)
│
├── keycloak/
│ └── skillsctl-client.yaml # Keycloak client config (realm export fragment)
│ # Defines the skillsctl-cli client with device flow
│ # and the skillsctl-admins group mapping
│
├── scripts/
│ └── seed-marketplaces.sh # One-time script: adds Anthropic + Superpowers
│
└── README.md
# argocd/prod/application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: skillsctl-prod
namespace: argocd
spec:
project: default
source:
repoURL: ghcr.io/yourorg/charts
chart: skillsctl
targetRevision: 0.3.1 # pin to a specific chart version
helm:
valueFiles:
- $values/helm-values/base/values.yaml
- $values/helm-values/prod/values.yaml
sources:
- repoURL: https://github.com/yourorg/skillsctl-deploy
targetRevision: main
ref: values
- repoURL: ghcr.io/yourorg/charts
chart: skillsctl
targetRevision: 0.3.1
helm:
valueFiles:
- $values/helm-values/base/values.yaml
- $values/helm-values/prod/values.yaml
destination:
server: https://kubernetes.default.svc
namespace: skillsctl-prod
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true# helm-values/base/values.yaml — shared across dev + prod
auth:
oidc:
issuerURL: https://keycloak.openteams.com/realms/openteams
clientID: skillsctl-cli
allowedDomain: openteams.com
adminGroup: skillsctl-admins
oci:
registry: ghcr.io
repository: openteams/skills
credentialsSecret: skillsctl-oci-credentials
nebari:
enabled: true
app:
displayName: "skillsctl"
subdomain: "skillsctl"
description: "Internal Claude Code skill registry"
keycloakClientID: skillsctl-server# helm-values/prod/values.yaml
replicaCount: 3
postgres:
instances: 3
storage:
size: 20Gi
storageClass: ssd
valkey:
storage:
size: 2Gi
storageClass: ssd
image:
tag: "0.3.1" # bumped by automated PR when new skillsctl release is taggedThe NebariApp CRD is installed by Helm from the chart/crds/ directory in the skillsctl chart. No separate step needed. The nebari.enabled=true value in base values.yaml activates the NebariApp resource, which the nebari-operator picks up to configure:
- An HTTPRoute on the shared Envoy Gateway pointing at the skillsctl Service
- A cert-manager certificate for
skillsctl.{nebari-domain} - A Keycloak OIDC SecurityPolicy (Envoy ExtAuthz) on the route
This means the skillsctl server itself does not handle TLS termination — Envoy Gateway does. The server receives pre-authenticated requests with the OIDC token forwarded as a header. The server still validates the token independently (defense in depth — the server should not trust Envoy blindly).
# keycloak/skillsctl-client.yaml
# Import this into your Keycloak realm via the admin console or Terraform Keycloak provider
clients:
# CLI client — device flow, public client (no secret)
- clientId: skillsctl-cli
publicClient: true
standardFlowEnabled: false
directAccessGrantsEnabled: false
deviceAuthorizationGrantEnabled: true
defaultScopes: [openid, email, profile, groups]
# Server client — used by nebari-operator for OIDC SecurityPolicy
- clientId: skillsctl-server
publicClient: false
# clientSecret: managed by nebari-operator
groups:
- name: skillsctl-admins
# Add admin users here
# Protocol mapper to include groups in token
protocolMappers:
- name: groups
protocol: openid-connect
protocolMapper: oidc-group-membership-mapper
config:
claim.name: groups
full.path: "false"
access.token.claim: "true"
id.token.claim: "true"Identical to v1 Phase 6 with these storage changes:
- All marketplace state (marketplaces table, federated_skills table, audit_log) is in PostgreSQL instead of Firestore
- The distributed lock for the federation poller uses PostgreSQL advisory locks (
pg_try_advisory_lock) instead of a Firestore document - The real-time cache update uses Valkey pub/sub instead of Firestore snapshot listener
// internal/federation/poller.go
func (p *Poller) acquireLock(ctx context.Context, db *pgx.Conn) (bool, error) {
// PostgreSQL advisory lock — only one instance runs the sync at a time
// Lock key: hash of "federation-poller" (a fixed int64)
var acquired bool
err := db.QueryRow(ctx,
"SELECT pg_try_advisory_lock($1)", federationLockKey).Scan(&acquired)
return acquired, err
}Pre-seeded marketplaces via scripts/seed-marketplaces.sh — same as v1.
Full auth flow for reference:
1. Dev runs: skillsctl auth login
→ CLI fetches OIDC discovery doc from issuerURL
→ CLI POSTs to device_authorization_endpoint
→ CLI prints user_code + verification_uri
→ Dev opens browser, logs into Keycloak, enters code
→ CLI polls token_endpoint, receives access_token + refresh_token
→ Tokens saved to ~/.config/skillsctl/credentials.json
2. Dev runs: skillsctl explore
→ CLI reads access_token from credentials.json
→ If expired: CLI uses refresh_token to get new access_token (transparent)
→ CLI sends: Authorization: Bearer <access_token>
→ Server validates token: signature (JWKS), expiry, audience (clientID), domain
→ Server serves request
3. Dev runs: skillsctl push
→ Same token flow
→ Server additionally checks X-Push-Token header against push_tokens config
4. Admin runs: skillsctl marketplace add
→ Same token flow
→ Server checks groups claim in token for adminGroup value
→ No external API call — group membership is in the token
5. [If Nebari] Browser access to skillsctl web UI (future)
→ Request hits Envoy Gateway
→ No OIDC session → redirect to Keycloak
→ Keycloak auth → token set as cookie
→ Envoy forwards request with token header to skillsctl-server
→ skillsctl-server validates independently (defense in depth)
Same as v1 — skills/goreleaser/SKILL.md in the skillsctl repo, pushed to the registry on first deploy via scripts/seed-marketplaces.sh.
-
Keycloak client setup — The
keycloak/skillsctl-client.yamlfile in the org repo documents the required config, but someone must apply it. Options: manual import via Keycloak admin console, the Terraform Keycloak provider (if your Nebari deployment uses it), or a one-timekcadm.shscript. Decide which is canonical for your org before deployment. -
CloudNativePG operator — Must be installed in the cluster before deploying the skillsctl chart. The chart creates a CloudNativePG
ClusterCR but does not install the operator. Document this as a prerequisite in the org README. If it's not already in your Nebari cluster, add it as a separate ArgoCD Application in the skillsctl-deploy repo. -
OCI registry credentials — The skillsctl server needs push/pull access to
ghcr.io/openteams/skills. Create a GitHub machine account or use a GitHub App for this. Store credentials as a K8s secret (skillsctl-oci-credentials) created out-of-band (not in the GitOps repo). Reference the secret name in values.yaml. -
Image tag automation — When a new skillsctl release is tagged, the prod values.yaml needs
image.tagbumped. Automate this with a GitHub Action in the skillsctl-deploy repo that opens a PR on new skillsctl releases (listen for release events from the skillsctl repo via a workflow_dispatch or repository_dispatch trigger). -
Nebari CRD version pinning — The
chart/crds/nebariapp-crd.yamlmust be kept in sync with the nebari-operator version running in your cluster. Add a note inchart/CHANGELOG.mdwhenever it is updated. Mismatched CRD versions are a common silent failure mode. -
OIDC
groupsclaim in Keycloak — Keycloak does not include group membership in tokens by default. The protocol mapper inkeycloak/skillsctl-client.yamlmust be applied for admin checks to work. Without it,IsAdminalways returns false andskillsctl marketplacecommands return PermissionDenied for everyone. -
Pre-configured binary for onboarding — The org can build a custom binary via GoReleaser ldflags that pre-sets
oidcIssuer,clientID, andapiURL. This removes the manualskillsctl config setstep for new devs. Consider hosting the org-specific binary as a private GitHub Release in the skillsctl-deploy repo, or distributing via an internal Homebrew tap / Scoop bucket that overrides the public one. -
Windows self-update — Use
inconshreveable/go-updatefor the locked-binary swap. Same constraint as v1. -
PostgreSQL backup — CloudNativePG supports WAL archiving to S3-compatible storage. Configure this in the prod values overlay. The dev environment can skip it. Define backup retention policy before going to production.
-
Valkey persistence — Valkey is used for pub/sub only, not as a durable store, so AOF/RDB persistence is not required. Configure Valkey with
--save ""(no persistence) to reduce I/O and storage overhead. If Valkey restarts, the worst case is that cache entries go stale until the next write invalidates them — which is acceptable. -
Proto versioning + VERSION file — Same as v1. Maintain
VERSIONfile in repo root. Buf breaking-change check reads it. -
Generated code drift — Same as v1.
buf generate && git diff --exit-code gen/in CI. Addgen/** linguist-generated=trueto.gitattributes. -
Scoop bucket — Requires a separate public GitHub repo
yourorg/scoop-bucket. Create before first GoReleaser run. -
chart-testing (ct) prerequisites — The
ci-chart.ymlpipeline useshelm/chart-testing-action. It requires achart/ci/test-values.yamlthat provides all required values (oci.repository, auth.oidc.issuerURL, auth.oidc.clientID) with test-safe values pointing at the kind cluster's local services.
End of work plan v2. Implementation order: Proto (Phase 1) → Backend (Phase 2) → Chart (Phase 3) → CLI (Phase 4) → CI/CD (Phase 5) → Federation (§5) → Org repo (§4) → Dogfood (§7). The org repo (§4) cannot be completed until the chart is published from Phase 3.