|
| 1 | +# AGENTS.md |
| 2 | + |
| 3 | +Guidance for AI coding assistants working in `fluxcd/notification-controller`. Read this file before making changes. |
| 4 | + |
| 5 | +## Contribution workflow for AI agents |
| 6 | + |
| 7 | +These rules come from [`fluxcd/flux2/CONTRIBUTING.md`](https://github.com/fluxcd/flux2/blob/main/CONTRIBUTING.md) and apply to every Flux repository. |
| 8 | + |
| 9 | +- **Do not add `Signed-off-by` or `Co-authored-by` trailers with your agent name.** Only a human can legally certify the DCO. |
| 10 | +- **Disclose AI assistance** with an `Assisted-by` trailer naming your agent and model: |
| 11 | + ```sh |
| 12 | + git commit -s -m "Add support for X" --trailer "Assisted-by: <agent-name>/<model-id>" |
| 13 | + ``` |
| 14 | + The `-s` flag adds the human's `Signed-off-by` from their git config — do not remove it. |
| 15 | +- **Commit message format:** Subject in imperative mood ("Add feature X" instead of "Adding feature X"), capitalized, no trailing period, ≤50 characters. Body wrapped at 72 columns, explaining what and why. No `@mentions` or `#123` issue references in the commit — put those in the PR description. |
| 16 | +- **Trim verbiage:** in PR descriptions, commit messages, and code comments. No marketing prose, no restating the diff, no emojis. |
| 17 | +- **Rebase, don't merge:** Never merge `main` into the feature branch; rebase onto the latest `main` and push with `--force-with-lease`. Squash before merge when asked. |
| 18 | +- **Pre-PR gate:** `make tidy fmt vet && make test` must pass and the working tree must be clean after codegen. Commit regenerated files in the same PR. |
| 19 | +- **Flux is GA:** Backward compatibility is mandatory. Breaking changes to CRD fields, status, CLI flags, metrics, or observable behavior will be rejected. Design additive changes and keep older API versions round-tripping. |
| 20 | +- **Copyright:** All new `.go` files must begin with the boilerplate from `hack/boilerplate.go.txt` (Apache 2.0). Update the year to the current year when copying. |
| 21 | +- **Spec docs:** New features and API changes must be documented in `docs/spec/` under the current version — `v1/receivers.md` for Receiver, `v1beta3/alerts.md` and `v1beta3/providers.md` for Alert and Provider. Update the relevant file in the same PR that introduces the change. |
| 22 | +- **Tests:** New features, improvements and fixes must have test coverage. Add unit tests in `internal/controller/*_test.go` and other `internal/*` packages as appropriate. Follow the existing patterns for test organization, fixtures, and assertions. Run tests locally before pushing. |
| 23 | + |
| 24 | +## Code quality |
| 25 | + |
| 26 | +Before submitting code, review your changes for the following: |
| 27 | + |
| 28 | +- **No secrets in logs or events.** Anything derived from a Secret (tokens, passwords, webhook URLs with embedded secrets) must be scrubbed via `fluxcd/pkg/masktoken` before logging or surfacing in conditions. Never log the receiver webhook path at info level — it is effectively a secret. |
| 29 | +- **No unchecked I/O.** Close HTTP response bodies and file handles in `defer` statements. Check and propagate errors from I/O operations. |
| 30 | +- **Bounded request bodies.** Both HTTP servers enforce `maxRequestSizeBytes` (3 MiB). Always read through `io.LimitReader` when extending handlers. Do not introduce new readers without bounds. |
| 31 | +- **Single HTTP client for notifiers.** `internal/notifier/client.go` is the only sanctioned way to make outbound HTTP calls from a notifier. It handles proxies, TLS, retries, and response validation. Do not add your own HTTP client or retry loops. |
| 32 | +- **No hardcoded defaults for security settings.** TLS verification must remain enabled by default; proxy settings come from user-provided secrets. Receiver signature verification must never be short-circuited. |
| 33 | +- **Error handling.** Wrap errors with `%w` for chain inspection. Do not swallow errors silently. Return actionable error messages that help users diagnose the issue without leaking internal state. |
| 34 | +- **No duplicate rate limiting.** The event server already applies a token-bucket rate limiter keyed by event fingerprint. Do not add a second layer of deduplication. |
| 35 | +- **Concurrency safety.** Do not introduce shared mutable state without synchronization. Both HTTP servers and the reconcilers run concurrently. |
| 36 | +- **No panics.** Never use `panic` in runtime code paths. Return errors and let the reconciler or handler handle them gracefully. |
| 37 | +- **Minimal surface.** Keep new exported APIs, flags, and environment variables to the minimum needed. Every export is a backward-compatibility commitment. |
| 38 | + |
| 39 | +## Project overview |
| 40 | + |
| 41 | +notification-controller is the eventing edge of the [Flux GitOps Toolkit](https://fluxcd.io/flux/components/). It reconciles three custom resources under `notification.toolkit.fluxcd.io` — `Provider`, `Alert`, and `Receiver` — and runs two HTTP servers alongside the controller manager: |
| 42 | + |
| 43 | +- An inbound **event sink** (default `:9090`) that ingests `eventv1.Event` payloads from the other Flux controllers and dispatches them to external notifier backends (Slack, MS Teams, PagerDuty, Git commit status, …) according to matching `Alert` resources. |
| 44 | +- An inbound **webhook receiver** (default `:9292`) that translates third-party webhooks (GitHub, GitLab, Harbor, image registries, CDEvents, …) into reconcile requests on Flux objects. |
| 45 | + |
| 46 | +It does not reconcile source or workload state itself. |
| 47 | + |
| 48 | +## Repository layout |
| 49 | + |
| 50 | +- `main.go` — manager entrypoint. Registers the three reconcilers, starts `server.NewEventServer` and `server.NewReceiverServer` as goroutines, wires feature gates, token cache, and workload identity. |
| 51 | +- `api/` — separate Go module (`replace`d from root `go.mod`). CRD Go types for `v1`, `v1beta1`, `v1beta2`, `v1beta3`. Storage versions: `v1` for `Receiver`, `v1beta3` for `Alert` and `Provider`. `zz_generated.deepcopy.go` is generated. |
| 52 | +- `internal/controller/` — reconcilers: `provider_controller.go`, `alert_controller.go`, `receiver_controller.go`, predicates, and the envtest `suite_test.go`. |
| 53 | +- `internal/server/` — the two HTTP servers. `event_server.go` + `event_handlers.go` implement event ingestion, alert matching, filtering, rate limiting (`sethvargo/go-limiter`) and token masking (`fluxcd/pkg/masktoken`). `receiver_server.go` + `receiver_handlers.go` implement webhook verification and dispatch. `provider_commit_status.go` and `provider_change_request.go` handle Git commit-status/MR commenting. |
| 54 | +- `internal/notifier/` — one file per backend plus `notifier.go` (the `Interface` with a single `Post(ctx, event)` method), `factory.go` (provider name → constructor map), `client.go` (shared retryable HTTP client), `util.go`, `markdown.go`. Each backend has a sibling `_test.go`. |
| 55 | +- `internal/features/` — feature-gate registration. |
| 56 | +- `config/` — Kustomize overlays. `config/crd/bases/` holds generated CRDs; `config/manager/`, `config/default/`, `config/rbac/`, `config/samples/`, `config/testdata/`. |
| 57 | +- `hack/` — `boilerplate.go.txt` and `api-docs/` templates for `gen-crd-api-reference-docs`. |
| 58 | +- `docs/spec/` — human-readable specs per API version. `docs/api/` — generated reference docs. |
| 59 | +- `tests/` — `listener/` and `proxy/` helpers for integration tests. |
| 60 | + |
| 61 | +### Notifiers implemented in `internal/notifier/` |
| 62 | + |
| 63 | +Mapped in `factory.go`: |
| 64 | + |
| 65 | +- generic, generic-hmac (the built-in `Forwarder`) |
| 66 | +- slack, discord, rocket, msteams, googlechat, googlepubsub, webex, telegram, lark, matrix, zulip |
| 67 | +- sentry, pagerduty, opsgenie, datadog, grafana, alertmanager |
| 68 | +- github, githubdispatch, githubpullrequestcomment |
| 69 | +- gitlab, gitlabmergerequestcomment |
| 70 | +- gitea, giteapullrequestcomment |
| 71 | +- bitbucket, bitbucketserver, azuredevops |
| 72 | +- azureeventhub, nats, otel |
| 73 | + |
| 74 | +## APIs and CRDs |
| 75 | + |
| 76 | +- Group: `notification.toolkit.fluxcd.io`. Kinds: `Provider`, `Alert`, `Receiver`. |
| 77 | +- Types under `api/<version>/{provider,alert,receiver}_types.go` with constants for every provider/receiver type name (e.g. `SlackProvider`, `GitHubReceiver`). The notifier factory and the receiver validator switch on these constants — keep them in sync when adding a type. |
| 78 | +- `v1beta3` is the current `Alert`/`Provider` storage version; `Receiver` graduated to `v1`. Conversion logic is code-generated. |
| 79 | +- CRD manifests under `config/crd/bases/` and reference docs under `docs/api/<version>/notification.md` are generated by `make manifests` and `make api-docs`. Do not hand-edit. |
| 80 | +- `config/crd/bases/gitrepositories.yaml` is pulled at build time from source-controller via `make download-crd-deps` (for envtest); do not commit changes to it. |
| 81 | + |
| 82 | +## Build, test, lint |
| 83 | + |
| 84 | +All targets in the root `Makefile`. Tool versions pinned via `CONTROLLER_GEN_VERSION`, `GEN_API_REF_DOCS_VERSION`, `SOURCE_VER`. Go version tracks `go.mod`. |
| 85 | + |
| 86 | +- `make tidy` — tidy both the root and `api/` modules. |
| 87 | +- `make fmt` / `make vet` — run in both modules. |
| 88 | +- `make generate` — `controller-gen object` in `api/` to refresh `zz_generated.deepcopy.go`. |
| 89 | +- `make manifests` — regenerate CRDs and RBAC under `config/crd/bases` and `config/rbac`. |
| 90 | +- `make api-docs` — regenerate `docs/api/v1`, `docs/api/v1beta2`, `docs/api/v1beta3` markdown. |
| 91 | +- `make test` — full pre-PR gate. Depends on `tidy generate fmt vet manifests api-docs download-crd-deps install-envtest`, then runs `go test ./...` with envtest assets plus `go test ./...` in `api/`. `GO_TEST_ARGS` for extra flags. |
| 92 | +- `make manager` — builds `bin/manager`. |
| 93 | +- `make run` — runs the controller against `~/.kube/config`. |
| 94 | +- `make install` / `make uninstall` / `make deploy` / `make dev-deploy` / `make dev-cleanup` — cluster workflows (`IMG` variable). |
| 95 | +- `make docker-build` / `make docker-push` — buildx image build (`linux/amd64` default via `BUILD_PLATFORMS`). |
| 96 | + |
| 97 | +## Codegen and generated files |
| 98 | + |
| 99 | +Check `go.mod` and the `Makefile` for current dependency and tool versions. After changing API types, kubebuilder markers, or RBAC comments, regenerate and commit the results: |
| 100 | + |
| 101 | +```sh |
| 102 | +make generate manifests api-docs |
| 103 | +``` |
| 104 | + |
| 105 | +Generated files (never hand-edit): |
| 106 | + |
| 107 | +- `api/*/zz_generated.deepcopy.go` |
| 108 | +- `config/crd/bases/notification.toolkit.fluxcd.io_*.yaml` |
| 109 | +- `config/crd/bases/gitrepositories.yaml` (downloaded from source-controller) |
| 110 | +- `config/rbac/role.yaml` |
| 111 | +- `docs/api/**/*.md` |
| 112 | + |
| 113 | +No load-bearing `replace` directives beyond the standard `api/` local replace. |
| 114 | + |
| 115 | +Bump `fluxcd/pkg/*` modules as a set — version skew breaks `go.sum`. Run `make tidy` after any bump. |
| 116 | + |
| 117 | +## Conventions |
| 118 | + |
| 119 | +- `go fmt`, `go vet`. All exported identifiers get doc comments, including provider-type-name constants. Non-trivial unexported helpers should also be documented. |
| 120 | +- **Adding a notifier.** Add a constant in `api/v1beta3/provider_types.go`, extend the `+kubebuilder:validation:Enum` list on `ProviderSpec.Type`, implement the backend in `internal/notifier/<name>.go` returning the `notifier.Interface`, and add its entry to the `notifiers` map plus a `xxxNotifierFunc` in `internal/notifier/factory.go`. Then regenerate CRDs and docs. |
| 121 | +- **Receiver verification.** `internal/server/receiver_handlers.go::validate` is the single entry point. Each `Receiver.Spec.Type` has its own signature scheme — GitHub HMAC-SHA1/256 over the body with `X-Hub-Signature`, GitLab compares `X-Gitlab-Token`, Bitbucket/Harbor/DockerHub/Quay/Nexus/GCR/ACR/CDEvents each have their own rules. Do not short-circuit these checks. Payloads are capped at `maxRequestSizeBytes = 3 * 1024 * 1024` (3 MiB); reuse the constant. |
| 122 | +- **Webhook path.** Receivers are addressed under `apiv1.ReceiverWebhookPath` (`/hook/`) with a per-resource digest. The path is indexed on `Receiver.Status.WebhookPath` via `WebhookPathIndexKey` — do not bypass the index when looking up receivers. |
| 123 | +- **Event dispatch.** `EventServer.handleEvent` matches each event against all `Alert` resources, applies inclusion/exclusion filters and CEL `eventMetadata`, rate-limits duplicates through the memory store, then calls each notifier's `Post(ctx, event)` with a 15s context timeout. HTTP-based notifiers inherit retry behavior from `internal/notifier/client.go` via `go-retryablehttp`. |
| 124 | +- **Proxy and TLS.** Every HTTP notifier takes an optional `ProxyURL` and `*tls.Config` from `notifierOptions`. Respect both. TLS material comes from `ProviderSpec.CertSecretRef` via `pkg/runtime/secrets`; never disable verification by default. |
| 125 | +- **Service account / workload identity.** Cloud notifiers (Azure Event Hub, Azure DevOps, Google Pub/Sub, GitHub App) use `fluxcd/pkg/auth` with the shared token cache passed through `WithTokenCache` / `WithTokenClient`. |
| 126 | +- **Cross-namespace refs.** Gated by `--no-cross-namespace-refs` (ACL option); both servers honor it when resolving `Alert.Spec.ProviderRef` and `Receiver.Spec.Resources`. |
| 127 | + |
| 128 | +## Testing |
| 129 | + |
| 130 | +- `internal/controller/suite_test.go` bootstraps `envtest` with controller-runtime. `make install-envtest` downloads the binaries into `build/testbin` via `setup-envtest`; `make test` exports `KUBEBUILDER_ASSETS`. |
| 131 | +- Reconciler tests (`alert_controller_test.go`, `provider_controller_test.go`, `receiver_controller_test.go`) drive the envtest API server. Use `testdata/` fixtures when possible. |
| 132 | +- Server tests (`event_server_test.go`, `event_handlers_test.go`, `receiver_handler_test.go`, `receiver_resource_filter_test.go`, `provider_commit_status_test.go`) exercise the HTTP handlers directly with `httptest`. |
| 133 | +- Notifier tests: one `<name>_test.go` per backend standing up an `httptest.Server` (or a fake SDK client for platforms that don't accept a custom URL) and asserting on the captured request. Match this pattern when adding a new notifier.- Run a single test: `make test GO_TEST_ARGS='-run TestSlack'`. |
| 134 | +- The `api/` submodule has its own tests; `make test` runs them in a separate `go test` pass. |
| 135 | + |
| 136 | +## Gotchas and non-obvious rules |
| 137 | + |
| 138 | +- Two Go modules. The root module imports `github.com/fluxcd/notification-controller/api` via a local `replace`. Run `tidy`, `fmt`, `vet` in both when you touch `api/` — the Makefile targets already do this. |
| 139 | +- **CRD enum drift.** `ProviderSpec.Type` and `ReceiverSpec.Type` have explicit `+kubebuilder:validation:Enum` lists. Adding a new constant in Go is not enough — extend the enum marker and rerun `make manifests`, or the new type will be rejected by the API server. |
| 140 | +- `v1beta3` is the storage version for `Alert`/`Provider` and introduced breaking-ish shape changes from `v1beta2` (e.g. `interval` deprecated on `Provider`). New fields go on `v1beta3` (and `v1` for `Receiver`); do not add fields to older versions. |
| 141 | +- The `Alert` reconciler is intentionally minimal — it exists to migrate `v1beta3` alerts and manage finalizers. Actual dispatch happens in `internal/server/event_handlers.go`. Do not move business logic into the reconciler. |
| 142 | +- The receiver webhook path is derived from a SHA-256 digest of the resource and a token from the referenced Secret; it is written to `Receiver.Status.WebhookPath` and indexed with `WebhookPathIndexKey`. |
| 143 | +- The event server applies a token-bucket rate limiter keyed by event fingerprint (`eventKeyFunc`) using `sethvargo/go-limiter` — duplicate events within `--rate-limit-interval` (default 5m) are dropped before notifier dispatch. |
| 144 | +- `--events-addr :9090` must match what source-controller, kustomize-controller, and helm-controller post to via `EVENT_ENDPOINT`; changing the default breaks the GitOps Toolkit wiring. |
| 145 | +- Prefer existing `fluxcd/pkg` helpers (runtime, secrets, auth, cache, masktoken) before adding new top-level dependencies. |
0 commit comments