FeatBit Experimentation

The A/B testing & experimentation system for FeatBit.

Turn a feature flag into a measured experiment. Every release moves through the full decision loop:

intent → hypothesis → exposure → measurement → analysis → decision → learning

Powered by Bayesian A/B testing, multi-armed bandits, FeatBit feature flags, and an AI-driven workflow scripted by a release-decision Skills catalog.

The coding agent is a first-class user of this system — Skills under skills/ script the workflow phases (CF-01 → CF-08); the web UI is a viewer/editor over the same database. Decisions are deterministic and auditable: every stage transition, hypothesis edit, and analysis result is appended to a per-experiment activity log.

See WHITE_PAPER.md for the product thesis.

Quick Start

Prerequisites

The FeatBit Release Decision Agent is built on top of the FeatBit feature-flag platform — it doesn't replace it. Before deploying RDA you need a running FeatBit instance, because the web app:

delegates all authentication to FeatBit (login, workspace, project)
reads / writes feature flags through FeatBit's API to drive experiment exposure

SaaS path — sign up at featbit.co and you're done; FeatBit + RDA are bundled.

Self-host path — install FeatBit first from github.com/featbit/featbit (Docker Compose or Helm). Replace FEATBIT_API_URL on the web block in docker/docker-compose.yml (or set web.featbit.apiUrl in Helm values) to point RDA's web at your FeatBit — runtime env, no rebuild required.

PostgreSQL and ClickHouse are bundled into the Docker Compose stack and bootstrap themselves on first boot — you don't have to provision them. You can override the connection strings to point at your own databases instead; see the deployment guides.

Deployment

1. Try it online at featbit.co

The fastest path is the hosted version: sign up at featbit.co and create an experiment from the dashboard. No install, no infrastructure.

2. Self-host with Docker Compose

Full step-by-step guide: docker/README.md.

3. Self-host with Helm on Kubernetes

For production deployments — autoscaling, ingress, TLS, secret projection — use the umbrella Helm chart:

helm install featbit-rda charts/featbit-rda \
  --namespace featbit --create-namespace \
  -f charts/featbit-rda/examples/aks/values.yaml

Full guide and AKS reference values: charts/README.md.

Run the local Claude Code connector (for Local chat mode)

The web app's chat panel has two modes — Managed (FeatBit-hosted Claude via sandbox0) and Local (your own Claude Code CLI). Local is the default. If you stay on Local mode, run the connector on your machine before opening the chat panel.

Two flags matter:

--access-token — required. Bearer token (fbat_…) the agent uses to call the web API. Issue one from the web UI: Env Settings → Agent tokens → Issue token. Plaintext is shown once — copy it then.
--sync-api-url — base URL of the web app. Default is https://www.featbit.ai. Set to http://localhost:3000 when running web locally, or to your self-hosted URL.

npx @featbit/experimentation-claude-code-connector \
  --access-token fbat_xxx \
  --sync-api-url http://localhost:3000

Same command on macOS, Linux, and Windows PowerShell — no shell-specific env-var dance. Listens on http://127.0.0.1:3100. Keep the process running while you use the chat panel.

If port 3100 is taken, add --port 4100, then click Change in the chat panel's connector toolbar (or open Env Settings → Local Claude Code connector URL) and paste the new URL. Both inputs share the same per-browser localStorage value.

Run with --help for the full flag reference, or see modules/experimentation-claude-code-connector/README.md for env-var fallbacks and SSE protocol details.

Usage

Once the dashboard is up:

Create an experiment — pick a flag, write the hypothesis, define the primary metric and guardrails.
Roll out the flag in FeatBit; your application emits flag_evaluation and metric_event records to your data backend.
Analyse — click Analyze on the run. The web service pulls per-variant statistics, runs Bayesian A/B (or Thompson-sampling bandit) in-process, and stores the result on the run row.
Decide — the evidence-analysis phase frames the outcome as CONTINUE / PAUSE / ROLLBACK CANDIDATE / INCONCLUSIVE.
Learn — capture a structured postmortem; the next iteration starts from evidence, not memory.

Detailed workflow + data-source modes: docs/usage/.

Architecture

                ┌──────────────────────────────────────────────┐
                │  modules/web  (Next.js + Prisma)  :3000      │
                │  Dashboard, REST API, Bayesian/Bandit        │
                │  analysis engine                             │
                └──┬───────────────┬───────────────────┬───────┘
                   │               │                   │
       ┌───────────┘               │                   └────────────┐
       ▼                           ▼                                ▼
┌────────────────┐        ┌──────────────────┐          ┌────────────────────┐
│  External      │        │  modules/        │          │  modules/          │
│  PostgreSQL    │        │  track-service   │          │  experimentation-  │
│                │        │  (.NET)  :5050   │          │  claude-code-      │
│  experiments,  │        │                  │          │  connector         │
│  runs, memory, │        │  Optional —      │          │  (npm, runs on     │
│  activity log  │        │  bring your own  │          │  user's machine)   │
└────────────────┘        │  warehouse via   │          │  :3100 loopback    │
                          │  Customer        │          └────────────────────┘
                          │  Managed         │                    ▲
                          │  Endpoint        │                    │
                          └────────┬─────────┘          ┌─────────┴────────┐
                                   ▼                    │  skills/         │
                          ┌──────────────────┐          │  (release-       │
                          │  External        │          │  decision        │
                          │  ClickHouse      │          │  workflow        │
                          │                  │          │  CF-01 → CF-08)  │
                          │  Optional —      │          └──────────────────┘
                          │  flag_evaluations│
                          │  + metric_events │
                          └──────────────────┘

Component	Stack	Role
`skills/`	Markdown skill catalog	Encodes the eight release-decision phases (CF-01 intent → CF-08 learning). Loaded by the coding agent at runtime; the agent calls the web API to persist state.
`modules/web`	Next.js, Prisma, TypeScript	Dashboard UI, REST API, in-process Bayesian / Thompson-sampling analysis engine, per-project memory store.
`modules/experimentation-claude-code-connector`	Node.js, `@anthropic-ai/claude-agent-sdk`, Express SSE	Optional npm package the user runs on their own machine to expose their local Claude Code CLI to the web UI's Local Claude Code chat mode. Published on npm; source in this repo.
`modules/track-service` (optional)	.NET, ClickHouse	Event ingest (`/api/track`) and per-experiment metric query (`/api/query/experiment`). Skip this entirely if you bring your own data warehouse via the Customer Managed Endpoint mode.
External PostgreSQL	—	Holds `Experiment`, `ExperimentRun`, `Activity`, `Message`, project + user memory. Provisioned by you; the chart and compose stack do not include a Postgres container.
External ClickHouse (optional)	—	Holds `flag_evaluations` and `metric_events`. Required only when `track-service` is in the loop; not needed in Customer Managed Endpoint mode.

Why track-service and ClickHouse are optional

Every experiment carries a dataSourceMode. The default (featbit-managed) pulls statistics from track-service via /api/query/experiment. The alternative (customer-single / customer-per-metric) calls your own HTTPS endpoint that returns per-variant statistics in a fixed shape — implemented in modules/web/src/lib/stats/customer-endpoint-client.ts and customer-endpoint-fetcher.ts. In customer mode the analysis engine never touches track-service, so neither it nor ClickHouse is required.

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
.claude/ss		.claude/ss
.github/workflows		.github/workflows
.wrangler/cache		.wrangler/cache
charts		charts
docker		docker
docs/usage		docs/usage
modules		modules
skills		skills
tutorial		tutorial
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
RELEASING.md		RELEASING.md
note.important.txt		note.important.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FeatBit Experimentation

Quick Start

Prerequisites

Deployment

1. Try it online at featbit.co

2. Self-host with Docker Compose

3. Self-host with Helm on Kubernetes

Run the local Claude Code connector (for Local chat mode)

Usage

Architecture

Why track-service and ClickHouse are optional

Further reading

About

Uh oh!

Releases 7

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

FeatBit Experimentation

Quick Start

Prerequisites

Deployment

1. Try it online at featbit.co

2. Self-host with Docker Compose

3. Self-host with Helm on Kubernetes

Run the local Claude Code connector (for Local chat mode)

Usage

Architecture

Why track-service and ClickHouse are optional

Further reading

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages