Skip to content

Latest commit

 

History

History
1343 lines (1006 loc) · 44.9 KB

File metadata and controls

1343 lines (1006 loc) · 44.9 KB

HostKit

Elixir-native host infrastructure declarations, planning, and runtime control.

HostKit is intended to be used from a normal Mix project with .exs infrastructure files. The DSL compiles to plain inspectable structs; Mix tasks are wrappers around the runtime API.

For naming, block shape, defaults, and reference style, see DSL design guidelines.

Design

  • Core owns systemd/systemdkit persistent units.
  • Core owns unitctl transient runtime primitives.
  • Integrations such as Caddy, Forgejo, object storage, and monitoring are providers.
  • DSL evaluation never applies changes to a host.
  • Planning and rendering are available as runtime APIs.

Example

use HostKit.DSL

project :toys do
  roots source: "/opt/toys/src",
        data: "/srv/toys",
        state: "/var/lib/toys",
        config: "/etc/toys"

  prefixes user: "toys-", unit: "toys-"

  host :elixir_toys, at: "elixir.toys" do
    ssh do
      user "dannote"
      sudo true
    end
  end

  service :exograph do
    account system: true
    storage :data, mode: 0o755
    storage :state, mode: 0o750

    daemon do
      description "Exograph search"
      after_target :network_online
      wants :network_online
      working_directory path(:source)
      exec ["/usr/local/bin/mix", "exograph.index.hex", "--web", "--port", "4200"]
      restart :on_failure
      restart_sec 10

      isolate do
        writable :data
        writable :state
        network :loopback
      end
    end
  end
end

Plans and down plans

Rollback is represented as another HostKit plan. A plan change already carries before and after state, so HostKit can derive a down plan from the exact plan that was applied. Down plans include coverage stats for reversible, explicit no-op, and skipped original changes:

{:ok, plan} = HostKit.plan(project, target: prod)
{:ok, down_plan} = HostKit.down(plan)

HostKit.format_plan(down_plan)
HostKit.apply(down_plan, confirm: true)

Partial rollback uses the same plan model:

{:ok, down_plan} =
  HostKit.down(plan, only: [{:file, "/etc/gatehouse/config.exs"}])

Command-like operations need semantic down steps because HostKit cannot infer the opposite of an arbitrary command:

command :migrate,
  exec: {"bin/app", ["eval", "App.Release.migrate()"]},
  phase: :before_start,
  down: {"bin/app", ["eval", "App.Release.rollback()"]}

command :warm_cache,
  exec: {"bin/app", ["eval", "App.Cache.warm()"]},
  down: :noop

The down command is emitted as an ordinary command change in the down plan. down: :irreversible records an explicit warning and omits the command from the down plan.

Created resources use conservative rollback policies. File-like resources and symlinks can be deleted by a down plan, but directories are kept unless explicitly opted in:

file "/etc/app/config", content: "..."
symlink "/opt/app/current", to: "/opt/app/releases/20260615"
directory "/tmp/demo", rollback: :delete_if_created
directory "/srv/app", rollback: :keep
account :app, system: true, rollback: :keep
package :caddy, rollback: :keep

Symlink ownership is unmanaged unless owner: or group: is explicitly set. This keeps release/current links reproducible across platforms where changing symlink inode ownership is unsupported or unreliable. When explicit symlink ownership is requested, apply verifies it and fails if the target cannot enforce it.

CLI usage mirrors this:

mix host_kit.plan infra/config.exs --host prod --out up.plan.json
mix host_kit.down up.plan.json --out down.plan.json
mix host_kit.apply --plan down.plan.json --confirm

Run tracking

Tracked applies write minimal run records under the project-configured HostKit runs root:

mix host_kit.apply --track --plan up.plan.json --confirm
mix host_kit.runs --host prod infra/config.exs
mix host_kit.runs --host prod --verbose infra/config.exs
mix host_kit.runs --host prod --latest --verbose infra/config.exs
mix host_kit.down --host prod --run 20260614-101148-demo-up --out down.plan.json infra/config.exs

Run records are intentionally compact: they identify the run, project, direction, timestamp, and applied change statuses. They do not replace plan artifacts; use plan artifacts for inspectable up/down plan contents. When a tracked apply is started from --plan, HostKit copies that up-plan artifact under the runs root and records the copied path so mix host_kit.down --last can work from the tracked run.

Tracked applies also write backup payloads for previous file-like state when that state was captured in the plan. Backup payloads live under hostkit_backups/<run-id>/ or the --backups-root override. mix host_kit.down --last and mix host_kit.down --run RUN_ID rewrite supported previous file-like state to %HostKit.BackupRef{} entries so generated down plans restore from backup payloads instead of embedding prior content. Backup-backed restore currently covers ordinary files plus rendered file resources such as env files, Caddy sites, proxy config, firewall/egress files, and systemd unit files when their previous rendered content was captured. Symlink rollback restores the previous link target directly in the plan. Use mix host_kit.runs --verbose, --latest, or --id RUN_ID to inspect copied plan artifacts and backup payload paths.

Source updates are intentionally not inferred as reversible by default: a previous Git remote/ref may no longer be reachable. Treat source rollback as an explicit lifecycle operation or pair it with a backup/source-bundle strategy.

Run retention is explicit. Use mix host_kit.runs --prune --keep N to remove older run records plus their copied plan artifact and backup payload directories.

Elixir app lifecycle helpers

The Elixir app recipe can emit lifecycle commands for common BEAM deployment operations. Ecto migrations are represented as normal commands with explicit down commands:

elixir_app :shop do
  source github: "acme/shop", path: ".", ref: "main"
  phoenix host: "shop.example.com", secret_key_base: secret_env("SECRET_KEY_BASE")

  ecto release: "Shop.Release"
end

This emits a :before_start migration command that runs through the built release and a matching down command that calls Shop.Release.rollback().

For multiple repos, HostKit emits one ordered command per repo. Down plans reverse that order:

elixir_app :shop do
  source github: "acme/shop", path: ".", ref: "main"
  phoenix host: "shop.example.com", secret_key_base: secret_env("SECRET_KEY_BASE")

  ecto release: "Shop.Release" do
    repo "Shop.Repo"
    repo "Shop.AnalyticsRepo"
  end
end

The default expressions are:

Shop.Release.migrate(Shop.Repo)
Shop.Release.rollback(Shop.Repo)

Use :migrate and :rollback for custom release functions when the defaults do not fit.

OTP release artifacts

The OTP release recipe consumes a BEAM-native release artifact manifest written as ETF and expands it into ordinary HostKit resources. The app repository remains responsible for building the Mix release tarball; HostKit remains responsible for accounts, directories, env files, systemd, readiness, planning, apply, and down plans.

ReleaseKit is the reference producer for this manifest format. Applications should configure ReleaseKit artifact defaults and prebuild steps in application config, then run mix release_kit.artifact directly. For example, frontend assets belong in ReleaseKit prebuild steps such as ReleaseKit.Step.Volt, not in HostKit or app-specific artifact wrapper tasks.

Import the recipe explicitly:

use HostKit.DSL, recipes: [HostKit.Recipes.OTPRelease]

Then reference the manifest:

project :example do
  otp_release :demo_app,
    manifest: "_build/prod/demo_app.etf",
    port: 4000,
    base_dir: "/opt/example/demo_app",
    config_dir: "/etc/example/demo_app"
end

Use the :account_home option when an existing service account should keep a home directory outside the release base. Use the :env option to add deployment-specific clear environment variables to the generated service env file without rebuilding the artifact manifest:

otp_release :demo_app,
  manifest: "_build/prod/demo_app.etf",
  account_home: "/var/lib/demo_app/home",
  env: %{"APP_DATA_DIR" => "/srv/demo"}

The manifest is decoded with:

:erlang.binary_to_term(binary, [:safe])

HostKit does not embed release tarball bytes into the plan. The tarball path recorded in the manifest must be available to the target where the generated unpack command runs.

RPC service bindings

rpc models service-to-service RPC wiring. HostKit owns service names, listener locations, module-level bindings, and local socket access; the runtime RPC protocol owns exact operations, typespecs, and handshakes.

Same-host RPC defaults to Unix sockets instead of TCP ports:

service :catalog do
  daemon do
    listen :rpc, protocol: :rpc
  end

  rpc do
    expose Catalog.API
    expose Catalog.Admin
  end
end

service :web do
  bind :catalog
end

With roots run: "/run/apps", the default RPC socket for catalog is:

/run/apps/catalog/rpc.sock

The provider side uses expose for RPC modules. Do not list every runtime operation in HostKit; SafeRPC or another RPC runtime should describe exact callable functions during handshake.

The caller side uses bind to declare Docker-like service bindings. bind :catalog means the current service may discover and connect to catalog's exposed RPC modules. Use bind :catalog, modules: [Catalog.Admin] only when the caller should narrow the binding metadata to a subset.

HostKit validates RPC bindings during planning:

  • the target service must exist;
  • the target listener must exist;
  • the target service must expose requested modules when a module subset is specified;
  • a service cannot bind itself.

For each service with RPC bindings, HostKit emits a caller-local SafeRPC binding term under the service runtime directory and injects its path as HOSTKIT_RPC_BINDINGS into the caller's systemd services:

/run/<service>/rpc.etf

With service-scoped runtime roots, this becomes for example:

/run/apps/web/rpc.etf

The ETF file contains only bindings for that caller:

%{
  catalog: %{
    listener: :rpc,
    socket: "/run/apps/catalog/rpc.sock",
    upstream: "unix:/run/apps/catalog/rpc.sock",
    modules: [Catalog.API, Catalog.Admin],
    unit: "catalog.service"
  }
}

Consumers read it with:

bindings =
  System.fetch_env!("HOSTKIT_RPC_BINDINGS")
  |> File.read!()
  |> :erlang.binary_to_term([:safe])

HostKit also derives the local access boundary from bind. The provider RPC socket metadata defaults to the provider service user/group with mode 0660, and the caller service account is added to the provider service group when an account resource is declared for the caller. For example, bind :catalog lets the web service account join the catalog service group so it can open /run/apps/catalog/rpc.sock.

Gatehouse/SafeRPC config can build on the same metadata later.

Use TCP explicitly only when the RPC endpoint must cross a host/container boundary:

daemon do
  listen :rpc, protocol: :rpc, port: 4451, on: :loopback
end

Providers

Providers can contribute DSL modules, resource types, renderers, validators, and read/plan/apply lifecycle operations. Systemd and Unitctl are core primitives, not providers; integrations such as Caddy should be providers.

use HostKit.DSL, providers: [HostKit.Providers.Caddy]

project :demo do
  provider :caddy, HostKit.Providers.Caddy do
    set :sites_dir, "/etc/caddy/sites"
  end

  service :web do
    daemon do
      exec ["/opt/web/bin/server"]
      listen :http, port: 4000
    end

    caddy_site "example.com", path: "web.caddy" do
      encode [:zstd, :gzip]
      reverse_proxy :http
    end
  end
end

Providers should keep generated resources inspectable. For example, the Gatus provider is a thin structured-config helper: it emits an ordinary yaml/2 config resource rather than hiding a daemon or runtime lifecycle.

use HostKit.DSL, providers: [HostKit.Providers.Gatus]

project :demo, providers: [HostKit.Providers.Gatus] do
  service :api do
    file "/srv/api/health.txt", content: "ok"

    monitor :http,
      name: "API",
      group: "demo",
      url: "https://api.example.com/health",
      interval: "1m",
      expect: [status: 200],
      alerts: [:telegram]
  end

  service :monitoring do
    gatus_config path(:config, "gatus.yaml"), owner: "root", group: service_user(), mode: 0o640 do
      web address: "127.0.0.1", port: 8080
      gatus_storage :sqlite, path: path(:state, "gatus.db")

      telegram_alerting token: "${MONITORING_TELEGRAM_BOT_TOKEN}", id: "${MONITORING_TELEGRAM_CHAT_ID}" do
        default_alert enabled: true, "failure-threshold": 3, "success-threshold": 2
      end

      gatus_monitor_endpoints order: ["API"]
    end
  end
end

Instances and nested hosts

Top-level host declarations describe existing connection targets. instance declarations describe lifecycle-managed compute boundaries with backend-selected lifecycle and normal HostKit contents nested inside.

use HostKit.DSL

project :demo do
  instance :demo_vm do
    backend :incus
    image "images:ubuntu/24.04"
    kind :container
    lifecycle :ephemeral

    expose :ssh, host: 2222, guest: 22
    expose :web, host: 18080, guest: 80

    target_host :guest

    host :guest, at: "127.0.0.1" do
      ssh do
        user "root"
        password "hostkit-demo"
        port 2222
        accept_hosts true
      end
    end

    service :web do
      package :caddy

      daemon do
        exec ["/usr/bin/env", "true"]
        listen :http, port: 80
      end
    end
  end
end

The instance owns compute lifecycle metadata (backend, image, kind, lifecycle, expose). The nested host owns connection metadata. Nested services/resources are ordinary HostKit declarations scoped to the instance contents. Plans emit the instance lifecycle resource first, then nested content resources annotated with the nested host target so read/apply operations run through that endpoint. If an instance declares more than one nested host, use target_host :name to choose the endpoint for nested content resources; otherwise HostKit uses the first nested host.

Down plans delete lifecycle :ephemeral instances after their nested content has been rolled back. Persistent instances are intentionally skipped in down plans and reported as warnings rather than destroyed implicitly.

Backend implementations are intentionally separate from the generic DSL. Incus is implemented as a backend for instance, not as a user-facing incus_machine DSL. The Incus backend maps expose declarations to Incus proxy devices.

Backend configuration stays on the backend declaration instead of leaking backend-specific flags into generic plan/apply commands:

instance :demo_vm do
  backend :incus, sudo: true, project: "hostkit"
end

For multi-line configuration, use backend options:

instance :demo_vm do
  backend :incus do
    option :sudo, true
    option :project, "hostkit"
  end
end

Backend authors implement HostKit.Instance.Backend:

  • read/2 returns the observed instance or nil,
  • apply/2 creates/starts/configures/waits for the instance,
  • delete/2 destroys an instance when an ephemeral down plan requests it.

Backends should emit apply events for long-running lifecycle work so CLI and Livebook progress remain mailbox-first.

Host bootstrap packages and mise-managed runtimes

HostKit can install OS packages through the target package manager. The DSL is distribution-neutral by default and can be pinned to a manager when needed.

bootstrap do
  package :ca_certificates
  package :build_essential, as: "build-essential", update: true
end

HostKit can also bootstrap mise and install system-wide tool versions. This is intended for host bootstrap and workspace agents; application services should still prefer packaged release artifacts where possible.

bootstrap do
  mise do
    tool :erlang, "29.0.2"
    tool :elixir, "1.20.1"
  end
end

This applies through the mise CLI contract: it installs the binary with mise.run when missing, then runs mise install --system with MISE_SYSTEM_DATA_DIR set.

Package planning resolves semantic package names through Repology and caches responses in .host_kit/cache/repology for 24 hours by default. Use locks for deterministic apply:

mix host_kit.plan --write-package-lock host_kit.package.lock infra/config.exs
mix host_kit.apply --package-lock host_kit.package.lock --confirm infra/config.exs

Plan/apply artifacts make remote changes inspectable before apply. Prefer declaring the remote host in normal .exs HostKit config and selecting it with --host:

use HostKit.DSL

project :infra do
  host :prod, at: "host.example" do
    ssh do
      user "root"
      identity_file Path.expand("~/.ssh/id_ed25519")
      password secret_env("HOSTKIT_SSH_PASSWORD")
      accept_hosts true
      retry attempts: 3, base_delay: 250, max_delay: 2_000
    end
  end
end
mix host_kit.plan --host prod \
  --package-lock host_kit.package.lock \
  --out host_kit.plan.json infra/config.exs

mix host_kit.apply --host prod \
  --plan host_kit.plan.json --confirm infra/config.exs

ssh retry: ... is an SSH transport policy. It retries connection establishment for transient SSH startup/network failures; it does not blindly rerun arbitrary deployment commands after a command has been sent to the remote host. Use retry: 3 as shorthand for three attempts, retry: false to disable, or keyword options with :attempts, :base_delay/:base_delay_ms, and :max_delay/:max_delay_ms. Retry progress is emitted as apply events and mirrored to Logger for collection.

Plan artifacts are JSON and intended to be inspectable. They include an artifact version, target metadata, dumped project/resources/changes, source identities, diagnostics, aggregate resource/action statistics, source-location metadata on changes where available, and structured diffs for resources that support semantic review. Structured diffs are generated through HostKit's diff wrapper around JSON Patch concepts; HostKit stores its own stable diff structs rather than exposing the dependency as the artifact contract. Dotenv/INI/YAML resources diff public keys or paths. Templates diff public assign metadata and redacted assign names, not arbitrary rendered text. Secret references are stored as references, not values, for example:

{
  "$type": "struct",
  "module": "Elixir.HostKit.Secret",
  "fields": {
    "source": {
      "$type": "tuple",
      "items": [
        {"$type": "atom", "value": "env"},
        "HOSTKIT_SSH_PASSWORD"
      ]
    }
  }
}

secret_env/1 records an environment-backed secret reference and resolves it only at the control-plane boundary that needs the value. Use it for HostKit's own credentials, such as SSH passwords or future provider API tokens. Target application environment files use contextual env declarations. Inside service, env :name do ... end declares a managed env file at the service's config path. Inside daemon, env :name attaches that same file to the systemd unit:

service :app do
  env :runtime do
    set :mix_env, :prod
    secret :database_url, env: "DATABASE_URL"
  end

  daemon do
    env :runtime
    exec ["/opt/app/bin/server"]
  end
end

Use dotenv path do ... end when you need an explicit dotenv-format file at a specific path.

Raw SSH flags remain available as an escape hatch: --remote, --user, --port, --identity-file, --password, and --password-env.

For Linux integration testing, use Incus as the lightweight native container/VM backend:

HOSTKIT_INCUS_SUDO=true HOSTKIT_SSH_PUBLIC_KEY=$HOME/.ssh/id_ed25519.pub \
  scripts/incus_integration_vm.sh ensure
HOSTKIT_INCUS_SUDO=true scripts/incus_integration_vm.sh ip

Set HOSTKIT_INCUS_TYPE=vm to launch an Incus VM instead of the default container, and HOSTKIT_INCUS_INSTANCE=name to change the instance name. Run the remote CLI integration against Incus with HOSTKIT_INTEGRATION_TOOL=incus, or against a pre-existing host declared in .exs config with HOSTKIT_INTEGRATION_TOOL=remote HOSTKIT_INTEGRATION_CONFIG=examples/integration_hosts.example.exs.

A real remote validation can use the same host config and a shell-provided secret:

HOSTKIT_SSH_PASSWORD='...' \
HOSTKIT_INTEGRATION_TOOL=remote \
HOSTKIT_INTEGRATION_CONFIG=examples/integration_hosts.example.exs \
mix test test/integration/cli_remote_test.exs --include integration

Project-local DSLs

Use HostKit.ProjectDSL in consuming projects to build local conventions without baking them into HostKit. Load project-local DSL files explicitly through the runtime API or Mix task --require option:

# infra/toys_infra.exs
defmodule ToysInfra do
  use HostKit.ProjectDSL

  root :source, "/opt/toys/src"
  root :data, "/srv/toys"
  root :state, "/var/lib/toys"
  root :config, "/etc/toys"

  prefix :user, "toys-"
  prefix :unit, "toys-"

  defservice :toy_service do
    let :service_user, do: prefixed(:user, service_name())
    let :unit_name, do: prefixed(:unit, service_name()) <> ".service"

    path :source_dir, root(:source), service_name()
    path :data_dir, root(:data), service_name()
    path :state_dir, root(:state), service_name()
    path :config_dir, root(:config), service_name()

    macro :standard_user do
      account service_user(), system: true, home: state_path("home")
    end
  end
end
# infra/config.exs
use HostKit.DSL, providers: [HostKit.Providers.Caddy]
use ToysInfra

project :toys do
  toy_service :exograph do
    standard_user()

    systemd_service unit_name() do
      working_directory source_dir()
      read_write_paths [data_dir(), state_dir(), source_dir()]
    end
  end
end

Runtime API

{:ok, project} = HostKit.load("infra/config.exs", require: ["toys_infra.exs"])
{:ok, plan} = HostKit.plan(project)
#=> %HostKit.Plan{changes: [%HostKit.Change{action: :create, ...}]}

prod = HostKit.Target.ssh(:prod, host: "elixir.toys", user: "dannote", sudo: true)
{:ok, remote_plan} = HostKit.plan(project, target: prod, reader: HostKit.Remote)

HostKit.format_plan(plan)
execution_graph = HostKit.Plan.ExecutionGraph.build(plan)
HostKit.Plan.ExecutionGraph.format(execution_graph)
{:ok, results} = HostKit.apply(plan, dry_run: true)

# Supported apply resources include accounts, directories, files, structured configs,
# templates, symlinks, env files, systemd units, commands, packages, and provider-rendered files.
{:ok, results} = HostKit.apply(plan, confirm: true, sudo: true)

# Command and filesystem operations are routed through a runner boundary.
{:ok, results} = HostKit.apply(plan, confirm: true, runner: HostKit.Runner.Local)

prod = HostKit.Target.ssh(:prod, host: "elixir.toys", user: "dannote", sudo: true)

{:ok, results} = HostKit.apply(plan, target: prod, confirm: true)

{:ok, conn} = HostKit.Runner.SSH.Connection.open(host: "elixir.toys", user: "dannote")
try do
  prod = HostKit.Target.ssh(:prod, runner: {HostKit.Runner.SSH.Connection, conn: conn}, sudo: true)
  {:ok, remote_plan} = HostKit.plan(project, target: prod, reader: HostKit.Remote)
after
  HostKit.Runner.SSH.Connection.close(conn)
end

{:ok, unit} = HostKit.Render.render(project, {:systemd_service, "toys-exograph.service"})

Plans can also be inspected as an execution dependency graph. The graph is derived from active create/update/delete changes and records why ordering exists: declared depends_on, parent directories, owner/group accounts, command source inputs, symlink target paths, systemd timer/service relationships, systemd service file/path references, and systemd readiness checks. It is currently an inspection/debug artifact; future parallel apply can consume the same graph without changing the plan format.

mix host_kit.plan infra/config.exs --host prod --show-graph
mix host_kit.plan infra/config.exs --host prod --graph-format json

The JSON graph output is a JSON-safe map with display labels and HostKit.Resource.dump/1 terms for resource ids; it does not encode raw Elixir structs or embed full before/after resource payloads. See Parallel apply design for how this graph may later feed a bounded scheduler.

Storage volumes

HostKit models storage as named metadata instead of repeated path strings:

volume =
  HostKit.Storage.volume(:repositories,
    path: "/srv/toys/forgejo/repositories",
    owner: "toys-forgejo",
    group: "toys-forgejo",
    mode: 0o750,
    backup: true
  )

directory HostKit.Storage.directory(volume)
read_write_paths HostKit.Storage.read_write_paths([volume])

Service conventions can derive these paths without project-specific macros and later reuse the same volume metadata for systemd sandboxing, Unitctl transient runtimes, and backups.

project :toys do
  roots data: "/srv/toys", config: "/etc/toys"
  prefixes user: "toys-", unit: "toys-"

  service :forgejo do
    storage :repositories, under: :data, path: "repositories", mode: 0o750, backup: true
    storage :config, under: :config, owner: "root", group: service_user(), writable: false, secret: true

    daemon unit_name() do
      run user: service_user(), read_write_paths: writable_storage_paths()
    end
  end
end

HostKit agent

HostKit can run as a supervised OTP application. The supervision tree currently starts agent state and a monitor worker:

HostKit.Agent.status()
HostKit.Agent.configure(project: project, target: HostKit.Target.local(:prod))
HostKit.Agent.run_plan()
HostKit.Agent.run_monitor()

HostKit can also declare its own outer systemd supervisor unit:

HostKit.Agent.Systemd.service(
  exec_start: ["/opt/host_kit/bin/host_kit", "agent", "--config", "/etc/host_kit/config.exs"]
)

State snapshots can be written for audit/drift history:

HostKit.State.write(plan, "/var/lib/host_kit/state/latest-plan.json")
HostKit.State.read("/var/lib/host_kit/state/latest-plan.json")

This gives a clean two-layer supervision model: OTP inside the BEAM and systemd outside it.

Firewall policy

HostKit can declare project- or host-scoped firewall policy:

firewall do
  allow tcp: 22, from: :any
  allow tcp: [80, 443], from: :any
  allow tcp: 9100, from: {10, 44, 0, 0, 24}
  deny :all
end

Host-scoped policy lives inside host:

host :prod, at: "elixir.toys" do
  firewall do
    allow tcp: 22, from: :any
    deny :all
  end
end

Extract, render, plan, and apply policies with:

HostKit.Firewall.policies(project)
HostKit.Firewall.Nftables.render(policy)
HostKit.plan(project, reader: HostKit.Local)
HostKit.apply(plan, confirm: true, nft_reload: true)

Firewall policy is written to /etc/nftables.d/hostkit.nft by default and validated with nft -c -f before optional reload.

Workspace inside monitoring

Workspace services can declare checks that are intended to run inside the sandbox later via a workspace agent:

workspace :blog, owner: :alice do
  service :preview do
    inside do
      monitor :mix, task: "test", every: "5m"
      monitor :port, port: 4000
      monitor :git, clean: true
    end
  end
end

Extract them with:

HostKit.Workspace.inside_monitors(project)

Workspace execution and tenants

Tenants can own workspaces:

tenant :alice, quota: [memory: "4G"] do
  agent port: 4173
end

Workspace command specs can be built for transient execution:

HostKit.Workspace.exec_spec(project, :alice, :blog, ["mix", "test"])
HostKit.Workspace.exec(project, :alice, :blog, ["mix", "test"])

Inside monitors currently return :pending_workspace_agent, reserving execution for the sandbox agent boundary.

OpenTelemetry Collector config

Telemetry declarations can be converted to an OpenTelemetry Collector config map:

HostKit.OtelCollector.config(project, endpoint: "otel.example:4317")

Workspace sandbox profiles

Systemd-backed isolation profiles can be applied inside daemons:

workspace :blog, owner: :alice do
  service :preview do
    daemon do
      exec ["mix", "phx.server"]

      isolate :vibe_dev do
        writable path(:data)
        network :loopback
      end
    end
  end
end

Profiles include :vibe_dev, :strict_app, and :untrusted, and can be overridden inside isolate:

isolate :untrusted do
  memory_max "256M"
  private_network false
end

Workspace preview helper

Workspace services can expose a preview route with a named listener and Caddy site:

workspace :blog, owner: :alice do
  service :preview do
    daemon unit_name() do
      run exec_start: ["mix", "phx.server"]
    end

    preview :http, port: 4000, domain: "alice-blog.dev.example.com"
  end
end

This expands to listen :http, a Caddy reverse proxy to that listener, an HTTP monitor, telemetry metadata, and Caddy access-log metadata.

Workspace agent helper

Workspaces can declare the default sandbox agent service as ordinary HostKit resources:

workspace :blog, owner: :alice do
  agent port: 4173
end

This expands to a service with an account, workspace directory, systemd daemon, loopback listener, logs, telemetry, systemd monitor, and loopback-only network policy.

Workspace scope

workspace scopes ordinary HostKit DSL for user sandboxes while keeping resources inspectable:

workspace :blog, owner: :alice do
  service :preview do
    directory path(:data), mode: :private_dir

    daemon unit_name() do
      run exec_start: ["mix", "phx.server"]
      listen :http, port: 4000, on: :loopback
    end
  end
end

Inside a workspace, services get workspace metadata plus separate path and identity names:

path(:data) # .../alice/blog/preview
unit_name()      # prefix-alice-blog-preview.service

Named listeners

Services can declare named listeners and reuse them from provider declarations:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  listen :http, port: 3000, on: :loopback
end

caddy_site "web.example.com" do
  reverse_proxy :http
end

Named listeners are stored as service metadata and render Caddy upstreams like 127.0.0.1:3000 at the provider boundary.

Network addresses and policy

Network addresses can use Elixir tuple forms and semantic aliases:

listen 3000, on: :loopback
listen 4000, on: {127, 0, 0, 1}
network_policy deny: :all, allow: [:loopback, {10, 44, 0, 0, 24}]

Systemd services compile network policy to:

IPAddressDeny=any
IPAddressAllow=localhost 10.44.0.0/24
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX

Log management intent

Log management can be declared globally, per service, or on individual resources:

observability do
  logs driver: :journald,
       retention: "14d",
       ship: true,
       attributes: [deployment_environment: :prod]
end

Systemd service log declarations also add unit directives:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  logs identifier: service_name(), stdout: :journal, stderr: :journal
end

Extract log intent with:

HostKit.Logs.configs(project)

Read recent journald logs through local or remote targets:

HostKit.Logs.read("toys-forgejo.service", target: prod, since: "1h")
HostKit.Logs.tail("toys-forgejo.service", target: prod, lines: 100)

OpenTelemetry collection intent

Observability defaults can be enabled once at project or service scope and inherited by resources:

observability do
  telemetry logs: true,
            metrics: true,
            traces: false,
            attributes: [deployment_environment: :prod]
end

Resource-level overrides are still available:

daemon unit_name() do
  run exec_start: ["/usr/bin/env", "true"]
  telemetry logs: :journald, metrics: false, service_name: service_name()
end

Extract collection intent with:

HostKit.Telemetry.signals(project)

Systemd services and Caddy sites get default collection intent even without global defaults:

# systemd: logs: :journald, metrics: :systemd
# caddy: logs: :access, metrics: :http

Monitoring metadata

Declarations can carry monitoring intent for a later monitoring service or config generator:

daemon do
  exec ["/usr/bin/env", "true"]
  listen :http, port: 4000
  monitor :systemd, expect: [state: :active], severity: :critical
end

caddy_site "web.example.com" do
  reverse_proxy :http
  monitor :http, url: "https://web.example.com", expect: [status: 200]
end

Extract, project, or run checks with:

HostKit.Monitor.checks(project)
HostKit.Monitor.endpoint_checks(project, group: "prod", interval: "1m")
HostKit.Providers.Gatus.endpoints_from_monitors(project)
HostKit.Monitor.run(project, target: prod)

Initial execution supports systemd state, HTTP status, filesystem existence, and command exit checks. Command monitors use the same exec: command shapes as command resources:

monitor :command,
  name: :dr_validate,
  exec: argv("/usr/local/sbin/dr-validate"),
  expect: [exit: 0],
  severity: :critical

Endpoint projection currently turns HTTP monitors into provider-neutral external endpoint specs; providers such as Gatus can render those specs into concrete monitoring config.

Binary release layouts

Use release/2 when a service follows the common unpacked-binary pattern of a versions directory and a current/<name> symlink. It is only a helper: it emits ordinary directory/2 and symlink/2 resources that remain visible in plans.

service :gatus do
  release :gatus, version: "5.36.0", owner: "deploy", group: "deploy"

  daemon do
    exec [path(:opt, "current/gatus/gatus")]
  end
end

The default layout is:

  • versions directory: path(:opt, "releases/<name>")
  • current symlink: path(:opt, "current/<name>")
  • symlink target: <versions_dir>/<version>

Use current_dir: [owner: ..., group: ..., mode: ...] when HostKit should also manage the parent current directory. See Release design notes for the intended boundary before adding artifact download, activation, retention, or rollback behavior.

File modes

Mode values can be raw octal, semantic aliases, tuples, keywords, or capability lists:

mode: :secret_group_file
mode: {:rw, :r, nil}
mode: [owner: :rw, group: :r]
mode: [:setgid, :owner_rwx, :group_rwx, :other_rx]

Resources store normalized integer modes, so plan/apply remains simple.

Env files and secrets

HostKit has a Dotenvy-validated dotenv resource for explicit env files. Secret values are resolved at apply time. Drift detection compares metadata and non-secret set entries with structured key-level diffs; secret entry values are not read into plan artifacts for comparison. Use secret KEY, env: :redacted for existing/generated env-file secrets that should be modeled but never rendered by HostKit. Secret sources support env: "NAME", file: "/run/secrets/name", and command: ["pass", "show", "name"].

service :web do
  env :runtime do
    set :MIX_ENV, :prod
    set :PORT, 4000
    secret :SECRET_KEY_BASE, env: "SECRET_KEY_BASE"
    secret :API_TOKEN, file: "/run/secrets/api-token"
    secret :GENERATED_TOKEN, env: :redacted
  end

  daemon do
    env :runtime
    exec ["/opt/web/bin/server"]
  end
end

For explicit paths, use dotenv alongside ini and yaml:

dotenv path(:config, "env"), owner: "root", group: service_user(), mode: 0o640 do
  set "MIX_ENV", "prod"
  set "PORT", 4000
  secret "GENERATED_TOKEN", env: :redacted
end

Structured config files

Use ini/2 and yaml/2 when a managed file is naturally data. Structured config resources are first-class resources in plans and render to ordinary managed files during read/apply.

service :forgejo, path: "forgejo" do
  ini path(:config, "app.ini"), owner: "root", group: service_user(), mode: 0o640 do
    set "APP_NAME", "elixir.toys git"

    section "server" do
      set "DOMAIN", "git.elixir.toys"
      set "ROOT_URL", "https://git.elixir.toys/"
      set "HTTP_PORT", 3000
      secret "LFS_JWT_SECRET", env: :redacted
    end

    section "database" do
      set "DB_TYPE", "sqlite3"
      set "PATH", path(:data, "forgejo.db")
    end
  end
end

Secret or redacted INI/YAML values are omitted from public drift comparison. For public values, HostKit produces structured plan diffs with operations, paths, before/after values, and human-readable output such as ~ server.HTTP_PORT: 3000 -> 4000; redacted values are reported as redacted paths without reading or storing their actual values. HostKit decodes YAML with yaml_elixir for public-path comparison, renders YAML scalars with ymlr, and uses JSON Patch-style operations internally for structured diffs; HostKit does not hand-roll YAML quoting/parsing. env: :redacted is useful for modeling existing generated secrets without storing or rendering them, and it is intentionally not renderable during apply. Use an env-backed secret when HostKit should render the file during apply:

secret "TOKEN", env: "APP_TOKEN"
secret "TOKEN", file: "/run/secrets/app-token"
secret "TOKEN", command: ["pass", "show", "app/token"]

YAML configs use Elixir keyword data for stable order and may contain redacted secret leaves:

yaml path(:config, "gatus.yaml"),
  content: [
    storage: [type: "sqlite", path: path(:state, "gatus.db")],
    alerting: [telegram: [token: :redacted, id: "chat-id"]],
    endpoints: [
      [
        name: "Forgejo",
        url: "https://git.elixir.toys",
        conditions: ["[STATUS] == 200"]
      ]
    ]
  ],
  owner: "root",
  group: service_user(),
  mode: 0o640

Elixir .exs files

Use exs/2 when the desired file is Elixir configuration code and should be represented as quoted AST rather than an EEx string template.

exs path(:config, "runtime.exs"), owner: "root", group: service_user(), mode: 0o640 do
  import Config

  config :my_app,
    url: unquote(value("https://example.com")),
    secret_key_base: unquote(secret("SECRET_KEY_BASE", env: "SECRET_KEY_BASE"))
end

The block is captured and rendered; it is not evaluated. HostKit currently interprets only strict placeholder forms inside unquote(...): value(literal) and secret(literal, literal_opts). Use templates for free-form text generation.

Templates

Use template/2 for deterministic EEx-rendered text resources. Templates are first-class resources in plans and render to ordinary managed files during read/apply.

service :forgejo, path: "forgejo" do
  template path(:config, "app.ini"),
    from: "templates/forgejo/app.ini.eex",
    assigns: %{
      domain: "git.elixir.toys",
      data_dir: path(:data),
      repositories_dir: path(:data, "repositories")
    },
    owner: "root",
    group: service_user(),
    mode: 0o640
end

In DSL configs, relative from: paths are resolved relative to the declaring config file. Runtime code may use absolute from: paths or inline source::

HostKit.Resources.Template.new("/etc/app.conf",
  source: "port=<%= @port %>\n",
  assigns: %{port: 4000}
)

Templates support regular EEx bindings (<%= port %>) and assigns syntax (<%= @port %>). Assign keys must be atoms because they become EEx bindings. Keep templates inspectable and deterministic; do not hide runtime behavior in templates. Template assigns may contain %HostKit.Secret{} references; plans show public assign diffs and redacted assign names without resolving secret values. :redacted assign values are useful for modeling existing generated values but cannot be rendered or applied by HostKit.

Read, audit, and facts APIs

Runtime APIs are primary; Mix tasks wrap them. Besides HostKit.plan/2, projects expose focused read/audit helpers:

{:ok, current_resources} = HostKit.Project.read(project, target: HostKit.Target.local(:prod))
{:ok, audit_plan} = HostKit.Project.audit(project, target: HostKit.Target.local(:prod))
{:ok, facts} = HostKit.Facts.collect(HostKit.Target.local(:prod), only: [:os, :users, :systemd, :ports])

read/2 returns the current snapshots captured for each desired resource. audit/2 returns the same plan shape as HostKit.plan/2, so callers can inspect creates, updates, deletes, read errors, and no-ops without going through Mix tasks. CLI wrappers are available as mix host_kit.read, mix host_kit.audit, and mix host_kit.facts.

Command argv builder

Use argv/2 when a service command has many CLI options. It keeps argv inspectable without hand-writing long flag lists.

daemon :search do
  exec argv(path(:bin, "mix"),
    args: ["exograph.web"],
    opts: [
      backend: "duckdb",
      manifest_path: path(:data, "hex-manifest.json"),
      duckdb_memory_limit: "2GB",
      port: 4200
    ]
  )
end

Option styles are configurable:

argv("cmd", opts: [foo_bar: "baz"], style: :gnu)         # --foo-bar baz
argv("cmd", opts: [foo_bar: "baz"], style: :equals)      # --foo-bar=baz
argv("cmd", opts: [foo_bar: "baz"], style: :single_dash) # -foo-bar baz
argv("cmd", opts: [f: "baz", v: true], style: :short)    # -f baz -v
argv("cmd", opts: [foo_bar: "baz"], style: :underscore)  # --foo_bar baz

Booleans with true emit flags, false/nil are omitted, and list values repeat the option.

BEAM command builders wrap the same argv structure:

mix("ecto.migrate", opts: [quiet: true])
mix("exograph.web", opts: [port: 4200])
elixir("script.exs", opts: [name: "demo"])
elixir(args: ["--version"])
eval("IO.puts(:ok)")

These return %HostKit.CommandLine{} and can be used anywhere exec: or exec_start accepts command lines. In DSL context, mix, elixir, and eval default to path(:bin, "mix") / path(:bin, "elixir"), so projects can override the executable root with roots bin: ....

Systemd unit names

daemon, job, and schedule normalize systemd suffixes. Strings without a suffix get the right suffix; strings with .service/.timer are preserved. Atom names use the configured :unit prefix.

daemon "custom" do ... end      # custom.service
schedule "custom" do ... end    # custom.timer

daemon :health_alert do ... end  # e.g. toys-health-alert.service
schedule :health_alert do ... end

Use raw systemd_service/systemd_timer only when you intentionally want the low-level resource constructor.

Timer schedule helpers

schedule supports typed helpers for common systemd timer shapes while keeping raw systemd calendar syntax available through timer on_calendar: ....

schedule :backup do
  daily at: ~T[02:30:00]
  jitter "15m"
  persistent true
  wanted_by :timers
end

schedule :weekly_maintenance do
  weekly :monday, at: "03:00"
end

schedule :monthly_report do
  monthly day: 1, at: "04:00"
end
  • daily at: time renders *-*-* HH:MM:SS.
  • weekly day, at: time renders Day *-*-* HH:MM:SS.
  • monthly day: n, at: time renders *-*-NN HH:MM:SS.
  • Times may be Time structs or strict "HH:MM" / "HH:MM:SS" strings.
  • jitter value sets RandomizedDelaySec.
  • repeat_after value sets OnUnitActiveSec.
  • after_boot value and on_boot value set OnBootSec.

Runtime isolation

HostKit uses shared runtime isolation structs for persistent systemd units and future transient Unitctl workloads:

sandbox = HostKit.Runtime.Sandbox.new(:strict_web)
resources = HostKit.Runtime.Resources.new(memory_max: "512M", cpu_quota: "50%")

service sandbox |> HostKit.Runtime.Sandbox.to_systemd_service_options()
service resources |> HostKit.Runtime.Resources.to_systemd_service_options()

Built-in profiles include :web_service, :strict_web, :strict_app, :small, :medium, and :large.

The daemon DSL exposes a human-oriented isolation block for common service hardening:

service :api do
  storage :data, mode: 0o750

  daemon do
    exec ["/opt/api/bin/server"]

    isolate do
      memory_max "512M"
      writable :data
      network :loopback
    end
  end
end

daemon do ... end derives the unit name from the enclosing service and enables it for multi-user.target by default. Use explicit systemd directives only when you need non-default unit behavior.

Runtime controls

HostKit exposes Unitctl as its core transient runtime layer:

{:ok, spec} =
  HostKit.Runtime.Spec.new(
    name: "demo-check",
    command: ["/usr/bin/env", "true"],
    sandbox: %{no_new_privileges: true, private_tmp: true}
  )

{:ok, instance} = HostKit.Runtime.start(spec)
{:ok, state} = HostKit.Runtime.status(instance)
:ok = HostKit.Runtime.stop(instance)

Mix tasks

mix host_kit.dump --require toys_infra.exs infra/config.exs
mix host_kit.plan --require toys_infra.exs infra/config.exs
mix host_kit.plan --require toys_infra.exs infra/config.exs --local
mix host_kit.plan --require toys_infra.exs infra/config.exs --local --ignore systemd_service:toys-exograph.service
mix host_kit.plan --require toys_infra.exs infra/config.exs --remote elixir.toys --user dannote --sudo
mix host_kit.apply --require toys_infra.exs infra/config.exs --local --dry-run
mix host_kit.render --require toys_infra.exs infra/config.exs systemd_service toys-exograph.service