Improve setup UI flow

SecAI-Hub · SecAI-Hub · commit d299c33a23e5 · 2026-04-28T19:06:58.000-07:00
diff --git a/README.md b/README.md
@@ -316,7 +316,7 @@ All CI jobs are defined in [`.github/workflows/ci.yml`](.github/workflows/ci.yml
 | Job | Workflow Link | What It Proves |
 |-----|--------------|---------------|
 | `go-build-and-test` | [View job](https://github.com/SecAI-Hub/SecAI_OS/actions/workflows/ci.yml) | 428 Go tests across 9 services with `-race` (build, test, vet) |
-| `python-test` | [View job](https://github.com/SecAI-Hub/SecAI_OS/actions/workflows/ci.yml) | 1,133 Python tests (unit/integration + adversarial/acceptance), ruff lint, bandit security scan (enforced on HIGH/HIGH), mypy type checking |
+| `python-test` | [View job](https://github.com/SecAI-Hub/SecAI_OS/actions/workflows/ci.yml) | 1,136 Python tests (unit/integration + adversarial/acceptance), ruff lint, bandit security scan (enforced on HIGH/HIGH), mypy type checking |
 | `appsec-lint` | [View job](https://github.com/SecAI-Hub/SecAI_OS/actions/workflows/ci.yml) | Hadolint for container build files and Semgrep project security rules |
 | `security-regression` | [View job](https://github.com/SecAI-Hub/SecAI_OS/actions/workflows/ci.yml) | Adversarial test suite: prompt injection, policy bypass, containment, recovery |
 | `supply-chain-verify` | [View job](https://github.com/SecAI-Hub/SecAI_OS/actions/workflows/ci.yml) | SBOM generation via Syft, cosign availability, provenance keywords in release/build workflows |
@@ -338,7 +338,7 @@ All CI jobs are defined in [`.github/workflows/ci.yml`](.github/workflows/ci.yml
 | [API Reference](docs/api.md) | HTTP API for all services |
 | [Policy Schema](docs/policy-schema.md) | Full policy.yaml schema reference |
 | [Security Status](docs/security-status.md) | Implementation status of all 54 milestones |
-| [Test Matrix](docs/test-matrix.md) | Test coverage: 1,561 tests across Go and Python (see [test-counts.json](docs/test-counts.json)) |
+| [Test Matrix](docs/test-matrix.md) | Test coverage: 1,564 tests across Go and Python (see [test-counts.json](docs/test-counts.json)) |
 | [Compatibility Matrix](docs/compatibility-matrix.md) | GPU, VM, and hardware support |
 | [Security Test Matrix](docs/security-test-matrix.md) | Security feature test coverage |
 | [FAQ](docs/faq.md) | Common questions |
@@ -462,7 +462,7 @@ for svc in airlock registry tool-firewall gpu-integrity-watch mcp-firewall \
   (cd services/$svc && go test -v -race ./...)
 done
 
-# Python tests (1,133 total)
+# Python tests (1,136 total)
 python -m pip install -r requirements-ci.txt
 PYTHONPATH=services python -m pytest tests/ -v
 
@@ -564,7 +564,7 @@ services/
   search-mediator/          Python -- Tor-routed web search (:8485)
   ui/                       Python/Flask -- Web UI (:8480)
   common/                   Python -- Shared utilities (audit, auth, mlock)
-tests/                      1,133 Python tests, 428 Go tests (1,561 total)
+tests/                      1,136 Python tests, 428 Go tests (1,564 total)
 docs/                       Architecture, API, threat model, install guides
 schemas/                    OpenAPI spec, JSON Schema for config files
 examples/                   Task-oriented walkthroughs
diff --git a/docs/security-status.md b/docs/security-status.md
@@ -19,7 +19,7 @@ All M5 security assurance criteria are met. The controls below have been impleme
 | Tool Firewall, default-deny policy | Implemented | M4 | Go tool-firewall service on :8475, default-deny egress |
 | Online Airlock, sanitization | Implemented | M5 | Go airlock service on :8490, disabled by default (privacy risk) |
 | Systemd sandboxing, kernel hardening, nftables | Implemented | M6 | Systemd unit hardening, sysctl tuning, nftables rules |
-| CI/CD, Go/Python tests, shellcheck | Implemented | M7 | GitHub Actions ci.yml. See docs/test-counts.json for current counts (428 Go, 1133 Python as of 2026-04-29) |
+| CI/CD, Go/Python tests, shellcheck | Implemented | M7 | GitHub Actions ci.yml. See docs/test-counts.json for current counts (428 Go, 1136 Python as of 2026-04-29) |
 | Image/video generation, diffusion worker | Implemented | M8 | Diffusion worker for image generation workloads |
 | Multi-GPU support (NVIDIA/AMD/Intel/Apple) | Implemented | M9 | CUDA, ROCm/HIP, XPU/Vulkan, Metal/MPS backends |
 | Tor-routed search, SearXNG, PII stripping | Implemented | M10 | Search mediator with Tor routing and PII redaction |
@@ -60,7 +60,7 @@ All M5 security assurance criteria are met. The controls below have been impleme
 | Production readiness hardening | Implemented | M45 | Incident recorder file-backed persistence (survives restarts), graceful shutdown (SIGTERM/SIGINT with connection draining) for all 9 Go services, HTTP server timeouts for mcp-firewall and gpu-integrity-watch, systemd production hardening (TimeoutStartSec, TimeoutStopSec, StartLimitInterval, StartLimitBurst) for all 12 daemon units, first-boot health validation script, audit log rotation via logrotate, CI dependency vulnerability scanning (govulncheck + pip-audit), production operations guide (upgrade, key rotation, capacity limits, monitoring) |
 | Operational maturity | Implemented | M46 | Bootstrap trust gap fix (cosign verify before unverified rebase, documented trust gap rationale), CI runs on all changes (removed blanket paths-ignore for .md files), Python quality gates (ruff lint + bandit security scan + split test suites into unit/integration and adversarial/acceptance), docs-validation CI job (broken link detection, required docs check, test-counts.json validation), production-readiness checklist (formal release gate), SLOs (availability/latency/correctness targets + alerting thresholds), release channel policy (stable/candidate/dev + versioning + upgrade paths + security patch SLA), support lifecycle (hardware matrix, driver versions, support windows, deprecation policy, scope boundaries), CI evidence table with current job descriptions and workflow links, sample verification output for verify-release.sh |
 | CI enforcement hardening | Implemented | M47 | Enforced vulnerability scanning: bandit fails CI on HIGH-severity/HIGH-confidence findings, govulncheck fails on unwaived Go vulns, pip-audit fails on unwaived Python vulns. Waiver mechanism (`.github/vuln-waivers.json`) with mandatory expiry dates for reviewed/accepted findings. mypy type checking gate for security-sensitive services (common, agent, quarantine, ui). Pinned reproducible Python CI dependencies (`requirements-ci.txt`). Go 1.23->1.25 upgrade fixing 12 stdlib CVEs (crypto/tls, crypto/x509, encoding/asn1, net/url, os). Flask 3.1.1->3.1.3 (GHSA-68rp-wp8r-4726). Verification-first bootstrap documentation (signed rebase as default quickstart, unverified bootstrap moved to labeled recovery section). |
-| Production hardening | Implemented | M48 | Build script fail-closed for required services, quarantine scanners, search mediator, and signing policy material; final binary verification gate; incident store fsync (f.Sync() before close on both incident persistence and audit log writes); GPU backend metadata recording (`/etc/secure-ai/gpu-backend.json` written at build time with backend/version/timestamp); llama-server watchdog (Type=notify wrapper with startup health gate + WatchdogSec=30 continuous monitoring); model catalog externalization (`/etc/secure-ai/model-catalog.yaml` with YAML loading + hardcoded fallback); circuit breaker for Python services; post-upgrade model verification in Greenboot; cosign key rotation documentation. Current automated suite: 428 Go + 1133 Python tests (1,561 total). |
+| Production hardening | Implemented | M48 | Build script fail-closed for required services, quarantine scanners, search mediator, and signing policy material; final binary verification gate; incident store fsync (f.Sync() before close on both incident persistence and audit log writes); GPU backend metadata recording (`/etc/secure-ai/gpu-backend.json` written at build time with backend/version/timestamp); llama-server watchdog (Type=notify wrapper with startup health gate + WatchdogSec=30 continuous monitoring); model catalog externalization (`/etc/secure-ai/model-catalog.yaml` with YAML loading + hardcoded fallback); circuit breaker for Python services; post-upgrade model verification in Greenboot; cosign key rotation documentation. Current automated suite: 428 Go + 1136 Python tests (1,564 total). |
 | Signed-first install path | Implemented | M49 | Signed bootstrap script (`secai-bootstrap.sh`) configures container signing policy (policy.json + registries.d + cosign public key) before first rebase -- eliminates unverified transport from production install path. Digest-pinned install flow (CI publishes image digest in build summary and release assets). First-boot setup wizard (interactive verification of image integrity, transport, vault setup, TPM2 sealing, health check). Signing policy files baked into OS image (`/etc/pki/containers/secai-cosign.pub`, `/etc/containers/registries.d/secai-os.yaml`, policy.json merge in build script). Recovery/dev bootstrap path separated into dedicated doc with clear warnings. |
 | Production operations package | Implemented | M50 | Backup script (`secai-backup.sh`) with full/config/logs/keys categories, age/gpg encryption, internal SHA256 manifest, LUKS header backup. Restore script (`secai-restore.sh`) with integrity verification, staging extraction, double-confirmation LUKS header restore, post-restore health check. Production operations doc extended with rollback decision matrix (Greenboot auto-rollback triggers + manual criteria), 5 break-glass recovery procedures (token loss, attestation failure, Level 1 panic lockout, signing policy break, Greenboot exhaustion), formal data retention policy (7 data classes with retention periods, disk capacity thresholds at 70/80/90/95%). |
 | Stronger observability | Implemented | M51 | Unified appliance health dashboard (trusted/degraded/recovery_required state derived from runtime attestor + integrity monitor + incident recorder). Live SLO compliance monitoring (in-process tracker measuring uptime % and P95 latency against docs/slos.md targets, 7-day rolling window). Webhook alerting hooks for containment events (fire-and-forget POST with retry, configurable per-event-type filtering in incident-containment.yaml). Forensic bundle export wired to HTTP mux (was implemented but unregistered), enriched with real audit log entries and policy digest, accessible via UI download button, Flask proxy, and CLI script (`secai-forensic.sh`). Recovery ceremony endpoints also wired (ack, reattest, status). |
diff --git a/docs/security-test-matrix.md b/docs/security-test-matrix.md
@@ -19,7 +19,7 @@ Last updated: 2026-04-29
 | Emergency wipe | tests/test_emergency_wipe.py | Python | 65 | 3-level panic escalation, secure deletion, vault destruction, recovery prevention |
 | Update verification | tests/test_update_rollback.py | Python | 74 | Signature verification, rollback triggers, version pinning, recovery |
 | Vault auto-lock | tests/test_vault_watchdog.py | Python | 21 | Idle detection, lock timer, UI lock/unlock controls |
-| Web UI security | tests/test_ui.py, tests/test_ui_cookies.py, tests/test_ui_file_handling.py | Python | 79 total | Route protection, input validation, CSP/cookie headers, upload/path handling |
+| Web UI security | tests/test_ui.py, tests/test_ui_cookies.py, tests/test_ui_file_handling.py | Python | 82 total | Route protection, input validation, CSP/cookie headers, setup completion, upload/path handling |
 | Tool firewall | services/tool-firewall/*_test.go | Go | 15 | Default-deny policy, rule evaluation, egress filtering |
 | Airlock | services/airlock/*_test.go | Go | 11 | Request sanitization, policy enforcement, disabled-by-default |
 | Trusted registry | services/registry/*_test.go | Go | 22 | Hash pinning, cosign verification, model fetch authorization |
@@ -72,15 +72,15 @@ Last updated: 2026-04-29
 |------|-------|-------|
 | Memory protection | 37 | Prevents secrets from leaking to disk |
 | Vault auto-lock | 21 | Automatic vault lock on idle |
-| Web UI security | 79 total | CSRF, CSP, cookie flags, input validation, upload/path handling |
+| Web UI security | 82 total | CSRF, CSP, cookie flags, setup completion, input validation, upload/path handling |
 
 ## Total Test Counts
 
 | Language | Current Automated Tests | Source of Truth |
 |----------|--------------------------|-----------------|
-| Python | 1133 | `docs/test-counts.json` and `pytest --collect-only` |
+| Python | 1136 | `docs/test-counts.json` and `pytest --collect-only` |
 | Go | 428 | `docs/test-counts.json` and `go test -v -count=1 ./...` |
-| **Total** | **1561** | Enforced by `.github/scripts/check-test-counts.sh` |
+| **Total** | **1564** | Enforced by `.github/scripts/check-test-counts.sh` |
 
 Security coverage overlaps heavily with functional coverage, so the feature tables above use exact file or service totals rather than attempting to split each test into exclusive "security" and "non-security" buckets.
 
diff --git a/docs/test-counts.json b/docs/test-counts.json
@@ -12,6 +12,6 @@
     "incident-recorder": 97
   },
   "go_total": 428,
-  "python_total": 1133,
-  "grand_total": 1561
+  "python_total": 1136,
+  "grand_total": 1564
 }
diff --git a/docs/test-matrix.md b/docs/test-matrix.md
@@ -12,7 +12,7 @@ Last updated: 2026-04-29
 | Language | Test Count | Runner |
 |----------|-----------|--------|
 | Go | 428 | `go test -race ./...` |
-| Python | 1133 | `pytest` |
+| Python | 1136 | `pytest` |
 | Shell | CI-scoped scripts plus Makefile target for all repo shell scripts | `shellcheck` |
 
 ## Go Tests (428 total)
@@ -29,7 +29,7 @@ Last updated: 2026-04-29
 | Integrity Monitor | services/integrity-monitor/ | 50 | Baseline computation, continuous scanning, violation detection, state machine, HMAC baselines, incident-recorder integration |
 | Incident Recorder | services/incident-recorder/ | 97 | Incident creation, auto-containment, lifecycle management, severity ranking, policy loading, containment execution, enforcement chain integration, recovery ceremony, severity escalation, forensic bundle export (M43), persistence durability (fsync) |
 
-## Python Tests (1133 total)
+## Python Tests (1136 total)
 
 | Test File | Location | Tests | Description |
 |-----------|----------|-------|-------------|
@@ -63,7 +63,7 @@ Last updated: 2026-04-29
 | test_search.py | tests/ | 36 | Search mediator, PII stripping, injection detection |
 | test_secure_boot.py | tests/ | 38 | Secure boot and measured boot behavior |
 | test_traffic_analysis.py | tests/ | 41 | Padding, timing jitter, dummy traffic generation |
-| test_ui.py | tests/ | 56 | Flask web UI routes, rendering, input handling, model catalog loading |
+| test_ui.py | tests/ | 59 | Flask web UI routes, rendering, setup completion, input handling, model catalog loading |
 | test_ui_cookies.py | tests/ | 11 | UI cookie security attributes |
 | test_ui_file_handling.py | tests/ | 12 | UI file upload and path handling |
 | test_update_rollback.py | tests/ | 74 | Signed update verification, rollback triggers, recovery |
diff --git a/services/ui/ui/app.py b/services/ui/ui/app.py
@@ -733,6 +733,70 @@ def has_models() -> bool:
         return False
 
 
+def _is_gguf_model_record(model: object) -> bool:
+    if not isinstance(model, dict):
+        return False
+    model_format = str(model.get("format") or "").lower()
+    filename = str(model.get("filename") or model.get("name") or "").lower()
+    return model_format == "gguf" or filename.endswith(".gguf")
+
+
+def has_chat_model() -> bool:
+    try:
+        resp = requests.get(f"{REGISTRY_URL}/v1/models", timeout=2)
+        models = resp.json()
+        return isinstance(models, list) and any(
+            _is_gguf_model_record(model) for model in models
+        )
+    except Exception:
+        return False
+
+
+def _write_setup_marker(profile: str) -> None:
+    """Mark the first-run setup flow as complete."""
+    SECURE_AI_ROOT.mkdir(parents=True, exist_ok=True)
+    marker = SECURE_AI_ROOT / ".initialized"
+    tmp_marker = SECURE_AI_ROOT / f".initialized.{os.getpid()}.tmp"
+    payload = {
+        "completed_at": time.time(),
+        "deployment_mode": _deployment_mode(),
+        "profile": profile,
+    }
+    with open(tmp_marker, "w", encoding="utf-8") as f:
+        json.dump(payload, f, sort_keys=True)
+        f.write("\n")
+        f.flush()
+        os.fsync(f.fileno())
+    os.chmod(tmp_marker, 0o600)
+    os.replace(tmp_marker, marker)
+
+
+@app.route("/api/setup/complete", methods=["POST"])
+def setup_complete():
+    """Complete the first-run setup flow and route the user to chat."""
+    data = request.get_json(silent=True) or {}
+    active, locked = _read_active_profile()
+    profile = data.get("profile") or active
+    if profile not in VALID_PROFILES:
+        return jsonify({"error": f"invalid profile: {profile}"}), 400
+    if (locked or _is_sandbox_deployment()) and profile != active:
+        return jsonify({"error": "profile does not match active runtime"}), 409
+    if not has_chat_model():
+        return jsonify({"error": "GGUF chat model required"}), 409
+
+    try:
+        _write_setup_marker(profile)
+    except OSError:
+        log.exception("failed to write setup marker")
+        return jsonify({"error": "failed to complete setup"}), 500
+
+    _ui_audit.append("setup_complete", {
+        "deployment_mode": _deployment_mode(),
+        "profile": profile,
+    })
+    return jsonify({"success": True, "redirect": "/chat", "profile": profile})
+
+
 def load_appliance_config() -> dict:
     try:
         with open(APPLIANCE_CONFIG) as f:
@@ -745,7 +809,7 @@ def load_appliance_config() -> dict:
 
 @app.route("/")
 def index():
-    if is_first_boot() or not has_models():
+    if is_first_boot() or not has_chat_model():
         return render_template("setup.html")
     return render_template("index.html", active_page="chat")
 
diff --git a/services/ui/ui/templates/setup.html b/services/ui/ui/templates/setup.html
diff --git a/tests/test_ui.py b/tests/test_ui.py

Original file line number	Diff line number	Diff line change
`@@ -12,6 +12,6 @@`
`12`	`12`	`"incident-recorder": 97`
`13`	`13`	`},`
`14`	`14`	`"go_total": 428,`
`15`		`- "python_total": 1133,`
`16`		`- "grand_total": 1561`
	`15`	`+ "python_total": 1136,`
	`16`	`+ "grand_total": 1564`
`17`	`17`	`}`