@@ -40,6 +40,9 @@ Built on [uBlue](https://universal-blue.org/) (Fedora Atomic / Silverblue) with
4040| Inference Worker | 8465 | llama.cpp | LLM inference (CUDA / ROCm / Vulkan / Metal / CPU) |
4141| Diffusion Worker | 8455 | Python | Image and video generation (CUDA / ROCm / XPU / MPS / CPU) |
4242| Quarantine | -- | Python | 7-stage verify, scan, and promote pipeline |
43+ | Search Mediator | 8485 | Python | Sanitized web search (query PII stripping + result cleaning) |
44+ | SearXNG | 8888 | Python | Self-hosted metasearch engine (privacy-respecting engines only) |
45+ | Tor | 9050 | C | Anonymous SOCKS5 proxy (all searches routed through Tor) |
4346
4447## Hardware Support
4548
@@ -389,6 +392,46 @@ To disable the airlock again:
389392sudo systemctl stop secure-ai-airlock
390393```
391394
395+ ### Web Search (Tor-Routed, Optional)
396+
397+ Web search is ** disabled by default** . When enabled, the LLM can augment its answers with web search results — all routed through Tor for anonymity.
398+
399+ ** How it works:**
400+ 1 . The LLM generates a search query (your raw prompt never leaves the device)
401+ 2 . The search mediator strips PII (emails, phone numbers, SSNs, API keys, IPs) from the query
402+ 3 . The sanitized query goes to a local SearXNG instance
403+ 4 . SearXNG routes the search through Tor (your IP is hidden from search engines)
404+ 5 . Results come back through Tor, are stripped of HTML/scripts, and checked for prompt injection
405+ 6 . Clean results are injected as context for the LLM to formulate a better answer
406+ 7 . The UI shows a "web sources used" indicator with citations
407+
408+ ** To enable:**
409+
410+ ``` bash
411+ # Enable in policy first
412+ # Edit /etc/secure-ai/policy/policy.yaml and set search.enabled: true
413+
414+ # Start the search stack (Tor -> SearXNG -> Search Mediator)
415+ sudo systemctl start secure-ai-tor
416+ sudo systemctl start secure-ai-searxng
417+ sudo systemctl start secure-ai-search-mediator
418+ ```
419+
420+ ** Privacy protections:**
421+ - All traffic routed through Tor (IP hidden from search engines)
422+ - Only privacy-respecting engines enabled (DuckDuckGo, Wikipedia, StackOverflow, GitHub)
423+ - PII automatically stripped from outbound queries
424+ - Queries with >50% PII content are blocked entirely
425+ - Inbound results scanned for prompt injection attacks
426+ - Every search is audit-logged (query hash only, not raw content)
427+ - ` offline-only ` session mode hard-blocks all search even if enabled
428+
429+ ** To disable:**
430+
431+ ``` bash
432+ sudo systemctl stop secure-ai-search-mediator secure-ai-searxng secure-ai-tor
433+ ```
434+
392435---
393436
394437## Security Overview
@@ -421,6 +464,7 @@ Every model — whether downloaded from the catalog or imported by the user —
421464| ** Models** | 7-stage quarantine: source, format, integrity, provenance, static scan, behavioral test, diffusion scan |
422465| ** Tools** | Default-deny policy, path allowlisting, traversal protection, rate limiting |
423466| ** Egress** | Airlock disabled by default, PII/credential scanning, destination allowlist |
467+ | ** Search** | Tor-routed, PII stripped from queries, injection detection on results, audit logged |
424468| ** Services** | Systemd sandboxing: ProtectSystem=strict, PrivateNetwork, syscall filters |
425469| ** GPU Isolation** | Vendor-specific DeviceAllow (NVIDIA ` /dev/nvidia* ` , AMD ` /dev/kfd ` , Intel ` /dev/dri/* ` ), PrivateNetwork on all |
426470| ** Emergency** | Panic switch: instant network kill + route flush + service stop |
@@ -514,6 +558,19 @@ quarantine:
514558 smoke_test_max_critical: 1 # fail if >1 critical flag
515559` ` `
516560
561+ **Web search** (`policy/policy.yaml`):
562+ ` ` ` yaml
563+ search:
564+ enabled: false # disabled by default
565+ strip_pii: true # always strip PII from queries
566+ detect_injection: true # scan results for prompt injection
567+ audit: true # log every search (hash only)
568+ allowed_engines: # privacy-respecting engines only
569+ - duckduckgo
570+ - wikipedia
571+ - stackoverflow
572+ ` ` `
573+
517574**Tool firewall policy** (`policy/policy.yaml`):
518575` ` ` yaml
519576tools:
@@ -550,9 +607,11 @@ services/
550607 quarantine/ Python -- 7-stage verification + scanning pipeline
551608 inference-worker/ llama.cpp wrapper
552609 diffusion-worker/ Python -- Stable Diffusion image/video generation
610+ search-mediator/ Python -- Tor-routed web search with PII stripping
553611 ui/ Python/Flask -- Web UI (chat, generate, model management)
554612tests/
555613 test_pipeline.py Quarantine pipeline tests (48 tests)
614+ test_search.py Search mediator tests (27 tests)
556615 test_ui.py Web UI tests (7 tests)
557616docs/
558617 threat-model.md Formal threat model and security invariants
@@ -566,7 +625,7 @@ cd services/registry && go test -v -race ./...
566625cd services/tool-firewall && go test -v -race ./...
567626cd services/airlock && go test -v -race ./...
568627
569- # Python tests (55 total)
628+ # Python tests (82 total)
570629pip install pytest flask requests pyyaml
571630python -m pytest tests/ -v
572631
@@ -585,7 +644,9 @@ shellcheck files/system/usr/libexec/secure-ai/*.sh files/scripts/*.sh
585644- [x] ** M6 Hardening** -- Systemd sandboxing, kernel params, nftables, panic switch
586645- [x] ** M7 CI/CD** -- GitHub Actions, Go/Python tests, shellcheck, YAML validation
587646- [x] ** M8 Image/Video Generation** -- Diffusion worker, one-click downloads, generate UI
588- - [ ] ** M9 Polish** -- OPA/Rego policy engine, appliance setup wizard, documentation site
647+ - [x] ** M9 Multi-GPU Support** -- NVIDIA/AMD/Intel/Apple auto-detection, Vulkan fallback
648+ - [x] ** M10 Tor-Routed Search** -- SearXNG + Tor, PII stripping, injection detection, audit
649+ - [ ] ** M11 Polish** -- OPA/Rego policy engine, appliance setup wizard, documentation site
589650
590651## Troubleshooting
591652
0 commit comments