SecAI-Hub
diff --git a/‎README.md‎
Lines changed: 63 additions & 2 deletions b/‎README.md‎
Lines changed: 63 additions & 2 deletions
diff --git a/‎files/scripts/build-services.sh‎
Lines changed: 21 additions & 0 deletions b/‎files/scripts/build-services.sh‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎files/system/etc/secure-ai/config/appliance.yaml‎
Lines changed: 6 additions & 0 deletions b/‎files/system/etc/secure-ai/config/appliance.yaml‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎files/system/etc/secure-ai/policy/policy.yaml‎
Lines changed: 25 additions & 0 deletions b/‎files/system/etc/secure-ai/policy/policy.yaml‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎files/system/etc/secure-ai/searxng/settings.yml‎
Lines changed: 99 additions & 0 deletions b/‎files/system/etc/secure-ai/searxng/settings.yml‎
Lines changed: 99 additions & 0 deletions
diff --git a/‎files/system/etc/secure-ai/tor/torrc‎
Lines changed: 33 additions & 0 deletions b/‎files/system/etc/secure-ai/tor/torrc‎
Lines changed: 33 additions & 0 deletions
diff --git a/‎files/system/usr/lib/systemd/system/secure-ai-search-mediator.service‎
Lines changed: 41 additions & 0 deletions b/‎files/system/usr/lib/systemd/system/secure-ai-search-mediator.service‎
Lines changed: 41 additions & 0 deletions
@@ -40,6 +40,9 @@ Built on [uBlue](https://universal-blue.org/) (Fedora Atomic / Silverblue) with
 | Inference Worker | 8465 | llama.cpp | LLM inference (CUDA / ROCm / Vulkan / Metal / CPU) |
 | Diffusion Worker | 8455 | Python | Image and video generation (CUDA / ROCm / XPU / MPS / CPU) |
 | Quarantine | -- | Python | 7-stage verify, scan, and promote pipeline |
+| Search Mediator | 8485 | Python | Sanitized web search (query PII stripping + result cleaning) |
+| SearXNG | 8888 | Python | Self-hosted metasearch engine (privacy-respecting engines only) |
+| Tor | 9050 | C | Anonymous SOCKS5 proxy (all searches routed through Tor) |
 
 ## Hardware Support
 
@@ -389,6 +392,46 @@ To disable the airlock again:
 sudo systemctl stop secure-ai-airlock
 ```
 
+### Web Search (Tor-Routed, Optional)
+
+Web search is **disabled by default**. When enabled, the LLM can augment its answers with web search results — all routed through Tor for anonymity.
+
+**How it works:**
+1. The LLM generates a search query (your raw prompt never leaves the device)
+2. The search mediator strips PII (emails, phone numbers, SSNs, API keys, IPs) from the query
+3. The sanitized query goes to a local SearXNG instance
+4. SearXNG routes the search through Tor (your IP is hidden from search engines)
+5. Results come back through Tor, are stripped of HTML/scripts, and checked for prompt injection
+6. Clean results are injected as context for the LLM to formulate a better answer
+7. The UI shows a "web sources used" indicator with citations
+
+**To enable:**
+
+```bash
+# Enable in policy first
+# Edit /etc/secure-ai/policy/policy.yaml and set search.enabled: true
+
+# Start the search stack (Tor -> SearXNG -> Search Mediator)
+sudo systemctl start secure-ai-tor
+sudo systemctl start secure-ai-searxng
+sudo systemctl start secure-ai-search-mediator
+```
+
+**Privacy protections:**
+- All traffic routed through Tor (IP hidden from search engines)
+- Only privacy-respecting engines enabled (DuckDuckGo, Wikipedia, StackOverflow, GitHub)
+- PII automatically stripped from outbound queries
+- Queries with >50% PII content are blocked entirely
+- Inbound results scanned for prompt injection attacks
+- Every search is audit-logged (query hash only, not raw content)
+- `offline-only` session mode hard-blocks all search even if enabled
+
+**To disable:**
+
+```bash
+sudo systemctl stop secure-ai-search-mediator secure-ai-searxng secure-ai-tor
+```
+
 ---
 
 ## Security Overview
@@ -421,6 +464,7 @@ Every model — whether downloaded from the catalog or imported by the user —
 | **Models** | 7-stage quarantine: source, format, integrity, provenance, static scan, behavioral test, diffusion scan |
 | **Tools** | Default-deny policy, path allowlisting, traversal protection, rate limiting |
 | **Egress** | Airlock disabled by default, PII/credential scanning, destination allowlist |
+| **Search** | Tor-routed, PII stripped from queries, injection detection on results, audit logged |
 | **Services** | Systemd sandboxing: ProtectSystem=strict, PrivateNetwork, syscall filters |
 | **GPU Isolation** | Vendor-specific DeviceAllow (NVIDIA `/dev/nvidia*`, AMD `/dev/kfd`, Intel `/dev/dri/*`), PrivateNetwork on all |
 | **Emergency** | Panic switch: instant network kill + route flush + service stop |
@@ -514,6 +558,19 @@ quarantine:
   smoke_test_max_critical: 1  # fail if >1 critical flag
 ```
 
+**Web search** (`policy/policy.yaml`):
+```yaml
+search:
+  enabled: false          # disabled by default
+  strip_pii: true         # always strip PII from queries
+  detect_injection: true  # scan results for prompt injection
+  audit: true             # log every search (hash only)
+  allowed_engines:        # privacy-respecting engines only
+    - duckduckgo
+    - wikipedia
+    - stackoverflow
+```
+
 **Tool firewall policy** (`policy/policy.yaml`):
 ```yaml
 tools:
@@ -550,9 +607,11 @@ services/
   quarantine/           Python -- 7-stage verification + scanning pipeline
   inference-worker/     llama.cpp wrapper
   diffusion-worker/     Python -- Stable Diffusion image/video generation
+  search-mediator/      Python -- Tor-routed web search with PII stripping
   ui/                   Python/Flask -- Web UI (chat, generate, model management)
 tests/
   test_pipeline.py      Quarantine pipeline tests (48 tests)
+  test_search.py        Search mediator tests (27 tests)
   test_ui.py            Web UI tests (7 tests)
 docs/
   threat-model.md       Formal threat model and security invariants
@@ -566,7 +625,7 @@ cd services/registry && go test -v -race ./...
 cd services/tool-firewall && go test -v -race ./...
 cd services/airlock && go test -v -race ./...
 
-# Python tests (55 total)
+# Python tests (82 total)
 pip install pytest flask requests pyyaml
 python -m pytest tests/ -v
 
@@ -585,7 +644,9 @@ shellcheck files/system/usr/libexec/secure-ai/*.sh files/scripts/*.sh
 - [x] **M6 Hardening** -- Systemd sandboxing, kernel params, nftables, panic switch
 - [x] **M7 CI/CD** -- GitHub Actions, Go/Python tests, shellcheck, YAML validation
 - [x] **M8 Image/Video Generation** -- Diffusion worker, one-click downloads, generate UI
-- [ ] **M9 Polish** -- OPA/Rego policy engine, appliance setup wizard, documentation site
+- [x] **M9 Multi-GPU Support** -- NVIDIA/AMD/Intel/Apple auto-detection, Vulkan fallback
+- [x] **M10 Tor-Routed Search** -- SearXNG + Tor, PII stripping, injection detection, audit
+- [ ] **M11 Polish** -- OPA/Rego policy engine, appliance setup wizard, documentation site
 
 ## Troubleshooting
 
 
@@ -63,6 +63,27 @@ mkdir -p "$DIFFUSION_DIR"
 cp /tmp/services/diffusion-worker/app.py "$DIFFUSION_DIR/app.py"
 echo "  -> ${DIFFUSION_DIR}/app.py"
 
+# Search mediator
+echo "Installing: search-mediator"
+SEARCH_DIR="/opt/secure-ai/services/search-mediator"
+mkdir -p "$SEARCH_DIR"
+cp /tmp/services/search-mediator/app.py "$SEARCH_DIR/app.py"
+cat > "${INSTALL_DIR}/search-mediator" <<'WRAPPER'
+#!/usr/bin/env python3
+import sys
+sys.path.insert(0, "/opt/secure-ai/services/search-mediator")
+from app import main
+main()
+WRAPPER
+chmod +x "${INSTALL_DIR}/search-mediator"
+echo "  -> ${INSTALL_DIR}/search-mediator"
+
+# Install SearXNG via pip if not available as RPM
+echo "Installing: searxng"
+pip3 install --prefix=/usr --no-cache-dir searxng 2>/dev/null || \
+    pip3 install --prefix=/usr --break-system-packages --no-cache-dir searxng 2>/dev/null || \
+    echo "WARNING: searxng pip install failed, relying on RPM package"
+
 # Cleanup build artifacts
 rm -rf "$SRC_DIR"
 dnf remove -y golang 2>/dev/null || true
 
@@ -41,6 +41,12 @@ services:
     bind: "127.0.0.1:8490"
   diffusion:
     bind: "127.0.0.1:8455"
+  search_mediator:
+    bind: "127.0.0.1:8485"
+  searxng:
+    bind: "127.0.0.1:8888"
+  tor:
+    socks: "127.0.0.1:9050"
 
 session:
   mode: "normal"  # normal | sensitive | offline-only
 
@@ -65,6 +65,31 @@ tools:
     - name: "process.spawn"
     - name: "filesystem.delete"
 
+search:
+  # Tor-routed web search via self-hosted SearXNG
+  # Disabled by default — user must explicitly enable
+  enabled: false
+  # Maximum query length sent to SearXNG (after PII stripping)
+  max_query_length: 200
+  # Maximum results returned per search
+  max_results: 5
+  # Maximum context size injected into LLM (characters)
+  max_context_length: 4000
+  # PII scanning on outbound queries (always on)
+  strip_pii: true
+  # Block queries that are >50% redacted PII
+  block_high_pii_queries: true
+  # Injection detection on inbound results
+  detect_injection: true
+  # Audit every search (query hash + sanitized query + result count)
+  audit: true
+  # Search engines enabled in SearXNG (privacy-respecting only)
+  allowed_engines:
+    - duckduckgo
+    - wikipedia
+    - stackoverflow
+    - github
+
 airlock:
   enabled: false
   allowed_destinations:
 
@@ -0,0 +1,99 @@
+# Secure AI Appliance - SearXNG Configuration
+# Self-hosted metasearch engine. All outbound requests route through Tor.
+# Listens on localhost only — accessed by the search mediator service.
+
+general:
+  instance_name: "SecAI Search"
+  debug: false
+  enable_metrics: false
+
+server:
+  bind_address: "127.0.0.1"
+  port: 8888
+  secret_key: "secai-local-only-key"
+  # No public access — localhost only
+  limiter: false
+  # Disable public image proxy (no need, we strip images)
+  image_proxy: false
+
+search:
+  safe_search: 1
+  default_lang: "en"
+  autocomplete: false
+  # Max results per engine
+  max_results: 10
+
+# Route ALL outbound requests through Tor SOCKS5 proxy
+outgoing:
+  proxies:
+    all://:
+      - socks5h://127.0.0.1:9050
+  # Timeout per engine request (Tor adds latency)
+  request_timeout: 20
+  useragent_suffix: ""
+
+# Only enable privacy-respecting search engines
+engines:
+  # Web search
+  - name: duckduckgo
+    engine: duckduckgo
+    shortcut: ddg
+    disabled: false
+
+  - name: wikipedia
+    engine: wikipedia
+    shortcut: wp
+    disabled: false
+
+  - name: wikidata
+    engine: wikidata
+    shortcut: wd
+    disabled: false
+
+  # Programming / technical
+  - name: stackoverflow
+    engine: stackoverflow
+    shortcut: so
+    disabled: false
+
+  - name: github
+    engine: github
+    shortcut: gh
+    disabled: false
+
+  - name: arch wiki
+    engine: archlinux
+    shortcut: aw
+    disabled: false
+
+  # Disable all engines that require API keys or track users
+  - name: google
+    engine: google
+    disabled: true
+
+  - name: bing
+    engine: bing
+    disabled: true
+
+  - name: yahoo
+    engine: yahoo
+    disabled: true
+
+  - name: brave
+    engine: brave
+    disabled: true
+
+# Disable analytics, tracking, result collection
+ui:
+  static_use_hash: true
+  default_theme: simple
+  results_on_new_tab: false
+
+# No external plugins
+plugins: []
+
+# Privacy-focused defaults
+enabled_plugins:
+  - "Hash plugin"
+  - "Hostname replace"
+  - "Tracker URL remover"
@@ -0,0 +1,33 @@
+# Secure AI Appliance - Tor configuration
+# Provides anonymous SOCKS5 proxy for SearXNG search routing.
+# Only SearXNG connects to Tor — no other services have access.
+
+# SOCKS5 proxy for SearXNG
+SocksPort 127.0.0.1:9050
+
+# Disable direct connections (SOCKS only)
+SocksPolicy accept 127.0.0.1
+SocksPolicy reject *
+
+# Circuit isolation: each destination gets its own circuit
+IsolateDestAddr 1
+IsolateDestPort 1
+
+# New circuit for each SearXNG query batch (rotate every 60s)
+NewCircuitPeriod 60
+MaxCircuitDirtiness 600
+
+# Disable unused features
+ControlPort 0
+DNSPort 0
+
+# Logging (minimal — don't log query content)
+Log notice file /var/lib/secure-ai/logs/tor.log
+SafeLogging 1
+
+# Data directory
+DataDirectory /var/lib/secure-ai/tor
+
+# Hardened security settings
+Sandbox 1
+NoExec 1
@@ -0,0 +1,41 @@
+[Unit]
+Description=Secure AI Appliance - Search Mediator (Sanitized Web Search)
+After=secure-ai-searxng.service
+Wants=secure-ai-searxng.service
+
+[Service]
+Type=simple
+ExecStart=/usr/libexec/secure-ai/search-mediator
+Restart=on-failure
+RestartSec=5
+
+Environment=BIND_ADDR=127.0.0.1:8485
+Environment=SEARXNG_URL=http://127.0.0.1:8888
+Environment=APPLIANCE_CONFIG=/etc/secure-ai/config/appliance.yaml
+Environment=POLICY_PATH=/etc/secure-ai/policy/policy.yaml
+Environment=AUDIT_DIR=/var/lib/secure-ai/logs
+
+# Sandboxing — mediator only talks to SearXNG on localhost
+ProtectSystem=strict
+ReadWritePaths=/var/lib/secure-ai/logs
+ReadOnlyPaths=/etc/secure-ai
+PrivateTmp=yes
+ProtectHome=yes
+ProtectKernelTunables=yes
+ProtectKernelModules=yes
+ProtectControlGroups=yes
+NoNewPrivileges=yes
+RestrictSUIDSGID=yes
+
+# Only needs localhost network (to reach SearXNG)
+RestrictAddressFamilies=AF_INET AF_UNIX
+
+# No GPU, no devices
+PrivateDevices=yes
+
+# Resource limits
+MemoryMax=256M
+TasksMax=64
+
+[Install]
+WantedBy=multi-user.target