DOC-3498: Address content gaps identified in source audit

kemister85 · kemister85 · commit b06e673048a9 · 2026-05-13T14:55:27.000+10:00
Add missing customer-facing content identified by comparing the
original internal documentation against the current on-premises
AsciiDoc pages: capabilities matrix on the overview page, Podman
production runbook, performance characteristics table, expanded
known limits reference, MySQL 8.4 caveat, Ollama systemd and
Modelfile examples, and getting-started teardown and config update
guidance.
diff --git a/modules/ROOT/pages/tinymceai-on-premises-database.adoc b/modules/ROOT/pages/tinymceai-on-premises-database.adoc
@@ -95,7 +95,7 @@ Do *not* use `mysql:8`. That tag now floats to MySQL 8.4, which removes the `def
 [ERROR] [MY-010119] [Server] Aborting
 ....
 
-Pin to `mysql:8.0` in every manifest: `docker run`, Docker Compose, Kubernetes, Helm, ECS.
+Pin to `mysql:8.0` in every manifest: `docker run`, Docker Compose, Kubernetes, Helm, ECS. Running MySQL 8.4 with workarounds (removing the flag and switching to `caching_sha2_password`) is not a supported configuration.
 
 TIP: The same principle applies to PostgreSQL. Pin `postgres:16` rather than `postgres:latest`.
 
diff --git a/modules/ROOT/pages/tinymceai-on-premises-getting-started.adoc b/modules/ROOT/pages/tinymceai-on-premises-getting-started.adoc
@@ -426,3 +426,42 @@ data: {}
 If the stream emits `event: error`, inspect the `data` payload. Provider errors (invalid API key, IAM denial, model unavailable) ride inside the SSE response. The HTTP status stays 200. See the xref:tinymceai-on-premises-troubleshooting.adoc[LLM provider errors] section in the Troubleshooting guide for details.
 
 A successful round-trip confirms: container health, database connectivity, Redis connectivity, JWT signing, JWT verification, permissions checking, environment registration, LLM provider authentication, and SSE streaming. If problems persist after these checks, focus on the editor configuration next.
+
+== Updating configuration
+
+IMPORTANT: `docker compose restart` after `.env` changes silently keeps the old environment values. The restart preserves the container and does not re-read `.env`. Always use `docker compose up -d --force-recreate` instead.
+
+[source,bash]
+----
+docker compose up -d --force-recreate
+# Or recreate only the AI service:
+docker compose up -d --force-recreate ai-service
+----
+
+For Kubernetes, update the Secret and trigger a rollout restart:
+
+[source,bash]
+----
+kubectl rollout restart deployment/ai-service -n tinymce-ai
+----
+
+== Stopping and cleaning up
+
+[source,bash]
+----
+# Stop the AI service (standalone Docker)
+docker stop ai-service && docker rm ai-service
+
+# Stop the Docker Compose stack
+docker compose down
+
+# Remove all data including volumes (destructive)
+docker compose down -v
+----
+
+For Kubernetes, scale the deployment to zero or delete it. Persistent volumes for the database are retained unless explicitly deleted.
+
+[source,bash]
+----
+kubectl delete deployment ai-service -n tinymce-ai
+----
diff --git a/modules/ROOT/pages/tinymceai-on-premises-production.adoc b/modules/ROOT/pages/tinymceai-on-premises-production.adoc
@@ -83,6 +83,41 @@ When deploying for the first time or upgrading to a new version, start a single
 
 
 
+== Podman deployment
+
+The AI service works with Podman as an alternative to Docker. In Podman, containers within a pod share a network namespace, so use `127.0.0.1` instead of container names for hostnames.
+
+[source,bash]
+----
+podman login -u 'TINY_REGISTRY_USERNAME' registry.containers.tiny.cloud
+
+podman pull registry.containers.tiny.cloud/ai-service:latest
+
+podman pod create --name ai-pod -p 8000:8000 -p 3306:3306 -p 6379:6379
+
+podman run -d --pod ai-pod --name mysql \
+  -e MYSQL_ROOT_PASSWORD=ROOT_PASSWORD \
+  -e MYSQL_DATABASE=ai_service \
+  mysql:8.0
+
+podman run -d --pod ai-pod --name redis redis:7
+
+podman run --init -d --pod ai-pod --name ai-service \
+  -e LICENSE_KEY='T8LK:...' \
+  -e ENVIRONMENTS_MANAGEMENT_SECRET_KEY='MANAGEMENT_SECRET' \
+  -e DATABASE_DRIVER='mysql' \
+  -e DATABASE_HOST='127.0.0.1' \
+  -e DATABASE_USER='root' \
+  -e DATABASE_PASSWORD='ROOT_PASSWORD' \
+  -e DATABASE_DATABASE='ai_service' \
+  -e REDIS_HOST='127.0.0.1' \
+  -e PROVIDERS='{"openai":{"type":"openai","apiKeys":["sk-proj-..."]}}' \
+  -e STORAGE_DRIVER='database' \
+  registry.containers.tiny.cloud/ai-service:latest
+----
+
+IMPORTANT: Pin to `mysql:8.0`. The `mysql:8` tag floats to MySQL 8.4, which removes the `default-authentication-plugin` flag and causes a crash loop. See xref:tinymceai-on-premises-database.adoc[Database, Redis, and storage] for details.
+
 == Kubernetes deployment
 
 === Namespace and image pull secret
@@ -562,6 +597,33 @@ License keys are per-deployment, not per-replica. One key covers any number of r
 
 
 
+== Performance characteristics
+
+[cols="1,1",options="header"]
+|===
+|Metric |Typical value
+
+|Cold start
+|Approximately 3 seconds
+
+|Health check response
+|Less than 10 ms
+
+|Token validation
+|Less than 5 ms
+
+|Time to first token (LLM)
+|200 ms to 2 s (depends on provider and model)
+
+|Memory per instance
+|256 to 512 MB
+
+|Concurrent connections
+|1,000{plus} per instance
+|===
+
+These values are approximate and vary with hardware, provider latency, and prompt complexity. The LLM provider's rate limits are typically the binding constraint before the AI service becomes one.
+
 == Sizing guide
 
 [cols=",,,,",options="header",]
diff --git a/modules/ROOT/pages/tinymceai-on-premises-providers.adoc b/modules/ROOT/pages/tinymceai-on-premises-providers.adoc
@@ -319,7 +319,7 @@ Azure-hosted OpenAI models. Requires an Azure subscription, an Azure OpenAI reso
 |`type` |Yes |Literal `"azure"`
 |`resourceName` |Yes |The `*.openai.azure.com` prefix only, not the full URL.
 |`apiKeys` |Yes |Array. Azure issues two keys per resource for zero-downtime key rotation.
-|`apiVersion` |Yes |Always set explicitly. Refer to https://learn.microsoft.com/azure/ai-services/openai/reference[Microsoft's API version matrix] for current values.
+|`apiVersion` |Yes |Always set explicitly. Omitting it produces a confusing SDK error about a missing query string parameter. Refer to https://learn.microsoft.com/azure/ai-services/openai/reference[Microsoft's API version matrix] for current stable values.
 |===
 
 IMPORTANT: The ``MODELS[].id`` value must match the Azure *deployment name* exactly. A mismatch produces a `DeploymentNotFound` error. Use human-readable deployment names because the ID also appears in JWT permission strings and the editor model picker.
@@ -682,6 +682,17 @@ Ollama listens on `127.0.0.1:11434` by default, which is unreachable from inside
 OLLAMA_HOST=0.0.0.0:11434 ollama serve
 ----
 
+On Linux with systemd, create an override file instead:
+
+[source,ini]
+----
+# /etc/systemd/system/ollama.service.d/override.conf
+[Service]
+Environment="OLLAMA_HOST=0.0.0.0:11434"
+----
+
+Then reload and restart: `sudo systemctl daemon-reload && sudo systemctl restart ollama`.
+
 On Linux, add the host gateway so `host.docker.internal` resolves:
 
 [source,yaml]
@@ -695,6 +706,27 @@ services:
 
 If Ollama returns "does not support tools", the model was built from a raw GGUF without a chat template. Use `ollama pull` for a Library model that includes a proper Modelfile, or author a custom one.
 
+.Custom Modelfile example
+[%collapsible]
+====
+[source]
+----
+FROM /path/to/your-model.gguf
+
+TEMPLATE """{{ if .System }}<|im_start|>system
+{{ .System }}<|im_end|>
+{{ end }}{{ range .Messages }}<|im_start|>{{ .Role }}
+{{ .Content }}<|im_end|>
+{{ end }}<|im_start|>assistant
+"""
+
+PARAMETER stop "<|im_end|>"
+PARAMETER stop "<|im_start|>"
+----
+
+The exact template depends on the base model. Check the model card for the recommended chat template. Verify tool support with `ollama show <model>` before connecting to the AI service.
+====
+
 The reasoning toggle (`capabilities.reasoning: true`) is cosmetic for Ollama-backed models; the openai-compatible adapter does not translate it to the native Ollama API.
 
 *Timeout:*
diff --git a/modules/ROOT/pages/tinymceai-on-premises-reference.adoc b/modules/ROOT/pages/tinymceai-on-premises-reference.adoc
@@ -170,8 +170,19 @@ Error codes returned in HTTP 4xx responses and inside SSE `event: error` payload
 [cols="1,1,3",options="header"]
 |===
 |Limit |Value |Notes
-|Maximum prompt length |100,000 characters |Hard limit enforced by the service. Summarize or shorten source content before it exceeds this threshold.
+
+|Maximum prompt length |100,000 characters |Hard limit enforced by the service. Requests exceeding this return `invalid-request-data`. Summarize or shorten source content before it exceeds this threshold.
+|Conversation create |Client-supplied `id` required |The plugin auto-generates `tiny-<uuid>`. Raw API callers must supply a unique `id` in the create body.
+|Stream-abort recovery |Stop button leaves stale state |The next message returns `409 conversation in use` then `404 conversation does not exist`. Recovery: start a new conversation or reload.
+|Built-in rate limiting |None |Front the service with nginx `limit_req` or ALB rate-limit rules. See xref:tinymceai-on-premises-production.adoc#rate-limiting[Rate limiting].
 |File support (OpenAI-compatible providers) |Images only (`image/*`) |PDFs, text, and Office files are not forwarded to OpenAI-compatible providers. Use a non-OpenAI-compatible provider for non-image file attachments.
 |MCP tool availability |Conversations only |MCP tools are not available in reviews or quick actions.
 |MCP authentication |Single shared token per server |The `headers` field in `MCP_SERVERS` is fixed at deploy time. Per-user authentication is not supported.
+|PostgreSQL default schema |`cs-on-premises` (with hyphen) |Pre-create with `CREATE SCHEMA "cs-on-premises";` or set `DATABASE_SCHEMA=public`.
+|`/v1/models/\{compatibilityVersion}` |Only accepts `1` |Values such as `v1`, `v2`, or `latest` return 500.
+|Environment creation through raw API |Not supported |Always create environments through the Management Panel UI.
+|Bedrock credentials |Inline only |The SDK default credential chain (IRSA, instance roles, `AWS_PROFILE`) is not used.
+|Vertex credentials |Inline only |Application Default Credentials, `GOOGLE_APPLICATION_CREDENTIALS`, and the metadata server are not used.
+|Azure `MODELS[].id` |Must equal deployment name |There is no separate `deploymentName` field. The ID is the deployment name.
+|OpenAI-compatible `baseUrl` |Must include `/v1` suffix |Omitting it produces a "Not Found" SSE error.
 |===
diff --git a/modules/ROOT/pages/tinymceai-on-premises.adoc b/modules/ROOT/pages/tinymceai-on-premises.adoc
@@ -24,6 +24,49 @@ Data flow for a single AI request:
 
 The shared secret (API Secret) never leaves the back end; the editor and the AI service only ever see signed tokens.
 
+== Capabilities
+
+[cols="1,2",options="header"]
+|===
+|Capability |Details
+
+|Conversational AI assistant
+|Multi-turn chat sidebar. Conversation history is isolated per user through the JWT `sub` claim.
+
+|Document review
+|Correctness, clarity, readability, tone, and translation.
+
+|Quick actions
+|Rewrite, summarize, expand, change tone, fix grammar, translate, continue, and improve writing.
+
+|LLM provider flexibility
+|OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Google Vertex AI, or any self-hosted OpenAI-compatible endpoint. Multiple providers can coexist.
+
+|MCP integration
+|Connect internal tools, databases, and knowledge bases through Model Context Protocol over Streamable HTTP transport.
+
+|Web scraping and web search
+|Pluggable endpoints for fetching web pages and running searches.
+
+|Multi-tenant environments
+|Isolated conversation history and per-tenant access keys through Environments.
+
+|Per-user, per-feature permissions
+|Fine-grained control through the `auth.ai.permissions` JWT claim.
+
+|Streaming responses
+|Server-Sent Events from the LLM back to the browser.
+
+|File attachments
+|Database, filesystem, Amazon S3, or Azure Blob Storage.
+
+|Observability
+|Structured request logs, OpenTelemetry, and Langfuse. All three run as independent simultaneous pipelines.
+
+|Horizontal scaling
+|The service is stateless. Share identical environment configuration across replicas.
+|===
+
 == Prerequisites
 
 [cols="1,3",options="header"]