Skip to content

Commit b06e673

Browse files
committed
DOC-3498: Address content gaps identified in source audit
Add missing customer-facing content identified by comparing the original internal documentation against the current on-premises AsciiDoc pages: capabilities matrix on the overview page, Podman production runbook, performance characteristics table, expanded known limits reference, MySQL 8.4 caveat, Ollama systemd and Modelfile examples, and getting-started teardown and config update guidance.
1 parent 7fdc6e5 commit b06e673

6 files changed

Lines changed: 190 additions & 3 deletions

modules/ROOT/pages/tinymceai-on-premises-database.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ Do *not* use `mysql:8`. That tag now floats to MySQL 8.4, which removes the `def
9595
[ERROR] [MY-010119] [Server] Aborting
9696
....
9797

98-
Pin to `mysql:8.0` in every manifest: `docker run`, Docker Compose, Kubernetes, Helm, ECS.
98+
Pin to `mysql:8.0` in every manifest: `docker run`, Docker Compose, Kubernetes, Helm, ECS. Running MySQL 8.4 with workarounds (removing the flag and switching to `caching_sha2_password`) is not a supported configuration.
9999

100100
TIP: The same principle applies to PostgreSQL. Pin `postgres:16` rather than `postgres:latest`.
101101

modules/ROOT/pages/tinymceai-on-premises-getting-started.adoc

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,3 +426,42 @@ data: {}
426426
If the stream emits `event: error`, inspect the `data` payload. Provider errors (invalid API key, IAM denial, model unavailable) ride inside the SSE response. The HTTP status stays 200. See the xref:tinymceai-on-premises-troubleshooting.adoc[LLM provider errors] section in the Troubleshooting guide for details.
427427

428428
A successful round-trip confirms: container health, database connectivity, Redis connectivity, JWT signing, JWT verification, permissions checking, environment registration, LLM provider authentication, and SSE streaming. If problems persist after these checks, focus on the editor configuration next.
429+
430+
== Updating configuration
431+
432+
IMPORTANT: `docker compose restart` after `.env` changes silently keeps the old environment values. The restart preserves the container and does not re-read `.env`. Always use `docker compose up -d --force-recreate` instead.
433+
434+
[source,bash]
435+
----
436+
docker compose up -d --force-recreate
437+
# Or recreate only the AI service:
438+
docker compose up -d --force-recreate ai-service
439+
----
440+
441+
For Kubernetes, update the Secret and trigger a rollout restart:
442+
443+
[source,bash]
444+
----
445+
kubectl rollout restart deployment/ai-service -n tinymce-ai
446+
----
447+
448+
== Stopping and cleaning up
449+
450+
[source,bash]
451+
----
452+
# Stop the AI service (standalone Docker)
453+
docker stop ai-service && docker rm ai-service
454+
455+
# Stop the Docker Compose stack
456+
docker compose down
457+
458+
# Remove all data including volumes (destructive)
459+
docker compose down -v
460+
----
461+
462+
For Kubernetes, scale the deployment to zero or delete it. Persistent volumes for the database are retained unless explicitly deleted.
463+
464+
[source,bash]
465+
----
466+
kubectl delete deployment ai-service -n tinymce-ai
467+
----

modules/ROOT/pages/tinymceai-on-premises-production.adoc

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,41 @@ When deploying for the first time or upgrading to a new version, start a single
8383

8484

8585

86+
== Podman deployment
87+
88+
The AI service works with Podman as an alternative to Docker. In Podman, containers within a pod share a network namespace, so use `127.0.0.1` instead of container names for hostnames.
89+
90+
[source,bash]
91+
----
92+
podman login -u 'TINY_REGISTRY_USERNAME' registry.containers.tiny.cloud
93+
94+
podman pull registry.containers.tiny.cloud/ai-service:latest
95+
96+
podman pod create --name ai-pod -p 8000:8000 -p 3306:3306 -p 6379:6379
97+
98+
podman run -d --pod ai-pod --name mysql \
99+
-e MYSQL_ROOT_PASSWORD=ROOT_PASSWORD \
100+
-e MYSQL_DATABASE=ai_service \
101+
mysql:8.0
102+
103+
podman run -d --pod ai-pod --name redis redis:7
104+
105+
podman run --init -d --pod ai-pod --name ai-service \
106+
-e LICENSE_KEY='T8LK:...' \
107+
-e ENVIRONMENTS_MANAGEMENT_SECRET_KEY='MANAGEMENT_SECRET' \
108+
-e DATABASE_DRIVER='mysql' \
109+
-e DATABASE_HOST='127.0.0.1' \
110+
-e DATABASE_USER='root' \
111+
-e DATABASE_PASSWORD='ROOT_PASSWORD' \
112+
-e DATABASE_DATABASE='ai_service' \
113+
-e REDIS_HOST='127.0.0.1' \
114+
-e PROVIDERS='{"openai":{"type":"openai","apiKeys":["sk-proj-..."]}}' \
115+
-e STORAGE_DRIVER='database' \
116+
registry.containers.tiny.cloud/ai-service:latest
117+
----
118+
119+
IMPORTANT: Pin to `mysql:8.0`. The `mysql:8` tag floats to MySQL 8.4, which removes the `default-authentication-plugin` flag and causes a crash loop. See xref:tinymceai-on-premises-database.adoc[Database, Redis, and storage] for details.
120+
86121
== Kubernetes deployment
87122

88123
=== Namespace and image pull secret
@@ -562,6 +597,33 @@ License keys are per-deployment, not per-replica. One key covers any number of r
562597

563598

564599

600+
== Performance characteristics
601+
602+
[cols="1,1",options="header"]
603+
|===
604+
|Metric |Typical value
605+
606+
|Cold start
607+
|Approximately 3 seconds
608+
609+
|Health check response
610+
|Less than 10 ms
611+
612+
|Token validation
613+
|Less than 5 ms
614+
615+
|Time to first token (LLM)
616+
|200 ms to 2 s (depends on provider and model)
617+
618+
|Memory per instance
619+
|256 to 512 MB
620+
621+
|Concurrent connections
622+
|1,000{plus} per instance
623+
|===
624+
625+
These values are approximate and vary with hardware, provider latency, and prompt complexity. The LLM provider's rate limits are typically the binding constraint before the AI service becomes one.
626+
565627
== Sizing guide
566628

567629
[cols=",,,,",options="header",]

modules/ROOT/pages/tinymceai-on-premises-providers.adoc

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ Azure-hosted OpenAI models. Requires an Azure subscription, an Azure OpenAI reso
319319
|`type` |Yes |Literal `"azure"`
320320
|`resourceName` |Yes |The `*.openai.azure.com` prefix only, not the full URL.
321321
|`apiKeys` |Yes |Array. Azure issues two keys per resource for zero-downtime key rotation.
322-
|`apiVersion` |Yes |Always set explicitly. Refer to https://learn.microsoft.com/azure/ai-services/openai/reference[Microsoft's API version matrix] for current values.
322+
|`apiVersion` |Yes |Always set explicitly. Omitting it produces a confusing SDK error about a missing query string parameter. Refer to https://learn.microsoft.com/azure/ai-services/openai/reference[Microsoft's API version matrix] for current stable values.
323323
|===
324324
325325
IMPORTANT: The ``MODELS[].id`` value must match the Azure *deployment name* exactly. A mismatch produces a `DeploymentNotFound` error. Use human-readable deployment names because the ID also appears in JWT permission strings and the editor model picker.
@@ -682,6 +682,17 @@ Ollama listens on `127.0.0.1:11434` by default, which is unreachable from inside
682682
OLLAMA_HOST=0.0.0.0:11434 ollama serve
683683
----
684684
685+
On Linux with systemd, create an override file instead:
686+
687+
[source,ini]
688+
----
689+
# /etc/systemd/system/ollama.service.d/override.conf
690+
[Service]
691+
Environment="OLLAMA_HOST=0.0.0.0:11434"
692+
----
693+
694+
Then reload and restart: `sudo systemctl daemon-reload && sudo systemctl restart ollama`.
695+
685696
On Linux, add the host gateway so `host.docker.internal` resolves:
686697
687698
[source,yaml]
@@ -695,6 +706,27 @@ services:
695706
696707
If Ollama returns "does not support tools", the model was built from a raw GGUF without a chat template. Use `ollama pull` for a Library model that includes a proper Modelfile, or author a custom one.
697708
709+
.Custom Modelfile example
710+
[%collapsible]
711+
====
712+
[source]
713+
----
714+
FROM /path/to/your-model.gguf
715+
716+
TEMPLATE """{{ if .System }}<|im_start|>system
717+
{{ .System }}<|im_end|>
718+
{{ end }}{{ range .Messages }}<|im_start|>{{ .Role }}
719+
{{ .Content }}<|im_end|>
720+
{{ end }}<|im_start|>assistant
721+
"""
722+
723+
PARAMETER stop "<|im_end|>"
724+
PARAMETER stop "<|im_start|>"
725+
----
726+
727+
The exact template depends on the base model. Check the model card for the recommended chat template. Verify tool support with `ollama show <model>` before connecting to the AI service.
728+
====
729+
698730
The reasoning toggle (`capabilities.reasoning: true`) is cosmetic for Ollama-backed models; the openai-compatible adapter does not translate it to the native Ollama API.
699731
700732
*Timeout:*

modules/ROOT/pages/tinymceai-on-premises-reference.adoc

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,8 +170,19 @@ Error codes returned in HTTP 4xx responses and inside SSE `event: error` payload
170170
[cols="1,1,3",options="header"]
171171
|===
172172
|Limit |Value |Notes
173-
|Maximum prompt length |100,000 characters |Hard limit enforced by the service. Summarize or shorten source content before it exceeds this threshold.
173+
174+
|Maximum prompt length |100,000 characters |Hard limit enforced by the service. Requests exceeding this return `invalid-request-data`. Summarize or shorten source content before it exceeds this threshold.
175+
|Conversation create |Client-supplied `id` required |The plugin auto-generates `tiny-<uuid>`. Raw API callers must supply a unique `id` in the create body.
176+
|Stream-abort recovery |Stop button leaves stale state |The next message returns `409 conversation in use` then `404 conversation does not exist`. Recovery: start a new conversation or reload.
177+
|Built-in rate limiting |None |Front the service with nginx `limit_req` or ALB rate-limit rules. See xref:tinymceai-on-premises-production.adoc#rate-limiting[Rate limiting].
174178
|File support (OpenAI-compatible providers) |Images only (`image/*`) |PDFs, text, and Office files are not forwarded to OpenAI-compatible providers. Use a non-OpenAI-compatible provider for non-image file attachments.
175179
|MCP tool availability |Conversations only |MCP tools are not available in reviews or quick actions.
176180
|MCP authentication |Single shared token per server |The `headers` field in `MCP_SERVERS` is fixed at deploy time. Per-user authentication is not supported.
181+
|PostgreSQL default schema |`cs-on-premises` (with hyphen) |Pre-create with `CREATE SCHEMA "cs-on-premises";` or set `DATABASE_SCHEMA=public`.
182+
|`/v1/models/\{compatibilityVersion}` |Only accepts `1` |Values such as `v1`, `v2`, or `latest` return 500.
183+
|Environment creation through raw API |Not supported |Always create environments through the Management Panel UI.
184+
|Bedrock credentials |Inline only |The SDK default credential chain (IRSA, instance roles, `AWS_PROFILE`) is not used.
185+
|Vertex credentials |Inline only |Application Default Credentials, `GOOGLE_APPLICATION_CREDENTIALS`, and the metadata server are not used.
186+
|Azure `MODELS[].id` |Must equal deployment name |There is no separate `deploymentName` field. The ID is the deployment name.
187+
|OpenAI-compatible `baseUrl` |Must include `/v1` suffix |Omitting it produces a "Not Found" SSE error.
177188
|===

modules/ROOT/pages/tinymceai-on-premises.adoc

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,49 @@ Data flow for a single AI request:
2424

2525
The shared secret (API Secret) never leaves the back end; the editor and the AI service only ever see signed tokens.
2626

27+
== Capabilities
28+
29+
[cols="1,2",options="header"]
30+
|===
31+
|Capability |Details
32+
33+
|Conversational AI assistant
34+
|Multi-turn chat sidebar. Conversation history is isolated per user through the JWT `sub` claim.
35+
36+
|Document review
37+
|Correctness, clarity, readability, tone, and translation.
38+
39+
|Quick actions
40+
|Rewrite, summarize, expand, change tone, fix grammar, translate, continue, and improve writing.
41+
42+
|LLM provider flexibility
43+
|OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Google Vertex AI, or any self-hosted OpenAI-compatible endpoint. Multiple providers can coexist.
44+
45+
|MCP integration
46+
|Connect internal tools, databases, and knowledge bases through Model Context Protocol over Streamable HTTP transport.
47+
48+
|Web scraping and web search
49+
|Pluggable endpoints for fetching web pages and running searches.
50+
51+
|Multi-tenant environments
52+
|Isolated conversation history and per-tenant access keys through Environments.
53+
54+
|Per-user, per-feature permissions
55+
|Fine-grained control through the `auth.ai.permissions` JWT claim.
56+
57+
|Streaming responses
58+
|Server-Sent Events from the LLM back to the browser.
59+
60+
|File attachments
61+
|Database, filesystem, Amazon S3, or Azure Blob Storage.
62+
63+
|Observability
64+
|Structured request logs, OpenTelemetry, and Langfuse. All three run as independent simultaneous pipelines.
65+
66+
|Horizontal scaling
67+
|The service is stateless. Share identical environment configuration across replicas.
68+
|===
69+
2770
== Prerequisites
2871

2972
[cols="1,3",options="header"]

0 commit comments

Comments
 (0)