You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DOC-3498: Address content gaps identified in source audit
Add missing customer-facing content identified by comparing the
original internal documentation against the current on-premises
AsciiDoc pages: capabilities matrix on the overview page, Podman
production runbook, performance characteristics table, expanded
known limits reference, MySQL 8.4 caveat, Ollama systemd and
Modelfile examples, and getting-started teardown and config update
guidance.
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tinymceai-on-premises-database.adoc
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -95,7 +95,7 @@ Do *not* use `mysql:8`. That tag now floats to MySQL 8.4, which removes the `def
95
95
[ERROR] [MY-010119] [Server] Aborting
96
96
....
97
97
98
-
Pin to `mysql:8.0` in every manifest: `docker run`, Docker Compose, Kubernetes, Helm, ECS.
98
+
Pin to `mysql:8.0` in every manifest: `docker run`, Docker Compose, Kubernetes, Helm, ECS. Running MySQL 8.4 with workarounds (removing the flag and switching to `caching_sha2_password`) is not a supported configuration.
99
99
100
100
TIP: The same principle applies to PostgreSQL. Pin `postgres:16` rather than `postgres:latest`.
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tinymceai-on-premises-getting-started.adoc
+39Lines changed: 39 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -426,3 +426,42 @@ data: {}
426
426
If the stream emits `event: error`, inspect the `data` payload. Provider errors (invalid API key, IAM denial, model unavailable) ride inside the SSE response. The HTTP status stays 200. See the xref:tinymceai-on-premises-troubleshooting.adoc[LLM provider errors] section in the Troubleshooting guide for details.
427
427
428
428
A successful round-trip confirms: container health, database connectivity, Redis connectivity, JWT signing, JWT verification, permissions checking, environment registration, LLM provider authentication, and SSE streaming. If problems persist after these checks, focus on the editor configuration next.
429
+
430
+
== Updating configuration
431
+
432
+
IMPORTANT: `docker compose restart` after `.env` changes silently keeps the old environment values. The restart preserves the container and does not re-read `.env`. Always use `docker compose up -d --force-recreate` instead.
433
+
434
+
[source,bash]
435
+
----
436
+
docker compose up -d --force-recreate
437
+
# Or recreate only the AI service:
438
+
docker compose up -d --force-recreate ai-service
439
+
----
440
+
441
+
For Kubernetes, update the Secret and trigger a rollout restart:
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tinymceai-on-premises-production.adoc
+62Lines changed: 62 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -83,6 +83,41 @@ When deploying for the first time or upgrading to a new version, start a single
83
83
84
84
85
85
86
+
== Podman deployment
87
+
88
+
The AI service works with Podman as an alternative to Docker. In Podman, containers within a pod share a network namespace, so use `127.0.0.1` instead of container names for hostnames.
IMPORTANT: Pin to `mysql:8.0`. The `mysql:8` tag floats to MySQL 8.4, which removes the `default-authentication-plugin` flag and causes a crash loop. See xref:tinymceai-on-premises-database.adoc[Database, Redis, and storage] for details.
120
+
86
121
== Kubernetes deployment
87
122
88
123
=== Namespace and image pull secret
@@ -562,6 +597,33 @@ License keys are per-deployment, not per-replica. One key covers any number of r
562
597
563
598
564
599
600
+
== Performance characteristics
601
+
602
+
[cols="1,1",options="header"]
603
+
|===
604
+
|Metric |Typical value
605
+
606
+
|Cold start
607
+
|Approximately 3 seconds
608
+
609
+
|Health check response
610
+
|Less than 10 ms
611
+
612
+
|Token validation
613
+
|Less than 5 ms
614
+
615
+
|Time to first token (LLM)
616
+
|200 ms to 2 s (depends on provider and model)
617
+
618
+
|Memory per instance
619
+
|256 to 512 MB
620
+
621
+
|Concurrent connections
622
+
|1,000{plus} per instance
623
+
|===
624
+
625
+
These values are approximate and vary with hardware, provider latency, and prompt complexity. The LLM provider's rate limits are typically the binding constraint before the AI service becomes one.
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tinymceai-on-premises-providers.adoc
+33-1Lines changed: 33 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -319,7 +319,7 @@ Azure-hosted OpenAI models. Requires an Azure subscription, an Azure OpenAI reso
319
319
|`type` |Yes |Literal `"azure"`
320
320
|`resourceName` |Yes |The `*.openai.azure.com` prefix only, not the full URL.
321
321
|`apiKeys` |Yes |Array. Azure issues two keys per resource for zero-downtime key rotation.
322
-
|`apiVersion` |Yes |Always set explicitly. Refer to https://learn.microsoft.com/azure/ai-services/openai/reference[Microsoft's API version matrix] for current values.
322
+
|`apiVersion` |Yes |Always set explicitly. Omitting it produces a confusing SDK error about a missing query string parameter. Refer to https://learn.microsoft.com/azure/ai-services/openai/reference[Microsoft's API version matrix] for current stable values.
323
323
|===
324
324
325
325
IMPORTANT: The ``MODELS[].id`` value must match the Azure *deployment name* exactly. A mismatch produces a `DeploymentNotFound` error. Use human-readable deployment names because the ID also appears in JWT permission strings and the editor model picker.
@@ -682,6 +682,17 @@ Ollama listens on `127.0.0.1:11434` by default, which is unreachable from inside
682
682
OLLAMA_HOST=0.0.0.0:11434 ollama serve
683
683
----
684
684
685
+
On Linux with systemd, create an override file instead:
Then reload and restart: `sudo systemctl daemon-reload && sudo systemctl restart ollama`.
695
+
685
696
On Linux, add the host gateway so `host.docker.internal` resolves:
686
697
687
698
[source,yaml]
@@ -695,6 +706,27 @@ services:
695
706
696
707
If Ollama returns "does not support tools", the model was built from a raw GGUF without a chat template. Use `ollama pull` for a Library model that includes a proper Modelfile, or author a custom one.
697
708
709
+
.Custom Modelfile example
710
+
[%collapsible]
711
+
====
712
+
[source]
713
+
----
714
+
FROM /path/to/your-model.gguf
715
+
716
+
TEMPLATE """{{ if .System }}<|im_start|>system
717
+
{{ .System }}<|im_end|>
718
+
{{ end }}{{ range .Messages }}<|im_start|>{{ .Role }}
719
+
{{ .Content }}<|im_end|>
720
+
{{ end }}<|im_start|>assistant
721
+
"""
722
+
723
+
PARAMETER stop "<|im_end|>"
724
+
PARAMETER stop "<|im_start|>"
725
+
----
726
+
727
+
The exact template depends on the base model. Check the model card for the recommended chat template. Verify tool support with `ollama show <model>` before connecting to the AI service.
728
+
====
729
+
698
730
The reasoning toggle (`capabilities.reasoning: true`) is cosmetic for Ollama-backed models; the openai-compatible adapter does not translate it to the native Ollama API.
Copy file name to clipboardExpand all lines: modules/ROOT/pages/tinymceai-on-premises-reference.adoc
+12-1Lines changed: 12 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -170,8 +170,19 @@ Error codes returned in HTTP 4xx responses and inside SSE `event: error` payload
170
170
[cols="1,1,3",options="header"]
171
171
|===
172
172
|Limit |Value |Notes
173
-
|Maximum prompt length |100,000 characters |Hard limit enforced by the service. Summarize or shorten source content before it exceeds this threshold.
173
+
174
+
|Maximum prompt length |100,000 characters |Hard limit enforced by the service. Requests exceeding this return `invalid-request-data`. Summarize or shorten source content before it exceeds this threshold.
175
+
|Conversation create |Client-supplied `id` required |The plugin auto-generates `tiny-<uuid>`. Raw API callers must supply a unique `id` in the create body.
176
+
|Stream-abort recovery |Stop button leaves stale state |The next message returns `409 conversation in use` then `404 conversation does not exist`. Recovery: start a new conversation or reload.
177
+
|Built-in rate limiting |None |Front the service with nginx `limit_req` or ALB rate-limit rules. See xref:tinymceai-on-premises-production.adoc#rate-limiting[Rate limiting].
174
178
|File support (OpenAI-compatible providers) |Images only (`image/*`) |PDFs, text, and Office files are not forwarded to OpenAI-compatible providers. Use a non-OpenAI-compatible provider for non-image file attachments.
175
179
|MCP tool availability |Conversations only |MCP tools are not available in reviews or quick actions.
176
180
|MCP authentication |Single shared token per server |The `headers` field in `MCP_SERVERS` is fixed at deploy time. Per-user authentication is not supported.
181
+
|PostgreSQL default schema |`cs-on-premises` (with hyphen) |Pre-create with `CREATE SCHEMA "cs-on-premises";` or set `DATABASE_SCHEMA=public`.
182
+
|`/v1/models/\{compatibilityVersion}` |Only accepts `1` |Values such as `v1`, `v2`, or `latest` return 500.
183
+
|Environment creation through raw API |Not supported |Always create environments through the Management Panel UI.
184
+
|Bedrock credentials |Inline only |The SDK default credential chain (IRSA, instance roles, `AWS_PROFILE`) is not used.
185
+
|Vertex credentials |Inline only |Application Default Credentials, `GOOGLE_APPLICATION_CREDENTIALS`, and the metadata server are not used.
186
+
|Azure `MODELS[].id` |Must equal deployment name |There is no separate `deploymentName` field. The ID is the deployment name.
187
+
|OpenAI-compatible `baseUrl` |Must include `/v1` suffix |Omitting it produces a "Not Found" SSE error.
|OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Google Vertex AI, or any self-hosted OpenAI-compatible endpoint. Multiple providers can coexist.
44
+
45
+
|MCP integration
46
+
|Connect internal tools, databases, and knowledge bases through Model Context Protocol over Streamable HTTP transport.
47
+
48
+
|Web scraping and web search
49
+
|Pluggable endpoints for fetching web pages and running searches.
50
+
51
+
|Multi-tenant environments
52
+
|Isolated conversation history and per-tenant access keys through Environments.
53
+
54
+
|Per-user, per-feature permissions
55
+
|Fine-grained control through the `auth.ai.permissions` JWT claim.
56
+
57
+
|Streaming responses
58
+
|Server-Sent Events from the LLM back to the browser.
59
+
60
+
|File attachments
61
+
|Database, filesystem, Amazon S3, or Azure Blob Storage.
62
+
63
+
|Observability
64
+
|Structured request logs, OpenTelemetry, and Langfuse. All three run as independent simultaneous pipelines.
65
+
66
+
|Horizontal scaling
67
+
|The service is stateless. Share identical environment configuration across replicas.
0 commit comments