170170| ` voice_clone_service.py ` | XTTS v2 with custom presets + streaming synthesis |
171171| ` piper_tts_service.py ` | Piper TTS (CPU) with Dmitri/Irina voices, auto-discovers models dir |
172172| ` stt_service.py ` | Vosk (realtime) + Whisper (batch) STT |
173- | ` multi_bot_manager.py ` | Subprocess manager for multiple Telegram bots |
173+ | ` multi_bot_manager.py ` | Subprocess manager for multiple Telegram bots (auto-start on app launch) |
174174| ` app/services/audio_pipeline.py ` | GSM telephony audio processing (8kHz, PCM16, G.711) |
175175
176176### Admin Panel (Vue 3)
@@ -188,7 +188,7 @@ admin/src/
188188
189189** Location:** ` data/secretary.db `
190190
191- ** Key tables:** ` chat_sessions ` (with ` source ` , ` source_id ` for tracking origin), ` chat_messages ` , ` faq_entries ` , ` tts_presets ` , ` system_config ` , ` telegram_sessions ` , ` audit_log ` , ` cloud_llm_providers ` , ` bot_instances ` , ` widget_instances `
191+ ** Key tables:** ` chat_sessions ` (with ` source ` , ` source_id ` for tracking origin), ` chat_messages ` , ` faq_entries ` , ` tts_presets ` , ` llm_presets ` , ` system_config ` , ` telegram_sessions ` , ` audit_log ` , ` cloud_llm_providers ` , ` bot_instances ` (with ` auto_start ` ) , ` widget_instances `
192192
193193** Redis (optional):** Used for caching with graceful fallback if unavailable.
194194
@@ -211,15 +211,19 @@ db/
211211 ├── preset.py # PresetRepository
212212 ├── config.py # ConfigRepository
213213 ├── telegram.py # TelegramRepository
214+ ├── bot_instance.py # BotInstanceRepository (Telegram bots)
215+ ├── widget_instance.py # WidgetInstanceRepository
216+ ├── cloud_provider.py # CloudProviderRepository
214217 └── audit.py # AuditRepository
215218```
216219
217220## Environment Variables
218221
219222``` bash
220223LLM_BACKEND=vllm # "vllm", "gemini", or "cloud:{provider_id}"
221- VLLM_API_URL=http://localhost:11434
224+ VLLM_API_URL=http://localhost:11434 # Base URL without /v1 suffix (auto-normalized)
222225VLLM_MODEL_NAME=lydia # LoRA adapter name
226+ VLLM_GPU_ID=1 # GPU ID for vLLM Docker container (default: 1)
223227SECRETARY_PERSONA=gulya # "gulya" or "lidia"
224228GEMINI_API_KEY=... # Only for gemini backend
225229ORCHESTRATOR_PORT=8002
@@ -300,7 +304,7 @@ Key patterns:
3003043 . ** GPU memory sharing** — vLLM 50% (~ 6GB) + XTTS ~ 5GB on 12GB GPU
3013054 . ** OpenWebUI Docker** — Use ` 172.17.0.1 ` not ` localhost ` for API URL
3023065 . ** Ruff ignores Cyrillic** — RUF001/002/003 disabled to allow Russian strings in code
303- 6 . ** Docker + vLLM** — vLLM автоматически запускается как контейнер при переключении в админке. Первый раз нужно скачать образ: ` docker pull vllm/vllm-openai:latest ` (~ 9GB)
307+ 6 . ** Docker + vLLM** — vLLM автоматически запускается как контейнер при переключении в админке. Первый раз нужно скачать образ: ` docker pull vllm/vllm-openai:latest ` (~ 9GB). ** Note: ** ` VLLM_API_URL ` is auto-normalized — trailing ` /v1 ` is stripped (code adds it internally)
3043087 . ** xray-core for VLESS** — Included in Docker image. For local dev, download to ` ./bin/xray ` :
305309 ``` bash
306310 mkdir -p bin && cd bin
@@ -339,8 +343,9 @@ Supported providers (configured via Admin Panel → LLM → Cloud Providers):
339343| ** Kimi** | — | ` kimi-k2 ` , ` moonshot-v1-128k ` |
340344
341345** Usage in Telegram bots:**
342- - Set ` llm_backend ` in bot config: ` "vllm" ` , ` "gemini" ` , or ` "cloud:{provider_id}" `
346+ - Set ` llm_backend ` in bot config: ` "vllm" ` or ` "cloud:{provider_id}" ` (dynamic dropdown in UI)
343347- Action buttons can override LLM per-mode (e.g., creative mode uses different model)
348+ - LLM dropdown dynamically loads all enabled cloud providers from database
344349
345350** Per-session LLM override in Chat:**
346351- Chat view has an LLM selector dropdown in the header
@@ -407,3 +412,49 @@ GeminiProvider → XrayProxyManagerWithFallback → xray-core (SOCKS5/HTTP) →
407412- Proxy fails → Auto-switch to next proxy (if multiple configured)
408413- All proxies fail → Fallback to direct connection
409414- VLESS server unreachable → SDK timeout, error returned to user
415+
416+ ## Telegram Bot Auto-Start
417+
418+ Telegram bots persist their running state and automatically restart after app/container restart.
419+
420+ ** How it works:**
421+ 1 . When bot is started via UI → ` auto_start=true ` saved in DB
422+ 2 . When bot is stopped via UI → ` auto_start=false ` saved in DB
423+ 3 . On app startup → all bots with ` auto_start=true ` automatically start
424+
425+ ** Startup logs:**
426+ ```
427+ 📱 Auto-started Telegram bot: MyBot
428+ 📱 Auto-started 2/2 Telegram bots
429+ ```
430+
431+ ** Migration for existing databases:**
432+ ``` sql
433+ ALTER TABLE bot_instances ADD COLUMN auto_start BOOLEAN DEFAULT 0 ;
434+ ```
435+
436+ ## Local Model Discovery
437+
438+ The system automatically discovers downloaded HuggingFace models in ` ~/.cache/huggingface/hub/ ` .
439+
440+ ** Supported model types:**
441+ - Qwen, Llama, DeepSeek, Mistral, Phi, Gemma, Yi
442+
443+ ** Detected quantization formats:**
444+ - AWQ, GPTQ, GGUF, BNB-4bit, EXL2, FP16
445+
446+ ** API response:**
447+ ``` json
448+ {
449+ "available_models" : {
450+ "qwen2_5_7b_instruct_awq" : {
451+ "full_name" : " Qwen/Qwen2.5-7B-Instruct-AWQ" ,
452+ "downloaded" : true ,
453+ "quant_type" : " AWQ" ,
454+ "lora_support" : true
455+ }
456+ }
457+ }
458+ ```
459+
460+ ** Models tab** in admin panel shows all local models with download status and quantization type
0 commit comments