Skip to content

Commit 803e877

Browse files
committed
docs: clean avatar and tts documentation
1 parent dd1f82d commit 803e877

2 files changed

Lines changed: 53 additions & 54 deletions

File tree

AVATAR.md

Lines changed: 44 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Dynamic Agent Avatar System
1+
# Avatar And Character Expression System
22

3-
Internal developer documentation for the OpenRoom / VibeApps avatar and character expression system.
3+
Maintained developer documentation for OpenRoom / VibeApps avatar rendering, character expression media, and related app avatar usage.
44

55
> **Note:** There is no single file named `Avatar.tsx` or `avatar.ts`. Avatar rendering is distributed across multiple components and modules. This document maps the complete landscape.
66
@@ -44,7 +44,7 @@ The avatar system has two independent tracks:
4444
│ │ (CharacterMetaInfo) │ │ or vibeContainerMock.ts │ │
4545
│ └─────────────────────────────┘ └─────────────────────────────────┘ │
4646
│ │
47-
│ CharacterPanel.tsx ──→ Edit character meta_info (images/videos/URLs)
47+
│ CharacterPanel.tsx ──→ Edit character meta_info (URLs or local uploads)
4848
│ │
4949
└─────────────────────────────────────────────────────────────────────────┘
5050
```
@@ -56,7 +56,7 @@ The avatar system has two independent tracks:
5656
### Type Definitions
5757

5858
```typescript
59-
// /home/niya/github/OpenRoom/apps/webuiapps/src/lib/characterManager.ts
59+
// apps/webuiapps/src/lib/characterManager.ts
6060

6161
export const CHARACTER_EMOTION_LIST = [
6262
'default',
@@ -115,7 +115,7 @@ All file operations go through the `@/lib` unified file API per project conventi
115115

116116
## Main Avatar Component: CharacterAvatar
117117

118-
**File:** `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/ChatSubComponents.tsx` (lines 92-198)
118+
**File:** `apps/webuiapps/src/components/ChatPanel/ChatSubComponents.tsx` (lines 92-198)
119119

120120
The `CharacterAvatar` is a `memo()`-wrapped React component rendered inside the ChatPanel's left 280px column (`.avatarSide`).
121121

@@ -179,7 +179,7 @@ export const CharacterAvatar: React.FC<{
179179
### CSS Layout
180180

181181
```scss
182-
// /home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/index.module.scss
182+
// apps/webuiapps/src/components/ChatPanel/index.module.scss
183183

184184
.avatarSide {
185185
width: 280px;
@@ -206,7 +206,7 @@ export const CharacterAvatar: React.FC<{
206206

207207
## Emotion Resolution Algorithm
208208

209-
**File:** `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/characterManager.ts` (lines 316-368)
209+
**File:** `apps/webuiapps/src/lib/characterManager.ts` (lines 316-368)
210210

211211
### `resolveEmotionMedia(config, emotion?)`
212212

@@ -249,9 +249,9 @@ export function clearEmotionVideoCache(characterId?: string): void {
249249
## Emotion Triggering Flow
250250

251251
**Files:**
252-
- `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/useConversationEngine.ts`
253-
- `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/toolDefinitions.ts`
254-
- `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/index.tsx`
252+
- `apps/webuiapps/src/components/ChatPanel/useConversationEngine.ts`
253+
- `apps/webuiapps/src/components/ChatPanel/toolDefinitions.ts`
254+
- `apps/webuiapps/src/components/ChatPanel/index.tsx`
255255

256256
### Sequence Diagram
257257

@@ -322,7 +322,7 @@ LLM ──→ respond_to_user tool call
322322

323323
## User Avatars (vibeInfo)
324324

325-
**File:** `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/vibeInfo.ts`
325+
**File:** `apps/webuiapps/src/lib/vibeInfo.ts`
326326

327327
The `vibeInfo` module wraps three info queries from `@gui/vibe-container`:
328328
- `getUserInfo()``UserInfoResponse`
@@ -332,7 +332,7 @@ The `vibeInfo` module wraps three info queries from `@gui/vibe-container`:
332332
### vibe-container Types
333333

334334
```typescript
335-
// /home/niya/github/OpenRoom/packages/vibe-container/src/types/index.ts
335+
// packages/vibe-container/src/types/index.ts
336336

337337
export interface UserInfoResponse {
338338
userId: number;
@@ -352,7 +352,7 @@ export interface CharacterInfoResponse {
352352
### Mock Types (Standalone Mode)
353353

354354
```typescript
355-
// /home/niya/github/OpenRoom/apps/webuiapps/src/lib/vibeContainerMock.ts
355+
// apps/webuiapps/src/lib/vibeContainerMock.ts
356356

357357
export interface UserInfoResponse {
358358
user_id?: string;
@@ -385,7 +385,7 @@ getCharacterInfo: () => Promise.resolve({ character_id: 'assistant', name: 'Assi
385385

386386
## CharacterPanel (Settings UI)
387387

388-
**File:** `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/CharacterPanel.tsx`
388+
**File:** `apps/webuiapps/src/components/ChatPanel/CharacterPanel.tsx`
389389

390390
The character settings modal has two views: **List** and **Editor**.
391391

@@ -401,17 +401,18 @@ The character settings modal has two views: **List** and **Editor**.
401401

402402
- 160px preview of `base_image_url`
403403
- Name, Gender, Persona Description fields
404-
- **Default Avatar** field — paste a single image URL
405-
- **Emotions & Expressions** — per-emotion image/video URL fields
406-
- Users manually paste URLs; no image generation
407-
- Video detection by extension regex: `/\.(mp4|webm|mov|ogg)(\?|$)/i`
408-
- Thumbnail preview loops muted video or shows static image
404+
- **Default Avatar** field — paste an image URL or upload a local image
405+
- **Emotions & Expressions** — per-emotion image/video URL fields plus local upload slots
406+
- Users can paste external URLs or upload local assets through `ImageUploader`
407+
- Local uploads are stored under `/characters/{characterId}/emotions/` via `characterAssetUpload.ts`
408+
- Video detection is centralized in `isVideoAssetUrl()` and supports local uploaded paths as well as external URLs
409+
- Thumbnail previews loop muted videos or show static images
409410
- Add/remove custom emotions
410411

411412
### Editor Data Flow
412413

413414
```
414-
User pastes URL → emotionImages[emotion] = url
415+
User pastes URL or uploads asset → emotionImages[emotion] or emotionVideos[emotion] = path/url
415416
416417
417418
On Save:
@@ -421,15 +422,15 @@ User pastes URL → emotionImages[emotion] = url
421422
- Persist via saveCharacterCollection() → /api/characters + localStorage
422423
```
423424

424-
**Note:** The editor's single input field per emotion populates `emotionImages`. There is no separate UI for `emotion_videos` arrays; the editor treats the input as either an image or a single video URL. To configure multiple videos per emotion, external tooling or manual JSON editing is required.
425+
**Note:** The editor supports one configured asset per emotion in the UI. Image assets are saved in `emotion_images`; video assets are saved as single-entry arrays in `emotion_videos`. The underlying `emotion_videos: Record<string, string[]>` model still supports multiple videos per emotion, but the editor does not expose multi-video configuration.
425426

426427
---
427428

428429
## Image Generation (Separate System)
429430

430431
**Files:**
431-
- `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/imageGenTools.ts`
432-
- `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/imageGenClient.ts`
432+
- `apps/webuiapps/src/lib/imageGenTools.ts`
433+
- `apps/webuiapps/src/lib/imageGenClient.ts`
433434

434435
The `generate_image` LLM tool is **completely separate** from the avatar system.
435436

@@ -457,8 +458,8 @@ LLM calls generate_image(prompt)
457458

458459
| Aspect | Avatar System | Image Generation |
459460
|--------|--------------|------------------|
460-
| Source | User-pasted URLs in CharacterPanel | LLM-triggered text-to-image API |
461-
| Storage | Character JSON (URLs only) | Binary files on disk (`generated-images/`) |
461+
| Source | User-pasted URLs or local uploads in CharacterPanel | LLM-triggered text-to-image API |
462+
| Storage | Character JSON with external URLs or `/characters/...` asset paths | Binary files on disk (`generated-images/`) |
462463
| Display | `CharacterAvatar` in ChatPanel left panel | Inline in chat messages |
463464
| URLs | External CDN URLs | `/api/session-data?path=...` or `data:` URLs |
464465
| Blob/ObjectURL usage | None | None |
@@ -469,7 +470,7 @@ LLM calls generate_image(prompt)
469470

470471
### Twitter App
471472

472-
**File:** `/home/niya/github/OpenRoom/apps/webuiapps/src/pages/Twitter/index.tsx`
473+
**File:** `apps/webuiapps/src/pages/Twitter/index.tsx`
473474

474475
```typescript
475476
interface AvatarProps {
@@ -492,7 +493,7 @@ const Avatar: React.FC<AvatarProps> = ({ name, avatarUrl, className }) => {
492493

493494
### Chess App
494495

495-
**File:** `/home/niya/github/OpenRoom/apps/webuiapps/src/pages/Chess/index.tsx`
496+
**File:** `apps/webuiapps/src/pages/Chess/index.tsx`
496497

497498
```tsx
498499
<div className={styles.avatarWrap}>
@@ -513,20 +514,20 @@ const Avatar: React.FC<AvatarProps> = ({ name, avatarUrl, className }) => {
513514

514515
| File | Purpose |
515516
|------|---------|
516-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/ChatSubComponents.tsx` | `CharacterAvatar` component with crossfade logic |
517-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/index.tsx` | ChatPanel parent; holds `currentEmotion` state, renders `CharacterAvatar` |
518-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/index.module.scss` | `.avatarSide`, `.avatarImage`, `.avatarPlaceholder` styles |
519-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/CharacterPanel.tsx` | Character list & editor UI |
520-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/useConversationEngine.ts` | Emotion triggering via `respond_to_user` tool |
521-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/components/ChatPanel/toolDefinitions.ts` | LLM tool schema for `respond_to_user` with emotion param |
522-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/characterManager.ts` | `CharacterConfig`, `CharacterMetaInfo`, `resolveEmotionMedia()`, persistence |
523-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/vibeInfo.ts` | `useVibeInfo()` hook wrapping container info queries |
524-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/vibeContainerMock.ts` | Mock `getUserInfo()` / `getCharacterInfo()` returning no avatars |
525-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/imageGenTools.ts` | `generate_image` tool — separate from avatars |
526-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/lib/imageGenClient.ts` | Generic text-to-image API client |
527-
| `/home/niya/github/OpenRoom/packages/vibe-container/src/types/index.ts` | Real SDK types: `UserInfoResponse.avatarUrl`, `CharacterInfoResponse.avatarUrl` |
528-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/pages/Twitter/index.tsx` | Twitter `Avatar` component using `avatarUrl` |
529-
| `/home/niya/github/OpenRoom/apps/webuiapps/src/pages/Chess/index.tsx` | Chess avatar using `characterInfo.avatarUrl` |
517+
| `apps/webuiapps/src/components/ChatPanel/ChatSubComponents.tsx` | `CharacterAvatar` component with crossfade logic |
518+
| `apps/webuiapps/src/components/ChatPanel/index.tsx` | ChatPanel parent; holds `currentEmotion` state, renders `CharacterAvatar` |
519+
| `apps/webuiapps/src/components/ChatPanel/index.module.scss` | `.avatarSide`, `.avatarImage`, `.avatarPlaceholder` styles |
520+
| `apps/webuiapps/src/components/ChatPanel/CharacterPanel.tsx` | Character list & editor UI |
521+
| `apps/webuiapps/src/components/ChatPanel/useConversationEngine.ts` | Emotion triggering via `respond_to_user` tool |
522+
| `apps/webuiapps/src/components/ChatPanel/toolDefinitions.ts` | LLM tool schema for `respond_to_user` with emotion param |
523+
| `apps/webuiapps/src/lib/characterManager.ts` | `CharacterConfig`, `CharacterMetaInfo`, `resolveEmotionMedia()`, persistence |
524+
| `apps/webuiapps/src/lib/vibeInfo.ts` | `useVibeInfo()` hook wrapping container info queries |
525+
| `apps/webuiapps/src/lib/vibeContainerMock.ts` | Mock `getUserInfo()` / `getCharacterInfo()` returning no avatars |
526+
| `apps/webuiapps/src/lib/imageGenTools.ts` | `generate_image` tool — separate from avatars |
527+
| `apps/webuiapps/src/lib/imageGenClient.ts` | Generic text-to-image API client |
528+
| `packages/vibe-container/src/types/index.ts` | Real SDK types: `UserInfoResponse.avatarUrl`, `CharacterInfoResponse.avatarUrl` |
529+
| `apps/webuiapps/src/pages/Twitter/index.tsx` | Twitter `Avatar` component using `avatarUrl` |
530+
| `apps/webuiapps/src/pages/Chess/index.tsx` | Chess avatar using `characterInfo.avatarUrl` |
530531

531532
---
532533

@@ -549,7 +550,7 @@ These were likely carried over from a different character system (Talkie export
549550

550551
## Important Implementation Notes
551552

552-
1. **No blob URLs or `URL.createObjectURL`** anywhere in the avatar or image generation codebase. All media is loaded via direct URL strings.
553+
1. **No blob URLs or `URL.createObjectURL`** anywhere in the avatar or image generation codebase. External media is loaded via direct URL strings; local character uploads are loaded through `/api/session-data?path=...` URLs built by `diskStorage.buildFileUrl()`.
553554

554555
2. **Video loop behavior:** Idle/default videos loop (`loop={true}`); emotion videos play once and trigger `onEmotionEnd` when finished.
555556

@@ -559,6 +560,6 @@ These were likely carried over from a different character system (Talkie export
559560

560561
5. **Mock vs Real SDK type mismatch:** Standalone mode mock uses `avatar?: string`; production SDK uses `avatarUrl: string`. Apps should be defensive when reading avatar fields from `useVibeInfo()`.
561562

562-
6. **CharacterPanel editor limitation:** The UI only allows one URL per emotion. The underlying data model (`emotion_videos: Record<string, string[]>`) supports arrays, but the editor does not expose multi-video configuration.
563+
6. **CharacterPanel editor limitation:** The UI only allows one configured asset per emotion. The underlying data model (`emotion_videos: Record<string, string[]>`) supports arrays, but the editor does not expose multi-video configuration.
563564

564565
7. **All file operations** for character persistence use the `@/lib` unified file API (`/api/characters`), not direct IndexedDB access.

TTS.md

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,21 @@
1-
# Text-to-Speech (TTS) Capability Analysis
1+
# Text-to-Speech (TTS) Integration Notes
22

33
**Project:** OpenRoom / VibeApps
4-
**Analysis Date:** 2026-05-12
5-
**Scope:** Exhaustive codebase audit for TTS, speech synthesis, voice, and audio playback systems
6-
**Analyst:** Automated codebase scan
4+
**Scope:** Current TTS, speech synthesis, voice, and audio playback integration points
75

86
---
97

10-
## Executive Summary
8+
## Summary
119

12-
**This codebase has ZERO TTS integration.**
10+
OpenRoom / VibeApps does not currently include a TTS integration.
1311

14-
After reading every core file, there is absolutely no TTS, speech synthesis, audio output, or voice integration. The codebase is entirely text-and-image oriented for AI character interaction. All audio-related code is limited to:
12+
The current AI character interaction flow is text-and-image oriented. Audio-related code is limited to:
1513

1614
1. A music player app that streams MP3s via `HTMLAudioElement`
1715
2. Muted video elements for character avatars and live wallpaper
18-
3. Generic binary file storage capable of holding arbitrary bytes (currently used only for images)
16+
3. Generic binary file storage capable of holding arbitrary bytes, used today for generated images and uploaded character assets
1917

20-
The codebase is from MiniMax-AI (GitHub org, license, author fields). MiniMax as a company likely offers TTS APIs, but this open-source codebase does NOT integrate them. The MiniMax provider is configured only for text chat via the Anthropic-compatible endpoint.
18+
MiniMax may offer TTS APIs separately, but this codebase does not integrate them. The MiniMax provider is configured only for text chat via the Anthropic-compatible endpoint.
2119

2220
---
2321

@@ -217,7 +215,7 @@ export async function putBinaryFile(
217215
}
218216
```
219217

220-
**Current Usage:** Only called from `imageGenTools.ts` to save generated images.
218+
**Current Usage:** Called from `imageGenTools.ts` to save generated images and from `characterAssetUpload.ts` to save uploaded character images/videos.
221219

222220
**Relevance to TTS:** This is the **exact storage mechanism** a TTS system would use to save generated audio files (e.g., `audio/mp3`, `audio/wav`). No code changes needed to the storage layer — it is already capable of persisting arbitrary binary data.
223221

@@ -549,4 +547,4 @@ The absence of TTS is a deliberate product gap, not a technical limitation. All
549547

550548
---
551549

552-
*Document generated via exhaustive automated codebase analysis. All negative findings verified by direct file inspection and grep search.*
550+
These notes should be updated when a TTS provider, character voice metadata, or chat audio playback UI is added.

0 commit comments

Comments
 (0)