Give AI a real-time interactive digital body
Browser-native 3D digital human engine with voice, vision, and dialogue capabilities.
Zero-config · Offline-ready · Production-grade
Quick Start · Features · Performance · Architecture · Documentation · Changelog · 中文
Experience a fully interactive 3D digital human right in your browser. No installation or API keys required!
3D Avatar with emotion-driven expressions and real-time dialogue
- Node.js ≥ 18
- npm ≥ 9
# Clone and install
git clone https://github.com/LessUp/meta-human.git
cd meta-human
npm install
# Start development server
npm run devOpen http://localhost:5173 — your 3D avatar is ready!
💡 No API key required. The engine automatically falls back to local mock mode for out-of-the-box demos.
|
import { digitalHumanEngine } from './core/avatar';
digitalHumanEngine.perform({
emotion: 'happy',
expression: 'smile',
animation: 'wave',
});Note: The project uses Vite path aliases. See Path Aliases for configuration. |
| Feature | Description |
|---|---|
| TTS | Edge TTS for natural voice synthesis |
| ASR | Browser-native speech recognition |
| Smart Muting | Auto-pause TTS when user speaks |
| Voice Detection | Visual feedback during recording |
import { ttsService, asrService } from './core/audio';
await ttsService.speak('Hello! How can I help?');
asrService.start({
onResult: (text) => dialogueService.send(text),
});| Feature | Description |
|---|---|
| Multi-Modal Response | Returns { replyText, emotion, action } |
| Streaming | Real-time token-by-token via SSE |
| Graceful Degradation | Falls back to local mock when API unavailable |
| Session Management | Persistent conversation context |
import { dialogueService } from './core/dialogue';
const response = await dialogueService.send({
text: 'Tell me a joke',
sessionId: 'user-123',
});
// → { replyText: '...', emotion: 'happy', action: 'laugh' }| Feature | Description |
|---|---|
| Face Mesh | 468 landmarks for micro-expression detection |
| Pose Estimation | Upper body gesture recognition |
| Emotion Mapping | Real-time emotion inference |
| Privacy First | All processing in browser, no data leaves client |
Benchmarks measured on typical devices:
| Metric | Desktop | Mobile (Mid-range) | Mobile (Low-end) |
|---|---|---|---|
| Rendering | 60 FPS | 60 FPS | 30 FPS |
| TTS Latency | < 200ms | < 300ms | < 500ms |
| Bundle Size | 180 KB (gzipped) | 180 KB | 180 KB |
| Memory Usage | ~120 MB | ~80 MB | ~60 MB |
Performance automatically scales based on device capabilities. See Performance Module for details.
┌─────────────────────────────────────────────────────────────────┐
│ UI Layer │
│ ChatDock · TopHUD · ControlPanel · SettingsDrawer │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────┐
│ Core Engine Layer │
│ Avatar · Dialogue · Vision · Audio · Performance │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────┐
│ State Layer │
│ chatSessionStore · systemStore · digitalHumanStore │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────────┐
│ External Services │
│ Three.js · Web Speech API · MediaPipe · OpenAI API │
└─────────────────────────────────────────────────────────────────┘
Three focused domains minimize re-renders:
| Store | Responsibility |
|---|---|
chatSessionStore |
Message history, session lifecycle |
systemStore |
Connection status, errors, performance metrics |
digitalHumanStore |
Avatar runtime state (expression, animation, audio) |
src/
├── core/ # Engine modules
│ ├── avatar/ # 3D rendering & animation
│ │ ├── DigitalHumanEngine.ts # Unified driver
│ │ └── constants.ts # Expressions, animations
│ ├── audio/ # TTS & ASR services
│ ├── dialogue/ # Chat transport & orchestration
│ │ ├── dialogueService.ts # API client
│ │ ├── dialogueOrchestrator.ts
│ │ └── chatTransport.ts # HTTP/SSE/WebSocket
│ ├── vision/ # MediaPipe pipeline
│ │ ├── visionService.ts
│ │ └── visionMapper.ts
│ └── performance/ # Device capability detection
├── components/ # React components
│ ├── DigitalHumanViewer.tsx # 3D viewport
│ ├── ChatDock.tsx # Chat interface
│ ├── TopHUD.tsx # Status bar
│ ├── ControlPanel.tsx # Quick controls
│ ├── VoiceInteractionPanel.tsx
│ ├── VisionMirrorPanel.tsx
│ └── ui/ # Shared primitives
├── store/ # Zustand stores
│ ├── chatSessionStore.ts
│ ├── systemStore.ts
│ └── digitalHumanStore.ts
├── hooks/ # Custom hooks
├── pages/ # Route pages
└── lib/ # Utilities
This project uses Vite path aliases configured in vite.config.ts:
| Alias | Maps to |
|---|---|
@core/* |
src/core/* |
@components/* |
src/components/* |
@store/* |
src/store/* |
@hooks/* |
src/hooks/* |
@lib/* |
src/lib/* |
@pages/* |
src/pages/* |
npm run build:pages- Set
VITE_API_BASE_URLin GitHub Repository Variables - Push to
mainor run "Deploy Pages" workflow - Live at:
https://lessup.github.io/meta-human/
Use render.yaml blueprint:
# Deploys FastAPI backend with:
POST /v1/chat # OpenAI-compatible chat
POST /v1/chat/stream # SSE streaming
POST /v1/tts # Text-to-speech
POST /v1/asr # Speech-to-text
WebSocket /ws # Real-time streamingRequired Environment Variables:
| Variable | Description | Required |
|---|---|---|
OPENAI_API_KEY |
AI responses | Optional (falls back to mock) |
CORS_ALLOW_ORIGINS |
Frontend domain for CORS | Yes |
npm run dev # Start dev server (port 5173)
npm run preview # Preview production build
npm run preview:https # Preview with HTTPSnpm run build # Production build
npm run build:pages # GitHub Pages build
npm run build:mobile # Mobile-optimized build
npm run build:desktop # Desktop-optimized build
npm run build:ar # AR-enabled build
npm run build:analyze # Build with bundle analyzernpm run lint # ESLint check
npm run lint:fix # Auto-fix ESLint issues
npm run format # Prettier formatting
npm run format:check # Check formatting without writing
npm run typecheck # TypeScript checknpm run test # Vitest watch mode
npm run test:run # Run tests once
npm run test:coverage # Coverage report
npm run test:ui # Vitest UI mode| Feature | Chrome | Edge | Firefox | Safari |
|---|---|---|---|---|
| Core Engine | 90+ ✅ | 90+ ✅ | 90+ ✅ | 15+ ✅ |
| TTS (Speech Synthesis) | 90+ ✅ | 90+ ✅ | 90+ ✅ | 15+ ✅ |
| ASR (Speech Recognition) | 90+ ✅ | 90+ ✅ | ❌ Not supported | ❌ Not supported |
| MediaPipe Vision | 90+ ✅ | 90+ ✅ | 90+ ✅ | 15+ |
ASR Limitations: Speech recognition requires Chrome or Edge due to Web Speech API limitations. Firefox and Safari users can use text input instead.
Safari Note: MediaPipe vision features may require enabling experimental features.
- Quick Start — Get running in 5 minutes
- API Reference — Backend API documentation
- Architecture — System design
- Configuration — Environment variables and settings
- Contributing — Contribution guidelines
- Changelog — Version history
See CHANGELOG.md for released features and GitHub Projects for upcoming work.
- Core 3D avatar rendering
- Voice interaction (TTS/ASR)
- Visual perception (MediaPipe)
- Streaming dialogue
- Mobile AR support
- Custom avatar upload
- Multi-language TTS
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create feature branch:
git checkout -b feat/amazing-feature - Commit changes:
git commit -m 'feat: add amazing feature' - Push:
git push origin feat/amazing-feature - Open a Pull Request
Follow Conventional Commits.
MIT © LessUp
Built with ❤️ to make digital humans accessible to everyone