Status: Updated October 2025 | Current implementation Last Validated: October 15, 2025 against actual codebase Architecture Version: 5-Collection Multi-Embedding with Split Captions
This document describes LaunchAgencyBot's 5-collection multi-embedding vector database architecture for optimal RAG performance:
What This Document Covers:
- 5 separate ChromaDB collections (1 image + 4 caption types)
- Weighted multi-embedding search strategy with independent weights
- Split caption system for semantic separation and RAG quality
- 768-dimensional CLIP embeddings via Replicate API
- Minimal ChromaDB metadata schema (6 fields only)
- Token address as universal UUID across all systems
Sections in This Document:
- Current Architecture Overview - 5-collection system and 3-tier storage
- Collection Architecture Deep Dive - Collection purposes and weighted search
- Storage Architecture - ChromaDB + file-based hybrid storage
- Split Caption Architecture - 4-part captions for RAG quality
Related Documentation:
- → DATABASE_STORAGE.md - File storage, degradation system, data flow workflows
- → DATABASE_API.md - API integration patterns, best practices, troubleshooting
Context Tags: #database #vector-store #chromadb #rag #embeddings #clip #multi-embedding #captions
LaunchAgencyBot uses a sophisticated 5-collection multi-embedding vector database architecture combining ChromaDB with file-based data management for optimal RAG performance.
The system maintains 5 separate ChromaDB collections to store different types of embeddings:
meme_image_embeddings- CLIP image embeddings from Base64 imagesentity_caption_embeddings- Entity captions (characters, objects, actions) - Weight: 4context_caption_embeddings- Context captions (cultural/meme references) - Weight: 3visual_caption_embeddings- Visual style captions (aesthetic, composition) - Weight: 2 (optional)emotions_caption_embeddings- Emotional tone captions (mood, expressions) - Weight: 1 (optional)
Key Characteristics:
- Single unified vector database - All memecoins exist across all 5 collections (not dual pending/confirmed)
- Token address as universal UUID - Blockchain addresses used consistently across all systems
- Weighted multi-embedding search - Different embeddings contribute with different weights (4/3/2/1)
- Vector dimensions: 768 (OpenAI CLIP embeddings via Replicate API)
- Location:
res/memecoins/rag_db/(ChromaDB persistent storage)
The architecture implements a 3-tier workflow for quality control and processing:
| Tier | Location | Purpose | Storage Format | Status |
|---|---|---|---|---|
| Tier 1: Pending | res/memecoins/pending/ |
Manual curation queue | JSON + JPG | Filesystem-only |
| Tier 2: Approved | res/memecoins/approved/ |
Ready for AI processing | JSON + JPG | Filesystem-only |
| Tier 3: Vector DB | res/memecoins/rag_db/ |
Fully processed with embeddings | ChromaDB + files | 5 collections |
Workflow Stages:
Pending → Approved → Vector DB
↓ ↓ ↓
Delete Delete Degrade (back to Approved)
See Also: DATABASE_STORAGE.md - Complete degradation workflow details
- ChromaDB Collections: Store 768-dimensional CLIP embeddings + minimal metadata
- Image Files:
res/memecoins/rag_db/images/{token_address}.jpg(512x512px JPG) - Metadata Files:
res/memecoins/rag_db/metadata/{token_address}.json(complete metadata including captions)
See Also: DATABASE_STORAGE.md - File management implementation details
Implementation: src/vector_store/memecoin_store.py:42-47
Each collection serves a specific semantic purpose in the RAG system:
| Collection | Content | Purpose | Weight | Example | Use Case / Rationale |
|---|---|---|---|---|---|
| meme_image_embeddings | CLIP embeddings of Base64 images | Visual similarity | Independent | N/A | Find visually similar memecoins |
| entity_caption_embeddings | Characters, objects, actions | "What is in the image" | 4 (highest) | "Shiba Inu dog wearing space suit, holding rocket, on moon" | Most concrete aspect, critical for matching characters/objects |
| context_caption_embeddings | Cultural/meme references | "What does it mean" | 3 (high) | "Reference to Doge meme, 'to the moon' expression, satirical crypto hype" | Critical for understanding meme context |
| visual_caption_embeddings | Aesthetic style, composition | "How does it look" | 2 (medium) | "Pixel art, neon colors, centered composition, Comic Sans, retro 8-bit" | Style matching (optional if unremarkable) |
| emotions_caption_embeddings | Emotional tone, mood | "How does it feel" | 1 (low) | "Playful and humorous, optimistic, ironic, lighthearted absurdism" | Supplementary (optional if neutral) |
Implementation: src/vector_store/memecoin_store.py:410-494
The system performs weighted multi-embedding search to find the most relevant memecoins:
# Query all 4 caption collections with different weights
entity_results = entity_collection.query(query_embedding, n_results=n_results) # weight: 4
context_results = context_collection.query(query_embedding, n_results=n_results) # weight: 3
visual_results = visual_collection.query(query_embedding, n_results=n_results) # weight: 2
emotions_results = emotions_collection.query(query_embedding, n_results=n_results) # weight: 1
# Combine results with weighted scoring
final_score = (entity_similarity * 4) + (context_similarity * 3) +
(visual_similarity * 2) + (emotions_similarity * 1)Benefits:
- Precision: Entity and context captions (weights 4+3=7) dominate the search
- Flexibility: Visual and emotion embeddings (weights 2+1=3) provide nuance
- Robustness: Optional embeddings don't penalize memecoins that lack them
- Semantic Separation: Each aspect contributes independently to the final match
ChromaDB Metadata Fields:
{
"name": "Token display name",
"ticker": "Token symbol/ticker",
"uuid": "Token address (Solana mint address - primary key)",
"tags": "tag1,tag2,tag3 (comma-separated, NOT semicolon)",
"tags_categories": "category1,category2 (comma-separated subcategories)",
"created_at": 1234567890 // UNIX timestamp (integer)
}Important Notes:
- ✅ Tags are comma-separated (not semicolon as in legacy docs)
- ❌ No
descriptionfield in ChromaDB metadata (stored only in metadata files) - ❌ No
confirmedfield (all memecoins in vector DB are implicitly confirmed) - ✅ Token address (
uuid) is the universal identifier across all systems
Location: res/memecoins/rag_db/ (defined in src/constants.py:29)
Collections:
meme_image_embeddings- 768-dim image vectorsentity_caption_embeddings- 768-dim text vectors (entity)context_caption_embeddings- 768-dim text vectors (context)visual_caption_embeddings- 768-dim text vectors (visual)emotions_caption_embeddings- 768-dim text vectors (emotions)
Characteristics:
- Persistent storage - Data survives application restarts
- Minimal metadata - Only 6 fields stored per memecoin
- Fast similarity search - Optimized for cosine similarity queries
- Atomic operations - All 5 collections updated together or rolled back
- Location:
res/memecoins/rag_db/images/{token_address}.jpg - Format: 512x512px RGB JPG files (quality: 90%)
- Manager:
FileImageManager(src/vector_store/file_managers/image_manager.py) - Operations: Atomic save/load/delete with temp file pattern
- Processing: Automatic resize, RGB conversion, RGBA → white background
See Also: DATABASE_STORAGE.md - Complete file manager implementation
- Location:
res/memecoins/rag_db/metadata/{token_address}.json - Manager:
TokenMetadataFileManager(src/vector_store/file_managers/metadata_manager.py) - Thread Safety: Async-safe with
asyncio.Lock - Validation: Enforces required fields and caption structure
Metadata JSON Structure:
{
"token_name": "Doge Coin",
"ticker": "DOGE",
"description": "Much wow, very crypto, such memecoin",
"token_address": "9WzDXwBbmkg8ZTbNMqUxvQRAyrZzDsGYdLVL9zYtAWWM",
"created_at": 1693440000,
"caption": {
"entity": "Shiba Inu dog wearing space suit, holding rocket, standing on moon surface",
"context": "Reference to Doge meme and cryptocurrency culture, 'to the moon' expression, satirical take on crypto investment hype",
"visual": "Pixel art style with bright neon colors, centered composition, bold Comic Sans font overlay",
"emotions": "Playful and humorous tone, optimistic energy, ironic and satirical mood"
}
}Required Fields: token_name, ticker, description, token_address, created_at, caption
Required Caption Fields: entity, context
Optional Caption Fields: visual, emotions
- Location:
res/memecoins/pending/ - Format:
{token_address}.json+{token_address}.jpg - Purpose: Manual curation queue - new memecoins awaiting review
- Characteristics: No ChromaDB storage, no captions yet, minimal metadata
- Location:
res/memecoins/approved/ - Format:
{token_address}.json+{token_address}.jpg - Purpose: Approved memecoins ready for AI processing into vector DB
- Characteristics: No ChromaDB storage, no captions yet, can be bulk-inserted
See Also: DATABASE_STORAGE.md - Complete 3-tier workflow details
Defined in: src/constants.py:27-31
MEMECOINS_DATA_ROOT = "res/memecoins/rag_db/" # ChromaDB + files
IMAGES_ROOT = "res/memecoins/rag_db/images/" # JPG image files
METADATA_ROOT = "res/memecoins/rag_db/metadata/" # JSON metadata files
TRENDS_IMAGES_ROOT = "res/memecoins/trends/" # Trend analysis imagesTraditional RAG systems use single monolithic captions per image, which creates several problems:
Problems with Single Captions:
- Semantic conflation - Content, context, style, and emotion mixed together
- Poor similarity matching - Can't weight different aspects independently
- Search inflexibility - Can't prioritize "what" over "how" or vice versa
- Information loss - Important details buried in long paragraphs
- One-size-fits-all - Same embedding must serve all query types
LaunchAgencyBot uses 4 separate caption types, each generating its own embedding:
{
"caption": {
"entity": "WHAT is in the image - concrete visual elements",
"context": "WHAT DOES IT MEAN - cultural/meme context",
"visual": "HOW does it look - aesthetic style (optional)",
"emotions": "HOW does it feel - emotional tone (optional)"
}
}| Caption Type | Purpose | Weight | Status | Guidelines | Example | Rationale |
|---|---|---|---|---|---|---|
| Entity | WHAT is in image | 4 | Required | Focus on factual visual content, describe entities, spatial relationships, actions | "Shiba Inu dog wearing space suit with NASA logo, holding rocket with flames, on cratered moon, Earth in background, golden DOGE coin floating" | Most concrete/factual, critical for matching characters/objects, least ambiguous, foundation for other caption types |
| Context | WHAT it means | 3 | Required | Identify meme references/origins, explain crypto culture, note ironic elements, connect to trends | "Reference to Doge meme + 'to the moon' expression, satirical take on crypto hype/HODL mentality, space race parody for memecoin speculation" | Critical for understanding meme meaning, essential for thematic similarity, captures cultural context, helps match by concept not just visuals |
| Visual | HOW it looks | 2 | Optional | Art style (pixel/3D/photo), color palette/lighting, composition/framing, typography | "Pixel art 8-bit aesthetic, neon cyan/magenta, centered composition, Comic Sans overlay, retro vaporwave gradient" | Important for style matching, less critical than content/context, useful for visual similarity. Omit if: generic photos, unremarkable composition, standard templates |
| Emotions | HOW it feels | 1 | Optional | Emotional tone, humor type (ironic/absurdist/sarcastic), mood/energy, intended response | "Playful humorous tone, optimistic energy, ironic/satirical poking fun at crypto culture, lighthearted absurdism, evoke amusement + relatable frustration" | Most subjective, supplementary to content/context, useful for "vibe" matching. Omit if: emotionally neutral, purely informational, unclear intent |
| Improvement | Before | After | Impact |
|---|---|---|---|
| Semantic Separation | Mixed concepts in single caption | Dedicated embedding per concept | Queries match specific aspects without semantic bleed (e.g., "space-themed characters" matches entity, not confused by pixel art style) |
| Independent Weighting | Equal or manual tuning | Mathematical weights (4/3/2/1) | Content/context dominate (7), style/emotions add nuance (3) |
| Optional Embeddings | Forced filler text | Skip optional captions | No penalty, more precise matches, no "filler text" polluting embeddings |
| Query Flexibility | Single embedding serves all | Target specific collections | Search by entity, context, or weighted combination |
| Embedding Efficiency | One 768-dim vector (information loss) | 4-5 vectors (3072-3840 dims) | Richer representation, better discrimination |
Stage: src/domain/processor/stages/caption_generation_stage.py:37-92
Process:
- Receive
MemecoinEntrywith populated tags from previous stage - AI generates captions using LLM (LiteLLM service)
- Parse AI response into 4-part caption dict
- Validate required fields (entity, context)
- Populate
tags_categoriesusingTagManager - Store caption data in context for next stage (actual save happens atomically)
AI Prompt Structure:
Generate 4 caption types for this memecoin image:
1. ENTITY: Describe concrete visual elements...
2. CONTEXT: Explain cultural references and meme lore...
3. VISUAL (optional): Describe aesthetic style if notable...
4. EMOTIONS (optional): Describe emotional tone if present...
Tags to consider: {tag_list}
Captions stored in unified metadata JSON files (NOT separate caption files):
Location: res/memecoins/rag_db/metadata/{token_address}.json
Why Unified Storage:
- ✅ Atomic operations - Single file write prevents caption/metadata mismatch
- ✅ Data integrity - Caption always tied to correct metadata
- ✅ Efficient loading - Single JSON read gets everything
- ✅ Version control friendly - Complete state in one file
- ✅ No orphaned files - No sync issues between separate files
Alternative Considered (Rejected):
- ❌ Separate caption files (
{token_address}_caption.txt) - Issues: Sync problems, orphaned files, complex rollback, slower loads
See Also: DATABASE_STORAGE.md - Atomic file operation details
Implementation: src/vector_store/file_managers/metadata_manager.py:104-107
# Required keys
required_keys = ["token_name", "ticker", "description", "token_address", "created_at", "caption"]
# Required caption structure
required_caption_keys = ["entity", "context"] # visual and emotions are optionalValidation Rules:
- All 6 top-level keys must exist
captionmust be a dictcaption.entityandcaption.contextmust be non-empty stringscaption.visualandcaption.emotionsare optional
Workflow: src/workflows/rag_memecoin_generation_workflow.py
Process:
- User provides generation prompt (e.g., "dog with sunglasses in cyberpunk style")
- Generate CLIP embedding from prompt text
- Query all 4 caption collections with weighted search
- Retrieve top N matching memecoins (e.g., N=5)
- Load full metadata + images for matched memecoins
- Build RAG context from retrieved examples:
Example 1: Entity: {entity_caption} Context: {context_caption} Visual: {visual_caption} Emotions: {emotions_caption} Image: [Base64 image data] Example 2: ... - Generate new memecoin metadata using RAG context
- Create new memecoin with similar style/theme to successful examples
Benefits:
- Retrieved examples are semantically similar across multiple dimensions
- Weighted search prioritizes content match over style match
- AI generator learns from contextually relevant examples
- Higher quality outputs that match successful patterns
- DATABASE_STORAGE.md - Hybrid storage strategy, degradation system, data flow workflows, file managers
- DATABASE_API.md - API integration patterns, performance optimization, best practices, troubleshooting
Last Updated: 2025-10-15
Architecture Version: 5-Collection Multi-Embedding with Split Captions
Implementation Files: src/vector_store/memecoin_store.py, src/domain/processor/stages/caption_generation_stage.py