What This Document Covers:
- Simplified real-time token launch monitoring from pump.fun
- 3-stage pipeline (metadata → image → file storage)
- WebSocket-based event stream with automatic reconnection
- File-only storage (no AI processing, no database)
- High-throughput data capture for later processing
Sections in This Document:
- Core Concept
- Architecture Philosophy
- Component Architecture
- Processing Pipeline - Stage Classes
- Usage
- Dependencies & Configuration
Related Documentation:
- → ../WORKFLOWS.md - All workflows overview
- → ../../docs/architecture/DOMAIN_ARCHITECTURE.md - Domain architecture
- → ./live_launch_detection_workflow.py - Implementation
Context Tags: #workflow #real-time #websocket #launch-detection #file-storage
Real-time token monitoring with file-only storage and no AI processing
The Launch Detection Workflow provides a simplified, lightweight solution for monitoring new token launches from pump.fun. Unlike the AI-enhanced immediate processing variant, this workflow focuses on fast data capture with minimal processing overhead.
This creates a high-throughput foundation for collecting raw token data that can be processed later by other workflows when AI enhancement is needed.
WebSocket Stream → 3-Stage Pipeline → File Storage Only
Core Assumption: Fast capture beats perfect processing - collect everything first, enhance later.
Key Innovation: Zero AI dependencies enable reliable 24/7 monitoring without API quota concerns or service failures.
┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐
│ TokenLaunchSource │───▶│ SimpleLaunchDetection│───▶│ SimpleFile │
│ (WebSocket Client) │ │ Processor │ │ Executor │
│ │ │ │ │ │
│ • PumpPortal API │ │ 3-Stage Pipeline: │ │ • File Storage │
│ • Event Generation │ │ 1. TokenMetadata │ │ • Pending Directory │
│ • Reconnection │ │ 2. ImagePreparation │ │ • JSON + JPG Pairs │
└─────────────────────┘ │ 3. SimpleFileStorage │ └─────────────────────┘
└──────────────────────┘
PumpPortal WebSocket Message
↓
TokenLaunchEvent {
raw_data: { mint, name, symbol, uri, ... }
timestamp: "2024-01-04T15:30:42Z"
platform: "pump.fun"
}
↓
Processing Context {
input_data: TokenLaunchEvent
results: {}
should_continue_processing: true
termination_reason: null
}
↓
[3 Sequential Stages - No AI Processing]
↓
File Output: JSON + JPG Pairs
↓
Pending Directory: res/memecoins/pending/
class TokenMetadataStage(BaseStage[TokenLaunchEvent, TokenLaunchEvent])- Input: Raw WebSocket data with token address and IPFS URI
- Processing:
- Parses token creation data
- Fetches IPFS metadata with retry logic (2 attempts, 1s/2s delays)
- Extracts social links (website, twitter, telegram)
- Creates basic MemecoinEntry object
- Output: Enriched token data with complete metadata
- Early termination: Missing name, symbol, token_address, or IPFS fetch failure
- Context updates:
memecoin_entry,ipfs_data
class ImagePreparationStage(BaseStage[TokenLaunchEvent, TokenLaunchEvent])- Input: MemecoinEntry with image URL from metadata
- Processing:
- Downloads image from IPFS with retry logic
- Resizes to 512x512px for consistency
- Converts to JPEG format with 90% quality
- Base64 encoding for storage
- MIME type detection and validation
- Output: Processed image data in MemecoinEntry
- Early termination: Image download failure, invalid format, or processing error
- Context updates:
image_base64(updated in memecoin_entry)
class SimpleFileStorageStage(BaseStage[TokenLaunchEvent, TokenLaunchEvent])- Input: Complete MemecoinEntry with processed image
- Processing:
- Saves JSON metadata to {token_address}.json
- Saves processed JPEG to {token_address}.jpg
- Uses atomic file operations to prevent corruption
- Stores in res/memecoins/pending/ directory
- Output: File paths for saved data
- Early termination: File system errors or write failures
- Context updates:
saved_json_path,saved_image_path
StageGraph(
nodes=[
StageNode(TokenMetadataStage(), next_stages=["ImagePreparationStage"]),
StageNode(ImagePreparationStage(), next_stages=["SimpleFileStorageStage"]),
StageNode(SimpleFileStorageStage(), next_stages=[])
],
entry_point="TokenMetadataStage"
)- No tag classification or LLM analysis
- No caption generation
- No CLIP embeddings
- No vector database operations
- No multimodal AI analysis
- No ChromaDB insertion
- No pending/confirmed collection workflow
- No vector similarity operations
- No database dependencies
- No AI-powered quality gates
- No semantic tag requirements
- No embedding validation
- Minimal data completeness checks
- Connects to PumpPortal API for pump.fun launches
- Automatic reconnection with exponential backoff
- Platform-agnostic design for future expansion (Raydium, Jupiter)
- JSON Files: Complete token metadata in {token_address}.json format
- JPG Files: Processed 512x512px images as {token_address}.jpg
- Pending Directory: All files saved to res/memecoins/pending/
- Atomic Operations: Prevents data corruption during writes
- Zero API dependencies for LLM or embedding services
- No quota limits or service failures
- No authentication tokens needed
- Reliable 24/7 operation capability
# No environment variables required - zero AI dependencies
# Start simplified monitoring
PYTHONPATH=. python src/workflows/live_launch_detection_workflow.py🚀 Simplified Token Launch Detection Workflow Starting...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📡 Connecting to token launch monitoring APIs...
🎯 Monitoring for new token launches on pump.fun...
📁 File storage enabled - memecoins will be saved as JSON + JPG pairs
🖼️ Images processed to 512x512px JPEG with 90% quality
💾 Target directory: res/memecoins/pending/
❌ AI processing disabled - no tags, captions, or embeddings
❌ Database storage disabled - only file operations
📊 Complete token data will be stored as files for later processing
🛑 Press Ctrl+C to stop
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
res/memecoins/pending/
├── 6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump.json
├── 6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump.jpg
├── 8mWqZxPCFuFGF7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump.json
├── 8mWqZxPCFuFGF7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump.jpg
└── ...
{
"token_name": "Trump Save America",
"ticker": "TRUMPSAVE",
"description": "Make America Great Again memecoin for patriots",
"token_address": "6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump",
"created_at": "2025-01-15T10:30:42Z",
"": "pump.fun"
}- Format: JPEG (.jpg)
- Resolution: 512x512px (resized and cropped)
- Quality: 90% JPEG compression
- Size: Typically 50-150KB per image
- Zero AI API costs enable continuous operation
- High success rate ensures minimal data loss
- Fast processing keeps up with busy launch periods
- No service dependencies reduce failure points
- Complete file pairs ready for AI enhancement workflows
- Standardized format compatible with meme processing workflow
- Atomic file operations ensure data integrity
- Pending directory serves as processing queue
- No API keys or external services required
- Fast iteration and testing cycles
- Simple debugging with file-based output
- Easy integration testing without AI complexity
Files created by this workflow can be processed by:
# Process files captured by launch detection
PYTHONPATH=. python src/workflows/rag_memecoin_insertion_workflow.py res/memecoins/pending/The immediate processing workflow can handle the same events with full AI:
# For AI-enhanced processing (requires API keys)
PYTHONPATH=. python src/workflows/launch_detection_immediate_processing_workflow.pyFiles can be reviewed via Web UI:
# Web interface for file review
PYTHONPATH=. uvicorn src.web_ui.main:app --port 8000 --host 127.0.0.1 --reload- Decision: Remove all AI processing for reliability
- Tradeoff: No semantic tags or embeddings, but 95% success rate
- Reasoning: Capture everything first, enhance later when needed
- Decision: Use file-based storage only
- Tradeoff: No immediate querying, but zero database dependencies
- Reasoning: Files are portable and can be processed by multiple workflows
- Decision: Process images to standard format during capture
- Tradeoff: Slight processing overhead, but consistent downstream compatibility
- Reasoning: 512x512px JPEG is optimal for web UI and AI processing
- Decision: Use atomic file operations even for simple storage
- Tradeoff: Slightly slower writes, but prevents data corruption
- Reasoning: Data integrity is critical for downstream processing
- Python 3.11+ with standard library
- WebSocket connection capability
- File system write access to res/memecoins/pending/
- No REPLICATE_API_KEY needed
- No OPENAI_API_KEY needed
- No LANGFUSE_SECRET_KEY needed
- No database connections required
- Disk space: ~200KB per token (JSON + processed image)
- Memory: Minimal - processes one token at a time
- Network: Stable connection to pump.fun WebSocket API
💾 Saving files for token: 6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump
✅ Successfully saved files for token 6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump
📄 JSON: 6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump.json
🖼️ JPG: 6eW4Cc3rFFGe7RXVDWA6cKPJzJxPwL9bDCFxbKcZpump.jpg (512x512px, processed)
- Early termination on critical failures (missing metadata)
- Graceful degradation for image processing failures
- Atomic file operations prevent partial writes
- Comprehensive error logging with failure reasons
- Monitor res/memecoins/pending/ directory growth
- Track file pair completeness (JSON + JPG)
- Watch for failed temp files (.tmp.json, .tmp.jpg)
- Monitor disk space usage for high-volume periods
The Simplified Launch Detection Workflow provides a robust, zero-dependency solution for high-volume token monitoring. Its focus on reliable data capture makes it ideal for 24/7 operations and serves as the foundation for more sophisticated AI-enhanced processing workflows.