This guide covers all configuration options for the YouTube Value Extractor.
The tool uses environment variables for configuration. You can set these in a .env file or export them directly.
| Variable | Description | Default | Required |
|---|---|---|---|
LLM_MODEL |
The LLM model to use for analysis | gpt-4o-mini |
No |
OPENAI_API_KEY |
OpenAI API key (required for GPT models) | - | For OpenAI models |
ANTHROPIC_API_KEY |
Anthropic API key (required for Claude models) | - | For Anthropic models |
| Variable | Description | Default | Required |
|---|---|---|---|
DEFAULT_OUTPUT_DIR |
Default directory for output files | ./notes |
No |
REPORT_TZ |
Timezone for report timestamps | America/Costa_Rica |
No |
| Variable | Description | Default | Required |
|---|---|---|---|
ENABLE_CACHE |
Enable/disable caching | true |
No |
CACHE_DIR |
Directory for cache files | ./.cache |
No |
| Variable | Description | Default | Required |
|---|---|---|---|
MAX_CONCURRENT_VIDEOS |
Maximum concurrent video processing | 3 |
No |
Only needed if you want to use Whisper for fallback transcription when YouTube transcripts aren't available.
| Variable | Description | Default | Required |
|---|---|---|---|
WHISPER_MODEL |
Whisper model size (tiny, base, small, medium, large) |
base |
No |
WHISPER_DEVICE |
Processing device (auto, cuda, cpu) |
auto |
No |
WHISPER_COMPUTE_TYPE |
Compute precision (float16, float32, int8) |
float16 |
No |
gpt-4o-mini- Fast and cost-effective (default)gpt-4o- Higher quality, more expensivegpt-4-turbo- Good balance of speed and qualitygpt-3.5-turbo- Fastest, least expensive
claude-3-5-sonnet-latest- Latest Sonnet modelclaude-3-5-haiku-latest- Fastest Claude modelclaude-3-opus-latest- Highest quality Claude model
ollama/llama3.1:8b- Meta's Llama 3.1 8Bollama/llama3.1:70b- Meta's Llama 3.1 70B (requires significant RAM)ollama/qwen2.5:7b- Alibaba's Qwen 2.5 7B
# .env file
LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=sk-your-openai-key-here
DEFAULT_OUTPUT_DIR=./video-insights
REPORT_TZ=America/New_York
ENABLE_CACHE=true# .env file
LLM_MODEL=claude-3-5-sonnet-latest
ANTHROPIC_API_KEY=your-anthropic-key-here
DEFAULT_OUTPUT_DIR=./reports
WHISPER_MODEL=small
WHISPER_DEVICE=cuda# .env file
LLM_MODEL=ollama/llama3.1:8b
DEFAULT_OUTPUT_DIR=./local-analysis
ENABLE_CACHE=false
MAX_CONCURRENT_VIDEOS=1You can override environment variables using CLI flags:
# Override output directory
python -m yt_extractor.cli process VIDEO_URL --output-dir ./custom-output
# Use verbose mode
python -m yt_extractor.cli process VIDEO_URL --verbose
# Override concurrent processing
python -m yt_extractor.cli batch videos.txt --concurrent 5python -m yt_extractor.cli config checkThis command will:
- Show all current configuration values
- Validate API keys
- Check model accessibility
- Verify directory permissions
python -m yt_extractor.cli config initThis creates a new .env file with default values and prompts for required settings.
-
Model Selection:
- Use
gpt-4o-minifor most use cases (fast, cost-effective) - Upgrade to
gpt-4ofor complex technical content - Consider
claude-3-5-sonnet-latestfor detailed analysis
- Use
-
Caching:
- Keep caching enabled (
ENABLE_CACHE=true) - Transcripts are cached for 7 days
- LLM responses are cached for 30 days
- Keep caching enabled (
-
Concurrent Processing:
- Default
MAX_CONCURRENT_VIDEOS=3works well - Increase for faster batch processing (if API limits allow)
- Decrease for rate-limited APIs or limited resources
- Default
-
API Keys:
- Never commit
.envfiles to version control - Use environment variables in production
- Rotate keys regularly
- Never commit
-
File Permissions:
- Ensure output directories have appropriate permissions
- Cache directory should be writable
-
Missing API Key:
ConfigurationError: OPENAI_API_KEY required for OpenAI modelsSolution: Set the appropriate API key for your chosen model.
-
Model Not Available:
LLMProcessingError: Model gpt-5 not foundSolution: Check model name spelling and availability.
-
Cache Permission Error:
CacheError: Failed to initialize cacheSolution: Ensure cache directory is writable or change
CACHE_DIR.
Run the config check command to validate your setup:
python -m yt_extractor.cli config checkThis will identify configuration issues and suggest fixes.