ViPERSQL configuration can be set via 3 methods (priority high → low):
- Command-line arguments (highest priority)
- Environment variables (.env file)
- Default values (in code)
Override mọi settings khác.
python vipersql.py \
--model gpt-4o \
--strategy few-shot \
--example-selection-strategy vir2 \
--samples 100 \
--temperature 0.5# .env file
DEFAULT_MODEL=gpt-4o
DEFAULT_STRATEGY=few-shot
EXAMPLE_SELECTION_STRATEGY=vir2
DEFAULT_TEMPERATURE=0.3Defined in mint/constants.py - used when no override present.
| Parameter | Environment Variable | Required | Description |
|---|---|---|---|
| N/A | OPENAI_API_KEY |
✅ (if using OpenAI) | OpenAI API key |
| N/A | ANTHROPIC_API_KEY |
✅ (if using Anthropic) | Anthropic API key |
| N/A | LANGCHAIN_API_KEY |
❌ | LangChain tracing (optional) |
| N/A | LANGCHAIN_TRACING_V2 |
❌ | Enable tracing (true/false) |
| Parameter | CLI | Environment | Default | Description |
|---|---|---|---|---|
| Model | --model |
DEFAULT_MODEL |
gpt-4o |
LLM model name |
| Temperature | --temperature |
DEFAULT_TEMPERATURE |
0.3 |
Sampling temperature (0-1) |
| Max tokens | --max-tokens |
DEFAULT_MAX_TOKENS |
1000 |
Maximum response tokens |
| Timeout | --timeout |
DEFAULT_TIMEOUT |
60 |
API timeout (seconds) |
Available Models:
gpt-4o- OpenAI GPT-4 Optimizedgpt-4o-mini- OpenAI GPT-4 Mini (cheaper)claude-3-5-sonnet-20241022- Anthropic Claude 3.5 Sonnet
| Parameter | CLI | Environment | Default | Values |
|---|---|---|---|---|
| Strategy | --strategy |
DEFAULT_STRATEGY |
zero-shot |
zero-shot, few-shot, cot |
| Template dir | --template-dir |
DEFAULT_TEMPLATE_DIR |
templates |
Path to templates |
| Parameter | CLI | Environment | Default | Description |
|---|---|---|---|---|
| Selector | --example-selection-strategy |
EXAMPLE_SELECTION_STRATEGY |
random |
Selection method |
| Examples (k) | --few-shot-examples |
FEW_SHOT_EXAMPLES |
3 |
Number of examples |
Available Selectors:
random- Random selectiondicl- Semantic similarity (DICL)astres- AST-based matchingskill_knn- Skill-based KNNvir2- Two-stage ViR2 (recommended)vir2-no-pos- ViR2 without POS matchingvir2-no-diversity- ViR2 without diversityvir2-no-beam-search- ViR2 without beam searchmultilang-vir2- Multi-language ViR2
| Parameter | CLI | Environment | Default | Range | Description |
|---|---|---|---|---|---|
| Pool size (M) | --vir2-candidate-pool-size |
VIR2_CANDIDATE_POOL_SIZE |
50 |
[10, 200] | Stage 1 candidates |
| Beam size (B) | --vir2-beam-size |
VIR2_BEAM_SIZE |
5 |
[1, 20] | Beam search width |
| Diversity (λ) | --vir2-diversity-weight |
VIR2_DIVERSITY_WEIGHT |
0.3 |
[0, 1] | Diversity weight |
| Parameter | CLI | Environment | Default | Description |
|---|---|---|---|---|
| Reasoning steps | --cot-reasoning-steps |
COT_REASONING_STEPS |
true |
Enable step-by-step |
| Parameter | CLI | Environment | Default | Values |
|---|---|---|---|---|
| Path | --dataset-path |
DATASET_PATH |
dataset/ViText2SQL |
Dataset directory |
| Split | --split |
DEFAULT_SPLIT |
dev |
dev, test |
| Level | --level |
DEFAULT_LEVEL |
std |
std, syllable, word |
| Samples | --samples |
DEFAULT_SAMPLES |
None (all) |
Number to process |
| Parameter | CLI | Environment | Default | Values |
|---|---|---|---|---|
| Language | --language |
LANGUAGE |
auto |
auto, vi, en |
| Cross-lingual | --cross-lingual |
CROSS_LINGUAL |
false |
Enable cross-lingual |
| Parameter | CLI | Environment | Default | Description |
|---|---|---|---|---|
| Results dir | --results-dir |
RESULTS_DIR |
results |
Output directory |
| Parameter | CLI | Environment | Default | Description |
|---|---|---|---|---|
| Execution accuracy | N/A | ENABLE_EXECUTION_ACCURACY |
true |
Enable EX metric |
| Component analysis | N/A | ENABLE_COMPONENT_ANALYSIS |
true |
Enable F1 metrics |
| Error analysis | N/A | ENABLE_ERROR_ANALYSIS |
true |
Enable error analysis |
| Eval timeout | N/A | EVALUATION_TIMEOUT |
60 |
Timeout (seconds) |
| Parameter | CLI | Environment | Default | Values |
|---|---|---|---|---|
| Log level | N/A | LOG_LEVEL |
INFO |
DEBUG, INFO, WARNING, ERROR |
| Log format | N/A | LOG_FORMAT |
Standard | Log format string |
| Parameter | CLI | Environment | Default | Description |
|---|---|---|---|---|
| Batch size | N/A | BATCH_SIZE |
10 |
Processing batch size |
| Max concurrent | N/A | MAX_CONCURRENT_REQUESTS |
5 |
Concurrent API calls |
| Retry attempts | N/A | RETRY_ATTEMPTS |
3 |
API retry count |
| Retry delay | N/A | RETRY_DELAY |
5 |
Retry delay (seconds) |
# API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Model Settings
DEFAULT_MODEL=gpt-4o
DEFAULT_TEMPERATURE=0.3
DEFAULT_MAX_TOKENS=1000
# Strategy
DEFAULT_STRATEGY=few-shot
EXAMPLE_SELECTION_STRATEGY=vir2
FEW_SHOT_EXAMPLES=3
# ViR2 Parameters
VIR2_CANDIDATE_POOL_SIZE=50
VIR2_BEAM_SIZE=5
VIR2_DIVERSITY_WEIGHT=0.3
# Dataset
DATASET_PATH=dataset/ViText2SQL
DEFAULT_SPLIT=dev
DEFAULT_LEVEL=std
# Output
RESULTS_DIR=results
# Logging
LOG_LEVEL=INFO# Override .env settings via CLI
python vipersql.py \
--model claude-3-5-sonnet-20241022 \
--strategy few-shot \
--example-selection-strategy vir2 \
--few-shot-examples 5 \
--vir2-candidate-pool-size 100 \
--vir2-beam-size 10 \
--vir2-diversity-weight 0.5 \
--level syllable \
--split test \
--samples 200 \
--temperature 0.5Production (.env.prod):
DEFAULT_MODEL=gpt-4o
DEFAULT_STRATEGY=few-shot
EXAMPLE_SELECTION_STRATEGY=vir2
FEW_SHOT_EXAMPLES=5
LOG_LEVEL=WARNINGDevelopment (.env.dev):
DEFAULT_MODEL=gpt-4o-mini
DEFAULT_STRATEGY=zero-shot
LOG_LEVEL=DEBUGUsage:
# Production
cp .env.prod .env
python vipersql.py --samples 1000
# Development
cp .env.dev .env
python vipersql.py --samples 10Goal: Fast feedback, low cost
python vipersql.py \
--model gpt-4o-mini \
--strategy zero-shot \
--samples 10Goal: Highest quality predictions
python vipersql.py \
--model gpt-4o \
--strategy few-shot \
--example-selection-strategy vir2 \
--few-shot-examples 5 \
--vir2-candidate-pool-size 100 \
--vir2-beam-size 10 \
--samples 1000Goal: Balance cost and accuracy
python vipersql.py \
--model gpt-4o-mini \
--strategy few-shot \
--example-selection-strategy dicl \
--few-shot-examples 3 \
--samples 1000Goal: Understand model reasoning
python vipersql.py \
--model claude-3-5-sonnet-20241022 \
--strategy cot \
--temperature 0.5 \
--max-tokens 2000 \
--samples 100Goal: Process Vietnamese and English
python vipersql.py \
--strategy few-shot \
--example-selection-strategy multilang-vir2 \
--language auto \
--samples 500Goal: Test ViR2 components
# Full ViR2
python vipersql.py --example-selection-strategy vir2 --samples 100
# Without POS
python vipersql.py --example-selection-strategy vir2-no-pos --samples 100
# Without diversity
python vipersql.py --example-selection-strategy vir2-no-diversity --samples 100
# Without beam search
python vipersql.py --example-selection-strategy vir2-no-beam-search --samples 100Low (0.1-0.3):
- ✅ Deterministic outputs
- ✅ SQL generation (structured task)
- ❌ Creative tasks
Medium (0.4-0.6):
- ✅ Chain-of-thought reasoning
- ✅ Balanced exploration
⚠️ May vary slightly
High (0.7-1.0):
- ✅ Creative text generation
- ❌ SQL generation (too random)
Recommendation: 0.3 for SQL, 0.5 for CoT
Low (500-1000):
- ✅ Zero-shot, Few-shot
- ✅ Simple queries
- ❌ Complex queries, CoT
Medium (1000-2000):
- ✅ Most use cases
- ✅ Chain-of-thought
⚠️ May truncate long reasoning
High (2000+):
- ✅ Complex CoT
- ✅ Multiple nested queries
⚠️ Higher cost
Recommendation: 1000 for normal, 2000 for CoT
Candidate Pool Size (M):
- Small (10-30): Fast, may miss good examples
- Medium (50-100): Balanced (recommended)
- Large (100-200): Better coverage, slower
Beam Size (B):
- Small (1-3): Fast, greedy selection
- Medium (5-10): Good optimization (recommended)
- Large (10-20): Better optimization, diminishing returns
Diversity Weight (λ):
- Low (0-0.2): Prefer similar structure
- Medium (0.3-0.5): Balanced (recommended)
- High (0.6-1.0): Prefer diverse examples
Copy this to .env:
# ============================================================
# ViPERSQL Configuration
# ============================================================
# ------------------------------------------------------------
# API Keys (Required)
# ------------------------------------------------------------
OPENAI_API_KEY=
ANTHROPIC_API_KEY=
# Optional: LangChain Tracing
LANGCHAIN_API_KEY=
LANGCHAIN_TRACING_V2=false
# ------------------------------------------------------------
# Model Settings
# ------------------------------------------------------------
DEFAULT_MODEL=gpt-4o
DEFAULT_TEMPERATURE=0.3
DEFAULT_MAX_TOKENS=1000
DEFAULT_TIMEOUT=60
# ------------------------------------------------------------
# Strategy Settings
# ------------------------------------------------------------
DEFAULT_STRATEGY=few-shot
DEFAULT_TEMPLATE_DIR=templates
# ------------------------------------------------------------
# Few-shot Settings
# ------------------------------------------------------------
EXAMPLE_SELECTION_STRATEGY=vir2
FEW_SHOT_EXAMPLES=3
# ------------------------------------------------------------
# ViR2 Settings
# ------------------------------------------------------------
VIR2_CANDIDATE_POOL_SIZE=50
VIR2_BEAM_SIZE=5
VIR2_DIVERSITY_WEIGHT=0.3
# ------------------------------------------------------------
# CoT Settings
# ------------------------------------------------------------
COT_REASONING_STEPS=true
# ------------------------------------------------------------
# Dataset Settings
# ------------------------------------------------------------
DATASET_PATH=dataset/ViText2SQL
DEFAULT_SPLIT=dev
DEFAULT_LEVEL=std
DEFAULT_SAMPLES=
# ------------------------------------------------------------
# Multi-Language Settings
# ------------------------------------------------------------
LANGUAGE=auto
CROSS_LINGUAL=false
# ------------------------------------------------------------
# Output Settings
# ------------------------------------------------------------
RESULTS_DIR=results
# ------------------------------------------------------------
# Evaluation Settings
# ------------------------------------------------------------
ENABLE_EXECUTION_ACCURACY=true
ENABLE_COMPONENT_ANALYSIS=true
ENABLE_ERROR_ANALYSIS=true
EVALUATION_TIMEOUT=60
# ------------------------------------------------------------
# Logging Settings
# ------------------------------------------------------------
LOG_LEVEL=INFO
LOG_FORMAT=%(asctime)s - %(name)s - %(levelname)s - %(message)s
# ------------------------------------------------------------
# Performance Settings
# ------------------------------------------------------------
BATCH_SIZE=10
MAX_CONCURRENT_REQUESTS=5
RETRY_ATTEMPTS=3
RETRY_DELAY=5# Dry run to see loaded config (if supported)
python vipersql.py --samples 0 --dry-run
# Or check programmatically
python -c "from mint.config import ViPERConfig; print(ViPERConfig())"Issue: API key not loaded
# Check .env exists
ls -la .env
# Verify syntax
cat .env | grep API_KEYIssue: Parameter not taking effect
- Check priority: CLI > ENV > Default
- Verify environment variable name matches
- Restart if .env changed
- Quick Start - Get started quickly
- Usage Examples - Real-world scenarios
- ViR2 Method - ViR2 hyperparameters explained
- Strategies - Strategy-specific settings