Building a state-of-the-art Rust-based MCP server for web searches with enterprise-grade architecture, multiple provider support, semantic caching with RAG integration, and AI-first observability. This server enables AI assistants (Claude, Cursor, etc.) to perform intelligent web searches with automatic provider fallback, persistent semantic caching, and self-diagnosing health monitoring.
| Decision | Choice | Rationale |
|---|---|---|
| Use Case | AI Assistant Integration | MCP-native, shapes API for AI consumption |
| Providers | DuckDuckGo, Brave, Synthetic Search | Free tiers + premium fallback path |
| Distribution | Weighted + Fallback | Smart load distribution with resilience |
| Auth | MCP Protocol Config | Native MCP integration, simple |
| Response Format | Normalized Schema | Consistent for AI clients |
| Caching | Persistent + RAG + Migrations | SQLite default, external DB optional |
| Embeddings | Ollama default, configurable providers | Local-first with cloud fallback |
| Error Handling | Circuit breaker + retry + cache fallback | Enterprise resilience |
| Observability | MCP tools (self-diagnosing) | AI clients can query health |
| Runtime | Tokio | Industry standard, rich ecosystem |
| Architecture | Hexagonal | Clean separation, testable |
| Rust Version | MSRV 1.80 | LazyLock, latest stable |
| Testing | Integration-focused with fixtures | Tests against real behavior |
mcp-websearch/
├── Cargo.toml
├── Cargo.lock
├── .cargo/
│ └── config.toml # Build configurations
├── src/
│ ├── main.rs # Composition root (~100 lines)
│ ├── lib.rs # Public API exports (~50 lines)
│ │
│ ├── domain/ # Core business logic - NO external deps
│ │ ├── mod.rs
│ │ ├── models.rs # SearchQuery, SearchResult, CacheEntry (~300 lines)
│ │ ├── search_service.rs # Core search orchestration (~400 lines)
│ │ ├── provider.rs # SearchProvider trait (~150 lines)
│ │ ├── cache.rs # Cache trait definitions (~200 lines)
│ │ ├── embeddings.rs # EmbeddingProvider trait (~150 lines)
│ │ └── circuit_breaker.rs # CircuitBreaker trait + state machine (~250 lines)
│ │
│ ├── application/ # Use case orchestration
│ │ ├── mod.rs
│ │ ├── use_cases/
│ │ │ ├── mod.rs
│ │ │ ├── search.rs # Search use case (~350 lines)
│ │ │ ├── health.rs # Health/observability use case (~200 lines)
│ │ │ ├── cache.rs # Cache management use case (~250 lines)
│ │ │ └── config.rs # Configuration use case (~200 lines)
│ │ └── dto.rs # Data transfer objects (~150 lines)
│ │
│ ├── ports/ # Port definitions (interfaces)
│ │ ├── mod.rs
│ │ ├── search_provider.rs # SearchProvider port (~100 lines)
│ │ ├── cache_backend.rs # CacheBackend port (~120 lines)
│ │ ├── embedding.rs # EmbeddingProvider port (~100 lines)
│ │ └── mcp_transport.rs # MCP transport abstraction (~150 lines)
│ │
│ ├── adapters/ # Outer hexagon - external integrations
│ │ ├── mod.rs
│ │ │
│ │ ├── mcp/ # MCP protocol adapter
│ │ │ ├── mod.rs
│ │ │ ├── stdio.rs # Stdio transport (~200 lines)
│ │ │ └── tools/
│ │ │ ├── mod.rs
│ │ │ ├── search.rs # web_search tool (~300 lines)
│ │ │ ├── search_similar.rs # search_similar tool (RAG) (~250 lines)
│ │ │ ├── search_news.rs # search_news tool (~250 lines)
│ │ │ ├── search_cached.rs # search_cached tool (~200 lines)
│ │ │ ├── cache_stats.rs # get_cache_stats tool (~150 lines)
│ │ │ ├── cache_clear.rs # cache_clear tool (~150 lines)
│ │ │ ├── cache_migrate.rs # cache_migrate tool (~200 lines)
│ │ │ ├── provider_status.rs # get_provider_status tool (~150 lines)
│ │ │ └── server_stats.rs # get_server_stats tool (~200 lines)
│ │ │
│ │ ├── providers/ # Search provider adapters
│ │ │ ├── mod.rs
│ │ │ ├── duckduckgo.rs # DuckDuckGo adapter (~350 lines)
│ │ │ ├── brave.rs # Brave Search adapter (~350 lines)
│ │ │ ├── synthetic.rs # Synthetic Search adapter (~400 lines)
│ │ │ ├── provider_pool.rs # Weighted provider pool (~300 lines)
│ │ │ └── normalized_result.rs # Result normalization (~250 lines)
│ │ │
│ │ ├── cache/ # Cache backend adapters
│ │ │ ├── mod.rs
│ │ │ ├── sqlite_backend.rs # SQLite implementation (~450 lines)
│ │ │ ├── postgres_backend.rs # PostgreSQL implementation (~450 lines)
│ │ │ ├── semantic_index.rs # HNSW semantic index (~400 lines)
│ │ │ ├── cache_entry.rs # Cache entry model (~150 lines)
│ │ │ └── migrations.rs # Schema migrations (~300 lines)
│ │ │
│ │ ├── embeddings/ # Embedding provider adapters
│ │ │ ├── mod.rs
│ │ │ ├── ollama.rs # Ollama embeddings (~300 lines)
│ │ │ ├── openai.rs # OpenAI embeddings (~250 lines)
│ │ │ └── model_registry.rs # Available models (~150 lines)
│ │ │
│ │ └── circuit_breaker/ # Circuit breaker implementations
│ │ ├── mod.rs
│ │ └── in_memory.rs # Lock-free circuit breaker (~300 lines)
│ │
│ └── infrastructure/ # Cross-cutting concerns
│ ├── mod.rs
│ ├── config.rs # Configuration loading (~350 lines)
│ ├── error.rs # Error types hierarchy (~300 lines)
│ ├── telemetry.rs # Logging/tracing setup (~200 lines)
│ └── http_client.rs # Shared HTTP client factory (~150 lines)
│
├── migrations/ # Database migrations
│ ├── V1__initial_schema.sql
│ ├── V2__add_semantic_index.sql
│ └── V3__add_embedding_metadata.sql
│
├── tests/ # Integration tests
│ ├── fixtures/
│ │ ├── duckduckgo/
│ │ ├── brave/
│ │ └── synthetic/
│ ├── integration/
│ │ ├── providers_test.rs
│ │ ├── cache_test.rs
│ │ └── embeddings_test.rs
│ └── e2e/
│ └── mcp_protocol_test.rs
│
├── config/
│ └── default.toml # Default configuration
│
└── README.md
/// Normalized search result - consistent across all providers
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct SearchResult {
pub title: String,
pub url: String,
pub snippet: String,
pub source_provider: ProviderName,
pub published_date: Option<DateTime<Utc>>,
pub relevance_score: Option<f32>,
}
/// Provider identification
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum ProviderName {
DuckDuckGo,
Brave,
Synthetic,
Cache, // For cached results
}
/// Search query with options
#[derive(Debug, Clone, Deserialize)]
pub struct SearchQuery {
pub query: String,
pub max_results: Option<u8>,
pub providers: Option<Vec<ProviderName>>,
pub use_cache: Option<bool>,
pub freshness: Option<Freshness>,
}
pub enum Freshness {
Day,
Week,
Month,
Year,
}
/// Cache entry with semantic indexing
#[derive(Debug, Clone)]
pub struct CacheEntry {
pub id: i64,
pub query_hash: [u8; 32], // Blake3 hash
pub query_text: String,
pub embedding: Vec<f32>, // Quantized to f16 for storage
pub results: Vec<SearchResult>,
pub provider_used: ProviderName,
pub created_at: DateTime<Utc>,
pub last_accessed: DateTime<Utc>,
pub access_count: u32,
pub embedding_model: String, // Track for dimension changes
pub embedding_dimension: u16,
}#[async_trait]
pub trait SearchProvider: Send + Sync {
/// Provider name for logging/routing
fn name(&self) -> ProviderName;
/// Weight for weighted selection (higher = more traffic)
fn weight(&self) -> u32;
/// Execute search
async fn search(&self, query: &SearchQuery) -> Result<Vec<SearchResult>, ProviderError>;
/// Health check for circuit breaker
async fn health_check(&self) -> ProviderHealth;
/// Provider capabilities for routing decisions
fn capabilities(&self) -> ProviderCapabilities;
}
pub struct ProviderCapabilities {
pub max_concurrent_requests: u32,
pub typical_latency_ms: u32,
pub supports_pagination: bool,
pub supports_freshness: bool,
pub rate_limit_per_minute: u32,
}
pub struct ProviderHealth {
pub is_healthy: bool,
pub latency_ms: Option<u32>,
pub error_rate: f32,
pub last_success: Option<Instant>,
}#[async_trait]
pub trait CacheBackend: Send + Sync {
/// Exact key lookup
async fn get_exact(&self, query_hash: &[u8; 32]) -> Result<Option<CacheEntry>, CacheError>;
/// Semantic similarity search
async fn get_semantic(
&self,
embedding: &[f32],
threshold: f32,
limit: usize,
) -> Result<Vec<SemanticMatch>, CacheError>;
/// Store entry
async fn store(&self, entry: CacheEntry) -> Result<(), CacheError>;
/// Invalidate by query or age
async fn invalidate(&self, query_hash: Option<&[u8; 32]>, older_than: Option<Duration>) -> Result<u64, CacheError>;
/// Get statistics
async fn stats(&self) -> Result<CacheStats, CacheError>;
/// Run migrations
async fn migrate(&self, target_version: u32) -> Result<(), CacheError>;
}
pub struct SemanticMatch {
pub entry: CacheEntry,
pub similarity: f32,
}#[async_trait]
pub trait EmbeddingProvider: Send + Sync {
/// Generate embedding for text
async fn embed(&self, text: &str) -> Result<Vec<f32>, EmbeddingError>;
/// Model identifier
fn model_name(&self) -> &str;
/// Embedding dimension
fn dimension(&self) -> u16;
/// Health check
async fn health_check(&self) -> bool;
}use std::sync::atomic::{AtomicU8, AtomicU32, AtomicU64, Ordering};
pub enum CircuitState {
Closed = 0, // Normal operation
Open = 1, // Failing fast
HalfOpen = 2, // Testing recovery
}
pub struct CircuitBreaker {
state: AtomicU8,
failure_count: AtomicU32,
success_count: AtomicU32,
last_failure: AtomicU64,
config: CircuitConfig,
}
pub struct CircuitConfig {
pub failure_threshold: u32, // Open after N failures
pub success_threshold: u32, // Close after N successes in half-open
pub timeout_ms: u64, // Time before half-open attempt
pub base_delay_ms: u64, // Base retry delay
pub max_delay_ms: u64, // Max retry delay
pub backoff_multiplier: f32, // Exponential backoff factor
}
impl CircuitBreaker {
/// Two-tier: returns result or falls back to next provider
pub async fn call_with_fallback<F, Fut, T>(
&self,
f: F,
fallback: Option<Arc<dyn SearchProvider>>,
) -> Result<T, CircuitError>
where
F: FnOnce() -> Fut,
Fut: std::future::Future<Output = Result<T, ProviderError>>,
{
if !self.allow_request() {
if let Some(fb) = fallback {
return fb.search(query).await;
}
return Err(CircuitError::Open);
}
// Execute with retry and jitter
let mut delay = self.config.base_delay_ms;
for attempt in 0..self.config.max_retries {
match f().await {
Ok(v) => {
self.record_success();
return Ok(v);
}
Err(e) if attempt < self.config.max_retries - 1 => {
let jitter = rand::random::<f32>() * 0.3;
tokio::time::sleep(Duration::from_millis(
(delay as f32 * (1.0 + jitter)) as u64
)).await;
delay = (delay as f32 * self.config.backoff_multiplier) as u64;
delay = delay.min(self.config.max_delay_ms);
}
Err(e) => {
self.record_failure();
return Err(e.into());
}
}
}
unreachable!()
}
}| Tool | Purpose | File |
|---|---|---|
web_search |
Primary search with provider fallback | adapters/mcp/tools/search.rs |
search_similar |
RAG-style semantic cache lookup | adapters/mcp/tools/search_similar.rs |
search_news |
News-focused search with freshness | adapters/mcp/tools/search_news.rs |
search_cached |
Cache-only search (no external calls) | adapters/mcp/tools/search_cached.rs |
get_cache_stats |
Cache hit rate, size, health | adapters/mcp/tools/cache_stats.rs |
cache_clear |
Clear cache entries | adapters/mcp/tools/cache_clear.rs |
cache_migrate |
Run schema migrations | adapters/mcp/tools/cache_migrate.rs |
get_provider_status |
Provider health + circuit breaker state | adapters/mcp/tools/provider_status.rs |
get_server_stats |
Server metrics (latency, throughput) | adapters/mcp/tools/server_stats.rs |
{
"name": "web_search",
"description": "Search the web using multiple providers with automatic fallback and semantic caching",
"inputSchema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
},
"max_results": {
"type": "integer",
"default": 10,
"minimum": 1,
"maximum": 50
},
"providers": {
"type": "array",
"items": { "enum": ["duckduckgo", "brave", "synthetic"] },
"description": "Specific providers to use (default: all available)"
},
"use_cache": {
"type": "boolean",
"default": true,
"description": "Use semantic cache for faster results"
},
"freshness": {
"type": "string",
"enum": ["day", "week", "month", "year"],
"description": "Result freshness filter"
}
},
"required": ["query"]
}
}┌─────────────────────────────────────────┐
│ L1: Hot (in-memory HNSW, full precision)│ 1000 entries × 6KB = 6MB
│ TTL: 5 minutes │
├─────────────────────────────────────────┤
│ L2: Warm (in-memory HNSW, quantized) │ 10000 entries × 192B = 2MB
│ TTL: 30 minutes │
├─────────────────────────────────────────┤ │ L3: Cold (SQLite, compressed) │ Unlimited
│ TTL: 7 days │
└─────────────────────────────────────────┘
/// Quantize f32 embedding to binary (96% size reduction)
pub fn quantize_to_binary(embedding: &[f32]) -> Vec<u8> {
embedding.chunks(8)
.map(|chunk| {
chunk.iter().enumerate()
.filter(|(_, v)| **v > 0.0)
.map(|(i, _)| 1 << i)
.fold(0u8, |acc, bit| acc | bit)
})
.collect()
}
/// Hamming distance for binary embeddings (fast XOR + popcount)
pub fn hamming_distance(a: &[u8], b: &[u8]) -> u32 {
a.iter().zip(b.iter())
.map(|(x, y)| (x ^ y).count_ones())
.sum()
}PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -64000; -- 64MB cache
PRAGMA mmap_size = 268435456; -- 256MB mmap
CREATE TABLE IF NOT EXISTS cache_entries (
id INTEGER PRIMARY KEY AUTOINCREMENT,
query_hash BLOB NOT NULL UNIQUE, -- 32-byte Blake3 hash
query_text TEXT NOT NULL, -- For debugging/FTS
embedding BLOB, -- Quantized f16/u8
results_json BLOB NOT NULL, -- zstd compressed
provider_used TEXT NOT NULL, -- "duckduckgo", "brave", "synthetic"
created_at INTEGER NOT NULL, -- Unix timestamp ms
last_accessed INTEGER NOT NULL,
access_count INTEGER DEFAULT 0,
embedding_model TEXT NOT NULL, -- "nomic-embed-text"
embedding_dimension INTEGER NOT NULL -- 768, 1536, etc.
);
-- Fast hash lookup
CREATE INDEX idx_query_hash ON cache_entries(query_hash);
-- Time-based cleanup
CREATE INDEX idx_created_at ON cache_entries(created_at);
-- Full-text search for pre-filtering
CREATE VIRTUAL TABLE cache_fts USING fts5(query_text, content='cache_entries');-- HNSW index metadata for tracking loaded entries
CREATE TABLE semantic_index_state (
id INTEGER PRIMARY KEY,
entry_id INTEGER NOT NULL,
hnsw_node_id INTEGER NOT NULL,
tier TEXT NOT NULL, -- 'hot' or 'warm'
FOREIGN KEY (entry_id) REFERENCES cache_entries(id) ON DELETE CASCADE
);
CREATE INDEX idx_semantic_tier ON semantic_index_state(tier);{
"mcpServers": {
"websearch": {
"command": "/path/to/mcp-websearch",
"args": [],
"env": {
"RUST_LOG": "info"
}
}
}
}#[derive(Debug, Deserialize)]
pub struct Config {
pub providers: ProvidersConfig,
pub cache: CacheConfig,
pub embeddings: EmbeddingsConfig,
pub circuit_breaker: CircuitBreakerConfig,
}
#[derive(Debug, Deserialize)]
pub struct ProvidersConfig {
pub duckduckgo: Option<DuckDuckGoConfig>,
pub brave: Option<BraveConfig>,
pub synthetic: Option<SyntheticConfig>,
pub weights: HashMap<String, u32>,
pub fallback_order: Vec<String>,
}
#[derive(Debug, Deserialize)]
pub struct BraveConfig {
pub api_key: SecretString,
pub rate_limit_per_minute: u32,
}
#[derive(Debug, Deserialize)]
pub struct SyntheticConfig {
pub api_key: SecretString,
pub endpoint: String,
}
#[derive(Debug, Deserialize)]
pub struct CacheConfig {
pub backend: CacheBackend,
pub path: PathBuf,
pub max_size_mb: u64,
pub default_ttl_hours: u64,
pub semantic_threshold: f32,
}
#[derive(Debug, Deserialize)]
pub enum CacheBackend {
Sqlite,
Postgres { connection_string: SecretString },
}
#[derive(Debug, Deserialize)]
pub struct EmbeddingsConfig {
pub provider: EmbeddingProvider,
pub model: String,
pub dimension: u16,
pub budget_limit_usd: Option<f32>,
}
#[derive(Debug, Deserialize)]
pub enum EmbeddingProvider {
Ollama { endpoint: String },
OpenAI { api_key: SecretString },
}#[derive(Error, Debug)]
pub enum SearchError {
#[error("Provider error: {provider} - {message}")]
Provider {
provider: ProviderName,
message: String,
#[source]
source: ProviderError,
},
#[error("All providers failed")]
AllProvidersFailed {
errors: Vec<(ProviderName, ProviderError)>,
},
#[error("Circuit breaker open for: {0}")]
CircuitBreakerOpen(ProviderName),
#[error("Cache error: {0}")]
Cache(#[from] CacheError),
#[error("Embedding generation failed: {0}")]
Embedding(#[from] EmbeddingError),
#[error("Query validation failed: {reasons:?}")]
Validation { reasons: Vec<String> },
}
// Sanitized for MCP responses
impl From<SearchError> for McpError {
fn from(err: SearchError) -> Self {
match err {
SearchError::AllProvidersFailed { .. } => {
McpError::internal("All search providers are currently unavailable. Try again later.")
}
SearchError::CircuitBreakerOpen(provider) => {
McpError::unavailable(&format!("Provider {} is temporarily unavailable", provider))
}
_ => McpError::internal("Search failed. Please try again."),
}
}
}| Area | Control | Implementation |
|---|---|---|
| API Keys | Memory protection | secrecy crate, zero-on-drop |
| API Keys | File permissions | Enforce 0o600 on config files |
| SSRF | URL allowlisting | Block private IP ranges, disable redirects |
| Prompt Injection | Input validation | Unicode NFC normalization, pattern detection |
| Denial-of-Wallet | Budget controls | Circuit breakers, mandatory caps for paid APIs |
| Cache Poisoning | Integrity | Blocklist validation, content hashing |
| Supply Chain | Audit | cargo audit in CI, pin critical versions |
pub fn validate_query(query: &str) -> Result<(), ValidationError> {
// 1. Length check
if query.len() > 1000 {
return Err(ValidationError::TooLong);
}
// 2. Unicode normalization (prevent homograph attacks)
let normalized = query.nfc().collect::<String>();
// 3. Control character filtering
if normalized.chars().any(|c| c.is_control()) {
return Err(ValidationError::InvalidCharacters);
}
// 4. Prompt injection patterns (basic)
let injection_patterns = [
"ignore previous",
"ignore all previous",
"disregard",
"system:",
"[INST]",
];
let lower = normalized.to_lowercase();
for pattern in injection_patterns {
if lower.contains(pattern) {
return Err(ValidationError::SuspiciousPattern);
}
}
Ok(())
}| Operation | Budget | Implementation Target |
|---|---|---|
| Cache lookup (HNSW hit) | 1ms | 0.5ms |
| Cache lookup (DB miss) | 50ms | 10-30ms |
| Embedding generation | 100ms | 50-200ms |
| Provider race timeout | 250ms | 250ms |
| Result normalization | 5ms | 1ms |
| Total | 406ms | 311-481ms |
| Resource | Limit |
|---|---|
| RAM (resident) | 256MB |
| RAM (with SQLite mmap) | 512MB |
| Disk I/O | <10MB/s sustained |
| File descriptors | <200 |
| CPU cores | 2+ |
tests/
├── fixtures/
│ ├── duckduckgo/
│ │ ├── rust_programming.json
│ │ └── error_rate_limit.json
│ ├── brave/
│ │ └── rust_programming.json
│ └── synthetic/
│ └── rust_programming.json
├── integration/
│ ├── providers_test.rs # Real API calls with recorded responses
│ ├── cache_test.rs # SQLite with testcontainers
│ └── embeddings_test.rs # Ollama integration
└── e2e/
└── mcp_protocol_test.rs # Full MCP tool flow
# Run all tests
cargo test
# Run with recording (saves fixtures)
RECORD_FIXTURES=1 cargo test --features recording
# Run integration tests only
cargo test --test integration[dependencies]
# Async runtime
tokio = { version = "1", features = ["full"] }
# MCP protocol
rmcp = "0.1"
# HTTP client (with rustls, NOT OpenSSL)
reqwest = { version = "0.12", features = ["rustls-tls", "json"], default-features = false }
# Serialization
serde = { version = "1", features = ["derive"] }
serde_json = "1"
# Database
sqlx = { version = "0.7", features = ["runtime-tokio", "sqlite", "postgres"] }
# Embeddings
async-openai = "0.20" # Ollama-compatible
# Cryptography
blake3 = "1"
# Circuit breaker
tokio-circuit-breaker = "0.1"
# Secrets handling
secrecy = "0.8"
# Error handling
thiserror = "1"
anyhow = "1"
# Logging
tracing = "0.1"
tracing-subscriber = "0.3"
# Date/time
chrono = { version = "0.4", features = ["serde"] }
# HNSW for semantic search
instant-distance = "0.6"
[dev-dependencies]
tokio-test = "0.4"
testcontainers = "0.15"
wiremock = "0.5"
[build-dependencies]
# Minimal build.rs, prefer const evaluation- Project setup (Cargo.toml, structure)
- Domain models (
domain/models.rs) - Port traits (
ports/) - Error types (
infrastructure/error.rs) - Configuration infrastructure
- DuckDuckGo provider (
adapters/providers/duckduckgo.rs) - Brave provider (
adapters/providers/brave.rs) - Synthetic provider (
adapters/providers/synthetic.rs) - Result normalization (
adapters/providers/normalized_result.rs) - Provider pool with weighting (
adapters/providers/provider_pool.rs)
- Lock-free circuit breaker (
adapters/circuit_breaker/in_memory.rs) - SQLite cache backend (
adapters/cache/sqlite_backend.rs) - HNSW semantic index (
adapters/cache/semantic_index.rs) - Cache migrations (
adapters/cache/migrations.rs)
- Ollama embedding provider (
adapters/embeddings/ollama.rs) - OpenAI embedding provider (
adapters/embeddings/openai.rs) - Embedding quantization utilities
- Stdio transport (
adapters/mcp/stdio.rs) - web_search tool
- search_similar tool (RAG)
- search_news tool
- search_cached tool
- Admin tools (cache_clear, cache_migrate, etc.)
- Observability tools (get_provider_status, get_server_stats)
- Integration test fixtures
- E2E MCP protocol tests
- Documentation
- README.md with usage examples
cargo build --release
cargo test
cargo clippy -- -D warningsRUST_LOG=debug ./target/release/mcp-websearchAdd to Claude Desktop config.json:
{
"mcpServers": {
"websearch": {
"command": "/path/to/mcp-websearch"
}
}
}In Claude:
- "Search for Rust async programming" → Should use web_search tool
- "Find similar results to previous search" → Should use search_similar tool
- "What is the cache status?" → Should use get_cache_stats tool
- "Check provider health" → Should use get_provider_status tool
# Cache hit should be <10ms
# Cache miss should be <500ms
# Provider fallback should work (disable one provider, search should succeed)- New files (all) - Greenfield project
- Key implementation files:
src/domain/models.rs- Core typessrc/ports/search_provider.rs- Provider abstractionsrc/adapters/providers/*.rs- Provider implementationssrc/adapters/cache/sqlite_backend.rs- Cache implementationsrc/adapters/mcp/tools/*.rs- MCP tool handlersmigrations/V1__initial_schema.sql- Database schema
- ✅ Provider capabilities exposed via trait for routing decisions
- ✅ Two-tier circuit breaker with fallback chain
- ✅ CacheBackend trait supports exact and semantic lookup
- ✅ MCP adapter is thin, delegates to application layer
- ✅ SSRF protection via URL allowlisting
- ✅ Prompt injection detection in input validation
- ✅ Budget controls for paid embedding APIs
- ✅
secrecycrate for API key memory protection - ✅ File permission enforcement (0o600)
- ✅
rustlsinstead of OpenSSL
- ✅ Multi-tier cache (L1 HNSW, L2 quantized, L3 SQLite)
- ✅ Embedding quantization (96% memory reduction)
- ✅ Lock-free circuit breaker
- ✅ Shared HTTP client with connection pooling
- ✅ Provider racing with timeout (not waiting for all)
- ✅ WAL mode for SQLite concurrency
Cloned from agent-guardrails-template
- Read Before Editing - Never modify code without reading first
- Stay in Scope - Only touch authorized files
- Verify Before Committing - Test all changes
- Halt When Uncertain - Ask instead of guessing
Before ANY file modification, verify:
| # | Check | Requirement |
|---|---|---|
| 1 | READ FIRST | NEVER edit a file without reading it first |
| 2 | SCOPE LOCK | Only modify files explicitly in scope |
| 3 | NO FEATURE CREEP | Do NOT add features, refactor, or "improve" unrelated code |
| 4 | SUB-500 LINES | No file should exceed 500 lines |
| 5 | TEST BEFORE COMMIT | All tests must pass before committing |
| 6 | CHECK FAILURE REGISTRY | Review known bugs for affected files |
| 7 | VERIFY FIXES INTACT | Confirm previous fixes not being undone |
| Rule | Description |
|---|---|
| NO FORCE PUSH | Never use git push --force |
| NO AMEND | Do not amend commits you didn't create this session |
| NO CONFIG CHANGES | Do not modify git config |
| NO PUSH WITHOUT PERMISSION | Only push if user explicitly requests |
| NO SKIP HOOKS | Never use --no-verify |
| NO REBASE | Never rebase shared branches |
| Rule | Rationale |
|---|---|
| EXACT REPLACEMENT | Use provided code exactly - no "improvements" |
| NO NEW IMPORTS | Unless explicitly required by the task |
| PRESERVE FORMATTING | Match existing indentation and style |
| NO SECRETS | Never commit credentials, keys, tokens |
Stop immediately and report to user if ANY of these occur:
- Target file does not exist
- Line numbers don't match expected
- File has unexpected modifications
- Syntax check fails after edit
- Any test fails after edit
- Merge conflicts encountered
- Uncertain about ANY step
- Edit tool reports "string not found"
- Permission denied errors
mcp-websearch/
├── CLAUDE.md # Agent guidelines (see below)
├── .guardrails/
│ ├── pre-work-check.md # Pre-work checklist
│ ├── failure-registry.jsonl # Bug database
│ └── prevention-rules/
│ ├── pattern-rules.json # Regex-based rules
│ ├── semantic-rules.json # AST-based rules
│ └── extracted-rules.json # Rules from AGENT_GUARDRAILS.md
└── docs/
└── AGENT_GUARDRAILS.md # Full guardrails documentation
Create /mnt/ollama/git/mcp-websearch/CLAUDE.md:
# mcp-websearch - Rust MCP Web Search Server
## Project Navigation
- **INDEX_MAP.md**: Find documents by keyword/category (TODO: create)
- **HEADER_MAP.md**: Find specific sections with file:line references (TODO: create)
- **Flow**: INDEX_MAP → identify doc → HEADER_MAP → read specific section
## Context
Rust-based MCP server for AI assistants to perform intelligent web searches with:
- Multiple providers (DuckDuckGo, Brave, Synthetic Search)
- Weighted round-robin with automatic fallback
- Semantic caching with RAG integration
- Enterprise-grade observability via MCP tools
## Stack
- **Language**: Rust 1.80+
- **Runtime**: Tokio async
- **Architecture**: Hexagonal (ports/adapters)
- **Database**: SQLite (default) / PostgreSQL (optional)
- **Embeddings**: Ollama (default) / OpenAI (optional)
- **Protocol**: MCP via stdio
## Quick Commands
```bash
# Build
cargo build --release
# Test
cargo test
# Run
RUST_LOG=info ./target/release/mcp-websearch
# Lint
cargo clippy -- -D warningssrc/
├── domain/ # Core business logic (NO external deps)
├── application/ # Use case orchestration
├── ports/ # Trait definitions
├── adapters/ # External integrations (MCP, providers, cache, embeddings)
└── infrastructure/ # Cross-cutting concerns
All source files MUST be under 500 lines. Split files that exceed this limit.
MANDATORY: Read .guardrails/pre-work-check.md before any modifications.
- Read Before Editing - Never modify without reading
- Stay in Scope - Only touch authorized files
- Verify Before Committing - Test all changes
- Halt When Uncertain - Ask instead of guessing
- NO force push
- NO skip hooks (
--no-verify) - NO amend commits you didn't create
- NO push without permission
- NO secrets in code
- NO feature creep
- NO unnecessary imports
- Production code BEFORE test code
| File | Purpose |
|---|---|
src/domain/models.rs |
Core types (SearchResult, SearchQuery, CacheEntry) |
src/ports/search_provider.rs |
SearchProvider trait |
src/adapters/providers/*.rs |
DuckDuckGo, Brave, Synthetic implementations |
src/adapters/cache/sqlite_backend.rs |
SQLite cache with semantic search |
src/adapters/mcp/tools/*.rs |
MCP tool handlers |
migrations/*.sql |
Database schema |
- Integration tests with recorded fixtures in
tests/fixtures/ - Run with
RECORD_FIXTURES=1to capture real API responses - All external provider calls should use fixtures in CI