AGENTS.md

Project Overview
Project Principles
Project Conventions & Patterns
High-Level Architecture
Subsystems Reference
Code Organization
Testing Approach
Development Commands

Project Overview

Agentic Memorizer is an automated knowledge graph builder that monitors user-configured filesystem paths, applies a set of filters, analyzes content using AI providers, and maintains a searchable graph. The daemon watches and walks registered directories for changes and automatically processes files through format-specific chunkers, semantic analysis, and embeddings generation. Results are exposed to AI assistants via the Model Context Protocol (MCP), Hooks, and Plugins.

Key capabilities:

Filesystem Monitoring: Watches registered directories for file changes with event coalescing
Intelligent Chunking: 22 format-specific chunkers for code (Tree-sitter AST with 8 languages), documents (PDF, DOCX, ODT), markup (Markdown, LaTeX, HTML), configuration (TOML, HCL, Dockerfile), data formats (JSON, YAML, SQL), and notebooks (Jupyter)
Semantic Analysis: Pluggable AI providers (Anthropic, OpenAI, Google) extract topics, entities, and summaries
Vector Embeddings: OpenAI, Voyage AI, and Google providers for semantic similarity search
Knowledge Graph: FalkorDB (Redis Graph) backend with typed metadata relationships
AI Tool Integration: MCP server and hooks for Claude Code, Gemini CLI, Codex, and OpenCode

Project Principles

Unix Philosophy: Each component does one thing well. Data flows through text-based formats (JSON, YAML). Components are composable and can be scripted. Output is silent by default; verbosity is opt-in. All state is inspectable and human-readable.
Graceful Component Degradation: The system continues operating with reduced functionality when external services fail. If the graph connection fails, the daemon enters degraded mode but continues processing. If a provider is unavailable, analysis proceeds without that capability. Failures are logged and surfaced via health endpoints, never silently ignored.
Loose Coupling: Components communicate via an event bus rather than direct method calls. The watcher publishes events; the queue subscribes. The cleaner subscribes to deletion events. Components can be replaced or extended without modifying their consumers.
Eventual Consistency: The filesystem is the source of truth. Changes produce events that propagate asynchronously through the system. The knowledge graph reflects filesystem state only after processing completes. Queries may return stale data during processing; this is acceptable.
Observability: Comprehensive logging, health checks, and status commands provide visibility into system state. Each component logs key events and errors with context. Health endpoints report component status and degradation.
Extensibility: Interface-first design, registry patterns, and event-driven architecture enables easy addition or replacement of the concrete types of individual components.

Adhering to Project Principles

Unix Philosophy:
- New components should have a single, clear responsibility
- Prefer text-based serialization (JSON, YAML) over binary formats
- Support --quiet and --verbose flags where applicable
- Expose internal state via health endpoints or status commands
- Design for scriptability: predictable exit codes, machine-parseable output options
Graceful Component Degradation:
- Initialize optional components (graph, providers, MCP) with error handling that logs warnings and continues
- Track degraded state via boolean flags (e.g., graphDegraded, mcpDegraded)
- Surface degradation in health endpoints and status commands
- Never crash the daemon due to external service failures
- Use retry logic with configurable backoff for transient failures
Loose Coupling:
- Components should depend on interfaces, not concrete implementations
- Use the event bus for cross-component communication instead of direct calls
- New functionality should subscribe to existing events rather than modify publishers
- Registries (chunkers, handlers, providers) enable runtime component selection
- Avoid circular dependencies between packages
Eventual Consistency:
- Accept that queries may return stale data during processing
- Design reconciliation logic to handle filesystem changes that occurred during walks
- Use the cleaner to remove stale graph entries after walks complete
- Emit events when state changes so dependent components can react
- Avoid synchronous dependencies between the filesystem state and graph state
Observability:
- Implement health checks via ComponentHealth struct with status (running/degraded/failed), error message, and timestamps
- Register components with ComponentHealthCollector to participate in aggregate health reporting
- Expose metrics via the MetricsProvider interface (CollectMetrics(ctx) error)
- Use structured logging with slog and component context: slog.Default().With("component", "name")
- Populate health Details map with actionable diagnostics (queue depths, drop rates, counts)
- Support /healthz (liveness), /readyz (readiness with component breakdown), and /metrics endpoints
Extensibility:
- Define component behavior as interfaces before implementation (see Chunker, Provider, Graph, Bus)
- Use registry pattern with priority ordering for pluggable components (chunkers, providers)
- Implement CanHandle() for capability-based selection and Priority() for ordering
- Use functional options pattern (WithXXX functions) for configurable constructors
- Subscribe to events rather than modifying publishers when adding new functionality
- New chunkers: implement interface, set priority, register in DefaultRegistry()
- New providers: implement interface, call RegisterSemantic() or RegisterEmbeddings()

Project Conventions & Patterns

Typed Configuration & User Input: Access configuration via typed structs, never string keys. CLI flags use variable-based storage with {commandName}{FlagName} naming. All user input is validated in PreRunE hooks before business logic executes.

// Typed config access
cfg := config.Get()
port := cfg.Daemon.HTTPPort
pidFile := config.ExpandPath(cfg.Daemon.PIDFile)

// Variable-based flag storage
var rebuildForce bool
func init() {
    RebuildCmd.Flags().BoolVar(&rebuildForce, "force", false, "Force rebuild")
}

// PreRunE validation pattern
var MyCmd = &cobra.Command{
    PreRunE: validateMy,
    RunE:    runMy,
}
func validateMy(cmd *cobra.Command, args []string) error {
    // Validate input...
    cmd.SilenceUsage = true  // Set AFTER validation passes
    return nil
}

Interface-First Design: Major subsystems define behavior through interfaces before implementation. Graph, Walker, Watcher, Registry, Bus, Chunker, and Provider are all interfaces with concrete implementations. This enables testing via mocks, component substitution, and clear contracts between packages.

Functional Options Pattern: Constructors use WithXXX option functions for configuration:

q := analysis.NewQueue(bus,
    analysis.WithWorkerCount(4),
    analysis.WithLogger(slog.Default()),
)

Registry Pattern with Priority Selection: Pluggable components use centralized registries with priority ordering. When a component fails, the system falls through to lower-priority alternatives. Chunkers, handlers, and providers all use this pattern.
Ordered Component Lifecycle: Components follow Initialize→Start→Stop lifecycle. The Orchestrator initializes in dependency order and shuts down in reverse order. This ensures pending work drains before dependencies close.
Panic Recovery in Concurrent Code: Event handlers and worker goroutines recover from panics to prevent cascading failures. Panics are logged with context but don't crash the daemon.

Error Handling: Use semicolons (not colons) when wrapping errors for cleaner CLI output:

return fmt.Errorf("failed to initialize config; %w", err)  // Correct
return fmt.Errorf("failed to initialize config: %w", err)  // Incorrect

Structured Logging: Use log/slog with key-value pairs. Add component context via .With():

slog.Info("starting daemon", "http_port", cfg.Daemon.HTTPPort)
logger := slog.Default().With("component", "graph")

CLI Command Organization: Commands organized by domain with subcommands in nested directories:

cmd/{parent}/
├── {parent}.go              # Parent command definition
└── subcommands/
    ├── {subcommand}.go      # One file per subcommand
    └── helpers.go           # Shared utilities

High-Level Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         CLI Layer (Cobra)                           │
│  [version] [initialize] [daemon] [remember] [forget] [list] [read]  │
│  [integrations] [providers] [config]                                │
└──────────────────────────────┬──────────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────────┐
│                         Daemon Core                                 │
│  Component Lifecycle │ Health Manager │ HTTP Server (7600)          │
└──────────────────────────────┬──────────────────────────────────────┘
                               │
      ┌────────────────────────┼────────────────────────┐
      │                        │                        │
┌─────▼─────┐  ┌───────────────▼───────────────┐  ┌─────▼───────┐
│  Watcher  │  │     Analysis Pipeline         │  │   Graph     │
│ (fsnotify)│  │  Queue → Workers → Handlers   │  │ (FalkorDB)  │
└─────┬─────┘  └───────────────┬───────────────┘  └─────────────┘
      │                        │
      │         ┌──────────────┼──────────────┐
      │         │              │              │
      │    ┌────▼────┐   ┌─────▼─────┐  ┌─────▼──────┐
      │    │Chunkers │   │ Semantic  │  │ Embeddings │
      │    │  (22)   │   │ Providers │  │ Providers  │
      │    └─────────┘   └───────────┘  └────────────┘
      │
      └──→ Event Bus ──→ Analysis Queue

Data Flow:

Watcher detects filesystem changes in remembered directories
Events are coalesced and published to the event bus
Analysis workers process files through chunkers and AI providers
Results are persisted to the FalkorDB knowledge graph
CLI, MCP server, and integrations query the graph

Key External Dependencies:

FalkorDB: Redis Graph for knowledge storage
SQLite: Registry database for remembered paths
AI Providers: Anthropic, OpenAI, Google, Voyage AI

Subsystems Reference

No subsystem documentation exists yet at docs/subsystems/. Key internal packages:

Core Subsystems:

Package	Purpose
`internal/daemon`	Daemon lifecycle, health monitoring, component orchestration
`internal/events`	Event bus, event type definitions, critical queue for dropped events
`internal/watcher`	Real-time filesystem monitoring with fsnotify and event coalescing
`internal/walker`	Directory traversal and file discovery for initial/incremental walks
`internal/analysis`	Work queue, workers, pipeline stages, and analysis coordination
`internal/chunkers`	22 format-specific file chunkers with Tree-sitter code support
`internal/providers`	Semantic analysis and embeddings provider implementations
`internal/graph`	FalkorDB client, schema definitions, and graph operations
`internal/cleaner`	Removes stale cache and graph entries after walks complete
`internal/cache`	Caching layer for semantic and embeddings results

Infrastructure:

Package	Purpose
`internal/config`	Typed configuration with Viper, validation, and defaults
`internal/registry`	SQLite registry for remembered paths and file states
`internal/logging`	Structured logging setup and configuration
`internal/metrics`	Prometheus metrics collection and recording
`internal/container`	Dependency injection container for component bootstrapping

Integration:

Package	Purpose
`internal/mcp`	Model Context Protocol server implementation
`internal/integrations`	Hook, MCP, and plugin integrations for AI tools
`internal/export`	Data export functionality for CLI
`internal/daemonclient`	HTTP client for CLI-to-daemon communication

Utilities:

Package	Purpose
`internal/fsutil`	Common filesystem operations and helpers
`internal/cmdutil`	CLI command utilities and helpers
`internal/testutil`	Testing utilities with isolated environments
`internal/tui`	Terminal UI components for interactive commands
`internal/version`	Build version and metadata

Code Organization

agentic-memorizer/
├── main.go                 # Entry point
├── cmd/                    # CLI commands (Cobra)
│   ├── root.go             # Root command with PersistentPreRunE
│   ├── daemon/             # Daemon subcommands
│   ├── remember/           # Path registration
│   └── ...                 # Other command groups
├── internal/               # Internal packages
│   ├── config/             # Configuration (types, defaults, validate, load)
│   ├── daemon/             # Daemon lifecycle management
│   ├── chunkers/           # File chunkers by format
│   │   └── code/           # Tree-sitter code chunkers
│   ├── providers/          # AI provider implementations
│   │   ├── semantic/       # Anthropic, OpenAI, Google
│   │   └── embeddings/     # OpenAI, Voyage, Google
│   ├── graph/              # FalkorDB client and models
│   └── ...                 # Other subsystems
├── testdata/               # Test fixtures organized by type
├── Makefile                # Build automation
└── config.yaml.example     # Example configuration

Testing Approach

Framework: Go stdlib testing package only (no testify, ginkgo, etc.)

Table-Driven Tests: Standard pattern throughout:

tests := []struct {
    name string
    // fields...
}{
    {"case 1", ...},
    {"case 2", ...},
}
for _, tt := range tests {
    t.Run(tt.name, func(t *testing.T) {
        // test logic
    })
}

Test Utilities (internal/testutil/):

TestEnv provides isolated test environments with temp directories
Automatic cleanup via t.Cleanup()
Environment variable isolation via t.Setenv()

Coverage: 90+ test files across all major subsystems including config, chunkers, graph, commands, and providers.

Development Commands

Building & Testing

# Build the binary
make build

# Build and install to ~/.local/bin
make install

# Run all tests
make test

# Run tests with race detector
make test-race

# Run linter
make lint

# Run linter with auto-fix
make lint-fix

# Clean build artifacts
make clean

Running the Application

# Run interactive setup wizard
memorizer initialize

# Start the daemon (foreground)
memorizer daemon start

# Stop the daemon
memorizer daemon stop

# Check daemon status
memorizer daemon status

# Remember a directory
memorizer remember ~/projects/myapp

# List remembered directories
memorizer list

# Export knowledge graph
memorizer read --format json

# Setup an integration
memorizer integrations setup claude-code-mcp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGENTS.md

Table of Contents

Project Overview

Project Principles

Adhering to Project Principles

Project Conventions & Patterns

High-Level Architecture

Subsystems Reference

Code Organization

Testing Approach

Development Commands

Building & Testing

Running the Application

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AGENTS.md

Table of Contents

Project Overview

Project Principles

Adhering to Project Principles

Project Conventions & Patterns

High-Level Architecture

Subsystems Reference

Code Organization

Testing Approach

Development Commands

Building & Testing

Running the Application