raglogs is an incident analysis CLI tool. Contributions are welcome — bug fixes, new log adapters, normalization improvements, and documentation are all useful.
This document covers how to get set up, what the codebase expects, and how to submit changes.
git clone https://github.com/leo-aa88/raglogs
cd raglogs
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt && pip install -e .
cp .env.example .env
# Edit .env — set RAGLOGS_DB_URL at minimum
docker compose up postgres -d
raglogs initRun the demo to verify everything works:
raglogs demo
raglogs timeline --since 2h
raglogs compare --since 2h --baseline 24hsrc/core/
clustering/ Fingerprint grouping, importance scoring, baseline comparison
compare/ Window diffing — new, disappeared, increased, decreased
explain/ Evidence assembly, confidence, templates, summarizer
ingestion/ Ingestion orchestration and batch persistence
llm/ Provider abstraction (OpenAI, Ollama, noop)
normalization/ Message normalization, fingerprinting, trigger patterns
parsing/ JSON and text parsers, field alias resolution
retrieval/ Keyword-based question answering
timeline/ Causal timeline reconstruction
src/cli/commands/ One file per CLI command
src/api/routes/ FastAPI route handlers
src/db/ SQLAlchemy models and session management
src/utils/ Time window parsing, hashing helpers
tests/unit/ Pure unit tests — no database required
tests/integration/ Full pipeline tests — require running Postgres
The pipeline flows: ingest → normalize → fingerprint → cluster → baseline compare → evidence assembly → explain / timeline / compare.
Unit tests require no database:
make test-unit
# or
pytest tests/unit/Integration tests require a running Postgres instance:
docker compose up postgres -d
make test-int
# or
pytest tests/integration/All tests must pass before a pull request will be merged. If you are adding new functionality, add tests for it.
- Python 3.11+
- Type annotations on all function signatures
rufffor linting,blackfor formatting
make lint
make formatNo hard rules on line length beyond what black enforces. Prefer explicit over clever. Avoid abstractions that exist only to save lines.
Normalization improvements
The normalization step (src/core/normalization/patterns.py) determines clustering quality. If you have real-world log formats that normalize poorly — producing too many clusters for the same error, or collapsing unrelated errors — a fix there has high leverage. Include before/after examples and a test in tests/unit/test_normalization.py.
Log source adapters
New adapters go in src/adapters/. Each adapter yields ParsedLogLine objects. The rest of the pipeline is fully source-agnostic. Useful adapters: Datadog, Loki, Kubernetes pod logs, CloudWatch.
Trigger patterns
New trigger event patterns go in TRIGGER_PATTERNS in src/core/normalization/patterns.py. A good trigger pattern is specific enough to avoid false positives and general enough to match common variants across log formats.
Bug fixes
Check the issue tracker. Bugs with a reproducing log sample are easiest to fix.
- Fork the repository and create a branch from
main - Make your changes with tests
- Run
make lintandmake test-unit— both must pass - Open a pull request with a clear description of what changed and why
- Reference any related issue
Pull requests that are purely cosmetic (reformatting with no functional change) will not be merged.
Keep pull requests focused. A PR that fixes a bug and adds an unrelated feature is harder to review and slower to merge than two separate PRs.
No strict format required. Be clear about what changed and why. One-line messages are fine for small changes. For anything non-trivial:
Short summary (under 72 chars)
Longer explanation of what changed and why, if not obvious from the
diff. Reference the issue number if applicable.
Open a GitHub issue with the question label. If it is a quick question about whether a contribution would be accepted before you invest time in it, that is a good use of an issue.