|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +Qualifire Python SDK - A library for evaluating LLM outputs for quality, safety, and reliability. Detects hallucinations, ensures grounding, identifies PII, blocks harmful content, and validates instruction-following through a single API. |
| 8 | + |
| 9 | +## Development Commands |
| 10 | + |
| 11 | +```bash |
| 12 | +# Install dependencies and set up environment |
| 13 | +make install |
| 14 | +make pre-commit-install |
| 15 | + |
| 16 | +# Run tests with coverage |
| 17 | +make test |
| 18 | + |
| 19 | +# Run a single test file |
| 20 | +poetry run pytest tests/test_types.py |
| 21 | + |
| 22 | +# Run a specific test |
| 23 | +poetry run pytest tests/test_types.py::TestEvaluationRequest::test_validate_messages_input_output -v |
| 24 | + |
| 25 | +# Format code (isort, black, pyupgrade) |
| 26 | +make codestyle |
| 27 | + |
| 28 | +# Check code style without modifying |
| 29 | +make check-codestyle |
| 30 | + |
| 31 | +# Run type checking |
| 32 | +make mypy |
| 33 | + |
| 34 | +# Run all checks (tests, codestyle, mypy, safety) |
| 35 | +make lint |
| 36 | + |
| 37 | +# Check for security issues |
| 38 | +make check-safety |
| 39 | +``` |
| 40 | + |
| 41 | +## Architecture |
| 42 | + |
| 43 | +The SDK has a minimal structure centered on the `Client` class: |
| 44 | + |
| 45 | +- **`qualifire/client.py`**: Main `Client` class with two public methods: |
| 46 | + - `evaluate()` - Run ad-hoc evaluations with various checks (hallucinations, grounding, PII, content moderation, etc.) |
| 47 | + - `invoke_evaluation()` - Run pre-configured evaluations from the Qualifire dashboard |
| 48 | + |
| 49 | +- **`qualifire/types.py`**: All data classes and enums: |
| 50 | + - `EvaluationRequest`/`EvaluationResponse` - API request/response structures |
| 51 | + - `LLMMessage`, `LLMToolCall`, `LLMToolDefinition` - Message and tool types for conversation-based evaluation |
| 52 | + - `ModelMode` (SPEED/BALANCED/QUALITY) - Quality vs speed trade-off for checks |
| 53 | + - `SyntaxCheckArgs` - Configuration for syntax validation |
| 54 | + |
| 55 | +- **`qualifire/utils.py`**: Helper functions for API key and base URL resolution from environment variables |
| 56 | + |
| 57 | +## Key Patterns |
| 58 | + |
| 59 | +- API key can be passed directly or via `QUALIFIRE_API_KEY` environment variable |
| 60 | +- Base URL defaults to `https://proxy.qualifire.ai/` but can be overridden via `QUALIFIRE_BASE_URL` |
| 61 | +- The `EvaluationRequest` dataclass validates inputs in `__post_init__` (e.g., tool_selection_quality_check requires both messages and available_tools) |
| 62 | +- Legacy content moderation checks (dangerous_content_check, harassment_check, etc.) are deprecated in favor of unified `content_moderation_check` |
| 63 | + |
| 64 | +## Testing |
| 65 | + |
| 66 | +Tests are in `tests/` and use pytest with parametrized test cases. Run `make test` for the full suite with coverage report. |
0 commit comments