Skip to content

Commit 109081a

Browse files
refactor(evaluators)!: reorganize into builtin + extra tiers
Split evaluators into two packages: - builtin (`agent-control-evaluators`): Core infrastructure + regex, list, json, sql - extra/galileo (`agent-control-evaluator-galileo`): Luna2 evaluator (calls external API) BREAKING CHANGES: - Luna2 import path changed from `agent_control_evaluators.galileo_luna2` to `agent_control_evaluator_galileo.luna2` - External evaluator names use dot notation instead of slash (e.g., `galileo.luna2` instead of `galileo/luna2`) - SDK and server now depend on `agent-control-evaluators` as a runtime dependency (not vendored) to avoid duplicate module conflicts Key changes: - Move builtin evaluators to `evaluators/builtin/` - Create `evaluators/extra/galileo/` as separate package - Add entry points for plugin discovery (`agent_control.evaluators`) - Update workspace to include only builtin (extras excluded for perf) - Add CI workflow for testing extra packages - Add template scaffold for creating new evaluator packages - Server build script no longer vendors evaluators
1 parent e176ad8 commit 109081a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+627
-184
lines changed

.github/workflows/test-extras.yml

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
name: Test Extras
2+
3+
on:
4+
push:
5+
paths:
6+
# Trigger on extra changes
7+
- 'evaluators/extra/**'
8+
# Also trigger on core changes that could break extras
9+
- 'evaluators/builtin/**'
10+
- 'models/**'
11+
- 'engine/**'
12+
- 'server/**'
13+
- 'sdks/python/**'
14+
pull_request:
15+
paths:
16+
- 'evaluators/extra/**'
17+
- 'evaluators/builtin/**'
18+
- 'models/**'
19+
- 'engine/**'
20+
- 'server/**'
21+
- 'sdks/python/**'
22+
23+
jobs:
24+
test-galileo:
25+
runs-on: ubuntu-latest
26+
steps:
27+
- uses: actions/checkout@v4
28+
29+
- name: Setup uv and Python
30+
uses: astral-sh/setup-uv@v3
31+
with:
32+
python-version: "3.12"
33+
34+
- name: Sync workspace
35+
run: make sync
36+
37+
- name: Install galileo extra
38+
run: cd evaluators/extra/galileo && uv pip install -e .
39+
40+
- name: Lint galileo
41+
run: cd evaluators/extra/galileo && uv run ruff check --config ../../../pyproject.toml src/
42+
43+
- name: Typecheck galileo
44+
run: cd evaluators/extra/galileo && uv run mypy --config-file ../../../pyproject.toml src/
45+
46+
- name: Test galileo
47+
run: cd evaluators/extra/galileo && uv run pytest
48+
49+
- name: Verify SDK integration
50+
run: |
51+
cd sdks/python
52+
uv run pytest tests/test_luna2_smoke.py

AGENTS.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@ Forwarded targets:
2626
- `engine/`: **control evaluation engine and evaluator system** — all evaluation logic, evaluator discovery, and evaluator orchestration lives here (`engine/src/agent_control_engine/`)
2727
- `server/`: FastAPI server (`server/src/agent_control_server/`)
2828
- `sdks/python/`: Python SDK — uses engine for evaluation (`sdks/python/src/agent_control/`)
29-
- `evaluators/`: evaluator implementations (`evaluators/src/agent_control_evaluators/`)
29+
- `evaluators/builtin/`: builtin evaluator implementations (`evaluators/builtin/src/agent_control_evaluators/`)
30+
- `evaluators/extra/`: optional evaluator packages (e.g., `evaluators/extra/galileo/`)
3031
- `ui/`: Nextjs based web app to manage agent controls
3132
- `examples/`: runnable examples (ruff has relaxed import rules here)
3233

@@ -66,13 +67,19 @@ All testing guidance (including “behavior changes require tests”) lives in `
6667
4) add SDK wrapper in `sdks/python/src/agent_control/`
6768
5) add tests (server + SDK) and update docs/examples if user-facing
6869

69-
- Add a new evaluator:
70-
1) implement evaluator class extending `Evaluator` in `evaluators/src/agent_control_evaluators/`
70+
- Add a new builtin evaluator:
71+
1) implement evaluator class extending `Evaluator` in `evaluators/builtin/src/agent_control_evaluators/`
7172
2) use `@register_evaluator` decorator (from `agent_control_evaluators`)
72-
3) add entry point in `evaluators/pyproject.toml` for auto-discovery
73-
4) add tests in the evaluators package
73+
3) add entry point in `evaluators/builtin/pyproject.toml` for auto-discovery
74+
4) add tests in the evaluators/builtin package
7475
5) evaluator is automatically available to server and SDK via `discover_evaluators()`
7576

77+
- Add an external evaluator package:
78+
1) copy `evaluators/extra/template/` as a starting point
79+
2) implement evaluator class extending `Evaluator` from `agent_control_evaluators`
80+
3) add entry point using `org.name` format (e.g., `galileo.luna2`)
81+
4) package is discovered automatically when installed alongside agent-control
82+
7683
## Git/PR workflow
7784

7885
- Branch naming: `feature/...`, `fix/...`, `refactor/...`

Makefile

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,18 @@
1-
.PHONY: help sync test test-models test-sdk lint lint-fix typecheck check build build-models build-server build-sdk publish publish-models publish-server publish-sdk hooks-install hooks-uninstall prepush
1+
.PHONY: help sync test test-models test-sdk lint lint-fix typecheck check build build-models build-server build-sdk publish publish-models publish-server publish-sdk hooks-install hooks-uninstall prepush evaluators-test evaluators-lint evaluators-lint-fix evaluators-typecheck evaluators-build
22

33
# Workspace package names
44
PACK_MODELS := agent-control-models
55
PACK_SERVER := agent-control-server
66
PACK_SDK := agent-control
77
PACK_ENGINE := agent-control-engine
8+
PACK_EVALUATORS := agent-control-evaluators
89

910
# Directories
1011
MODELS_DIR := models
1112
SERVER_DIR := server
1213
SDK_DIR := sdks/python
1314
ENGINE_DIR := engine
15+
EVALUATORS_DIR := evaluators/builtin
1416

1517
help:
1618
@echo "Agent Control - Makefile commands"
@@ -56,7 +58,7 @@ sync:
5658
# Test
5759
# ---------------------------
5860

59-
test: server-test engine-test sdk-test
61+
test: server-test engine-test sdk-test evaluators-test
6062

6163
# Run tests, lint, and typecheck
6264
check: test lint typecheck
@@ -65,17 +67,17 @@ check: test lint typecheck
6567
# Quality
6668
# ---------------------------
6769

68-
lint: engine-lint
70+
lint: engine-lint evaluators-lint
6971
uv run --package $(PACK_MODELS) ruff check --config pyproject.toml models/src
7072
uv run --package $(PACK_SERVER) ruff check --config pyproject.toml server/src
7173
uv run --package $(PACK_SDK) ruff check --config pyproject.toml sdks/python/src
7274

73-
lint-fix: engine-lint-fix
75+
lint-fix: engine-lint-fix evaluators-lint-fix
7476
uv run --package $(PACK_MODELS) ruff check --config pyproject.toml --fix models/src
7577
uv run --package $(PACK_SERVER) ruff check --config pyproject.toml --fix server/src
7678
uv run --package $(PACK_SDK) ruff check --config pyproject.toml --fix sdks/python/src
7779

78-
typecheck: engine-typecheck
80+
typecheck: engine-typecheck evaluators-typecheck
7981
uv run --package $(PACK_MODELS) mypy --config-file pyproject.toml models/src
8082
uv run --package $(PACK_SERVER) mypy --config-file pyproject.toml server/src
8183
uv run --package $(PACK_SDK) mypy --config-file pyproject.toml sdks/python/src
@@ -84,7 +86,7 @@ typecheck: engine-typecheck
8486
# Build / Publish
8587
# ---------------------------
8688

87-
build: build-models build-server build-sdk engine-build
89+
build: build-models build-server build-sdk engine-build evaluators-build
8890

8991
build-models:
9092
cd $(MODELS_DIR) && uv build
@@ -130,6 +132,21 @@ engine-%:
130132
sdk-%:
131133
$(MAKE) -C $(SDK_DIR) $(patsubst sdk-%,%,$@)
132134

135+
evaluators-test:
136+
$(MAKE) -C $(EVALUATORS_DIR) test
137+
138+
evaluators-lint:
139+
$(MAKE) -C $(EVALUATORS_DIR) lint
140+
141+
evaluators-lint-fix:
142+
$(MAKE) -C $(EVALUATORS_DIR) lint-fix
143+
144+
evaluators-typecheck:
145+
$(MAKE) -C $(EVALUATORS_DIR) typecheck
146+
147+
evaluators-build:
148+
$(MAKE) -C $(EVALUATORS_DIR) build
149+
133150
.PHONY: server-%
134151
server-%:
135152
$(MAKE) -C $(SERVER_DIR) $(patsubst server-%,%,$@)

evaluators/README.md

Lines changed: 0 additions & 23 deletions
This file was deleted.

evaluators/builtin/Makefile

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
.PHONY: help sync test lint lint-fix typecheck build publish
2+
3+
PACKAGE := agent-control-evaluators
4+
5+
help:
6+
@echo "Agent Control Evaluators - Makefile commands"
7+
@echo ""
8+
@echo " make test - run pytest"
9+
@echo " make lint - run ruff check"
10+
@echo " make lint-fix - run ruff check --fix"
11+
@echo " make typecheck - run mypy"
12+
@echo " make build - build package"
13+
14+
sync:
15+
uv sync
16+
17+
test:
18+
uv run pytest --cov=src --cov-report=xml:../../coverage-evaluators.xml -q
19+
20+
lint:
21+
uv run ruff check --config ../../pyproject.toml src/
22+
23+
lint-fix:
24+
uv run ruff check --config ../../pyproject.toml --fix src/
25+
26+
typecheck:
27+
uv run mypy --config-file ../../pyproject.toml src/
28+
29+
build:
30+
uv build
31+
32+
publish:
33+
uv publish

evaluators/builtin/README.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Agent Control Evaluators
2+
3+
Built-in evaluators for agent-control.
4+
5+
## Installation
6+
7+
```bash
8+
pip install agent-control-evaluators
9+
```
10+
11+
## Available Evaluators
12+
13+
| Name | Description |
14+
|------|-------------|
15+
| `regex` | Regular expression pattern matching |
16+
| `list` | List-based value matching (allow/deny) |
17+
| `json` | JSON validation (schema, required fields, types) |
18+
| `sql` | SQL query validation |
19+
20+
## Usage
21+
22+
Evaluators are automatically discovered via Python entry points:
23+
24+
```python
25+
from agent_control_evaluators import discover_evaluators, list_evaluators
26+
27+
# Load all available evaluators
28+
discover_evaluators()
29+
30+
# See what's available
31+
print(list_evaluators())
32+
# {'regex': <class 'RegexEvaluator'>, 'list': ..., 'json': ..., 'sql': ...}
33+
```
34+
35+
## External Evaluators
36+
37+
Additional evaluators are available via separate packages:
38+
39+
- `agent-control-evaluator-galileo` - Galileo Luna2 evaluator
40+
41+
Install convenience extras:
42+
```bash
43+
pip install agent-control-evaluators[galileo]
44+
```
45+
46+
## Creating Custom Evaluators
47+
48+
See [AGENTS.md](../../AGENTS.md) for guidance on creating new evaluators.
Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[project]
22
name = "agent-control-evaluators"
3-
version = "2.1.0"
4-
description = "Evaluator implementations for agent-control"
3+
version = "3.0.0"
4+
description = "Builtin evaluators for agent-control"
55
readme = "README.md"
66
requires-python = ">=3.12"
77
license = { text = "Apache-2.0" }
@@ -15,16 +15,15 @@ dependencies = [
1515
]
1616

1717
[project.optional-dependencies]
18-
luna2 = ["httpx>=0.24.0"]
19-
all = ["httpx>=0.24.0"]
18+
# NOTE: galileo extra commented out during local dev - package not yet on PyPI
19+
# galileo = ["agent-control-evaluator-galileo>=3.0.0"]
2020
dev = ["pytest>=8.0.0", "pytest-asyncio>=0.23.0"]
2121

2222
[project.entry-points."agent_control.evaluators"]
2323
regex = "agent_control_evaluators.regex:RegexEvaluator"
2424
list = "agent_control_evaluators.list:ListEvaluator"
2525
json = "agent_control_evaluators.json:JSONEvaluator"
2626
sql = "agent_control_evaluators.sql:SQLEvaluator"
27-
"galileo/luna2" = "agent_control_evaluators.galileo_luna2:Luna2Evaluator"
2827

2928
[build-system]
3029
requires = ["hatchling"]

evaluators/src/agent_control_evaluators/__init__.py renamed to evaluators/builtin/src/agent_control_evaluators/__init__.py

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
"""Agent Control Evaluators.
22
3-
This package contains evaluator implementations for agent-control.
3+
This package contains builtin evaluator implementations for agent-control.
44
Built-in evaluators (regex, list, json, sql) are registered automatically on import.
55
66
Available evaluators:
@@ -10,15 +10,12 @@
1010
- json: JSON validation
1111
- sql: SQL query validation
1212
13-
External (provider/name format):
14-
- galileo/luna2: Galileo Luna-2 runtime protection
15-
(pip install agent-control-evaluators[luna2])
16-
1713
Naming convention:
1814
- Built-in: "regex", "list", "json", "sql"
19-
- External: "provider/name" (e.g., "galileo/luna2")
15+
- External: "provider.name" (e.g., "galileo.luna2")
2016
- Agent-scoped: "agent:name" (custom code deployed with agent)
2117
18+
External evaluators are installed via separate packages (e.g., agent-control-evaluator-galileo).
2219
Custom evaluators are Evaluator classes deployed with the engine.
2320
Their schemas are registered via initAgent for validation purposes.
2421
"""
@@ -45,7 +42,7 @@
4542
from agent_control_evaluators.regex import RegexEvaluator, RegexEvaluatorConfig
4643
from agent_control_evaluators.sql import SQLEvaluator, SQLEvaluatorConfig
4744

48-
__version__ = "0.1.0"
45+
__version__ = "3.0.0"
4946

5047
__all__ = [
5148
# Core infrastructure
File renamed without changes.

evaluators/src/agent_control_evaluators/_discovery.py renamed to evaluators/builtin/src/agent_control_evaluators/_discovery.py

File renamed without changes.

0 commit comments

Comments
 (0)