Skip to content

Commit 9ceca2e

Browse files
committed
ci: fix codex review findings - test paths, provider check, step gating
Fix issues found by Codex review: - Fix test paths: tests/ does not exist at repo root, use packages/*/tests/ and packages/data-designer/tests/test_import_perf.py - Remove DataDesigner(model_providers=[]) from smoke checks - raises NoModelProvidersError; keep config-layer checks only - Fix audit step gating: remove continue-on-error, use step outcome to gate runner memory update (|| true + continue-on-error made the step always "succeed", defeating the success() condition)
1 parent 30a6de2 commit 9ceca2e

3 files changed

Lines changed: 21 additions & 22 deletions

File tree

.agents/recipes/structure/recipe.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ Exclude:
9393
- Files that are themselves optional/heavy modules
9494

9595
This directly affects startup time, which has a 3-second budget tested by
96-
`tests/test_import_perf.py`.
96+
`packages/data-designer/tests/test_import_perf.py`.
9797

9898
### 3. Future annotations compliance
9999

.agents/recipes/test-health/recipe.md

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ Map source files to their corresponding test files:
4141
find packages/*/src/ -name '*.py' -not -name '__init__.py' -not -path '*/test*' | sort
4242

4343
# Test files
44-
find tests/ packages/*/tests/ -name 'test_*.py' 2>/dev/null | sort
44+
find packages/*/tests/ -name 'test_*.py' 2>/dev/null | sort
4545
```
4646

4747
For each source module, check if a corresponding test file exists. Flag:
@@ -61,10 +61,10 @@ Scan test files for tests that assert nothing meaningful:
6161
```bash
6262
# Tests that only check "is not None" (often meaningless if the function
6363
# can't return None)
64-
grep -rn "assert .* is not None$" tests/ --include='*.py'
64+
grep -rn "assert .* is not None$" packages/*/tests/ --include='*.py'
6565

6666
# Test functions with no assert statements at all
67-
grep -l "def test_" tests/ --include='*.py' -r | while read f; do
67+
grep -l "def test_" packages/*/tests/ --include='*.py' -r | while read f; do
6868
# Count test functions vs assert statements
6969
TESTS=$(grep -c "def test_" "$f")
7070
ASSERTS=$(grep -c "assert " "$f")
@@ -92,12 +92,12 @@ Skip `tests_e2e/` - e2e tests have different assertion patterns.
9292

9393
### 3. Import performance
9494

95-
The repo has a 3-second import budget tested by `tests/test_import_perf.py`.
96-
Check the current state:
95+
The repo has a 3-second import budget tested by
96+
`packages/data-designer/tests/test_import_perf.py`. Check the current state:
9797

9898
```bash
9999
# Verify the test exists and read its thresholds
100-
cat tests/test_import_perf.py 2>/dev/null || echo "not found"
100+
cat packages/data-designer/tests/test_import_perf.py 2>/dev/null || echo "not found"
101101
```
102102

103103
Also check for heavy imports that bypass the lazy loading system:
@@ -200,15 +200,18 @@ same config every day.
200200

201201
**What to always check:**
202202
1. Config build round-trip: column count and names survive `.build()`
203-
2. Validation: `DataDesigner.validate(builder)` succeeds for valid configs
204-
3. Rejection: invalid inputs raise, not silently produce bad configs
203+
2. Rejection: invalid inputs raise, not silently produce bad configs
204+
205+
**Limitation**: `DataDesigner(model_providers=[])` raises
206+
`NoModelProvidersError`, so you cannot call `DataDesigner.validate()`
207+
without at least one provider configured. Stick to config-layer checks
208+
(`DataDesignerConfigBuilder.build()`, column type resolution) which do
209+
not require providers.
205210

206211
**API reference** for writing checks:
207212

208213
```python
209214
from data_designer.config.config_builder import DataDesignerConfigBuilder
210-
from data_designer.interface.data_designer import DataDesigner
211-
import tempfile
212215

213216
# Build a config - use keyword args: name, column_type, sampler_type, params
214217
builder = (
@@ -223,10 +226,6 @@ config = builder.build()
223226
assert len(config.columns) >= 2
224227
names = {c.name for c in config.columns}
225228
assert 'id' in names and 'cat' in names
226-
227-
# Validate through the full stack (no LLM needed for sampler-only)
228-
dd = DataDesigner(artifact_path=tempfile.mkdtemp(), model_providers=[])
229-
dd.validate(builder)
230229
```
231230

232231
Run at least 2 creative checks per audit. Document what you chose and why
@@ -290,15 +289,15 @@ Write the report to `/tmp/audit-{{suite}}.md`:
290289
| Check | Status | Detail |
291290
|-------|--------|--------|
292291
| Package imports | OK/FAIL | All three packages import cleanly |
293-
| Import timing | OK/FAIL | Xms (budget: 3s) |
292+
| Import timing | OK/FAIL | X.XXs (budget: 3s) |
294293
| Registry completeness | OK/WARN | Column types resolve to config classes |
295294

296295
**Creative checks** (describe what you tested and why):
297296

298-
| Check | Sampler types used | Status | Detail |
299-
|-------|--------------------|--------|--------|
297+
| Check | What was tested | Status | Detail |
298+
|-------|-----------------|--------|--------|
300299
| Config build #1 | e.g. uuid+poisson+datetime | OK/FAIL | ... |
301-
| Validate #1 | ... | OK/FAIL | ... |
300+
| Rejection #1 | e.g. gaussian with negative std | OK/FAIL | ... |
302301
| ... | ... | ... | ... |
303302

304303
### Test isolation

.github/workflows/agentic-ci-daily.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,7 @@ jobs:
153153
fi
154154
155155
- name: Run audit recipe
156+
id: audit
156157
env:
157158
ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }}
158159
ANTHROPIC_API_KEY: ${{ secrets.AGENTIC_CI_API_KEY }}
@@ -185,11 +186,10 @@ jobs:
185186
--max-turns 30 \
186187
--output-format text \
187188
--verbose \
188-
2>&1 | tee /tmp/claude-audit-log.txt || true
189-
continue-on-error: true
189+
2>&1 | tee /tmp/claude-audit-log.txt
190190
191191
- name: Update runner memory
192-
if: success()
192+
if: steps.audit.outcome == 'success'
193193
env:
194194
SUITE: ${{ matrix.suite }}
195195
run: |

0 commit comments

Comments
 (0)