feat: add evals for fork-specific features (v0.1.2) by kanfil · Pull Request #79 · tikalk/agentic-sdlc-spec-kit

kanfil · 2026-03-14T20:58:30Z

Summary

Add test coverage for agentic-sdlc preset functionality that was missing from the upstream evals framework.

Changes

New Tests (7 total)

Mission Brief suite (4 tests): Tests the new Mission Brief enforcement in /adlc.spec.specify
- Completeness: Goal, Success Criteria, Constraints, Demo Sentence present
- Quality: Goal is concise, criteria are measurable, demo is observable
- Constraint extraction: technical, business, regulatory constraints captured
- Approval flow: "Proceed with this Mission Brief?" prompt present
Fork spec sections (3 tests): Tests fork-specific spec template sections
- Goal, Demo Sentence, Boundary Map present
- Boundary Map has Produces/Consumes structure
- Constraints are extracted and documented

New Graders

check_mission_brief_completeness() - validates Mission Brief has all required elements
check_mission_brief_quality() - validates quality of each Mission Brief element
check_fork_spec_sections() - validates fork-specific spec sections

Fixes

check_extension_manifest() now accepts both speckit.* and adlc.* command patterns

Files Changed

evals/prompts/spec-prompt.txt - added fork sections (Goal, Demo Sentence, Boundary Map)
evals/prompts/mission-brief-prompt.txt - new prompt for Mission Brief tests
evals/configs/promptfooconfig-mission-brief.js - new config for Mission Brief suite
evals/configs/promptfooconfig.js - added 4 Mission Brief tests
evals/configs/promptfooconfig-spec.js - added 3 fork section tests
evals/graders/custom_graders.py - added 3 new graders, fixed command pattern regex
evals/README.md - updated test counts and documentation
pyproject.toml - version bump to 0.1.2
CHANGELOG.md - added v0.1.2 entry

Test Results

All 356 pytest tests pass
Test counts: 29 LLM tests + 39 unit tests across 7 suites (was 22+39 / 6)

Add test coverage for agentic-sdlc preset functionality: - Mission Brief test suite: 4 tests for completeness, quality, constraint extraction, approval flow - Fork spec section tests: 3 tests for Goal, Demo Sentence, Boundary Map, Constraints - New graders: check_mission_brief_completeness, check_mission_brief_quality, check_fork_spec_sections - spec-prompt.txt updated with fork-specific sections - Command pattern grader now accepts both speckit.* and adlc.* Evals: 29 LLM tests + 39 unit tests across 7 suites (was 22+39 / 6)

kfinkels · 2026-03-16T07:32:44Z

LGTM, but you have conflicts that need to be addressed

kanfil · 2026-03-16T09:30:03Z

@kfinkels merge it, but we do need to do src evals as well

Merge Strategy: - Reset to pre-merge commit 2f0852a (last clean tikalk state) - Re-applied PR #79 evals-refactor changes - Merged upstream/main with careful conflict resolution Tikalk-specific code preserved: - Config management (get_global_config_path, load_config, save_config, etc.) - Architecture config (get_architecture_diagram_format, get_adr_heuristic, etc.) - Skills config (get_skills_config, set_skills_config) - Skill subcommand app with all skill commands (search, install, update, etc.) - show_skills_banner function - Orange theme colors (ACCENT_COLOR, BANNER_COLORS) - Agentic SDLC branding (TAGLINE, show_banner extensions display) - _run_git_command and sync_team_ai_directives functions - install_bundled_extensions and install_bundled_presets - _ensure_commands_for_agent function - _validate_ai_assistant and _validate_ai_commands_dir callbacks Upstream features merged: - New agents: kimi, trae, pi, bob, vibe, tabnine, etc. - Updated preset system with new PresetCatalog features - Test improvements and new test files - Extension system enhancements - Various bug fixes and improvements Conflict Resolution: - Kept tikalk versions: README, pyproject.toml, docs, bash scripts - Kept upstream + tikalk additions: src/specify_cli/__init__.py - Kept upstream versions: agents.py, presets.py, tests All 444 tests pass.

kanfil assigned kfinkels Mar 14, 2026

Merge main to resolve conflicts in PR #79

19e6a2a

kanfil merged commit 3959c83 into main Mar 17, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add evals for fork-specific features (v0.1.2)#79

feat: add evals for fork-specific features (v0.1.2)#79
kanfil merged 2 commits intomainfrom
evals-refactor-v0.1.2

kanfil commented Mar 14, 2026

Uh oh!

kfinkels commented Mar 16, 2026

Uh oh!

kanfil commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kanfil commented Mar 14, 2026

Summary

Changes

Test Results

Related

Uh oh!

kfinkels commented Mar 16, 2026

Uh oh!

kanfil commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants