Skip to content

issue-to-eval: import benchmark evals for issues #147, #148, #149#150

Open
elong0527 wants to merge 1 commit into
mainfrom
claude/funny-planck-oVtNE
Open

issue-to-eval: import benchmark evals for issues #147, #148, #149#150
elong0527 wants to merge 1 commit into
mainfrom
claude/funny-planck-oVtNE

Conversation

@elong0527
Copy link
Copy Markdown
Collaborator

Summary

Ran the issue-to-eval skill against the live GitHub issues labeled benchmark (35 total). Three new evaluation files were produced; every other issue parses identically to what is already on disk.

Issues imported (new)

Issue Skill Title
#147 admiral-bds ADVS derivation from pharmaversesdtm — BDS Findings (vital signs)
#148 admiral-adae ADAE derivation from pharmaversesdtm — safety analysis
#149 admiral-bds ADLB derivation from pharmaversesdtm — BDS Findings (laboratory values)

Skipped / no change

Notes

  • The gh CLI is unavailable in this environment, so the sync was run by feeding GitHub MCP issue bodies through _automation/issue-to-eval/scripts/import_issue_eval.parse_issue_markdown + save_to_evals directly. The MCP returns HTML-encoded text (e.g. ', ", <), so bodies are normalized via html.unescape before parsing to keep the on-disk evals byte-identical with prior gh-based runs.

Test plan

  • Re-run python3 _automation/issue-to-eval/scripts/sync_benchmarks.py after this PR merges and confirm every parseable issue reports Skipped (up to date).
  • _automation/evals/github-issue-147.json carries target_skills: ["admiral-bds"], language: "R", and the SYSBP/DIABP/PULSE/WEIGHT/HEIGHT/BMI parameter mapping prompt.
  • _automation/evals/github-issue-148.json carries target_skills: ["admiral-adae"], language: "R", and the 30-day TRTEMFL window prompt.
  • _automation/evals/github-issue-149.json carries target_skills: ["admiral-bds"], language: "R", and the LB-domain ADLB derivation prompt with spec-driven PARAMCD/PARAM lookup.

Generated by Claude Code

Synced 3 new benchmark issues into _automation/evals/:
- #147: admiral-bds ADVS derivation from pharmaversesdtm (R)
- #148: admiral-adae ADAE derivation from pharmaversesdtm (R)
- #149: admiral-bds ADLB derivation from pharmaversesdtm (R)

All 32 other benchmark issues already up to date; #108 skipped
(unfilled template).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants