Skip to content

feat: Support inline eval definitions#43

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/define-evals-inline-in-experiment
Draft

feat: Support inline eval definitions#43
Copilot wants to merge 2 commits into
mainfrom
copilot/define-evals-inline-in-experiment

Conversation

Copilot AI commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Experiments can now reference evals defined outside the repository’s generated eval registry. Inline evals include a name, project-local directory, and optional config/test path overrides that resolve from the CLI working directory.

  • Experiment config

    • Allows evals entries to be either built-in eval IDs or inline eval objects.
    • Adds shared eval config types for prompt metadata.
  • Eval resolution

    • Resolves built-in eval IDs through the existing registry.
    • Resolves inline eval paths relative to process.cwd().
    • Supports inline config, configPath, and testPath.
    • Preserves sandbox spoofing behavior by normalizing inline evals into the same runtime shape as generated evals.
  • Validation + docs

    • Adds focused coverage for built-in lookup, cwd-relative inline paths, and custom config/test paths.
    • Documents inline eval usage.
export const experiment: ExperimentConfig = {
  name: 'Local project experiment',
  description: 'Run an eval from the current project',
  models: ['gpt-5.5'],
  evals: [
    {
      name: 'local-button-eval',
      path: './evals/button',
      config: {
        prompt: 'Update the local project to use a Primer button',
      },
      testPath: 'button.eval.test.ts',
    },
  ],
  treatments: [],
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants