DEMO: intentional skill regression for eval pipeline validation (DO NOT MERGE) by saurabhrb · Pull Request #61 · microsoft/Dataverse-skills

saurabhrb · 2026-05-16T01:24:59Z

Purpose

Demo branch for validating the eval pipeline catches skill regressions. DO NOT MERGE.

Recreated from a fresh branch off main (replaces closed PR #58) now that the pipeline default branch is main.

What's regressed

dv-data/SKILL.md replaces CreateMultiple bulk-create guidance with a per-record loop antipattern:

Bulk create via list-arg replaced with per-record for loop
CreateMultiple references removed
Chunking/adaptive helpers removed

How it's used

The ADO pipeline DVSkillsPlugin-Evals-PR (32010) runs against this branch. The data_003_skill_contract test asks the agent to report what the skill teaches, and NOT_CONTAINS: assertions catch the regressed content.

Expected result: 2/3 FAIL (data_003 catches the regression; data_001 and data_002 may still pass due to model prior knowledge).

DEMO: break bulk-create guidance (regression test target)

b01b346

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEMO: intentional skill regression for eval pipeline validation (DO NOT MERGE)#61

DEMO: intentional skill regression for eval pipeline validation (DO NOT MERGE)#61
saurabhrb wants to merge 1 commit into
mainfrom
users/saurabhrb/evals-bad-skill-demo-v2

saurabhrb commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

saurabhrb commented May 16, 2026

Purpose

What's regressed

How it's used

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant