Skip to content

Add agent discovery symlinks for fine-tuning skill#28

Merged
AliciaFrame merged 4 commits into
mainfrom
skills/add-agent-discovery
Apr 21, 2026
Merged

Add agent discovery symlinks for fine-tuning skill#28
AliciaFrame merged 4 commits into
mainfrom
skills/add-agent-discovery

Conversation

@AliciaFrame

Copy link
Copy Markdown
Collaborator

Adds symlinks so coding agents auto-discover the existing fine-tuning skill from their conventional paths:

  • .github/skills/azure-ai-fine-tuning\ → ../../Skills\ (GitHub Copilot)
  • .claude/skills/azure-ai-fine-tuning\ → ../../Skills\ (Claude Code)
  • .agents/skills/azure-ai-fine-tuning\ → ../../Skills\ (Codex/other agents)

This reuses the comprehensive skill from PR #26 (SFT + DPO + RFT with graders/tools, 8 scripts, 11 reference docs, 5 workflows) rather than creating a separate thinner skill.

Related: PR #27 proposes a new SFT-only skill at the same paths. This PR is an alternative that reuses the existing content via symlinks instead of duplicating. See review comments on #27 for details.

AliciaFrame and others added 4 commits April 21, 2026 10:01
Add symlinks so coding agents (Copilot, Claude, Codex) can auto-discover
the fine-tuning skill from their conventional paths:

- .github/skills/azure-ai-fine-tuning -> ../../Skills
- .claude/skills/azure-ai-fine-tuning -> ../../Skills
- .agents/skills/azure-ai-fine-tuning -> ../../Skills

This reuses the existing comprehensive skill (SFT, DPO, RFT with graders
and tools, 8 scripts, 11 reference docs, 5 workflows) rather than creating
a separate thinner skill.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…gent Skills docs

Adopts three patterns from PR #27:

1. HelpOnErrorParser in common.py — prints full --help on invalid args
2. PEP 723 inline script metadata (# /// script) on all scripts —
   enables 'uv run scripts/submit_training.py' with auto-installed deps
3. DefaultAzureCredential fallback in get_clients() — works without
   API key when az CLI or Managed Identity is available

Also adds AI Agent Skills section to README with usage instructions
for Copilot (VS Code + CLI), Claude Code, and Codex agents.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- All 7 scripts now import and use HelpOnErrorParser from common.py
  (previously added the class but scripts still used argparse.ArgumentParser)
- Fix bug in DefaultAzureCredential fallback: when base_url is set but
  api_key is None and credential fails, no longer creates client with
  api_key=None (which would fail silently on first API call)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- convert_dataset.py: Fix DPO generation to include system messages when
  generating non-preferred responses from base model (was generating without
  system prompt but recording it in the DPO input — distributional mismatch)
- All scripts: Add sys.path.insert for common.py import to work when
  scripts are run from outside the scripts/ directory
- Clean up duplicate os/sys imports

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@AliciaFrame AliciaFrame merged commit a5052df into main Apr 21, 2026
2 checks passed
AliciaFrame added a commit that referenced this pull request Apr 21, 2026
README.md:
- Add quickstart + agent-assisted paths to Quick Start section
- Update AI Agent Skills counts (12 scripts, 14 references, 6 workflows)

CHANGELOG.md:
- Add entries for PRs #24, #26, #28, #29, #30
- Restructure into versioned releases (1.0.0, 1.1.0, 2.0.0, Unreleased)

GETTING_STARTED.md:
- Add Option A (AI-Assisted) path before the manual demo path
- Fix repo URL (Azure/fine-tuning -> microsoft-foundry/fine-tuning)
- Renumber steps to accommodate new section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AliciaFrame added a commit that referenced this pull request Apr 21, 2026
* Add monitor + calibrate scripts, fix 4 bugs from code review

New scripts:
- monitor_training.py: Poll a running job until completion, stream events
  in real time with status icons for reward steps, errors, and milestones
- calibrate_grader.py: Run base model on training data, score with your
  Python grader, output pass rates at every threshold, recommend optimal
  pass_threshold. Automates the most critical pre-RFT step.

Bug fixes (from code review, missed in PR #29 squash):
- submit_training.py, deploy_model.py: Add missing 'requests' to PEP 723 deps
- submit_training.py: Fix file handle leak on grader file read
- evaluate_model.py: Fix StopIteration crash on malformed test data
- validate/validate_rft.py: Fix broken newline escape detection logic

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add quickstart guide: 5-step path to first fine-tuned model

New workflow: quickstart.md — goes from zero to a deployed fine-tuned
model in 5 steps (credentials, data prep, submit, monitor, deploy+test).
Targets novice users who just want to get started without reading the
full pipeline guide.

SKILL.md updated to list quickstart as the first workflow option.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revise quickstart: synthetic data from prompt, baseline step, comparison

- Step 2: Generate training data from a prompt (no pre-existing data needed)
- Step 3: Baseline the base model before training (see what you're improving)
- Step 6: Side-by-side comparison of base vs fine-tuned on same questions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add cleanup.py: list and delete old files, deployments, pending jobs

Helps users avoid quota exhaustion (max 100 files per resource). Supports:
- --list (all/jobs/deployments/files) for resource inventory
- --delete-files with optional --older-than N days filter
- --cancel-pending to cancel queued jobs
- --dry-run to preview before executing

Includes quota warning when approaching 80/100 file limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix: JSON parsing for LLM output, align failure rate docs to 25-50%

1. quickstart.md: Use regex to extract JSON from code fences — the
   simple strip() approach fails when LLM adds preamble text before
   the JSON block
2. grader-design.md: Change prose from '30-50%' to '25-50%' to match
   the table and calibrate_grader.py code

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update README, CHANGELOG, and GETTING_STARTED for agent skills

README.md:
- Add quickstart + agent-assisted paths to Quick Start section
- Update AI Agent Skills counts (12 scripts, 14 references, 6 workflows)

CHANGELOG.md:
- Add entries for PRs #24, #26, #28, #29, #30
- Restructure into versioned releases (1.0.0, 1.1.0, 2.0.0, Unreleased)

GETTING_STARTED.md:
- Add Option A (AI-Assisted) path before the manual demo path
- Fix repo URL (Azure/fine-tuning -> microsoft-foundry/fine-tuning)
- Renumber steps to accommodate new section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update Skills/README.md: auto-discovery, new scripts, uv support, grader guidance

- Quick Start: add auto-discovery path (symlinks), uv as recommended install
- What This Skill Covers: add grader calibration, monitoring, cleanup, token cost
- Directory Structure: add monitor_training, calibrate_grader, cleanup, quickstart,
  grader-design; update descriptions for accuracy
- Guidance Highlights: add grader calibration, Python grader default, token cost

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* CHANGELOG: set Unreleased to v2.1.0 (2026-04-21)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix: SKILL.md missing DPO in submit description, remove unused requests dep from cleanup.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AliciaFrame added a commit that referenced this pull request Jun 1, 2026
* Add monitor + calibrate scripts, fix 4 bugs from code review

New scripts:
- monitor_training.py: Poll a running job until completion, stream events
  in real time with status icons for reward steps, errors, and milestones
- calibrate_grader.py: Run base model on training data, score with your
  Python grader, output pass rates at every threshold, recommend optimal
  pass_threshold. Automates the most critical pre-RFT step.

Bug fixes (from code review, missed in PR #29 squash):
- submit_training.py, deploy_model.py: Add missing 'requests' to PEP 723 deps
- submit_training.py: Fix file handle leak on grader file read
- evaluate_model.py: Fix StopIteration crash on malformed test data
- validate/validate_rft.py: Fix broken newline escape detection logic

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add quickstart guide: 5-step path to first fine-tuned model

New workflow: quickstart.md — goes from zero to a deployed fine-tuned
model in 5 steps (credentials, data prep, submit, monitor, deploy+test).
Targets novice users who just want to get started without reading the
full pipeline guide.

SKILL.md updated to list quickstart as the first workflow option.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revise quickstart: synthetic data from prompt, baseline step, comparison

- Step 2: Generate training data from a prompt (no pre-existing data needed)
- Step 3: Baseline the base model before training (see what you're improving)
- Step 6: Side-by-side comparison of base vs fine-tuned on same questions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add cleanup.py: list and delete old files, deployments, pending jobs

Helps users avoid quota exhaustion (max 100 files per resource). Supports:
- --list (all/jobs/deployments/files) for resource inventory
- --delete-files with optional --older-than N days filter
- --cancel-pending to cancel queued jobs
- --dry-run to preview before executing

Includes quota warning when approaching 80/100 file limit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix: JSON parsing for LLM output, align failure rate docs to 25-50%

1. quickstart.md: Use regex to extract JSON from code fences — the
   simple strip() approach fails when LLM adds preamble text before
   the JSON block
2. grader-design.md: Change prose from '30-50%' to '25-50%' to match
   the table and calibrate_grader.py code

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update README, CHANGELOG, and GETTING_STARTED for agent skills

README.md:
- Add quickstart + agent-assisted paths to Quick Start section
- Update AI Agent Skills counts (12 scripts, 14 references, 6 workflows)

CHANGELOG.md:
- Add entries for PRs #24, #26, #28, #29, #30
- Restructure into versioned releases (1.0.0, 1.1.0, 2.0.0, Unreleased)

GETTING_STARTED.md:
- Add Option A (AI-Assisted) path before the manual demo path
- Fix repo URL (Azure/fine-tuning -> microsoft-foundry/fine-tuning)
- Renumber steps to accommodate new section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update Skills/README.md: auto-discovery, new scripts, uv support, grader guidance

- Quick Start: add auto-discovery path (symlinks), uv as recommended install
- What This Skill Covers: add grader calibration, monitoring, cleanup, token cost
- Directory Structure: add monitor_training, calibrate_grader, cleanup, quickstart,
  grader-design; update descriptions for accuracy
- Guidance Highlights: add grader calibration, Python grader default, token cost

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* CHANGELOG: set Unreleased to v2.1.0 (2026-04-21)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix: SKILL.md missing DPO in submit description, remove unused requests dep from cleanup.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant