Adaptive Evals: add EvaluatorGenerationJob LRO, JobSource (Prompt/Dataset/Traces), and RubricsEvaluatorDefinition (#42264) by glecaros · Pull Request #42764 · Azure/azure-rest-api-specs

glecaros · 2026-04-29T21:05:17Z

Re-applying #42264

adapt /feature/foundry with v2 folder structure
move generated openapi3 folder into Foundry data plane folder
git add tspconfig.yaml
introduce src folder
update additionalDirectories per 39560
Update specification/ai-foundry/data-plane/Foundry/client.csharp.tsp

Co-authored-by: Jose Alvarez jpalvarezl@users.noreply.github.com

updates
fixes
more updates
Extracting Namespace definition to importable file under common (Extracting Namespace definition to importable file under common #39644)
Trying out namespace extraction to its own file
File rename
update generated specs
Removed api version pinning for Java code gen
[Foundry branch-to-branch] client.tsp updates for new ingestion .NET support ([Foundry branch-to-branch] client.tsp updates for new ingestion .NET support #39696)
.net client.tsp support for new spec structure
tsp format
(minor) comment cleanup
recompile to reflect small non-client.tsp changes
client.tsp workaround for itemresource
port cspell updates with path snap
comprehensive cspell update
removal of legacy 'readonly' folder
introduce interim suppression guidance
Revert "Removed api version pinning for Java code gen"

This reverts commit 7d98e6e.

Make fixes for evaluators and generate code (Evaluator catalog fixes. #39720)
[Foundry] Moving updates from old branch. ([Foundry] Moving updates from old branch. #39792)
[Foundry] Moving updates from old branch.
cleanup
cspell
cspell
cspell
cspell
Refreshing spec + cleanup,
Minor corrections to tsp-config and forcing emission of IncludeEnum (Minor corrections to tsp-config and forcing emission of IncludeEnum #39802)
Use "Tool" or "PreviewTool" as a suffix for Azure Tools (Use "Tool" or "PreviewTool" as a suffix for Azure Tools (updated PR to target latest TypeSpec branch) #39815)
Update OpenAPI3.0 files
Rename MemorySearchTool to MemorySearchPreviewTool
Updating item type for the memory store search parameters. (Updating item type for the memory store search parameters. #39878)
Fix input item type for Update Memory operation (Fix input item type for Update Memory operation #39888)
Update allowed types for TextResponseFormatJsonSchema (Update allowed types for TextResponseFormatJsonSchema #39884)
[Agents-V2] Add tool_choice_mode for prompt agents and Resolve created_by field conflicts in ItemResources ([Agents-V2] Add tool_choice_mode for prompt agents and Resolve created_by field conflicts in ItemResources #39882)
[Agents-V2] Add tool_choice_mode for prompt agnts and Resolve created_by field conflicts in ItemResources
Add versing to deprecate old fields
PR feedack
review comments
Fix comments
Fix the evaluator creation time type. (Fix the evaluator creation time type. #39916)
Fix the evaluator creation time type.
Fix
Making CodeInterpreterTool's container property optional. (Making CodeInterpreterTool's container property optional. #39927)
update (Missing comments, docs. #39932)
Suppressing emission for duplicate types exposed by Stainless SDK (Suppressing emission for duplicate types exposed by Stainless SDK #39906)
Supressing unnecessary types
Added more supressions
Removing unnecessary renames
Revert "Removing unnecessary renames"

This reverts commit fab93d6.

Rename OpenAI namespace fo fix xref in the generated code. (Rename OpenAI namespace to fix xref in the generated code. #39938)
[Foundry branch-to-branch] 'v1' versioning w/preview designations ([Foundry branch-to-branch] 'v1' versioning w/preview designations #39854)
initial and incomplete: draft 'v1' versioning updates
adjust openai-based operation pattern
minor: adjust suppressions and superficial folder structure for Agents+OpenAI ops
refresh with most (not yet all) removed(v1) excised
include finetuning, insights in v1
remove in-situ _preview qualifiers from agents and introduce OAS extension-based tracking
supply deprecated pre-GA memorysearchtool copy
revert overapplied blanket preview treatment of openai-evals
restore /evaluations to PuPr pseudo-version; make openai-eval traces source preview
Adjusting broken import from responses and converstaion folder rename
use Record for OpenAPI tool 'spec'
Update OpenAPI files
Replace 'unknown' with 'Record' (Replace 'unknown' with 'Record<unknown>' #39992)
Rename folder /src/finetuning to /src/openai-finetuning (Rename folder /src/finetuning to /src/openai-finetuning #40041)

And update its route to /openai/v1/fine_tuning (added /v1 and fixed to use underscore instead of dash).

Now all OpenAI routes are defined by TypeSpec files in folders that start with "openai-": openai-conversations openai-evaluations openai-finetuning openai-responses

Add Synthetic data generation eval models for public preview (Add Synthetic data generation eval models for public preview #40003)
Add Synthetic data generation eval models for public preview
comment
fix
fix: lingering and mismatched memory_search_preview_call
remove conditional memory stores header use from agents operations
remove agent preview header from delete ops
remove duplicate discriminator kind on parent AgentDefinition (remove duplicate discriminator kind on parent AgentDefinition #40063)
remove duplicate discriminator
update openapi
Added missing @list for paged op (Added missing @list for paged op #40057)
Added missing @list for paged op
Removing OpenAI.ToolChoiceParam from code gen tree
Forcing emission of missed enum
Removed duplicate line
Removing rename for inexistent class
Forcing emission of FoundryPreviewOptInKeys for Java
Cleanup client.tsp used by Python and JS emitters (Cleanup client.tsp used by Python and JS emitters #40090)

There is no functional change associated with this PR, just cleanup.

Remove anything related to C# or Java, as those emitters do not use this client.tsp file.

Also move some lines around for better grouping under "sub-client" section.

Tested by re-emitting the Python SDK and seeing that there are no changes.

Schedule parameter rename (Schedule parameter rename #40113)
Restore input variable name "schedule" (instead of "resource") in emitted Python/JS Schedules operations (Restore input variable name "schedule" (instead of "resource") in emitted Python/JS Schedules operations #40094)
add operation for compact
post-merge spec compile
[Foundry branch-to-branch] Add a service contract library target to facilitate ingestion ([Foundry branch-to-branch] Add a service contract library target to facilitate ingestion #40065)
service contract target for lib generation
refactor imports to facilitate compatible view ingestion
revert branch-local package.json changes (development only)
.net openai: use client emitter view
.net client sdk only: adjust naming for types
sequence_number in parent ResponseStreamEvent
trial: required sequence_number on parent for easier emission
access/usage for '1' suffixes
updates for customtoolcalloutput to appear
post-merge spec compile
Updates to opt-in headers. Extend OpenAI.OutputItem for Azure defined Responses models. (Updates to opt-in headers. Extend OpenAI.OutputItem for Azure defined Responses models. #40114)
Make sure opt-in flag values can be any string, for extensibility purposes (while still documenting the current supported value) (Make sure opt-in flag values can be any string, for extensibility purposes (while still documenting the current supported value) #40175)
Do no emit streamAgentContainerLogs operation for Python and JS
Cleanup in client.tsp
Build fix + namespace + object type enum. (Build fix + namespace + object type enum. #40184)
Adding object type union + fixing compilation error.
updating target namespace.
rest of the unions.
More cleanup in Python/JS client.tsp. Do not emit AgentReference
Restoring original way of defining opt-in flags (Restoring original way of defining opt-in flags #40214)
[Agents-V2] Minor updates for service contracts
Fix typo
Update client.tsp to not include main.tsp
Remove evaluations from client.tsp
Minor update to client.tsp
Create Beta sub-clients for emitted Python and JS SDKs (Create Beta sub-clients for emitted Python and JS SDKs #40254)
service contract: make conversation message id, status optional for reuse
Fix typo and minor cleanup in client.tsp
[Agents-V2] Minor updates to make models more extensible ([Agents-V2] Minor updates to make models more extensible #40269)
Make Id in OutputItem mandatory
Make it optional
Make ID mandatory again
Make Id optional again
Move Container Agent Operations to a separate routes.tsp file, excluded from client*.tsp used by SDK emitters. (Move Container Agent Operations to a separate routes.tsp file, excluded from client*.tsp used by SDK emitters. #40259)
sdk-projects-openai-only: client.tsp refresh (+format folder)
Save copy of Foundry Features flag summary on preview operations
[Agents-V2] Update OutputItemRemoteToolCall model to include the full payload ([Agents-V2] Update OutputItemRemoteToolCall model to include the full… #40281)
Fix OutputItemRemoteToolCall schema
Add OutputItemRemoteToolCallOutput
Fix Remote tool output item type
Fix typo
Remove args from remote tool call output
Remove language specification from client names (Remove language specification from client names #40264)
Remove Python client name mappings for containers (Remove Python client name mappings for containers #40298)

Removed client name mapping for AgentContainerObject and AgentContainerOperationObject in Python.

Fix opt-in preview flags (Part 1) (Fix opt-in preview flags (Part 1) #40308)
no function change: tsp format, re-emit .json/.yaml
Fix RemoteToolArgument scheme
update OpenAPI3 files
Fix opt-in preview flags (Part 2) (Fix opt-in preview flags (Part 2) #40315)
[Agents-V2 Fix RemoteToolArgument schema
Allow renaming an SDK property on the Schedules createOrUpdate operations
Use alias isntead of Model for Schedules createOrUpdate parameters, so it does not get emitted as a class
Fix the schedules name (Fix the schedules name #40097)
Fix the schedules name
Fix
Fix renamings
Rollback

Co-authored-by: Nikolay Rovinskiy nirovins@microsoft.com

Use "word1-word2" format everywhere in folder names (and one file name) (Use "word1-word2" format everywhere in folder names (and one file name) #40379)
Remove the experimental header from evaluation rules. (Remove the experimental header from evaluation rules. #40383)

Co-authored-by: Nikolay Rovinskiy nirovins@microsoft.com

Make HumanEvaluationRuleAction -> HumanEvaluationPreviewRuleAction (Make HumanEvaluationRuleAction -> HumanEvaluationPreviewRuleAction #40413)
Make HumanEvaluationRuleAction -> HumanEvaluationPreviewRuleAction
Add optional headers
Add csv file source. (Add csv file source. #40460)
Update OpenAPI3 files, following npx tsp compile .
Remove "humanEvaluation" as type discriminator from v1. Python/JS emitters use more unique name for OpenAI "Error" class. (Remove "humanEvaluation" as type discriminator from v1. Python/JS emitters use more unique name for OpenAI "Error" class. #40495)
Update "Foundry-Features" HTTP request header report (Update "Foundry-Features" HTTP request header report #40502)
Additional updates to foundry-features-flag-summary.md
Removed OpenAI prefix for Evals (Removed OpenAI prefix for Evals #40484)
Opt-in flag only required for createOrUpdate in Evaluation Rules (Opt-in flag only required for createOrUpdate in Evaluation Rules #40513)
error rename (error rename #40531)
Do not emit Agent Create/Update operations for JS and Python (Do not emit Agent Create/Update operations for JS and Python #40539)
Fix Python emitted enum values (Fix Python emitted enum values #40576)
Temporary local fix for "type" discriminator in OpenAI's WebSearchApproximateLocation (Temporary local fix for "type" discriminator in OpenAI's WebSearchApproximateLocation #40594)
Add tool call item models for grounding tools (Foundry v2) (Add tool call item models for grounding tools (Foundry v2) #40491)
Add tool call item models for grounding tools

Adds the following tool call item models (adapted to v2 structure):

GroundingToolCallDocument - shared document model
BingGroundingToolCallItemParam/Resource - Bing grounding
SharepointGroundingToolCallItemParam/Resource - SharePoint grounding
AzureAISearchToolCallItemParam/Resource - Azure AI Search
BingCustomSearchToolCallItemParam/Resource - Bing custom search
OpenApiToolCallItemParam/Resource - OpenAPI
BrowserAutomationToolCallItemParam/Resource - browser automation
FabricDataAgentToolCallItemParam/Resource - Fabric data agent
AzureFunctionToolCallItemParam/Resource - Azure Function

Generate OpenAPI specs with new tool call item models
Address PR feedback: Add types to _AgentItemType enum and create ToolCallStatus union

Added new tool call type discriminators to _AgentItemType union in openai-responses/models.tsp
Created ToolCallStatus named union to avoid duplicate inline union definitions
Regenerated OpenAPI specs

Update ToolCallStatus doc string to be generic

Updated doc to 'The status of a tool call.' since it's used by multiple tools.

Note: Type discriminator properties must use string literals per TypeSpec requirements for discriminated model inheritance. The new types are registered in _AgentItemType enum (openai-responses/models.tsp) which extends OpenAI.ItemType/OutputItemType via @@copyVariants.

Use enum values for tool call type discriminators
Revert to OpenAI pattern: extend OpenAI.Item/OutputItem with string literals
Inline OutputBase aliases that were used only once
Rework tool call models: split into _call and _call_output types with call_id, arguments, and generic output
Add A2A tool call models and name property to OpenAPI/AzureFunction/A2A models
remove agent containers, reformat and recompile ([Foundry branch-to-branch] Remove 'Agent Containers' from v1 specification #40617)
Java arch board review feedback (Java arch board review feedback #40481)
Renaming OpenAI.Error to avoid conflict in codegen
Projects renames
Field renames
Trying alternate types
Hidding custom day of week
Hidding custom day of week
Alternate type not working, need to investigate further
utcDateTime type overrides
Added comment with suppressions we might need for agents sdk
Made singular nounds out of some names
Renamed enum variant
Updated function param name
Rename agent request models to avoid Java codegen '1' suffix collision
More feedback
Fixed bad tsp
Update OpenAPI3 files after running npx tsp compile .
Run npx tsp format **/*tsp
tactical patch for MCPToolCall.error ([Foundry branch-to-branch] patch MCPToolCall.error (observed spec inconsistency) #40655)
More renames + Foundry Feature keys (More renames + Foundry Feature keys #40641)
More renames
Using area values for foundry feature opt-in keys
Deduping header values
after merge compile

Co-authored-by: Gerardo Lecaros 10088504+glecaros@users.noreply.github.com

Remove Python & JS SDK dependency on OpenAI.InputItem (Remove Python & JS SDK dependency on OpenAI.InputItem #40648)
Update openapi3 files by running npx tsp compile .
Remove additional Container Agent assets from v1 (Remove additional Container Agent assets from v1 #40685)
Update OpenAPI3 files after running 'npx tsp compile .' with latest packages ('npm install')
Make Memory Stores "search_memories" method internal for Python (Make Memory Stores "search_memories" method internal for Python #40741)
Update to latest OpenAI TypeSpec package (1.11.0) (Update to latest OpenAI TypeSpec package (1.11.0) #40745)
Separating AgentDefinitionFeatureKeys from FoundryFeaturesOptInKeys. (Separating AgentDefinitionFeatureKeys from FoundryFeaturesOptInKeys. #40765)
Foundry Eval Benchmark (Foundry Eval Benchmark #40012)
Add Azure AI benchmark models and data source config
Rename AzureAIBenchmarkEvalRunDataSource to AzureAIBenchmarkPreviewEvalRunDataSource
Change scenario from 'benchmark' to 'benchmark_preview'
Add input messages configuration to AzureAIBenchmarkPreviewEvalRunDataSource

Added input messages configuration to AzureAIBenchmarkPreviewEvalRunDataSource.

Refactor AzureAIBenchmarkDataSourceConfig model
Remove AzureAIBenchmarkDataSourceConfig from models
Add 'benchmark_preview' to evaluation scenarios
Add Azure AI Benchmark Data Source Config

Added AzureAIBenchmarkDataSourceConfig to the evaluation models.

Refactor BenchmarkMetadata into AzureAIBenchmarkDataSourceConfig
Add grader_model field to benchmark specification

Added optional grader model field for benchmarks using model graders.

[Draft] Agent Invocations API Specification ([Draft] Agent Invocations API Specification #40709)
Initial commit that imports Lakshmi's invoke api spec
Updates
Fine tune the spec
remove open ai spec changes
Adding RAPI<>Invoke mapping examples
flush pending updates
agent-invocations
fix
Add get/cancel apis:
Finetune
fixes post merge
Fix compile error (npx tsp compile .). Format files (npx tsp format **/*tsp)
Explucde Invoactions from GA version (Exclude Invocations API from GA version #40857)
Address arch review board comments (humans and Azure SDK bot) (Address arch review board comments (humans and Azure SDK bot) #40844)
Fix typo in name of newly added union AgentDefintionOptInKeys (Fix typo in name of newly added union AgentDefintionOptInKeys #40870)
Add "allow_preview" to Python client initialization list (Add "allow_preview" to Python client initialization list #40925)
feat: Add Hosted Agents ADC integration API spec changes (feat: Add Hosted Agents ADC integration API spec changes #40739)
feat: Add Hosted Agents ADC integration API spec changes

Add status (AgentVersionStatus) and error (AgentVersionError) fields to AgentVersionObject for ADC snapshot provisioning visibility
Add force query parameter to deleteAgentVersion for safe version deletion when active sessions exist
Add foundry_session_id to CreateResponse request and Response model for ADC sandbox affinity and session-scoped operations

All new fields are gated behind hosted_agents_v1_preview feature key.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Added deleted state
Renamed to agent_session_id
Addressed review comments
Nit doc updates
Hide preview-only fields from GA SDK with @removed(Versions.v1)

Add @removed(Versions.v1) to hosted-agent-specific additions:

status and error on AgentVersionObject
AgentVersionStatus union
agent_session_id on CreateResponse and Response

These fields are only relevant for hosted agents (preview) and should not appear in the GA SDK. Following the pattern from PR #40857.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Co-authored-by: Ankit Sultania asultania@microsoft.com Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

regen openapi3
MemoryStores operation rename/visibility updates and arch board feedback for java (MemoryStores operation rename/visibility updates and arch board feedback for java #40910)
MemoryStores operation rename/visibility updates for java
Renamed anonymous field
Using JDK type for DayOfWeek
OpenAI.Error rename, Index rename
More renames
rename
More renames
Moved renames
Fixed renames
Modified right file
feat: Add TelemetryConfig and TelemetryEndpoint models to HostedAgentDefinition (feat: Add TelemetryConfig and TelemetryEndpoint models to HostedAgentDefinition #40784)
Add TelemetryConfig and TelemetryEndpoint models to HostedAgentDefinition

TelemetryConfig: wraps a list of TelemetryEndpoint instances
TelemetryEndpoint: defines kind (required), data, endpoint (required), protocol, and auth
Add telemetry_config optional property to HostedAgentDefinition
Both models gated behind hosted_agents_v1_preview feature key
kind remains a plain string to allow extensibility for future telemetry endpoint types (e.g. AzureMonitor, AppInsights) without an API version change

Add minimal README for Foundry data-plane API src directory
Move README.md to agents folder
feat(agents): add extensible telemetry config for hosted agents

Add TelemetryConfig with discriminated endpoint and auth hierarchies to HostedAgentDefinition for customer-supplied telemetry export.

Design:

TelemetryEndpoint: discriminated by 'kind' (OTLP today, extensible)
TelemetryEndpointAuth: discriminated by 'type' (header today, extensible)
Typed unions for endpoint kind, data kinds, transport protocol, auth type
All unions include string fallback for forward-compatibility

Models added:

TelemetryConfig (endpoints: 1-3 required)
TelemetryEndpoint (base, discriminator: kind)
OtlpTelemetryEndpoint (kind: OTLP, endpoint + protocol required)
TelemetryEndpointAuth (base, discriminator: type)
HeaderTelemetryEndpointAuth (type: header, headerName/secretId/secretKey)
TelemetryEndpointKind, TelemetryDataKind, TelemetryTransportProtocol, TelemetryEndpointAuthType unions

Wire format preserved via @Encodedname for auth fields (camelCase). Feature-gated behind hosted_agents_v1_preview.

Breaking change: auth object now requires 'type' discriminator field.

fix: format tsp, fix identifier, regenerate openapi3 (json+yaml)

tsp format applied
Fixed AgentDefinitionFeatureKeys -> AgentDefinitionOptInKeys
Regenerated openapi3 JSON and YAML for v1 and virtual-public-preview

Remove @Encodedname decorators from HeaderTelemetryEndpointAuth to use snake_case consistently
Regenerate OpenAPI3 JSON and YAML after snake_case fix for HeaderTelemetryEndpointAuth

Co-authored-by: Vipin Koottayi vkoottayi@microsoft.com

Rename ImageGenActionEnum to ImageGenAction for Python and JS
Apply suggestions from code review

Feedback

Co-authored-by: Johan Stenberg (MSFT) johan.stenberg@microsoft.com

regen openapi3
closed opt in enums. (closed opt in enums. #41017)
Add evaluator upload operations and entry_point for custom code evaluators (Add evaluator upload operations and entry_point for custom code evaluators #40955)
Add evaluator upload operations and entry_point to CodeBasedEvaluatorDefinition

Add startPendingUpload operation to Evaluators interface for initiating code upload to blob storage
Add getCredentials operation to Evaluators interface for fetching SAS tokens to access evaluator storage
Add entry_point property to CodeBasedEvaluatorDefinition for specifying the main Python file of uploaded evaluator code
Make code_text optional since uploaded evaluators use entry_point instead
Add SDK client name for startPendingUpload in client.tsp

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Add request body with blobUri to evaluator getCredentials operation

Add EvaluatorCredentialRequest model with required blobUri property
Update getCredentials operation to accept the request body

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Add optional image_tag to CodeBasedEvaluatorDefinition

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Use Python native date-time objects (Use Python native date-time objects (following the example from Java) #41044)
Split packages (Split packages #40662)
Split packages
Split packages
Adopt openai spec 1.11.0
Fix generation
Fix generation
Remove Agent Items
Fix Azure.AI.Extensions.OpenAI
Fix AgentResponseItem
Remove Agents from Azure.AI.Projects
Expose APIError
cleanup
Run tsp compile
Fix after merge

Co-authored-by: Nikolay Rovinskiy nirovins@microsoft.com

removing telemetry_config from v1.
Marking operations as preview-only.
TimeZone JDK type overrides for Foundry libraries (TimeZone JDK type overrides for Foundry libraries #41103)
Added type overrides for timezone + rearrangements
WebSearchAppLoc same override
Rever addition of telemetry config.
feedback
Revert "Add evaluator upload operations and entry_point for custom code evaluators (Add evaluator upload operations and entry_point for custom code evaluators #40955)"

This reverts commit 08e55ee.

Suppressing the emission of classes duplicated by openai-java package (Suppressing the emission of classes duplicated by openai-java package #41194)
Suppressing the emission of classes duplicated by openai-java package
removed bad stuff
More external duped types
More external types
Implement suggested renamings. (Implement suggested renamings. #41135)
Implement suggested renamings.
Move tools to the Tools namespace
Revert
More cleanup (More cleanup #41211)
Renamings in the Azure.AI.Projects (Renaming in the Azure.AI.Projects #41259)
Add listconversation ops (add listconversation ops #41323)
Apply suggestions from code review

Co-authored-by: Johan Stenberg (MSFT) johan.stenberg@microsoft.com

PR feedback for Microsoft Foundry: contemporary .tsp migration with v2 folder structure #39565 (PR feedback for #39565 #41345)
Apply suggestions from code review

Co-authored-by: Johan Stenberg (MSFT) johan.stenberg@microsoft.com

making invocations body unknown.
fix spec for tcgc 0.66.2 (Fix Spec for TCGC Version Bump #41316)
Update deleteAgent to internal type (Update deleteAgent to internal type #41348)
Updated insights to use the same pattern as the rest of the API.
regen openapi3
Implement other renamings. (Implement other renaming. #41390)
Adding correct decorator for paged op Insights (Adding correct decorator for paged op Insights #41420)
Using @list and library type
Restoring Azure.Core.Page usage
Change the way of subcliuent initialization (Change the way of subcliuent initialization #41439)
Change the way of subcliuent initialization
Generate standalone client
renaming and relocating Azure specific listConversations op ([Java] renaming and relocating Azure specific listConversations op #41419)
Disabled convenientAPI generation for Agent delete Operations (Disabled convenientAPI generation for Agent delete Operations #41522)
Disabled convenientAPI generation for Agent delete Operations
Added clarifying comment
Azure specific for ResponseCreate ops clustered for SDK convenience (Azure specific for ResponseCreate ops clustered for SDK convenience #41471)
Trying something out
Adjusting option bag visibility
The azure option bag contains only current fields
Same model for response
Renamed model
regen openapi
Renamings inside agents package. (Renamings inside agents package. #41508)
Rename intermediate model and added docs (only visible at client library level) (Rename intermediate model and added docs (only visible at client library) #41585)
Class Renames (Class Renames #41614)
suppress object type
fix naming
approach as rename
mcp rename
rename params
fix typo
Customizations to update delete op return types (Customizations to update delete op return types #41677)
deleteAgent internal
customization for memory store delete ops
set deleteAgentVersion internal
remove make private
More renaming (More renaming #41719)
Hide DetailEnum everywhere except for Azure.AI.Extensions.OpenAI (Hide DetailEnum everywhere except for Azure.AI.Extensions.OpenAI #41771)
Yet another round of renaming. (Yet another round of renaming. #41912)
Remove EvaluationScheduleTaskEvalRun (Remove EvaluationScheduleTaskEvalRun #41979)
Unhide the evaluation targets. (Unhide the evaluation targets. #41984)
Add ScenarioBasedEvaluatorDefinition for Adaptive Evals

Add 'scenario' to EvaluatorDefinitionType union and supporting models:

ContextInput: typed context (agent_prompt, policy_document, trace_file, supplementary_text)
RubricCriterion: weighted scoring criterion with 1-5 rubric and applicability guidance
ScenarioBasedEvaluatorDefinition: generates rubric catalog (quality) or taxonomy (safety) from context inputs

EvaluatorCategory determines which generation pipeline runs:

QUALITY -> rubric catalog with scored criteria, applicability gates, sink criterion
SAFETY -> taxonomy with risk categories and sub-behaviors

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Redesign: separate generate endpoint from evaluator persistence

Remove context_inputs from ScenarioBasedEvaluatorDefinition (generation inputs don't belong on persisted definition)
Add ContextInputType union with typed context categories
Add GenerateEvaluatorRequest with context_inputs, category, existing_criteria (for iterative refinement)
Add GenerateEvaluatorResponse with RubricCatalog for HITL review before saving
Add RubricCatalog model (spec + criteria array + source_model)
Add generate action on Evaluators interface: POST /evaluators/{name}/versions:generate
ScenarioBasedEvaluatorDefinition now stores only generated outputs (spec, rubric_catalog, taxonomy ref)

User flow: generate -> review/edit criteria -> save via createVersion

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Rename: scenario→generated, ContextInput→Source, simplify terminology

ScenarioBasedEvaluatorDefinition → GeneratedEvaluatorDefinition (type 'generated')
ContextInput → Source, ContextInputType → SourceType
Source type values: description, policy, traces, supplementary
context_inputs → sources on GenerateEvaluatorRequest
category defaults to 'quality' (optional, not required)
Reduces core terminology from ~9 terms to 6

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address council review: simplify RubricCriterion, rename rubric_catalog→rubric_criteria, evaluation_summary

Critical fixes from council of models review (Opus, GPT-5.2, Codex):

RubricCriterion: remove name, scoring, applicability_guidance, always_on (pipeline doesn't produce these)
RubricCriterion: add rubric_id (read-only, service-generated), fixed_applicability (for sink)
RubricCriterion: weight changed from float32 to int32 (matching evalfactory range 1-10)
Remove RubricCatalog model; use rubric_criteria: RubricCriterion[] directly
Rename rubric_catalog → rubric_criteria on definition and response
Rename spec → evaluation_summary; move to response-only (remove from definition)
GeneratedEvaluatorDefinition now has only: rubric_criteria, taxonomy_id, taxonomy_version
existing_criteria doc updated: seed-only for //build, pipeline may keep/modify/drop

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Remove existing_criteria — service auto-retrieves prior version by evaluator name

When regenerating, the service looks up the latest version's criteria by evaluator name and uses them as context. No need for the user to explicitly pass them.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Generate returns EvaluatorVersion, align sources to data gen pattern

Generate action now returns EvaluatorVersion (not custom response model)
Remove GenerateEvaluatorResponse — evaluation_summary/source_model go in metadata
Replace Source/SourceType with data-gen-aligned discriminated union:
- PromptEvaluatorGenerationSource (prompt text + agent_name)
- TracesEvaluatorGenerationSource (agent_name + time window)
- FileEvaluatorGenerationSource (file id)
Source type strings match DataGenerationJobSourceType: Prompt, Traces, File

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Rename sources to EvaluatorGenerationJobSource, simplify RubricCriterion, rename sink to general quality

Rename EvaluatorGenerationSource → EvaluatorGenerationJobSource (all subtypes)
Rename EvaluatorGenerationSourceType → EvaluatorGenerationJobSourceType
Replace fixed_applicability (int32) with always_applicable (boolean)
Update doc strings: sink criterion → general quality criterion

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Add @minValue/@MaxValue to weight, clarify rubric_id semantics

RubricCriterion.weight: Add @minValue(1) @MaxValue(10) constraint
RubricCriterion.weight doc: Clarify weight discipline is a generation heuristic
RubricCriterion.rubric_id doc: Stable human-readable slug, not raw hash

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

PromptEvaluatorGenerationJobSource: allow both prompt + agent_name

At least one of prompt or agent_name must be specified. When both provided, agent instructions are merged with supplementary prompt text.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Meeting decisions: rename to rubrics, persist-by-default, remove taxonomy, add model+persist params

Rename GeneratedEvaluatorDefinition → RubricBasedEvaluatorDefinition
Rename EvaluatorDefinitionType.generated → .rubrics
Remove taxonomy_id/taxonomy_version (safety uses rubrics too)
Add persist boolean (default true) to GenerateEvaluatorRequest
Rename model_deployment_name → model
Update route doc for persist-by-default semantics

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Align review fixes: rubric_id lifecycle, general quality non-editable docs

rubric_id: clients must echo existing ID on edit (not just 'preserved')
always_applicable: clarify general criterion is non-editable, users can set on own criteria
rubric_criteria: note general criterion is non-editable

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Clarify residual criterion IDs: general_quality vs general_policy_compliance

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Rename File source → Dataset source (DatasetEvaluatorGenerationJobSource)

FileEvaluatorGenerationJobSource → DatasetEvaluatorGenerationJobSource
Source type 'File' → 'Dataset' with name+version fields
Aligns with DatasetDataGenerationJobSource pattern

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Make model required, update sources doc

model: optional → required (user must provide own LLM)
sources doc: 'uploaded files' → 'datasets'

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Source roles, LRO-first (JobLike), EvaluatorGenerationJob model

Add EvaluatorGenerationJobSourceRole union (agent_description, evaluator_description)
Add role field to EvaluatorGenerationJobSource base model
Add EvaluatorGenerationResult and EvaluatorGenerationJob (extends JobLike)
Generate route returns JobCreatedResponse (201 + Operation-Location)
Import servicepatterns.tsp for JobLike/FoundryTimestamp

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Revert to synchronous generate API

Remove EvaluatorGenerationJob and EvaluatorGenerationResult models
Generate route returns ResourceOkResponse (not JobCreatedResponse)
Remove servicepatterns.tsp import (no longer needed)

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Fix rubric_id visibility: Read+Create (clients must echo on edit)

rubric_id was Read-only, preventing clients from sending it back when editing criteria and saving as a new version. Now Read+Create so service generates it on first creation and clients echo it on saves.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Switch to LRO pattern: EvaluatorGenerationJob + shared GenerationJobSource

Replace sync generate route with EvaluatorGenerationJobs interface (postJobPreview, queryJobStatusPreview, listJobsPreview, cancelJobPreview, deleteJobPreview)
Rename EvaluatorGenerationJobSource* to GenerationJobSource* (shared with datagen)
Replace role enum with purpose?: string on base source model
Add EvaluatorGenerationInputs, EvaluatorGenerationResult, EvaluatorGenerationJob
Add TokenUsage model (will consolidate when datagen merges)
Follow DataGenerationJob pattern exactly: evaluator_generation/jobs/{jobId}

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Flat route structure: evaluator_generation_jobs (not evaluator_generation/jobs)

Architect decision: REST API uses flat routes. SDK represents as nested (project.evaluators.generation_jobs).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address April's review: JobSource in common, Agent source, simplify LRO

Per April's 8 review comments on PR #42264:

Rename GenerationJobSource* to JobSource* (generic for future job types)
Move JobSource types to common/models.tsp (shared location)
Add new Agent source type (AgentJobSource) - agent_name under Prompt was unintuitive per April + Dan feedback
PromptJobSource now has prompt (required), no agent_name
Remove persist field (LRO always persists)
Remove EvaluatorGenerationResult wrapper (result is EvaluatorVersion)
Remove TokenUsage (not needed without result wrapper)
Remove cancel route (not required now)
Keep category singular (quality/safety mutually exclusive)

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Nest generation jobs under evaluators sub-client via @@clientLocation

Move EvaluatorGenerationJobs operations into Beta.Evaluators using @@clientLocation in relocate-beta-operations.tsp. Add @@clientName entries in client.tsp for Python-friendly names: create → generate get → get_generation_job list → list_generation_jobs delete → delete_generation_job

SDK surface: client.beta.evaluators.generate(...)

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address Sakoll's review: purpose→description, AgentJobSource→PromptAgentJobSource

Per Sakoll's 3 comments:

Rename purpose to description on JobSource (avoids OAI Files purpose collision)
Rename AgentJobSource to PromptAgentJobSource (hosted agents can't fetch instructions)
Update TracesJobSource.agent_version doc: all versions included when not specified

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address April's round 2: Agent (not PromptAgent), rubrics doc, required criteria, cancel back

Per April's 5 new comments:

Revert PromptAgentJobSource back to AgentJobSource (Agent type) per April: hosted agents have description/metadata, useful in future
Fix rubrics doc: can be created via generate API or manually
Make rubric_criteria required (not optional)
Add cancel route back per Sashank + April (similar to Eval Run)
Delete doc: wipes job record only, keeps generated evaluator

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Fix rubric_id visibility, model keyword escape, regenerate swagger

rubric_id: remove @visibility restriction, make user-editable short name
model field: escape with backticks (reserved keyword in TypeSpec)
Regenerate openapi3 JSON for v1 and virtual-public-preview

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Add typed EvaluatorGenerationArtifacts on EvaluatorVersion

Replace the implicit metadata-bag approach with a typed, read-only generation_artifacts field on EvaluatorVersion that holds DatasetReference pointers to the spec, optional tools, and optional context produced during generation. Excludes actor (deferred to multi-turn dataset epic).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Convert JobSourceType wire values to snake_case

Per Foundry data-plane convention, JobSourceType in common/models.tsp now uses snake_case wire values: Prompt -> prompt Agent -> agent Traces -> traces Dataset -> dataset

JobSource subtype discriminator literals (PromptJobSource, AgentJobSource, TracesJobSource, DatasetJobSource) updated to match. SDK class names are unchanged (emitter normalizes regardless of TypeSpec key casing); only the JSON wire format changes.

Backend coordination required during transition.

Out of scope (deferred to separate cleanup):

red-teams/models.tsp RiskCategory (10 PascalCase members)
connections/models.tsp ConnectionType, CredentialType
foundry-data-generation-jobs.tsp DataGenerationJobSourceType (PR Update swagger specs for network resource provider #56)

Regenerated OpenAPI also picks up prior TSP/JSON drift from earlier commits (DatasetReference, EvaluatorGenerationArtifacts schemas now materialized in OpenAPI).

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Update specification/ai-foundry/data-plane/Foundry/src/common/models.tsp

Co-authored-by: Sashank Kolli 89619248+sakoll@users.noreply.github.com

Make rubric_id required; drop service-side slugifier

rubric_id is now produced directly by the generation model (snake_case identifier matching ^[a-z][a-z0-9_]*$) and required in every RubricCriterion. The service no longer derives or post-processes rubric_ids — the slugifier in ACA generate.py will be removed in a follow-up commit on the rubric gen branch.

Resolves spec.md Open Question Q3 follow-up; aligns with the decision to keep the evaluation spec markdown only in generation_artifacts.spec.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Clarify rubric_id authorship in @doc

rubric_id is user-provided ù either during manual evaluator creation or during human-in-the-loop review of a generated rubric catalog. The generation pipeline emits an initial value the user can edit before saving the EvaluatorVersion.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Restructure JobSource per sakoll review

Drop discriminator base in common/models.tsp; replace with JobSourceDescription mixin to align with Shivam's PR #41606 pattern. Add named EvaluatorJobSource union in evaluators/models.tsp for the polymorphic sources field.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Use JobSourceType enum refs instead of string literals per dargilco

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address dargilco r2: rename Python clientName generate->generate_job; add service-side default for always_applicable

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address sakoll: drop AgentJobSource.prompt — supplementary text covered by PromptJobSource entry

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address glecaros: convert JobSource mixin -> discriminated base model

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Address glecaros: per-context discriminated EvaluatorGenerationJobSource via spread

Drop shared @Discriminator("type") JobSource base; restore shared XxxJobSource shapes with string-literal type so they can be spread. Add per-top-level discriminated EvaluatorGenerationJobSource base + 4 ...JobSource-spread subtypes (matches DataGenerationJobSource pattern from #42722). EvaluatorGenerationInputs.sources is now EvaluatorGenerationJobSource[].

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

feat(adaptive-evals): relocate evaluator generation jobs under evaluators sub-client (Python) + add TokenUsage

Add 5 @@clientLocation directives so Python SDK exposes: project_client.evaluators.{generate_job, get_generation_job, list_generation_jobs, cancel_generation_job, delete_generation_job} REST surface (route + tag EvaluatorGenerationJobs) is unchanged.
Add reusable TokenUsage model in common/models.tsp (input_tokens, output_tokens, total_tokens) for cross-LRO reuse.
Add usage?: TokenUsage (read-only) to EvaluatorGenerationJob so callers can see token consumption when the job reaches a terminal state.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

fix(adaptive-evals): remove duplicate @@clientLocation from client.tsp

Per @dargilco review: the EvaluatorGenerationJobs @@clientLocation directives were already present in relocate-beta-operations.tsp (without the unnecessary 'python' scope arg). Remove the duplicates I added in client.tsp to keep the relocation in one place.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

fix(adaptive-evals): rename TokenUsage to EvaluatorGenerationTokenUsage per glecaros review

The shared TokenUsage model in common/models.tsp collided conceptually with PR #41606's data-generation TokenUsage (renamed there to DataGenerationTokenUsage). The two shapes differ enough that sharing doesn't pay off; rename ours to be specific to the evaluator generation LRO and move it next to EvaluatorGenerationJob.

Cross-LRO convergence on a shared TokenUsage can be revisited later.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Choose a PR Template

Switch to "Preview" on this description then select one of the choices below.

Click here to open a PR for a Data Plane API.

Click here to open a PR for a Control Plane (ARM) API.

Click here to open a PR for only SDK configuration.

@list

…aset/Traces), and RubricsEvaluatorDefinition (#42264) * adapt /feature/foundry with v2 folder structure * move generated openapi3 folder into Foundry data plane folder * git add tspconfig.yaml * introduce src folder * update additionalDirectories per 39560 * Update specification/ai-foundry/data-plane/Foundry/client.csharp.tsp Co-authored-by: Jose Alvarez <jpalvarezl@users.noreply.github.com> * updates * fixes * more updates * Extracting Namespace definition to importable file under `common` (#39644) * Trying out namespace extraction to its own file * File rename * update generated specs * Removed api version pinning for Java code gen * [Foundry branch-to-branch] client.tsp updates for new ingestion .NET support (#39696) * .net client.tsp support for new spec structure * tsp format * (minor) comment cleanup * recompile to reflect small non-client.tsp changes * client.tsp workaround for itemresource * port cspell updates with path snap * comprehensive cspell update * removal of legacy 'readonly' folder * introduce interim suppression guidance * Revert "Removed api version pinning for Java code gen" This reverts commit 7d98e6e. * Make fixes for evaluators and generate code (#39720) * [Foundry] Moving updates from old branch. (#39792) * [Foundry] Moving updates from old branch. * cleanup * cspell * cspell * cspell * cspell * Refreshing spec + cleanup, * Minor corrections to tsp-config and forcing emission of IncludeEnum (#39802) * Use "Tool" or "PreviewTool" as a suffix for Azure Tools (#39815) * Update OpenAPI3.0 files * Rename MemorySearchTool to MemorySearchPreviewTool * Updating item type for the memory store search parameters. (#39878) * Fix input item type for Update Memory operation (#39888) * Update allowed types for TextResponseFormatJsonSchema (#39884) * [Agents-V2] Add tool_choice_mode for prompt agents and Resolve created_by field conflicts in ItemResources (#39882) * [Agents-V2] Add tool_choice_mode for prompt agnts and Resolve created_by field conflicts in ItemResources * Add versing to deprecate old fields * PR feedack * review comments * Fix comments * Fix the evaluator creation time type. (#39916) * Fix the evaluator creation time type. * Fix * Making `CodeInterpreterTool`'s `container` property optional. (#39927) * update (#39932) * Suppressing emission for duplicate types exposed by Stainless SDK (#39906) * Supressing unnecessary types * Added more supressions * Removing unnecessary renames * Revert "Removing unnecessary renames" This reverts commit fab93d6. * Rename OpenAI namespace fo fix xref in the generated code. (#39938) * [Foundry branch-to-branch] 'v1' versioning w/preview designations (#39854) * initial and incomplete: draft 'v1' versioning updates * adjust openai-based operation pattern * minor: adjust suppressions and superficial folder structure for Agents+OpenAI ops * refresh with most (not yet all) removed(v1) excised * include finetuning, insights in v1 * remove in-situ _preview qualifiers from agents and introduce OAS extension-based tracking * supply deprecated pre-GA memorysearchtool copy * revert overapplied blanket preview treatment of openai-evals * restore /evaluations to PuPr pseudo-version; make openai-eval traces source preview * Adjusting broken import from responses and converstaion folder rename * use Record<unknown> for OpenAPI tool 'spec' * Update OpenAPI files * Replace 'unknown' with 'Record<unknown>' (#39992) * Rename folder /src/finetuning to /src/openai-finetuning (#40041) And update its route to /openai/v1/fine_tuning (added /v1 and fixed to use underscore instead of dash). Now all OpenAI routes are defined by TypeSpec files in folders that start with "openai-": openai-conversations openai-evaluations openai-finetuning openai-responses * Add Synthetic data generation eval models for public preview (#40003) * Add Synthetic data generation eval models for public preview * comment * fix * fix: lingering and mismatched memory_search_preview_call * remove conditional memory stores header use from agents operations * remove agent preview header from delete ops * remove duplicate discriminator `kind` on parent `AgentDefinition` (#40063) * remove duplicate discriminator * update openapi * Added missing @list for paged op (#40057) * Added missing @list for paged op * Removing OpenAI.ToolChoiceParam from code gen tree * Forcing emission of missed enum * Removed duplicate line * Removing rename for inexistent class * Forcing emission of FoundryPreviewOptInKeys for Java * Cleanup client.tsp used by Python and JS emitters (#40090) There is no functional change associated with this PR, just cleanup. Remove anything related to C# or Java, as those emitters do not use this client.tsp file. Also move some lines around for better grouping under "sub-client" section. Tested by re-emitting the Python SDK and seeing that there are no changes. * Schedule parameter rename (#40113) * Restore input variable name "schedule" (instead of "resource") in emitted Python/JS Schedules operations (#40094) * add operation for compact * post-merge spec compile * [Foundry branch-to-branch] Add a service contract library target to facilitate ingestion (#40065) * service contract target for lib generation * refactor imports to facilitate compatible view ingestion * revert branch-local package.json changes (development only) * .net openai: use client emitter view * .net client sdk only: adjust naming for types * sequence_number in parent ResponseStreamEvent * trial: required sequence_number on parent for easier emission * access/usage for '1' suffixes * updates for customtoolcalloutput to appear * post-merge spec compile * Updates to opt-in headers. Extend OpenAI.OutputItem for Azure defined Responses models. (#40114) * Make sure opt-in flag values can be any string, for extensibility purposes (while still documenting the current supported value) (#40175) * Do no emit streamAgentContainerLogs operation for Python and JS * Cleanup in client.tsp * Build fix + namespace + object type enum. (#40184) * Adding object type union + fixing compilation error. * updating target namespace. * rest of the unions. * More cleanup in Python/JS client.tsp. Do not emit AgentReference * Restoring original way of defining opt-in flags (#40214) * [Agents-V2] Minor updates for service contracts * Fix typo * Update client.tsp to not include main.tsp * Remove evaluations from client.tsp * Minor update to client.tsp * Create Beta sub-clients for emitted Python and JS SDKs (#40254) * service contract: make conversation message id, status optional for reuse * Fix typo and minor cleanup in client.tsp * [Agents-V2] Minor updates to make models more extensible (#40269) * Make Id in OutputItem mandatory * Make it optional * Make ID mandatory again * Make Id optional again * Move Container Agent Operations to a separate routes.tsp file, excluded from client*.tsp used by SDK emitters. (#40259) * sdk-projects-openai-only: client.tsp refresh (+format folder) * Save copy of Foundry Features flag summary on preview operations * [Agents-V2] Update OutputItemRemoteToolCall model to include the full payload (#40281) * Fix OutputItemRemoteToolCall schema * Add OutputItemRemoteToolCallOutput * Fix Remote tool output item type * Fix typo * Remove args from remote tool call output * Remove language specification from client names (#40264) * Remove Python client name mappings for containers (#40298) Removed client name mapping for AgentContainerObject and AgentContainerOperationObject in Python. * Fix opt-in preview flags (Part 1) (#40308) * no function change: tsp format, re-emit .json/.yaml * Fix RemoteToolArgument scheme * update OpenAPI3 files * Fix opt-in preview flags (Part 2) (#40315) * [Agents-V2 Fix RemoteToolArgument schema * Allow renaming an SDK property on the Schedules createOrUpdate operations * Use alias isntead of Model for Schedules createOrUpdate parameters, so it does not get emitted as a class * Fix the schedules name (#40097) * Fix the schedules name * Fix * Fix renamings * Rollback --------- Co-authored-by: Nikolay Rovinskiy <nirovins@microsoft.com> * Use "word1-word2" format everywhere in folder names (and one file name) (#40379) * Remove the experimental header from evaluation rules. (#40383) Co-authored-by: Nikolay Rovinskiy <nirovins@microsoft.com> * Make HumanEvaluationRuleAction -> HumanEvaluationPreviewRuleAction (#40413) * Make HumanEvaluationRuleAction -> HumanEvaluationPreviewRuleAction * Add optional headers * Add csv file source. (#40460) * Update OpenAPI3 files, following `npx tsp compile .` * Remove "humanEvaluation" as type discriminator from v1. Python/JS emitters use more unique name for OpenAI "Error" class. (#40495) * Update "Foundry-Features" HTTP request header report (#40502) * Additional updates to foundry-features-flag-summary.md * Removed OpenAI prefix for Evals (#40484) * Opt-in flag only required for createOrUpdate in Evaluation Rules (#40513) * error rename (#40531) * Do not emit Agent Create/Update operations for JS and Python (#40539) * Fix Python emitted enum values (#40576) * Temporary local fix for "type" discriminator in OpenAI's WebSearchApproximateLocation (#40594) * Add tool call item models for grounding tools (Foundry v2) (#40491) * Add tool call item models for grounding tools Adds the following tool call item models (adapted to v2 structure): - GroundingToolCallDocument - shared document model - BingGroundingToolCallItemParam/Resource - Bing grounding - SharepointGroundingToolCallItemParam/Resource - SharePoint grounding - AzureAISearchToolCallItemParam/Resource - Azure AI Search - BingCustomSearchToolCallItemParam/Resource - Bing custom search - OpenApiToolCallItemParam/Resource - OpenAPI - BrowserAutomationToolCallItemParam/Resource - browser automation - FabricDataAgentToolCallItemParam/Resource - Fabric data agent - AzureFunctionToolCallItemParam/Resource - Azure Function * Generate OpenAPI specs with new tool call item models * Address PR feedback: Add types to _AgentItemType enum and create ToolCallStatus union - Added new tool call type discriminators to _AgentItemType union in openai-responses/models.tsp - Created ToolCallStatus named union to avoid duplicate inline union definitions - Regenerated OpenAPI specs * Update ToolCallStatus doc string to be generic Updated doc to 'The status of a tool call.' since it's used by multiple tools. Note: Type discriminator properties must use string literals per TypeSpec requirements for discriminated model inheritance. The new types are registered in _AgentItemType enum (openai-responses/models.tsp) which extends OpenAI.ItemType/OutputItemType via @@copyVariants. * Use enum values for tool call type discriminators * Revert to OpenAI pattern: extend OpenAI.Item/OutputItem with string literals * Inline OutputBase aliases that were used only once * Rework tool call models: split into _call and _call_output types with call_id, arguments, and generic output * Add A2A tool call models and name property to OpenAPI/AzureFunction/A2A models * remove agent containers, reformat and recompile (#40617) * Java arch board review feedback (#40481) * Renaming OpenAI.Error to avoid conflict in codegen * Projects renames * Field renames * Trying alternate types * Hidding custom day of week * Hidding custom day of week * Alternate type not working, need to investigate further * utcDateTime type overrides * Added comment with suppressions we might need for agents sdk * Made singular nounds out of some names * Renamed enum variant * Updated function param name * Rename agent request models to avoid Java codegen '1' suffix collision * More feedback * Fixed bad tsp * Update OpenAPI3 files after running `npx tsp compile .` * Run `npx tsp format **/*tsp` * tactical patch for MCPToolCall.error (#40655) * More renames + Foundry Feature keys (#40641) * More renames * Using area values for foundry feature opt-in keys * Deduping header values * after merge compile --------- Co-authored-by: Gerardo Lecaros <10088504+glecaros@users.noreply.github.com> * Remove Python & JS SDK dependency on OpenAI.InputItem (#40648) * Update openapi3 files by running `npx tsp compile .` * Remove additional Container Agent assets from v1 (#40685) * Update OpenAPI3 files after running 'npx tsp compile .' with latest packages ('npm install') * Make Memory Stores "search_memories" method internal for Python (#40741) * Update to latest OpenAI TypeSpec package (1.11.0) (#40745) * Separating AgentDefinitionFeatureKeys from FoundryFeaturesOptInKeys. (#40765) * Foundry Eval Benchmark (#40012) * Add Azure AI benchmark models and data source config * Rename AzureAIBenchmarkEvalRunDataSource to AzureAIBenchmarkPreviewEvalRunDataSource * Change scenario from 'benchmark' to 'benchmark_preview' * Add input messages configuration to AzureAIBenchmarkPreviewEvalRunDataSource Added input messages configuration to AzureAIBenchmarkPreviewEvalRunDataSource. * Refactor AzureAIBenchmarkDataSourceConfig model * Remove AzureAIBenchmarkDataSourceConfig from models * Add 'benchmark_preview' to evaluation scenarios * Add Azure AI Benchmark Data Source Config Added AzureAIBenchmarkDataSourceConfig to the evaluation models. * Refactor BenchmarkMetadata into AzureAIBenchmarkDataSourceConfig * Add grader_model field to benchmark specification Added optional grader model field for benchmarks using model graders. * [Draft] Agent Invocations API Specification (#40709) * Initial commit that imports Lakshmi's invoke api spec * Updates * Fine tune the spec * remove open ai spec changes * Adding RAPI<>Invoke mapping examples * flush pending updates * agent-invocations * fix * Add get/cancel apis: * Finetune * fixes post merge * Fix compile error (`npx tsp compile .`). Format files (`npx tsp format **/*tsp`) * Explucde Invoactions from GA version (#40857) * Address arch review board comments (humans and Azure SDK bot) (#40844) * Fix typo in name of newly added union `AgentDefintionOptInKeys` (#40870) * Add "allow_preview" to Python client initialization list (#40925) * feat: Add Hosted Agents ADC integration API spec changes (#40739) * feat: Add Hosted Agents ADC integration API spec changes - Add status (AgentVersionStatus) and error (AgentVersionError) fields to AgentVersionObject for ADC snapshot provisioning visibility - Add force query parameter to deleteAgentVersion for safe version deletion when active sessions exist - Add foundry_session_id to CreateResponse request and Response model for ADC sandbox affinity and session-scoped operations All new fields are gated behind hosted_agents_v1_preview feature key. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Added deleted state * Renamed to agent_session_id * Addressed review comments * Nit doc updates * Hide preview-only fields from GA SDK with @removed(Versions.v1) Add @removed(Versions.v1) to hosted-agent-specific additions: - status and error on AgentVersionObject - AgentVersionStatus union - agent_session_id on CreateResponse and Response These fields are only relevant for hosted agents (preview) and should not appear in the GA SDK. Following the pattern from PR #40857. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Ankit Sultania <asultania@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * regen openapi3 * MemoryStores operation rename/visibility updates and arch board feedback for java (#40910) * MemoryStores operation rename/visibility updates for java * Renamed anonymous field * Using JDK type for DayOfWeek * OpenAI.Error rename, Index rename * More renames * rename * More renames * Moved renames * Fixed renames * Modified right file * feat: Add TelemetryConfig and TelemetryEndpoint models to HostedAgentDefinition (#40784) * Add TelemetryConfig and TelemetryEndpoint models to HostedAgentDefinition - TelemetryConfig: wraps a list of TelemetryEndpoint instances - TelemetryEndpoint: defines kind (required), data, endpoint (required), protocol, and auth - Add telemetry_config optional property to HostedAgentDefinition - Both models gated behind hosted_agents_v1_preview feature key - kind remains a plain string to allow extensibility for future telemetry endpoint types (e.g. AzureMonitor, AppInsights) without an API version change * Add minimal README for Foundry data-plane API src directory * Move README.md to agents folder * feat(agents): add extensible telemetry config for hosted agents Add TelemetryConfig with discriminated endpoint and auth hierarchies to HostedAgentDefinition for customer-supplied telemetry export. Design: - TelemetryEndpoint: discriminated by 'kind' (OTLP today, extensible) - TelemetryEndpointAuth: discriminated by 'type' (header today, extensible) - Typed unions for endpoint kind, data kinds, transport protocol, auth type - All unions include string fallback for forward-compatibility Models added: - TelemetryConfig (endpoints: 1-3 required) - TelemetryEndpoint (base, discriminator: kind) - OtlpTelemetryEndpoint (kind: OTLP, endpoint + protocol required) - TelemetryEndpointAuth (base, discriminator: type) - HeaderTelemetryEndpointAuth (type: header, headerName/secretId/secretKey) - TelemetryEndpointKind, TelemetryDataKind, TelemetryTransportProtocol, TelemetryEndpointAuthType unions Wire format preserved via @Encodedname for auth fields (camelCase). Feature-gated behind hosted_agents_v1_preview. Breaking change: auth object now requires 'type' discriminator field. * fix: format tsp, fix identifier, regenerate openapi3 (json+yaml) - tsp format applied - Fixed AgentDefinitionFeatureKeys -> AgentDefinitionOptInKeys - Regenerated openapi3 JSON and YAML for v1 and virtual-public-preview * Remove @Encodedname decorators from HeaderTelemetryEndpointAuth to use snake_case consistently * Regenerate OpenAPI3 JSON and YAML after snake_case fix for HeaderTelemetryEndpointAuth --------- Co-authored-by: Vipin Koottayi <vkoottayi@microsoft.com> * Rename ImageGenActionEnum to ImageGenAction for Python and JS * Apply suggestions from code review Feedback Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> * regen openapi3 * closed opt in enums. (#41017) * Add evaluator upload operations and entry_point for custom code evaluators (#40955) * Add evaluator upload operations and entry_point to CodeBasedEvaluatorDefinition - Add startPendingUpload operation to Evaluators interface for initiating code upload to blob storage - Add getCredentials operation to Evaluators interface for fetching SAS tokens to access evaluator storage - Add entry_point property to CodeBasedEvaluatorDefinition for specifying the main Python file of uploaded evaluator code - Make code_text optional since uploaded evaluators use entry_point instead - Add SDK client name for startPendingUpload in client.tsp Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add request body with blobUri to evaluator getCredentials operation - Add EvaluatorCredentialRequest model with required blobUri property - Update getCredentials operation to accept the request body Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add optional image_tag to CodeBasedEvaluatorDefinition Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use Python native date-time objects (#41044) * Split packages (#40662) * Split packages * Split packages * Adopt openai spec 1.11.0 * Fix generation * Fix generation * Remove Agent Items * Fix Azure.AI.Extensions.OpenAI * Fix AgentResponseItem * Remove Agents from Azure.AI.Projects * Expose APIError * cleanup * Run tsp compile * Fix after merge --------- Co-authored-by: Nikolay Rovinskiy <nirovins@microsoft.com> * removing telemetry_config from v1. * Marking operations as preview-only. * TimeZone JDK type overrides for Foundry libraries (#41103) * Added type overrides for timezone + rearrangements * WebSearchAppLoc same override * Rever addition of telemetry config. * feedback * Revert "Add evaluator upload operations and entry_point for custom code evaluators (#40955)" This reverts commit 08e55ee. * Suppressing the emission of classes duplicated by `openai-java` package (#41194) * Suppressing the emission of classes duplicated by openai-java package * removed bad stuff * More external duped types * More external types * Implement suggested renamings. (#41135) * Implement suggested renamings. * Move tools to the Tools namespace * Revert * More cleanup (#41211) * Renamings in the Azure.AI.Projects (#41259) * Add listconversation ops (#41323) * Apply suggestions from code review Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> * PR feedback for #39565 (#41345) * Apply suggestions from code review Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> * making invocations body unknown. * fix spec for tcgc 0.66.2 (#41316) * Update deleteAgent to internal type (#41348) * Updated insights to use the same pattern as the rest of the API. * regen openapi3 * Implement other renamings. (#41390) * Adding correct decorator for paged op Insights (#41420) * Using @list and library type * Restoring Azure.Core.Page usage * Change the way of subcliuent initialization (#41439) * Change the way of subcliuent initialization * Generate standalone client * renaming and relocating Azure specific listConversations op (#41419) * Disabled convenientAPI generation for Agent delete Operations (#41522) * Disabled convenientAPI generation for Agent delete Operations * Added clarifying comment * Azure specific for `ResponseCreate` ops clustered for SDK convenience (#41471) * Trying something out * Adjusting option bag visibility * The azure option bag contains only current fields * Same model for response * Renamed model * regen openapi * Renamings inside agents package. (#41508) * Rename intermediate model and added docs (only visible at client library level) (#41585) * Class Renames (#41614) * suppress object type * fix naming * approach as rename * mcp rename * rename params * fix typo * Customizations to update delete op return types (#41677) * deleteAgent internal * customization for memory store delete ops * set deleteAgentVersion internal * remove make private * More renaming (#41719) * Hide DetailEnum everywhere except for Azure.AI.Extensions.OpenAI (#41771) * Yet another round of renaming. (#41912) * Remove EvaluationScheduleTaskEvalRun (#41979) * Unhide the evaluation targets. (#41984) * Add ScenarioBasedEvaluatorDefinition for Adaptive Evals Add 'scenario' to EvaluatorDefinitionType union and supporting models: - ContextInput: typed context (agent_prompt, policy_document, trace_file, supplementary_text) - RubricCriterion: weighted scoring criterion with 1-5 rubric and applicability guidance - ScenarioBasedEvaluatorDefinition: generates rubric catalog (quality) or taxonomy (safety) from context inputs EvaluatorCategory determines which generation pipeline runs: - QUALITY -> rubric catalog with scored criteria, applicability gates, sink criterion - SAFETY -> taxonomy with risk categories and sub-behaviors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Redesign: separate generate endpoint from evaluator persistence - Remove context_inputs from ScenarioBasedEvaluatorDefinition (generation inputs don't belong on persisted definition) - Add ContextInputType union with typed context categories - Add GenerateEvaluatorRequest with context_inputs, category, existing_criteria (for iterative refinement) - Add GenerateEvaluatorResponse with RubricCatalog for HITL review before saving - Add RubricCatalog model (spec + criteria array + source_model) - Add generate action on Evaluators interface: POST /evaluators/{name}/versions:generate - ScenarioBasedEvaluatorDefinition now stores only generated outputs (spec, rubric_catalog, taxonomy ref) User flow: generate -> review/edit criteria -> save via createVersion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename: scenario→generated, ContextInput→Source, simplify terminology - ScenarioBasedEvaluatorDefinition → GeneratedEvaluatorDefinition (type 'generated') - ContextInput → Source, ContextInputType → SourceType - Source type values: description, policy, traces, supplementary - context_inputs → sources on GenerateEvaluatorRequest - category defaults to 'quality' (optional, not required) - Reduces core terminology from ~9 terms to 6 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address council review: simplify RubricCriterion, rename rubric_catalog→rubric_criteria, evaluation_summary Critical fixes from council of models review (Opus, GPT-5.2, Codex): - RubricCriterion: remove name, scoring, applicability_guidance, always_on (pipeline doesn't produce these) - RubricCriterion: add rubric_id (read-only, service-generated), fixed_applicability (for sink) - RubricCriterion: weight changed from float32 to int32 (matching evalfactory range 1-10) - Remove RubricCatalog model; use rubric_criteria: RubricCriterion[] directly - Rename rubric_catalog → rubric_criteria on definition and response - Rename spec → evaluation_summary; move to response-only (remove from definition) - GeneratedEvaluatorDefinition now has only: rubric_criteria, taxonomy_id, taxonomy_version - existing_criteria doc updated: seed-only for //build, pipeline may keep/modify/drop Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove existing_criteria — service auto-retrieves prior version by evaluator name When regenerating, the service looks up the latest version's criteria by evaluator name and uses them as context. No need for the user to explicitly pass them. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Generate returns EvaluatorVersion, align sources to data gen pattern - Generate action now returns EvaluatorVersion (not custom response model) - Remove GenerateEvaluatorResponse — evaluation_summary/source_model go in metadata - Replace Source/SourceType with data-gen-aligned discriminated union: - PromptEvaluatorGenerationSource (prompt text + agent_name) - TracesEvaluatorGenerationSource (agent_name + time window) - FileEvaluatorGenerationSource (file id) - Source type strings match DataGenerationJobSourceType: Prompt, Traces, File Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename sources to EvaluatorGenerationJobSource, simplify RubricCriterion, rename sink to general quality - Rename EvaluatorGenerationSource → EvaluatorGenerationJobSource (all subtypes) - Rename EvaluatorGenerationSourceType → EvaluatorGenerationJobSourceType - Replace fixed_applicability (int32) with always_applicable (boolean) - Update doc strings: sink criterion → general quality criterion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add @minValue/@MaxValue to weight, clarify rubric_id semantics - RubricCriterion.weight: Add @minValue(1) @MaxValue(10) constraint - RubricCriterion.weight doc: Clarify weight discipline is a generation heuristic - RubricCriterion.rubric_id doc: Stable human-readable slug, not raw hash Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * PromptEvaluatorGenerationJobSource: allow both prompt + agent_name At least one of prompt or agent_name must be specified. When both provided, agent instructions are merged with supplementary prompt text. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Meeting decisions: rename to rubrics, persist-by-default, remove taxonomy, add model+persist params - Rename GeneratedEvaluatorDefinition → RubricBasedEvaluatorDefinition - Rename EvaluatorDefinitionType.generated → .rubrics - Remove taxonomy_id/taxonomy_version (safety uses rubrics too) - Add persist boolean (default true) to GenerateEvaluatorRequest - Rename model_deployment_name → model - Update route doc for persist-by-default semantics Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Align review fixes: rubric_id lifecycle, general quality non-editable docs - rubric_id: clients must echo existing ID on edit (not just 'preserved') - always_applicable: clarify general criterion is non-editable, users can set on own criteria - rubric_criteria: note general criterion is non-editable Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Clarify residual criterion IDs: general_quality vs general_policy_compliance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Rename File source → Dataset source (DatasetEvaluatorGenerationJobSource) - FileEvaluatorGenerationJobSource → DatasetEvaluatorGenerationJobSource - Source type 'File' → 'Dataset' with name+version fields - Aligns with DatasetDataGenerationJobSource pattern Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Make model required, update sources doc - model: optional → required (user must provide own LLM) - sources doc: 'uploaded files' → 'datasets' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Source roles, LRO-first (JobLike), EvaluatorGenerationJob model - Add EvaluatorGenerationJobSourceRole union (agent_description, evaluator_description) - Add role field to EvaluatorGenerationJobSource base model - Add EvaluatorGenerationResult and EvaluatorGenerationJob (extends JobLike) - Generate route returns JobCreatedResponse (201 + Operation-Location) - Import servicepatterns.tsp for JobLike/FoundryTimestamp Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert to synchronous generate API - Remove EvaluatorGenerationJob and EvaluatorGenerationResult models - Generate route returns ResourceOkResponse<EvaluatorVersion> (not JobCreatedResponse) - Remove servicepatterns.tsp import (no longer needed) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix rubric_id visibility: Read+Create (clients must echo on edit) rubric_id was Read-only, preventing clients from sending it back when editing criteria and saving as a new version. Now Read+Create so service generates it on first creation and clients echo it on saves. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Switch to LRO pattern: EvaluatorGenerationJob + shared GenerationJobSource - Replace sync generate route with EvaluatorGenerationJobs interface (postJobPreview, queryJobStatusPreview, listJobsPreview, cancelJobPreview, deleteJobPreview) - Rename EvaluatorGenerationJobSource* to GenerationJobSource* (shared with datagen) - Replace role enum with purpose?: string on base source model - Add EvaluatorGenerationInputs, EvaluatorGenerationResult, EvaluatorGenerationJob - Add TokenUsage model (will consolidate when datagen merges) - Follow DataGenerationJob pattern exactly: evaluator_generation/jobs/{jobId} Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Flat route structure: evaluator_generation_jobs (not evaluator_generation/jobs) Architect decision: REST API uses flat routes. SDK represents as nested (project.evaluators.generation_jobs). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address April's review: JobSource in common, Agent source, simplify LRO Per April's 8 review comments on PR #42264: - Rename GenerationJobSource* to JobSource* (generic for future job types) - Move JobSource types to common/models.tsp (shared location) - Add new Agent source type (AgentJobSource) - agent_name under Prompt was unintuitive per April + Dan feedback - PromptJobSource now has prompt (required), no agent_name - Remove persist field (LRO always persists) - Remove EvaluatorGenerationResult wrapper (result is EvaluatorVersion) - Remove TokenUsage (not needed without result wrapper) - Remove cancel route (not required now) - Keep category singular (quality/safety mutually exclusive) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Nest generation jobs under evaluators sub-client via @@clientLocation Move EvaluatorGenerationJobs operations into Beta.Evaluators using @@clientLocation in relocate-beta-operations.tsp. Add @@clientName entries in client.tsp for Python-friendly names: create → generate get → get_generation_job list → list_generation_jobs delete → delete_generation_job SDK surface: client.beta.evaluators.generate(...) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address Sakoll's review: purpose→description, AgentJobSource→PromptAgentJobSource Per Sakoll's 3 comments: - Rename purpose to description on JobSource (avoids OAI Files purpose collision) - Rename AgentJobSource to PromptAgentJobSource (hosted agents can't fetch instructions) - Update TracesJobSource.agent_version doc: all versions included when not specified Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address April's round 2: Agent (not PromptAgent), rubrics doc, required criteria, cancel back Per April's 5 new comments: - Revert PromptAgentJobSource back to AgentJobSource (Agent type) per April: hosted agents have description/metadata, useful in future - Fix rubrics doc: can be created via generate API or manually - Make rubric_criteria required (not optional) - Add cancel route back per Sashank + April (similar to Eval Run) - Delete doc: wipes job record only, keeps generated evaluator Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix rubric_id visibility, model keyword escape, regenerate swagger - rubric_id: remove @visibility restriction, make user-editable short name - model field: escape with backticks (reserved keyword in TypeSpec) - Regenerate openapi3 JSON for v1 and virtual-public-preview Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add typed EvaluatorGenerationArtifacts on EvaluatorVersion Replace the implicit metadata-bag approach with a typed, read-only generation_artifacts field on EvaluatorVersion that holds DatasetReference pointers to the spec, optional tools, and optional context produced during generation. Excludes actor (deferred to multi-turn dataset epic). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Convert JobSourceType wire values to snake_case Per Foundry data-plane convention, JobSourceType in common/models.tsp now uses snake_case wire values: Prompt -> prompt Agent -> agent Traces -> traces Dataset -> dataset JobSource subtype discriminator literals (PromptJobSource, AgentJobSource, TracesJobSource, DatasetJobSource) updated to match. SDK class names are unchanged (emitter normalizes regardless of TypeSpec key casing); only the JSON wire format changes. Backend coordination required during transition. Out of scope (deferred to separate cleanup): - red-teams/models.tsp RiskCategory (10 PascalCase members) - connections/models.tsp ConnectionType, CredentialType - foundry-data-generation-jobs.tsp DataGenerationJobSourceType (PR #56) Regenerated OpenAPI also picks up prior TSP/JSON drift from earlier commits (DatasetReference, EvaluatorGenerationArtifacts schemas now materialized in OpenAPI). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update specification/ai-foundry/data-plane/Foundry/src/common/models.tsp Co-authored-by: Sashank Kolli <89619248+sakoll@users.noreply.github.com> * Make rubric_id required; drop service-side slugifier rubric_id is now produced directly by the generation model (snake_case identifier matching ^[a-z][a-z0-9_]*$) and required in every RubricCriterion. The service no longer derives or post-processes rubric_ids — the slugifier in ACA generate.py will be removed in a follow-up commit on the rubric gen branch. Resolves spec.md Open Question Q3 follow-up; aligns with the decision to keep the evaluation spec markdown only in generation_artifacts.spec. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Clarify rubric_id authorship in @doc rubric_id is user-provided ù either during manual evaluator creation or during human-in-the-loop review of a generated rubric catalog. The generation pipeline emits an initial value the user can edit before saving the EvaluatorVersion. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Restructure JobSource per sakoll review Drop discriminator base in common/models.tsp; replace with JobSourceDescription mixin to align with Shivam's PR #41606 pattern. Add named EvaluatorJobSource union in evaluators/models.tsp for the polymorphic sources field. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use JobSourceType enum refs instead of string literals per dargilco Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address dargilco r2: rename Python clientName generate->generate_job; add service-side default for always_applicable Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address sakoll: drop AgentJobSource.prompt — supplementary text covered by PromptJobSource entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address glecaros: convert JobSource mixin -> discriminated base model Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address glecaros: per-context discriminated EvaluatorGenerationJobSource via spread Drop shared @Discriminator(\"type\") JobSource base; restore shared XxxJobSource shapes with string-literal type so they can be spread. Add per-top-level discriminated EvaluatorGenerationJobSource base + 4 ...JobSource-spread subtypes (matches DataGenerationJobSource pattern from #42722). EvaluatorGenerationInputs.sources is now EvaluatorGenerationJobSource[]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(adaptive-evals): relocate evaluator generation jobs under evaluators sub-client (Python) + add TokenUsage - Add 5 @@clientLocation directives so Python SDK exposes: project_client.evaluators.{generate_job, get_generation_job, list_generation_jobs, cancel_generation_job, delete_generation_job} REST surface (route + tag EvaluatorGenerationJobs) is unchanged. - Add reusable TokenUsage model in common/models.tsp (input_tokens, output_tokens, total_tokens) for cross-LRO reuse. - Add usage?: TokenUsage (read-only) to EvaluatorGenerationJob so callers can see token consumption when the job reaches a terminal state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(adaptive-evals): remove duplicate @@clientLocation from client.tsp Per @dargilco review: the EvaluatorGenerationJobs @@clientLocation directives were already present in relocate-beta-operations.tsp (without the unnecessary 'python' scope arg). Remove the duplicates I added in client.tsp to keep the relocation in one place. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(adaptive-evals): rename TokenUsage to EvaluatorGenerationTokenUsage per glecaros review The shared TokenUsage model in common/models.tsp collided conceptually with PR #41606's data-generation TokenUsage (renamed there to DataGenerationTokenUsage). The two shapes differ enough that sharing doesn't pay off; rename ours to be specific to the evaluator generation LRO and move it next to EvaluatorGenerationJob. Cross-LRO convergence on a shared TokenUsage can be revisited later. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Travis Wilson <travisw@microsoft.com> Co-authored-by: Gerardo Lecaros <10088504+glecaros@users.noreply.github.com> Co-authored-by: Jose Alvarez <jpalvarezl@users.noreply.github.com> Co-authored-by: Jose Alvarez <jp.alvarezl@gmail.com> Co-authored-by: Travis Wilson <35748617+trrwilson@users.noreply.github.com> Co-authored-by: Nikolay Rovinskiy <30440255+nick863@users.noreply.github.com> Co-authored-by: Darren Cohen <39422044+dargilco@users.noreply.github.com> Co-authored-by: Glenn Harper <64209257+glharper@users.noreply.github.com> Co-authored-by: Ravi Pidaparthi <rapida@microsoft.com> Co-authored-by: Sashank Kolli <89619248+sakoll@users.noreply.github.com> Co-authored-by: retry-recv <goubo2012@gmail.com> Co-authored-by: Nikolay Rovinskiy <nirovins@microsoft.com> Co-authored-by: Nagendra Posani <naposani@microsoft.com> Co-authored-by: JoshLove-msft <54595583+JoshLove-msft@users.noreply.github.com> Co-authored-by: Linda Li <139801625+lindazqli@users.noreply.github.com> Co-authored-by: Abdelmohsen Quritum <127798197+AbdelmohsenMS@users.noreply.github.com> Co-authored-by: Ankit Sultania <ankitsultania2007@gmail.com> Co-authored-by: Ankit Sultania <asultania@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: vtrika <vipincec@gmail.com> Co-authored-by: Vipin Koottayi <vkoottayi@microsoft.com> Co-authored-by: Johan Stenberg (MSFT) <johan.stenberg@microsoft.com> Co-authored-by: Waqas Javed <7674577+w-javed@users.noreply.github.com> Co-authored-by: Kaylie <50653231+kaylieee@users.noreply.github.com> Co-authored-by: Mike Harder <mharder@microsoft.com> Co-authored-by: Jorge Rangel <102122018+jorgerangel-msft@users.noreply.github.com>

github-actions · 2026-04-29T21:06:14Z

Next Steps to Merge

Important checks have failed. As of today they are not blocking this PR, but in near future they may.
Addressing the following failures is highly recommended:

⚠️ The check named Swagger BreakingChange has failed. To unblock this PR, follow the process at aka.ms/brch.

If you still want to proceed merging this PR without addressing the above failures, refer to step 4 in the PR workflow diagram.

Comment generated by summarize-checks workflow run.

github-actions · 2026-04-29T21:13:15Z

API Change Check

APIView identified API level changes in this PR and created the following API reviews

Language	API Review for Package
TypeSpec	Azure.AI.Projects

Comment generated by After APIView workflow run.

slister1001 · 2026-04-29T22:00:14Z

@johanste / @glecaros — opened a child PR into this branch applying the feedback patterns from #41606:

#42768 — fix(adaptive-evals): align with Johan's feedback on data-gen PR #41606

Summary of changes:

JobSourceDescription: model → alias (single-property fragment used purely for spreading)
Removed redundant ...JobSourceDescription spread from the EvaluatorGenerationJobSource discriminated base — each leaf already spreads it via its XxxJobSource chain
Regenerated openapi3 (4 lines removed per file — only the redundant description on the base)

Things kept as-is (with rationale in #42768):

XxxJobSource shapes stay as model (multiple per-context consumers — this PR + Add data gen jobs Apis and contracts #41606)
common/ placement of source shapes is now genuinely cross-cutting (PR 42764 + 41606 both consume)
agent_name/agent_version not folded into a shared alias because the version docstrings differ semantically between AgentJobSource and TracesJobSource

@johanste — would appreciate your signoff on #42768 so we can unblock #42764.

glecaros · 2026-04-30T18:56:05Z

+// ============================================================================
+
+@doc("Common properties shared across all job source types. Spread into per-context source variants alongside a context-specific discriminated base (e.g., `EvaluatorGenerationJobSource`).")
+model JobSourceDescription {


when we merge #41606, we'll need to resolve the conflict by taking their version of this (which is an alias)

#42768) - Convert JobSourceDescription from model to alias (single-property wrapper used purely for spreading - declaring a named model adds no value over inlining the property). - Remove redundant ...JobSourceDescription spread from the EvaluatorGenerationJobSource discriminated base. Each concrete subtype already spreads its XxxJobSource shape from common/, and each of those already spreads JobSourceDescription. The redundant spread emitted `description` twice in the base schema. - Regenerate openapi3 (v1 + virtual-public-preview).

…ure-rest-api-specs into glecaros/adaptative-evals # Conflicts: # specification/ai-foundry/data-plane/Foundry/openapi3/v1/microsoft-foundry-openapi3.json # specification/ai-foundry/data-plane/Foundry/openapi3/virtual-public-preview/microsoft-foundry-openapi3.json # specification/ai-foundry/data-plane/Foundry/src/common/models.tsp

aprilk-ms · 2026-05-01T02:40:47Z


-@doc("Traces source — conversation traces from Application Insights.")
+@doc("Traces source — conversation traces from Application Insights. Reusable shape spread into per-context discriminated subtypes.")
 model TracesJobSource {


add agent_id?

Done in #42826 (commit cc789bf5) — added agent_id?: string between type and agent_name. Pattern matches the existing agent_id? in TracesPreviewEvalRunDataSource (openai-evaluations/models.tsp:350). When omitted, traces fall back to filtering by agent_name (and agent_version if specified).

aprilk-ms · 2026-05-01T02:42:41Z

+}
+
+@doc("Service-managed provenance artifacts produced by an evaluator generation job. Present only on EvaluatorVersion resources created via the generation pipeline. All references are read-only and resolve to versioned Foundry Datasets in a service-reserved namespace.")
+model EvaluatorGenerationArtifacts {


I thought we updated to 1 dataset JSONL file now?

Caught — TypeSpec was lagging the Apr 27 C1 reshape (vienna PR #2056752 / thread 32921814). Fixed in #42826 (commit cc789bf5):

model EvaluatorGenerationArtifacts { dataset: DatasetReference; // single combined-JSONL Foundry Dataset, version-aligned to EvaluatorVersion.version kinds: string[]; // which kinds appear as rows (e.g. ["spec"], ["spec", "tools"]) }

ACA already writes this shape; the spec just hadn't caught up.

aprilk-ms · 2026-05-01T02:45:11Z

 @@clientName(Evaluators.listLatestVersions, "list");

+// Evaluator generation jobs — renamed for Python SDK discoverability
+@@clientName(EvaluatorGenerationJobs.create, "generate_job", "python");


are we sure it is generate_job and not create_generation_job (consistent with others)?

Switched to create_generation_job for consistency in #42826 (commit cc789bf5). All five now align: create_generation_job / get_generation_job / list_generation_jobs / cancel_generation_job / delete_generation_job.

This updates dargilco's Apr 28 resolution (where I'd explicitly noted leaving siblings unchanged) — your consistency point wins.

Three changes: 1. TracesJobSource: add optional agent_id field (common/models.tsp:152) — matches the existing pattern in TracesPreviewEvalRunDataSource (openai-evaluations/models.tsp:350). When omitted, traces fall back to filtering by agent_name (and agent_version if specified). 2. EvaluatorGenerationArtifacts: reshape from 3-field {spec, tools, context} to combined-JSONL {dataset, kinds[]} (evaluators/models.tsp:155). The TypeSpec was lagging the Apr 27 C1 reshape that already shipped in vienna PR #2056752 — single Foundry Dataset per EvaluatorVersion with each row carrying a kind discriminator. ACA writes this shape today; spec just hadn't caught up. 3. Python clientName for EvaluatorGenerationJobs.create: generate_job -> create_generation_job (client.tsp:169) for consistency with the four siblings (get_generation_job, list_generation_jobs, cancel_generation_job, delete_generation_job). Updates dargilco's Apr 28 resolution per April's May 1 follow-up. Regenerated openapi3/virtual-public-preview and openapi3/v1. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

glecaros requested review from balapv, dargilco, johanste and trrwilson as code owners April 29, 2026 21:05

github-actions Bot added data-plane TypeSpec Authored with TypeSpec labels Apr 29, 2026

slister1001 mentioned this pull request Apr 29, 2026

fix(adaptive-evals): align with Johan's feedback on data-gen PR #41606 #42768

Merged

slister1001 mentioned this pull request Apr 29, 2026

Add data gen jobs Apis and contracts #41606

Merged

glecaros commented Apr 30, 2026

View reviewed changes

slister1001 and others added 2 commits April 30, 2026 15:41

aprilk-ms reviewed May 1, 2026

View reviewed changes

slister1001 mentioned this pull request May 1, 2026

Address PR #42764 review feedback (April Kwong May 1) #42826

Merged

slister1001 and others added 2 commits May 1, 2026 14:55

regenerating oai3

6cd9d68

johanste approved these changes May 1, 2026

View reviewed changes

Merge branch 'feature/foundry-release' into glecaros/adaptative-evals

7384903

glecaros merged commit 63dc02b into feature/foundry-release May 2, 2026
17 of 36 checks passed

glecaros deleted the glecaros/adaptative-evals branch May 2, 2026 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive Evals: add EvaluatorGenerationJob LRO, JobSource (Prompt/Dataset/Traces), and RubricsEvaluatorDefinition (#42264)#42764

Adaptive Evals: add EvaluatorGenerationJob LRO, JobSource (Prompt/Dataset/Traces), and RubricsEvaluatorDefinition (#42264)#42764
glecaros merged 6 commits intofeature/foundry-releasefrom
glecaros/adaptative-evals

glecaros commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

slister1001 commented Apr 29, 2026

Uh oh!

glecaros Apr 30, 2026

Uh oh!

aprilk-ms May 1, 2026

Uh oh!

slister1001 May 1, 2026

Uh oh!

aprilk-ms May 1, 2026

Uh oh!

slister1001 May 1, 2026

Uh oh!

aprilk-ms May 1, 2026

Uh oh!

slister1001 May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

glecaros commented Apr 29, 2026

Choose a PR Template

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Next Steps to Merge

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API Change Check

Uh oh!

slister1001 commented Apr 29, 2026

Uh oh!

glecaros Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

aprilk-ms May 1, 2026

Choose a reason for hiding this comment

Uh oh!

slister1001 May 1, 2026

Choose a reason for hiding this comment

Uh oh!

aprilk-ms May 1, 2026

Choose a reason for hiding this comment

Uh oh!

slister1001 May 1, 2026

Choose a reason for hiding this comment

Uh oh!

aprilk-ms May 1, 2026

Choose a reason for hiding this comment

Uh oh!

slister1001 May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading