Skip to content

Commit 2e10e26

Browse files
authored
feat(sdk): improve AI tool definitions for LLM accuracy (25% → 95% pass rate) (#2446)
* refactor(tests): enhance execution and tool quality tests with detailed assertions - Updated execution tests to validate tool execution mechanics, including trace and content assertions. - Improved tool quality tests to assess LLM tool selection accuracy and argument structure. - Added comprehensive checks for tool call sequences and success rates in execution traces. - Refined assertions to ensure correct tool usage and argument validation across various document operations. * feat(cli): add descriptions to CLI operation parameters - Enhanced CLI operation parameter specifications by adding human-readable descriptions for better usability and documentation. - Updated existing parameters to include descriptions, improving clarity for users interacting with the CLI. - Modified the `CliOperationParamSpec` type to include an optional `description` field for enhanced schema documentation. * feat(docs): enhance list creation and insertion documentation - Updated the documentation for list creation and insertion operations to include detailed descriptions for required parameters, improving clarity for users. - Added specific formatting instructions for the `at` and `target` parameters in the `create` and `insert` operations, respectively. - Regenerated the manifest file to reflect the updated source hash. * fix(docs): update descriptions for selection target and reference handle in format documentation - Enhanced the descriptions for the `target` and `ref` properties across multiple format-related documentation files to clarify usage. - Updated the `target` description to recommend using 'ref' for search result handles. - Improved the `ref` description to specify passing the handle.ref value directly for inline formatting. - Regenerated the manifest file to reflect the updated source hash. * feat(sdk): improve tool definitions for better LLM accuracy - Add descriptions to SelectionPoint, nestingPolicy, and inline formatting - Fix codegen to check contract-level required arrays (not just CLI params) - Remove empty {} oneOf branches from inline properties (42 simplified) - Deduplicate same-type oneOf branches (e.g. duplicate string refs) - Collapse single-branch oneOf to plain type - Add fallback descriptions for target, ref, content, inline params - Add "placing content near text" workflow to system prompt - Clarify search select.type must be "text" or "node" * docs: regenerate reference docs after main merge * fix(sdk): update agentVisible test for expectedRevision param * chore: update sourceHash in generated manifest for document API * chore: enable Claude Haiku 4.5 and Gemini 2.5 Pro providers in promptfooconfig * fix(sdk): address PR review comments - Fix heading level description "1-9" → "1-6" to match schema max: 6 - Add comment explaining commented-out providers are templates - Add zero tool calls guard to traceAllOk assertion * chore: update sourceHash in generated manifest and enable additional providers in promptfooconfig
1 parent 86600ac commit 2e10e26

74 files changed

Lines changed: 1614 additions & 686 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

apps/cli/scripts/export-sdk-contract.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,7 @@ function buildSdkContract() {
9898
if (p.flag && p.flag !== p.name) spec.flag = p.flag;
9999
if (p.required) spec.required = true;
100100
if (p.schema) spec.schema = p.schema;
101+
if (p.description) spec.description = p.description;
101102
if (p.agentVisible === false) spec.agentVisible = false;
102103
return spec;
103104
}),

0 commit comments

Comments
 (0)