Commit 2e10e26
authored
feat(sdk): improve AI tool definitions for LLM accuracy (25% → 95% pass rate) (#2446)
* refactor(tests): enhance execution and tool quality tests with detailed assertions
- Updated execution tests to validate tool execution mechanics, including trace and content assertions.
- Improved tool quality tests to assess LLM tool selection accuracy and argument structure.
- Added comprehensive checks for tool call sequences and success rates in execution traces.
- Refined assertions to ensure correct tool usage and argument validation across various document operations.
* feat(cli): add descriptions to CLI operation parameters
- Enhanced CLI operation parameter specifications by adding human-readable descriptions for better usability and documentation.
- Updated existing parameters to include descriptions, improving clarity for users interacting with the CLI.
- Modified the `CliOperationParamSpec` type to include an optional `description` field for enhanced schema documentation.
* feat(docs): enhance list creation and insertion documentation
- Updated the documentation for list creation and insertion operations to include detailed descriptions for required parameters, improving clarity for users.
- Added specific formatting instructions for the `at` and `target` parameters in the `create` and `insert` operations, respectively.
- Regenerated the manifest file to reflect the updated source hash.
* fix(docs): update descriptions for selection target and reference handle in format documentation
- Enhanced the descriptions for the `target` and `ref` properties across multiple format-related documentation files to clarify usage.
- Updated the `target` description to recommend using 'ref' for search result handles.
- Improved the `ref` description to specify passing the handle.ref value directly for inline formatting.
- Regenerated the manifest file to reflect the updated source hash.
* feat(sdk): improve tool definitions for better LLM accuracy
- Add descriptions to SelectionPoint, nestingPolicy, and inline formatting
- Fix codegen to check contract-level required arrays (not just CLI params)
- Remove empty {} oneOf branches from inline properties (42 simplified)
- Deduplicate same-type oneOf branches (e.g. duplicate string refs)
- Collapse single-branch oneOf to plain type
- Add fallback descriptions for target, ref, content, inline params
- Add "placing content near text" workflow to system prompt
- Clarify search select.type must be "text" or "node"
* docs: regenerate reference docs after main merge
* fix(sdk): update agentVisible test for expectedRevision param
* chore: update sourceHash in generated manifest for document API
* chore: enable Claude Haiku 4.5 and Gemini 2.5 Pro providers in promptfooconfig
* fix(sdk): address PR review comments
- Fix heading level description "1-9" → "1-6" to match schema max: 6
- Add comment explaining commented-out providers are templates
- Add zero tool calls guard to traceAllOk assertion
* chore: update sourceHash in generated manifest and enable additional providers in promptfooconfig1 parent 86600ac commit 2e10e26
74 files changed
Lines changed: 1614 additions & 686 deletions
File tree
- apps
- cli
- scripts
- src/cli
- docs/document-api/reference
- comments
- create
- format
- paragraph
- lists
- mutations
- query
- track-changes
- evals
- lib
- providers
- tests
- packages
- document-api/src/contract
- sdk
- codegen/src
- __tests__
- tools
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
98 | 98 | | |
99 | 99 | | |
100 | 100 | | |
| 101 | + | |
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| |||
0 commit comments