Problem
Tools are loaded into context at LanguageModel.create() time. On-device models have tight context windows (2K–8K tokens). Many tools with schemas eat into that fast.
Proposal
A toolSearch callback for on-demand tool discovery:
const session = await LanguageModel.create({
tools: [/* always-on tools */],
toolSearch: async (query) => {
return registry.filter(t => matches(t, query));
}
});
The UA exposes toolSearch as a built-in capability. Model calls it when it needs something not in its current tool set. Matched tools get added to the session dynamically.
Why
- 50 tools ≈ 5K+ tokens in schemas alone
- On-device models cannot afford that
- Only 2–3 tools are relevant per turn
Prior Art
Problem
Tools are loaded into context at
LanguageModel.create()time. On-device models have tight context windows (2K–8K tokens). Many tools with schemas eat into that fast.Proposal
A
toolSearchcallback for on-demand tool discovery:The UA exposes
toolSearchas a built-in capability. Model calls it when it needs something not in its current tool set. Matched tools get added to the session dynamically.Why
Prior Art