-
-
Notifications
You must be signed in to change notification settings - Fork 180
feat: function calling across models #589
Copy link
Copy link
Open
Labels
new featureNew feature or requestNew feature or requestrequires triageRequires triagingRequires triaging
Description
Feature Description
Improve function calling support across models. right now, i get 3 failures out of four for this test:
import { getLlama, LlamaChat, resolveModelFile } from "node-llama-cpp";
import { afterAll, describe, expect, it } from "vitest";
const models = [
{ label: "FunctionGemma 270M", url: "hf:unsloth/functiongemma-270m-it-GGUF:Q8_0" },
{ label: "LFM2 1.2B Tool", url: "hf:LiquidAI/LFM2-1.2B-Tool-GGUF:Q8_0" },
{ label: "Qwen2.5 Coder 1.5B", url: "hf:bartowski/Qwen2.5-Coder-1.5B-Instruct-GGUF:Q4_K_M" },
{ label: "Llama 3.2 1B", url: "hf:unsloth/Llama-3.2-1B-Instruct-GGUF:Q4_K_M" },
];
const functions = {
get_weather: {
description: "Get the current weather for a city.",
params: {
type: "object" as const,
properties: {
location: { type: "string" as const },
},
required: ["location"],
},
},
};
describe("node-llama-cpp native function calling", () => {
const timeout = 10 * 60 * 1000;
let llama: Awaited<ReturnType<typeof getLlama>> | undefined;
afterAll(async () => {
await llama?.dispose();
});
for (const { label, url } of models) {
it(
label,
async () => {
llama ??= await getLlama();
const modelPath = await resolveModelFile(url, "./models");
const model = await llama.loadModel({ modelPath });
const context = await model.createContext({ flashAttention: true });
const sequence = context.getSequence();
const chat = new LlamaChat({ contextSequence: sequence });
const res = await chat.generateResponse(
[{ type: "user", text: "What is the weather in San Francisco?" }],
{ functions, maxTokens: 200, seed: 42 }
);
const hasFnCalls = (res.functionCalls?.length ?? 0) > 0;
console.log(`\n--- ${label} (wrapper: ${chat.chatWrapper.wrapperName}) ---`);
console.log(`functionCalls: ${hasFnCalls ? JSON.stringify(res.functionCalls) : "NONE"}`);
console.log(`response text: ${JSON.stringify(res.response.slice(0, 200))}`);
if (!hasFnCalls && res.response) {
console.log(`⚠ tool call embedded in text, not returned via functionCalls`);
}
chat.dispose({ disposeSequence: false });
sequence.dispose();
await context.dispose();
await model.dispose();
// expect(hasFnCalls).toBe(true);
},
timeout
);
}
});[node-llama-cpp] A prebuilt binary was not found, falling back to using no GPU
stderr | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > FunctionGemma 270M
[node-llama-cpp] load: control-looking token: 212 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > FunctionGemma 270M
--- FunctionGemma 270M (wrapper: Gemma) ---
functionCalls: NONE
response text: "I cannot assist with retrieving weather information for San Francisco. My current capabilities are limited to calling specific functions as needed."
⚠ tool call embedded in text, not returned via functionCalls
stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > LFM2 1.2B Tool
--- LFM2 1.2B Tool (wrapper: ChatML) ---
functionCalls: NONE
response text: "I can help you with that, but I need the function to retrieve the weather information. Could you please provide the function call?\n\n```typescript\nfunction get_weather(params: {location: string});\n```\n"
⚠ tool call embedded in text, not returned via functionCalls
stderr | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Qwen2.5 Coder 1.5B
[node-llama-cpp] load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Qwen2.5 Coder 1.5B
--- Qwen2.5 Coder 1.5B (wrapper: Qwen) ---
functionCalls: NONE
response text: "{\"name\": \"get_weather\", \"arguments\": {\"location\": \"San Francisco\"}}"
⚠ tool call embedded in text, not returned via functionCalls
stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Llama 3.2 1B
--- Llama 3.2 1B (wrapper: Llama 3.2 lightweight) ---
functionCalls: [{"functionName":"get_weather","params":{"location":"San Francisco"},"raw":["{\"name\": \"get_weather\", \"parameters\": {\"location\": \"San Francisco\"}}",{"type":"specialTokensText","value":"<|eot_id|>"}]}]
response text: ""
✓ packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts (4 tests) 14068ms
✓ node-llama-cpp native function calling (4)
✓ FunctionGemma 270M 1644ms
✓ LFM2 1.2B Tool 5359ms
✓ Qwen2.5 Coder 1.5B 3320ms
✓ Llama 3.2 1B 3742ms
Test Files 1 passed (1)
Tests 4 passed (4)
Start at 16:04:15
Duration 15.59s (transform 910ms, setup 1.07s, import 352ms, tests 14.07s, environment 0ms)The Solution
I expected res.functionCalls to be filled. Is there a model I should test that is known to work?
Considered Alternatives
I scan text steam for <|tool_call_start|>[get_weather(location="San Francisco")]<|tool_call_end|>, but while the onnx version does this, this version is stripping the token <|tool_call_start|> and <|tool_call_end|>.
Additional Context
No response
Related Features to This Feature Request
- Metal support
- CUDA support
- Vulkan support
- Grammar
- Function calling
Are you willing to resolve this issue by submitting a Pull Request?
Yes, I have the time, but I don't know how to start. I would need guidance.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
new featureNew feature or requestNew feature or requestrequires triageRequires triagingRequires triaging