feat: function calling across models

### Feature Description

Improve function calling support across models. right now, i get 3 failures out of four for this test:

```ts
import { getLlama, LlamaChat, resolveModelFile } from "node-llama-cpp";
import { afterAll, describe, expect, it } from "vitest";

const models = [
  { label: "FunctionGemma 270M", url: "hf:unsloth/functiongemma-270m-it-GGUF:Q8_0" },
  { label: "LFM2 1.2B Tool", url: "hf:LiquidAI/LFM2-1.2B-Tool-GGUF:Q8_0" },
  { label: "Qwen2.5 Coder 1.5B", url: "hf:bartowski/Qwen2.5-Coder-1.5B-Instruct-GGUF:Q4_K_M" },
  { label: "Llama 3.2 1B", url: "hf:unsloth/Llama-3.2-1B-Instruct-GGUF:Q4_K_M" },
];

const functions = {
  get_weather: {
    description: "Get the current weather for a city.",
    params: {
      type: "object" as const,
      properties: {
        location: { type: "string" as const },
      },
      required: ["location"],
    },
  },
};

describe("node-llama-cpp native function calling", () => {
  const timeout = 10 * 60 * 1000;
  let llama: Awaited<ReturnType<typeof getLlama>> | undefined;

  afterAll(async () => {
    await llama?.dispose();
  });

  for (const { label, url } of models) {
    it(
      label,
      async () => {
        llama ??= await getLlama();
        const modelPath = await resolveModelFile(url, "./models");
        const model = await llama.loadModel({ modelPath });
        const context = await model.createContext({ flashAttention: true });
        const sequence = context.getSequence();
        const chat = new LlamaChat({ contextSequence: sequence });

        const res = await chat.generateResponse(
          [{ type: "user", text: "What is the weather in San Francisco?" }],
          { functions, maxTokens: 200, seed: 42 }
        );

        const hasFnCalls = (res.functionCalls?.length ?? 0) > 0;

        console.log(`\n--- ${label} (wrapper: ${chat.chatWrapper.wrapperName}) ---`);
        console.log(`functionCalls: ${hasFnCalls ? JSON.stringify(res.functionCalls) : "NONE"}`);
        console.log(`response text: ${JSON.stringify(res.response.slice(0, 200))}`);
        if (!hasFnCalls && res.response) {
          console.log(`⚠ tool call embedded in text, not returned via functionCalls`);
        }

        chat.dispose({ disposeSequence: false });
        sequence.dispose();
        await context.dispose();
        await model.dispose();
        // expect(hasFnCalls).toBe(true);
      },
      timeout
    );
  }
});
```


```bash
[node-llama-cpp] A prebuilt binary was not found, falling back to using no GPU
stderr | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > FunctionGemma 270M
[node-llama-cpp] load: control-looking token:    212 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > FunctionGemma 270M

--- FunctionGemma 270M (wrapper: Gemma) ---
functionCalls: NONE
response text: "I cannot assist with retrieving weather information for San Francisco. My current capabilities are limited to calling specific functions as needed."
⚠ tool call embedded in text, not returned via functionCalls

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > LFM2 1.2B Tool

--- LFM2 1.2B Tool (wrapper: ChatML) ---
functionCalls: NONE
response text: "I can help you with that, but I need the function to retrieve the weather information. Could you please provide the function call?\n\n```typescript\nfunction get_weather(params: {location: string});\n```\n"
⚠ tool call embedded in text, not returned via functionCalls

stderr | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Qwen2.5 Coder 1.5B
[node-llama-cpp] load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Qwen2.5 Coder 1.5B

--- Qwen2.5 Coder 1.5B (wrapper: Qwen) ---
functionCalls: NONE
response text: "{\"name\": \"get_weather\", \"arguments\": {\"location\": \"San Francisco\"}}"
⚠ tool call embedded in text, not returned via functionCalls

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Llama 3.2 1B

--- Llama 3.2 1B (wrapper: Llama 3.2 lightweight) ---
functionCalls: [{"functionName":"get_weather","params":{"location":"San Francisco"},"raw":["{\"name\": \"get_weather\", \"parameters\": {\"location\": \"San Francisco\"}}",{"type":"specialTokensText","value":"<|eot_id|>"}]}]
response text: ""

 ✓ packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts (4 tests) 14068ms
   ✓ node-llama-cpp native function calling (4)
     ✓ FunctionGemma 270M  1644ms
     ✓ LFM2 1.2B Tool  5359ms
     ✓ Qwen2.5 Coder 1.5B  3320ms
     ✓ Llama 3.2 1B  3742ms

 Test Files  1 passed (1)
      Tests  4 passed (4)
   Start at  16:04:15
   Duration  15.59s (transform 910ms, setup 1.07s, import 352ms, tests 14.07s, environment 0ms)
```

### The Solution

I expected `res.functionCalls` to be filled. Is there a model I should test that is known to work?

### Considered Alternatives

I scan text steam for `<|tool_call_start|>[get_weather(location="San Francisco")]<|tool_call_end|>`, but while the onnx version does this, this version is stripping the token `<|tool_call_start|>` and `<|tool_call_end|>`.

### Additional Context

_No response_

### Related Features to This Feature Request

- [ ] Metal support
- [ ] CUDA support
- [ ] Vulkan support
- [ ] Grammar
- [x] Function calling

### Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: function calling across models #589

Feature Description

The Solution

Considered Alternatives

Additional Context

Related Features to This Feature Request

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

feat: function calling across models #589

Description

Feature Description

The Solution

Considered Alternatives

Additional Context

Related Features to This Feature Request

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions