Skip to content

feat: add support for multimodal projector in GGUF model building#826

Merged
ericcurtin merged 1 commit into
mainfrom
fix-mmproj-for-hf
Apr 3, 2026
Merged

feat: add support for multimodal projector in GGUF model building#826
ericcurtin merged 1 commit into
mainfrom
fix-mmproj-for-hf

Conversation

@ilopezluna
Copy link
Copy Markdown
Contributor

@ilopezluna ilopezluna commented Apr 2, 2026

Include mmproj file when building a model from Hugging Face.

fixes #822


curl --request POST \
  --url http://localhost:13434/engines/v1/chat/completions \
  --header 'content-type: application/json' \
  --data '{
  "model": "huggingface.co/ggml-org/ultravox-v0_5-llama-3_1-8b-gguf:Q4_K_M",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "transcribe this audio"
        },
        {
          "type": "input_audio",
          "input_audio": {
            "data": "//PoxAB8RA5OX53xAkRAKBwMBQLRYMhmLDEXQQI0CT8QFQNMawxMYiQtFQClEDCgjDHhCzjtkQuCpb4tQY+IgZ3bGZtttcm+GnGYNIBBgRAAGB+AuYK4N4wICLCPmW0GqZsapZtCnzmq4V4AABgQA2YGoAqQ5gXgWDoBRgVBQCQWxgXAXgYEswTQHR4At2IpfL935ePAKpgGACAAvlkDP2ZrfMBkAdpq4kYTARAHCoABgWAROIBgDPhUGuotB/GkF1EII2i6BgGwwAEVAAMAoARpqC8TRUMFcD2Ly6AIPTnuLEMAkBgwVALjBsBeMEABxWAwUgSDAIAPMBEAMwLAPy65gHgDmBgBALAOPIYDYBYVARMB0AdoKYYYAwYAIAYNANTcMBQAoEAEmAcApBRg+g5mCmBGIgATAPBFMEsBUwTwMzAXAHfMwNQKTAPAPDAHwcCoCAGkHAwBNYRHhYBwWhhwBEPyQuuHAJwuSmAOAeEAKLBSmQR2zbhC81/ORKWnsDhhrjlrxWcBgI2+hCBiAOXzMGLoTHb681deaxoLMAUAFY5gHgCxaTQuIZUsmnpTXVsglpKonHlejAXAHXHOJ0QxnHJyafakpJ+CJAziA/izoImFwFIO/E37iEYij/0+8s8c/9YJAiAgADAHADa28k4sSA3vhE9GrcCw/lPpTEFNRQMACQgABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB6jBZABd3gAADj6KbWk8H+MCSbKw3Jgvlxg+JpjWF5uKl4QJgiEcw5EIyCSY3E4IyvS0wFHwz3E8wrG0yzIU8E7Q5zK8xTFwwbE0HDmYaXhvZCqGoKGEgIFRyZzQxmZ2mXBOY1Aw4LDDyIN/F47SVzdzIMqAowELgCszjgFMvmMxiHzE4hMLicyaGQUaTCoDfuaSaIhgLAsuAFQSYRC4sMxISiQtMGi0ymWTKYvCoHMyjUwAJDIBIMbAhPIKgsdACDpgkFoVGLB8Fg+YTHpkoEGFxCFh0DBeYeECPyEBgQEGDxSCRijDSLJYGwBE4wOBjDYABwWLAMC4fXCFiYHEsuGCQcY2BIcQBIqGAhGYjD5iAKGNwOYgLplAAv2OgJEsCBUwsHDBILBQuEAiMnBcDHIw4AgsACgIlAJDkGY6OICSsgEh2FBOYfCwMBLcJuHE/0elvMKaw1UHBNFB9IdQDxxxH2V/AvvK9cPSJonarWZcyeYd2XQ3BLhUD0yvrpQK0hscP0UesPM0FgDjoAEb1VntaO5MPzDYnJpn4fd9EnS5isDTQSGQoAAEFzAwhBQLTQAQIdi1Arwvo4z6t9FoCcdw2/biq9fDTQ4NrsKBCFwGRDYIAiB7PFPPczALAJS4UAK3G7Sle95iVl+qKL00NXaWsmIKaigYAEhAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxABxxBZIBua3UAy9KUzYdYeFxCZJEQOExhYDGHg4bB7J5W0GV0QYdAhig3G9IQbDT53SunFlCZmoh0DsGmVOc6bZmZqnK0Ga7ABrYqmWUsZSNZeMwgDhYJlULjAfFQGOlAwYfTGYBMMDUmZgYazW4aNBM8w0XD5JMPDxo1KQjLilMbBA24QLviy5lAxhwzWwaFaIk+YIKg5cIAQKKgw4bI6DjJsEqERhHZtCBCdNInOgRVnMAEWNARfMuyIAxnwAJGGlBA0YFiQYSFggKBmHDlAcxIUmQsEX9HF/R1YUeDNzJiKZKgMLBwsAhE5pSCQiDiK6bJfUOCCxswBgmKo4IjrwAoWCQ8wgtMpUjOYEZE/DAYEgwNGxIIMMAzBAAdAbK/qVDxv2wWN3WXNJX0opEXta2XUQBMrAACNAhh4IECTV4CRXaQzqUsScKOypSqiemQTMxelkY6/ucCu1QwxfuSajv1pSzmXrJRxZK4Hxb2Fr7dJR+H2mlYcXmFQEmCEzR6BFCAxxTDjIRDANCVLW3LR0MKaE2N41VmiIpO+UB4sFpfoK1TFB0HCiwKBgkqhx0YKCDQQjWXlXmBgQLg6mSLCbSv2Gs8i0OL4h56926SbxTEFNRQMACQgABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB19BY4BOd00gCrS9rgyDTAxEMTgsQDA0HNzlqJNvgM31dzOAbHRAYyXZqNHG8TwaPCBhMHmn1oaiThh8gmQXQatWOcLz8dTEgaBoaYkmwYvDcaECZBZg8FJiQQhekwbCcwtE8tUYSCeYRAkJECY9BoZQjoYgmEY3r8Y2hEYsnaZuryaPT4ba1aZqr8bjGkZCmSAQqMSALCAeBoFmCoFllgUEQdJB0cCuhaSYJcYowRIjkmjWizNGTDLTOjzQRUigKFb1TktU4iqIGCF6QI1CAIWDgEAgUZUYTJwoZDwhqCpsTpCFEA8s+utVJYcQNwaPMzTDI4hRmVAmICGXOm5FmDEIBCak2hg3Znx50Z5k4o07SAAAMFHBATAWIR8gpFNonBX8xH0zxcAhw40a5aaAqYQ+Y1CYdWHIk2n/SkVUWRLJAomXnZu8CKb+iwxE+Wui1JZZgRTvzzPonOOxYoYGgNmyuGTKnfSRxaTu3duS57aaNvtMSt4qaVxqYdWKwcytpaiDNbb4Sq1UoGwOU5bKJYoGmUwNEx3VzCMIoHSMMTnmHaL/Splj9MZZs3MOBgwWSDKhMYS0WFLvGADiEimQXbFCLuIcVGgsOgd9AcUTCfyOKFLEWLAsafeOQmnpbMWpUxBTUUDAAkIAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB0zAI0AO7fcaG1AIUAUDAI3ph2SpjEhxkiG4yVAGjcysGkaCkwaG4xgFcLA8BhFMSxnMAyVAoRiMkzGlnDKqtT66qTfpcjBtgTE2UjUxyzaRVmsmXA1GWASGAIbmA4cgkDDAMjjEkFDDMCAgdjFANjDEUjCcRzJUlRIrTW8/TIsDjwdgDKlbjetyTIw4TdoxRpojFwBDFwAwgIQqCQCBMwxEIxcvSzGFI1MuCzUUBpoQOI3QIQTAEEVOjZQUysGMYDDBgsoC2ENGGAFMsEAAJJQIEIC37MBHBwKCDcxJCNTOTBCF5DTg0i3zKQwRiJh4NfJAIwV1OTKjThszB+N4vwgCNfbDSyMxs+NFLjJV438TN2OwcsklwZovmLFRkqsQioiIwhTHZ8wQ0MihzXkU2iKFGF5gQaAwaMJAxIqMqSTGBYwZrNEMTHyBREaAwACLpkCXgGD08Q+gJoJEbxNlwyk9To0S2hXloiTaSYuS92ZGqdQKITZRsrgm0XROrCGZdztDUiI7o7Hpf08ex8P0LFiTByoa+P1SF0sgHBCFceqNS0N0bpYFcN8K8XNclxOsfQpSeNwviFGgk1KRAtiKJsW00VWnamIPEzblsJwfCsQs60gjPPi8sOcBrEpiCmooGABIQAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAByRBYwBt800Am7elzDkQl9GYCJm2CfcMHAmBjsIcZfGLogAETDRMwkkAJURaYURjMjkBX5lwqBl85jBzsAHNIK0w/RjyJSN8L804rDXpaMXBQx6fAERzN4XM8FkwiDDDYqQDGBgKBSsYPD4YOjArTN7Gkxq7TcJKNpKw+2GzsFXNEYU5cCzIoDMcDsDAZOYwaDEjgcAgSTM8LABMyAw1qZIY0wdBgmZJ7AkOEDSAaELUlF1sajKOyX6mCH4ECjoZLN+RCEMaBMCQBVMWWtbNjKUGDAQUBqNjgJfxsyB8AYOQgS8bW4RezPPwURRWRILoFgOUcTGljLtwFjAxoza446IEAwsLMYHFqJn1xwGJsD5j3CKaNAkBbQHElGF4CFAEGggs6TIH+f5yGXvCsKncX7JAqcjSFlszX9QqprCR6/Ik9oCUtjc1yJ138kr+P/QMfdymbpDLPJxrVPYhDouhDbPU7lAmpP68cWcVsqqqLC+s5Q5DWJtwlBl9LSUqLAJg6TrGVyEQtK8wgAqizFDEo1xLQW8vd2WMLte5xkqBoEwVZRVDqKIghCllhfZwyYIhCniDB4QRayY9IY4i5S1k5FIq3InM6aVLZbGWuwK2SVUl9MQU1FAwAJCAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
            "format": "mp3"
          }
        }
      ]
    }
  ]
}'


{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "It sounds like the audio is a mix of Spanish and English words. Here's my attempt to transcribe it:\n\n\"Hola, cómo va? Vamos a...\"\n\nHowever, I'm not confident in my transcription, as the audio seems to be incomplete or distorted. If you could provide more context or information about the audio, I may be able to help you better."
      }
    }
  ],
  "created": 1775134234,
  "model": "model.gguf",
  "system_fingerprint": "b1-ec2b787",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 75,
    "prompt_tokens": 230,
    "total_tokens": 305,
    "prompt_tokens_details": {
      "cached_tokens": 229
    }
  },
  "id": "chatcmpl-QokLIDhJOqn3k7BRFBz44JVP5l2Cm7Xl",
  "timings": {
    "cache_n": 229,
    "prompt_n": 1,
    "prompt_ms": 25.463,
    "prompt_per_token_ms": 25.463,
    "prompt_per_second": 39.27267014884342,
    "predicted_n": 75,
    "predicted_ms": 876.973,
    "predicted_per_token_ms": 11.692973333333333,
    "predicted_per_second": 85.52144706849583
  }
}

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for multimodal projectors in GGUF models by updating the build pipeline to handle optional mmproj files and include them as specific layers in the model artifact. Comprehensive tests have been added to verify the manifest generation with and without these projectors. I have no feedback to provide.

@ilopezluna ilopezluna marked this pull request as ready for review April 2, 2026 12:52
@ilopezluna ilopezluna requested a review from a team April 2, 2026 12:52
@ilopezluna ilopezluna changed the title [WIP] feat: add support for multimodal projector in GGUF model building feat: add support for multimodal projector in GGUF model building Apr 2, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In buildGGUFModelV01, failing hard when mmprojFile is non-nil but missing from localPaths may be brittle for partially downloaded repos; consider either validating earlier (where mmprojFile is discovered) or gracefully skipping the projector with a logged warning instead of returning an error.
  • The tests for buildGGUFModelV01 only assert that a MediaTypeMultimodalProjector layer exists or not; consider also asserting that the projector layer corresponds to the expected file (e.g., by checking annotations or layer count/ordering) to catch wiring mistakes between mmprojFile and the builder.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `buildGGUFModelV01`, failing hard when `mmprojFile` is non-nil but missing from `localPaths` may be brittle for partially downloaded repos; consider either validating earlier (where `mmprojFile` is discovered) or gracefully skipping the projector with a logged warning instead of returning an error.
- The tests for `buildGGUFModelV01` only assert that a `MediaTypeMultimodalProjector` layer exists or not; consider also asserting that the projector layer corresponds to the expected file (e.g., by checking annotations or layer count/ordering) to catch wiring mistakes between `mmprojFile` and the builder.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@ericcurtin ericcurtin merged commit 1d91357 into main Apr 3, 2026
14 checks passed
@ericcurtin ericcurtin deleted the fix-mmproj-for-hf branch April 3, 2026 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mutimodal HF model for audio return input audio is not supported

2 participants