Skip to content

Add ERNIE Image architecture support#2459

Merged
Acly merged 1 commit intoAcly:mainfrom
verigen:feature/ernie_image_support
May 2, 2026
Merged

Add ERNIE Image architecture support#2459
Acly merged 1 commit intoAcly:mainfrom
verigen:feature/ernie_image_support

Conversation

@eruanno123
Copy link
Copy Markdown

@eruanno123 eruanno123 commented Apr 19, 2026

Adds support for ERNIE Image by Baidu - a diffusion model using a Ministral-3B text encoder and the Flux 2 VAE.

@Acly, I am sharing this PR as is, based on what I quickly developed for my needs (as I was a bit impatient to wait :D). If it passes initial review, I can spend some time polishing it (items mentioned in Todos).

Model files:

  • ernie-image.safetensors - base model (CFG-guided)
  • ernie-image-turbo.safetensors - distilled variant (CFG=1.0)

Architecture highlights:

  • Text encoder: Ministral-3-3B (CLIPLoader, type flux2)
  • VAE: shared with Flux 2 Klein (flux2-vae.safetensors)
  • Latent space: EmptyFlux2LatentImage

Requires a patch to comfyui-tooling-nodes to report base_model: "ernie-image" for ERNIE diffusion models (currently returns "unknown"): Acly/comfyui-tooling-nodes#63

Preview:

ernie image

@hansnolte
Copy link
Copy Markdown

Hi eruanno123,

I am also intereted by ERNIE.

I updated krita-ai-diffusion from Github.
Write "ErnieImage": "ernie-image", to ... ComfyUI\custom_nodes\comfyui-tooling-nodes\api.py
Copy ernie-image-turbo-nvfp4.safetensors to diffusion_models and ministral-3-3b.safetensors (Is there also a GGUF version of this?) to text_encoders.
And I have updated ComfyUI and all Nodes.
But it is not listed in the Model Checkpoint drop-down.
Have I forgotten something?

@eruanno123
Copy link
Copy Markdown
Author

@hansnolte I tried the NVFP4 variant and it seems to work well on my side.

In your setup:

  1. Does the official ComfyUI workflow work with NVFP4?
  2. Did you update api.py first and then ComfyUI/Nodes? Did you check if changes to api.py are still there? I had impression that ComfyUI might overwrite or clean up untracked changes in custom nodes.

@hansnolte
Copy link
Copy Markdown

Hi eruanno123,
many thanks for your answer.

  1. Yes
  2. The changes in the api.py are still there.

I have also ernie-image-turbo-Q6_K.gguf copied to the diffusion_models Folder.
The GGUF is listed the NVFP4 not.

The GGUF give me this error.

Server error: mat1 and mat2 shapes cannot be multiplied (512x4096 and 3072x4096)
TIPS: If you have any "Load CLIP" or "*CLIP Loader" nodes in your workflow connected to this sampler node make sure the correct file(s) and type is selected.

@eruanno123
Copy link
Copy Markdown
Author

eruanno123 commented Apr 21, 2026

Another thing to check (this is how I verified the tooling nodes work):

curl http://localhost:8188/api/etn/model_info/diffusion_models | python3 -m json.tool | grep -A5 ernie

It should return something like this:

    "External/ERNIE/ernie-image-turbo-fp8.safetensors": {
        "base_model": "ernie-image",
        "is_inpaint": false
    },
    "External/ERNIE/ernie-image-turbo-nvfp4.safetensors": {
        "base_model": "ernie-image",
        "is_inpaint": false
    },
    "External/ERNIE/ernie-image-turbo.safetensors": {
        "base_model": "ernie-image",
        "is_inpaint": false
    },
    "External/Nunchaku/svdq-fp4_r128-z-image-turbo.safetensors": {
        "base_model": "z-image",
        "is_inpaint": false,

I quickly checked the GGUF variants, and they are not detected on my side. After taking a closer look, I think the tooling nodes need further updates - there is an inspect_gguf procedure that is supposed to detect specific metadata for Ernie models.

EDIT: it is actually detected, I was checking wrong endpoint (should be unet_gguf), but it is discovered as "WAN" model, which confirms my conclusion that inspect_gguf needs to be updated too.

@eruanno123
Copy link
Copy Markdown
Author

The problem is that GGUF models (at least those from Unsloth) explicitly report the architecture type as "wan":

image

This means the detection logic needs to hook into some distinguishable layer or feature specific to ERNIE. At the moment, I have no knowledge of how ERNIE works, so I’m afraid I have to leave this PR as “no GGUF support yet” :(

@Acly
Copy link
Copy Markdown
Owner

Acly commented Apr 22, 2026

I don't have time to try the model deeply, from what I can see it doesn't immediately seem like a (significant) improvement on Flux2/ZIT. But I think we can add basic support as sketched here.

For now, please remove the changes to models.json, server.py and download_models.py - these things are difficult to change later, and even if Ernie becomes more deeply integrated we might want to go with different model files (eg. fp8).

@hansnolte
Copy link
Copy Markdown

Hi eruanno123,
many thanks for your deep investigation.

I have just managed to get both variants to run in ComfyUI.
Doesn't look bad.
But Acly is right, apart from the text rendering, I don't see any major improvement over Flux2 and ZIT.

Again, thank you for your help
Hans

@eruanno123
Copy link
Copy Markdown
Author

@hansnolte You’re welcome. I’m getting the impression that base model is way easier to train, which is why I integrated it into my local workflow. It’s still not very clear to me what the proper way to train LoRAs for the ZIB model is.

@Acly, thanks for taking the time. I've reverted the changes you mentioned, aligned the “E” more precisely, and fixed the failing unit test.

@eruanno123 eruanno123 marked this pull request as ready for review April 23, 2026 16:51
@Acly
Copy link
Copy Markdown
Owner

Acly commented May 1, 2026

Could you do a rebase please?

@eruanno123 eruanno123 force-pushed the feature/ernie_image_support branch from 7fae2c9 to b831db2 Compare May 2, 2026 15:52
@eruanno123 eruanno123 force-pushed the feature/ernie_image_support branch from b831db2 to 2b45074 Compare May 2, 2026 15:56
@Acly Acly merged commit 4f9928c into Acly:main May 2, 2026
1 of 2 checks passed
@eruanno123 eruanno123 deleted the feature/ernie_image_support branch May 2, 2026 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants