PP-DocLayoutV3 checkpoint/load-path mismatch in GLM-OCR
Summary
GLM-OCR loads PP-DocLayoutV3 through PPDocLayoutV3ForObjectDetection.from_pretrained(...).
With transformers==5.4.0, that load path reports the decoder detection heads as missing,
even though the published PaddlePaddle/PP-DocLayoutV3_safetensors checkpoint contains the
corresponding trained head weights under enc_* names.
This leads to newly initialized decoder heads instead of loading the tied trained weights.
Evidence
Runtime startup reported missing decoder head keys such as:
model.decoder.class_embed.weight
model.decoder.class_embed.bias
model.decoder.bbox_embed.layers.0.weight
model.decoder.bbox_embed.layers.0.bias
...
Checkpoint inspection showed these relevant keys:
model.denoising_class_embed.weight
model.enc_score_head.weight
model.enc_score_head.bias
model.enc_bbox_head.layers.0.weight
model.enc_bbox_head.layers.0.bias
model.enc_bbox_head.layers.1.weight
model.enc_bbox_head.layers.1.bias
model.enc_bbox_head.layers.2.weight
model.enc_bbox_head.layers.2.bias
But the model class expects:
model.denoising_class_embed.weight
model.decoder.class_embed.weight
model.decoder.class_embed.bias
model.decoder.bbox_embed.layers.0.weight
model.decoder.bbox_embed.layers.0.bias
model.decoder.bbox_embed.layers.1.weight
model.decoder.bbox_embed.layers.1.bias
model.decoder.bbox_embed.layers.2.weight
model.decoder.bbox_embed.layers.2.bias
The installed transformers implementation also declares these tied mappings:
_tied_weights_keys = {
"decoder.class_embed": "enc_score_head",
"decoder.bbox_embed": "enc_bbox_head",
}
Root cause
The checkpoint stores the trained detection heads under encoder-head names:
model.enc_score_head.*
model.enc_bbox_head.layers.*
But the object-detection wrapper expects decoder-head names:
model.decoder.class_embed.*
model.decoder.bbox_embed.layers.*
GLM-OCR's original load path did not alias the checkpoint keys before constructing
PPDocLayoutV3ForObjectDetection, so the decoder heads were treated as missing and
initialized from scratch.
Symptoms
- startup warnings about missing decoder head weights
- degraded or unstable layout detection in self-hosted OCR runs
- deprecation warning for
PPDocLayoutV3ImageProcessorFast
Minimal fix direction
- load the PP-DocLayoutV3 config separately
- load
model.safetensors directly
- alias:
model.enc_score_head.* -> model.decoder.class_embed.*
model.enc_bbox_head.layers.* -> model.decoder.bbox_embed.layers.*
- construct the model with the prepared state dict
Separately, switch from PPDocLayoutV3ImageProcessorFast to PPDocLayoutV3ImageProcessor
to remove the transformers 5.4.0 deprecation warning.
PP-DocLayoutV3 checkpoint/load-path mismatch in GLM-OCR
Summary
GLM-OCR loads PP-DocLayoutV3 through
PPDocLayoutV3ForObjectDetection.from_pretrained(...).With
transformers==5.4.0, that load path reports the decoder detection heads as missing,even though the published
PaddlePaddle/PP-DocLayoutV3_safetensorscheckpoint contains thecorresponding trained head weights under
enc_*names.This leads to newly initialized decoder heads instead of loading the tied trained weights.
Evidence
Runtime startup reported missing decoder head keys such as:
Checkpoint inspection showed these relevant keys:
But the model class expects:
The installed
transformersimplementation also declares these tied mappings:Root cause
The checkpoint stores the trained detection heads under encoder-head names:
model.enc_score_head.*model.enc_bbox_head.layers.*But the object-detection wrapper expects decoder-head names:
model.decoder.class_embed.*model.decoder.bbox_embed.layers.*GLM-OCR's original load path did not alias the checkpoint keys before constructing
PPDocLayoutV3ForObjectDetection, so the decoder heads were treated as missing andinitialized from scratch.
Symptoms
PPDocLayoutV3ImageProcessorFastMinimal fix direction
model.safetensorsdirectlymodel.enc_score_head.*->model.decoder.class_embed.*model.enc_bbox_head.layers.*->model.decoder.bbox_embed.layers.*Separately, switch from
PPDocLayoutV3ImageProcessorFasttoPPDocLayoutV3ImageProcessorto remove the
transformers 5.4.0deprecation warning.