Skip to content

ONNX Unsupported export of operator adaptive_avg_pool2d #1266

@statiqueplasma

Description

@statiqueplasma

Context

After training my model using the Lightning Module, i tried exporting the resulting model to Pytorch and ONNX. i followed the training structure used in the git repo, i then exported the model to pytorch, then tried to load it and export it to ONNX (the code is down below)
I used a LightningModule and Dataset class structures both identical to the ones described in the repo notebook . and my libraries versions are as follow:

  • PyTorch: 2.4.0a0+f70bd71a48.nv24.06
  • Torch CUDA available: True
  • PyTorch Lightning: 2.6.1
  • Segmentation Models PyTorch: 0.5.0
  • NumPy: 1.26.4
  • OpenCV: 4.9.0

Error

The training, export to pytorch and loading is handled fine, no issue in these steps, i can even run inferences using the loaded smp.PSPNet model. But when trying to export the pytorch smp.PSPNet model to ONNX, i run into a SymbolicValueError saying :

Unsupported: ONNX export of operator adaptive_avg_pool2d, output size that are not factor of input size. Please feel free to request support or submit a pull request on PyTorch GitHub: https://github.com/pytorch/pytorch/issues  [Caused by the value '469 defined in (%469 : Long(2, strides=[1], device=cpu) = onnx::Constant[value= 3  3 [ CPULongType{2} ]]()
)' (type 'Tensor') in the TorchScript graph. The containing node has kind 'onnx::Constant'.] 

    Inputs:
        Empty
    Outputs:
        #0: 469 defined in (%469 : Long(2, strides=[1], device=cpu) = onnx::Constant[value= 3  3 [ CPULongType{2} ]]()
    )  (type 'Tensor')

Code used

Please refer to the code below showing how i trained, loaded, and exported the checkpoints.

trainer.fit(
    model,
    train_dataloaders=train_loader,
    val_dataloaders=valid_loader,
)
torch.save(model.model.state_dict(), "model.pth")
model = smp.PSPNet(
      encoder_name="mobilenet_v2",
      encoder_weights=None,  
      classes=1,
      activation=None
  )
model.load_state_dict(torch.load("model.pth", map_location="cpu"))
model.to(device)

images, masks = next(iter(train_loader))

dummy_input = images[:1]
dummy_input = dummy_input.float() 
dummy_input = dummy_input / 255.0 
dummy_input = dummy_input.to(device)
with torch.inference_mode():
    model.eval()
    output = model(dummy_input)

torch.onnx.export(
        model,
        dummy_input,
        "model.onnx",
        export_params=True,
        opset_version=17,  # the ONNX version to export
        do_constant_folding=True,  # whether to execute constant folding for optimization
        input_names=["input"],
        output_names=["output"],
        dynamic_axes={"input": {0: "batch"}, "output": {0: "batch"}},
    )

Discussion

The input used in the ONNX export is the exact input used in the training (which is 128x128) since i used the same Dataloader, so i find it very weird that the error is showing a complete different number and that it leads to this error.
I assume that i should use avg_pool2d instead of adaptive_avg_pool2d because i read that ONNX doesn't know how to handle the adaptive_avg_pool2d, but i can find nowhere how to change the pooling function in smp or LightningModule.
Does any one have any idea how i can solve the issue please ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions