Commit 3d98616
committed
Add FLOPs analysis tool for groundingdino
## Motivation
The existing [tools/analysis_tools/get_flops.py](cci:7://file:///home/david/mmdetection/tools/analysis_tools/get_flops.py:0:0-0:0) does not support grounding / vision-language detection models (e.g., GroundingDINO, GroundingCLIP) because these models require text inputs and have multi-modal architectures that cannot be traced end-to-end with `mmengine.analysis.get_model_complexity_info`.
This PR adds a dedicated FLOPs analysis tool that handles the unique architecture of grounding detection models, providing per-component FLOPs and parameter breakdowns.
## Modification
**New file: [tools/analysis_tools/get_flops_grounding.py](cci:7://file:///home/david/mmdetection/tools/analysis_tools/get_flops_grounding.py:0:0-0:0)**
A script that computes per-component FLOPs and parameter counts for grounding detection models:
- **Vision Backbone**: Accurate FLOPs via `fvcore.nn.FlopCountAnalysis`
- **Text Encoder**: Estimated FLOPs based on model type (CLIP, BERT, etc.)
- **Neck (ChannelMapper)**: Estimated from config-driven channel/stride info
- **Transformer Encoder/Decoder**: Estimated from config-driven architecture params
- **Detection Head**: Parameter count
Key design choices:
- Automatically disables `with_cp` (gradient checkpointing) which is incompatible with JIT tracing, without modifying the original config
- Reads architecture parameters (channels, layers, embed_dim, etc.) dynamically from the model config instead of hardcoding
- Uses `MMLogger` consistent with existing mmdet tools
**New file: [tests/test_tools/test_get_flops_grounding.py](cci:7://file:///home/david/mmdetection/tests/test_tools/test_get_flops_grounding.py:0:0-0:0)**
41 unit tests covering all helper functions and config readers.
## BC-breaking
No. This PR only adds new files and does not modify any existing code.
## Use cases
```bash
# Basic usage
python tools/analysis_tools/get_flops_grounding.py \
configs/mm_grounding_dino/grounding_dino_swin-t_finetune_8xb4_20e_cat.py
# Custom input shape
python tools/analysis_tools/get_flops_grounding.py <config> --shape 640 640
# Specify number of classes for text encoder FLOPs estimation
python tools/analysis_tools/get_flops_grounding.py <config> --num-classes 801 parent cfd5d3a commit 3d98616
2 files changed
Lines changed: 851 additions & 0 deletions
0 commit comments