| Method | Venue | Modality | Scheme | Data Fusion | Comm Mecha | Feat Fusion | Loss Func | Code |
|---|---|---|---|---|---|---|---|---|
| Cooper [1] | ICDCS'19 | LiDAR | E | Raw | - | - | - | - |
| F-Cooper [2] | SEC'19 | LiDAR | I | - | - | Trad | - | Linkn |
| Who2com [3] | ICRA'20 | Camera | I | - | Agent | Trad | - | - |
| When2com [4] | CVPR'20 | Camera | I | - | Agent | Trad | - | Linkn |
| V2VNet [5] | ECCV'20 | LiDAR | I | - | - | Graph | - | - |
| Coop3D [6] | TITS'20 | LiDAR | E, L | Raw, Out | - | - | - | Linkn |
| CoFF [7] | IoT'21 | LiDAR | I | - | - | Trad | - | - |
| DiscoNet [8] | NeurIPS'21 | LiDAR | I | Raw | - | Graph | - | Linkc |
| MP-Pose [9] | RAL'22 | Camera | I | - | - | Graph | - | - |
| FPV-RCNN [10] | RAL'22 | LiDAR | I | Out | Feat | Trad | - | Linkn |
| AttFusion [11] | ICRA'22 | LiDAR | I | - | - | Atten | - | Linko |
| TCLF [12] | CVPR'22 | LiDAR | L | Out | - | - | - | Linkv |
| COOPERNAUT [13] | CVPR'22 | LiDAR | I | - | - | Atten | - | Linkn |
| V2X-ViT [14] | ECCV'22 | LiDAR | I | - | - | Atten | - | Linko |
| CRCNet [15] | MM'22 | LiDAR | I | - | - | Atten | Redund | - |
| CoBEVT [16] | CoRL'22 | Camera | I | - | - | Atten | - | Linko |
| Where2comm [17] | NeurIPS'22 | LiDAR | I | - | Agent, Feat | Atten | - | Linko |
| Double-M [18] | ICRA'23 | LiDAR | E, I, L | - | - | - | Uncert | Linkc |
| CoCa3D [19] | CVPR'23 | Camera | I | - | Feat | Trad | - | Linko |
| HM-ViT [20] | ICCV'23 | LiDAR, Camera | I | - | - | Atten | - | Linko |
| CORE [21] | ICCV'23 | LiDAR | I | Raw | Feat | Atten | Recon | Linko |
| SCOPE [22] | ICCV'23 | LiDAR | I | - | - | Atten (ST) | - | Linko |
| TransIFF [23] | ICCV'23 | LiDAR | I | - | Feat | Atten | - | - |
| UMC [24] | ICCV'23 | LiDAR | I | - | Feat | Graph | - | Linkc |
| HYDRO-3D [25] | TIV'23 | LiDAR | I | - | - | Atten (ST) | - | - |
| MKD-Cooper [26] | TIV'23 | LiDAR | I | Raw | - | Atten | - | Linko |
| V2VFormer++ [27] | TITS'23 | LiDAR, Camera | I | - | - | Atten | - | - |
| How2comm [28] | NeurIPS'23 | LiDAR | I | - | Feat | Atten (ST) | - | Linko |
| What2comm [29] | MM'23 | LiDAR | I | - | Feat | Atten (ST) | - | - |
| BM2CP [30] | CoRL'23 | LiDAR, Camera | I | - | Feat | Atten | - | Linko |
| DI-V2X [31] | AAAI'24 | LiDAR | I | Raw | Feat | Atten | - | Linko |
| QUEST [32] | ICRA'24 | Camera | I, L | - | Feat | Atten | - | - |
| CMiMC [33] | AAAI'24 | LiDAR | I | - | Feat | - | ✔️ | Linkc |
| Select2Col [34] | TVT'24 | LiDAR | I | - | Agent | Atten (ST) | - | Linko |
| MOT-CUP [35] | RAL'24 | LiDAR | E, I, L | - | - | - | Uncert | Linkc |
| CodeFilling [36] | CVPR'24 | LiDAR, Camera | I | - | Feat | Trad | - | Linko |
| IFTR [37] | ECCV'24 | Camera | I | - | Feat | Atten | - | Linko |
| VIMI [38] | ICRA'24 | Camera | I | - | - | Atten | - | Linkv |
| CPPC [39] | ICLR'25 | LiDAR | I | - | Feat | Trad | - | - |
| CoSDH [40] | CVPR'25 | LiDAR | I, L | - | Feat | Trad | - | Linko |
| CoGMP [41] | CVPR'25 | Camera | I | - | Feat | Trad | - | - |
| CoST [42] | ICCV'25 | LiDAR | I | - | Feat | Atten (ST) | - | - |
| V2XPnP [43] | ICCV'25 | LiDAR | I | - | Feat | Atten (ST) | - | Linko |
| mmCooper [44] | ICCV'25 | LiDAR | I, L | - | Feat | Atten | - | Linko |
| GSCOOP [45] | ICCV'25 | Camera | I | - | Gaussian | Trad | - | - |
| RayFusion [46] | NeurIPS'25 | Camera | I | - | Feat+Ray | Trad | - | - |
| InfoCom [47] | AAAI'26 | LiDAR | I | - | Feat | Trad | - | - |
| SparseCoop [48] | AAAI'26 | Camera | I | - | Feat | Atten (ST) | - | - |
| ZeRCP [49] | AAAI'26 | Camera | I | - | Feat | Atten (ST) | - | - |
| DiGS-CP [50] | AAAI'26 | LiDAR | I | - | Feat | Atten (ST) | - | Linko |
| CooperTrim [51] | ICLR'26 | LiDAR, Camera | I | - | Feat | Atten | - | - |
| RDcomm [52] | ICLR'26 | LiDAR, Camera | I | - | Feat | Trad | - | - |
| SiMO [53] | ICLR'26 | LiDAR + Camera | I | - | - | Atten | - | Linko] |
Note:
- Schemes include early (E), intermediate (I) and late (L) collaboration.
- Data Fusion: data fusion includes raw data fusion (Raw) and output fusion (Out).
- Comm Mecha: communication mechanism includes agent selection (Agent) and feature selection (Feat).
- Feat Fusion: feature fusion can be divided into traditional (Trad), graph-based (Graph) and attention-based (Atten) feature fusion. (ST: spatio-temporal)
- Loss Func: loss function can be used for uncertainty estimation (Uncert), redundancy minimization (Redund) and Reconstruction (Recon), etc.
- Code Framework: o (OpenCOOD), v (VIC3D), c (CoPerception), n (Non-mainstream framework)
Back to Contents 🔙
- Cooper: Cooperative perception for connected autonomous vehicles based on 3d point clouds (ICDCS'19) [
pdf] - F-Cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds (SEC'19) [
pdf] - Who2com: Collaborative perception via learnable handshake communication (ICRA'20) [
pdf] - When2com: Multi-agent perception via communication graph grouping (CVPR'20) [
pdf] - V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction (ECCV'20) [
pdf] - Cooperative perception for 3D object detection in driving scenarios using infrastructure sensors (TITS'20) [
pdf] - CoFF: Cooperative spatial feature fusion for 3-d object detection on autonomous vehicles (IoT'21) [
pdf] - Learning distilled collaboration graph for multi-agent perception (NeurIPS'21) [
pdf] [code] - Multi-Robot Collaborative Perception with Graph Neural Networks (RAL'22) [
pdf] - Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of Autonomous Driving (RAL'22) [
pdf] [code] - OPV2V: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication (ICRA'22) [
pdf] [code] - DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection (CVPR'22) [
pdf] [code] - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles (CVPR'22) [
pdf] [code] - V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer (ECCV'22) [
pdf] [code] - Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception (MM'22) [
pdf] - CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers (CoRL'22) [
pdf] [code] - Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps (NeurIPS'22) [
pdf] [code] - Uncertainty Quantification of Collaborative Detection for Self-Driving (ICRA'23) [
pdf] [code] - Collaboration Helps Camera Overtake LiDAR in 3D Detection (CVPR'23) [
pdf] [code] - HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer (ICCV'23) [
pdf] - CORE: Cooperative Reconstruction for Multi-Agent Perception (ICCV'23) [
pdf] [code] - Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception (ICCV'23) [
pdf] [code] - TransIFF: An Instance-Level Feature Fusion Framework for Vehicle-Infrastructure Cooperative 3D Detection with Transformers (ICCV'23) [
pdf] - UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework (ICCV'23) [
pdf] [code] - HYDRO-3D: Hybrid Object Detection and Tracking for Cooperative Perception Using 3D LiDAR (TIV'23) [
pdf] - MKD-Cooper: Cooperative 3D Object Detection for Autonomous Driving via Multi-teacher Knowledge Distillation (TIV'23) [
pdf] [code] - V2VFormer ++ : Multi-Modal Vehicle-to-Vehicle Cooperative Perception via Global-Local Transformer (TITS'23) [
pdf] - How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception (NeurIPS'23) [
pdf] [code] - What2comm: Towards Communication-efficient Collaborative Perception via Feature Decoupling (MM'23) [
pdf] - BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities (CoRL'23) [
pdf] [code] - DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection (AAAI'24) [
pdf] [code] - QUEST: Query Stream for Practical Cooperative Perception (ICRA'24) [
pdf] - What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception (AAAI'24) [
pdf] [code] - Select2Col: Leveraging Spatial-Temporal Importance of Semantic Information for Efficient Collaborative Perception (TVT'24) [
pdf] [code] - Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation (RAL'24) [
pdf] [code] - Communication-Efficient Collaborative Perception via Information Filling with Codebook (CVPR'24) [
pdf] [code] - IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception (ECCV'24) [
pdf] [code] - EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection (ICRA'24) [
pdf] [code] - Point Cluster: A Compact Message Unit for Communication-Efficient Collaborative Perception (ICLR'25) [
pdf] - CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization (CVPR'25) [
pdf] [code] - Generative Map Priors for Collaborative BEV Semantic Segmentation (CVPR'25) [
pdf] - CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective (ICCV'25) [
pdf] - V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction (ICCV'25) [
pdf] [code] - mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework (ICCV'25) [
pdf] [code] - Communication-Efficient Multi-Vehicle Collaborative Semantic Segmentation via Sparse 3D Gaussian Sharing (ICCV'25) [
pdf] - RayFusion: Ray Fusion Enhanced Collaborative Visual Perception (NeurIPS'25) [
pdf] - InfoCom: Kilobyte-Scale Communication-Efficient Collaborative Perception with Information Bottleneck (AAAI'26) [
pdf] - SparseCoop: Cooperative Perception with Kinematic-Grounded Queries (AAAI'26) [
pdf] - ZeRCP: Towards Communication-Efficient Collaborative Perception and Future Scene Prediction via Request-Free Spatial Filtering (AAAI'26) [
pdf] - From Discriminative to Generative: A Diffusion-Based Paradigm for Multi-Agent Collaborative Perception (AAAI'26) [
pdf] [code] - CooperTrim: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception (ICLR'26) [
pdf] - Rate-Distortion Optimized Communication for Collaborative Perception (ICLR'26) [
pdf] - SiMO: Single-Modality-Operable Multimodal Collaborative Perception (ICLR'26) [
pdf] [code]