Evaluation

Open-loop Evaluation

Open-loop evaluation validates the model's predictive capabilities under fixed historical conditions.

Configuration

Update the drivepi config file and drivemoe config file
Important: The model must be trained with horizon=20 (predicting the next 20 trajectory points)

Run Evaluation

bash script/evaluation/open_loop_drivepi0.sh
bash script/evaluation/open_loop_drivemoe.sh

Table 1: Statistics of Dataset Distribution and Camera Router Performance.

Camera Position	Training Set		Evaluation Results (Test Set)
Camera Position	Samples	Ratio (%)	Precision	Recall	F1-score	Support
Front Left	35,105	15.58	0.83	0.89	0.86	1,300
Front Right	8,865	3.93	0.79	0.81	0.80	666
Back	161,717	71.76	0.95	0.92	0.93	8,996
Back Left	13,671	6.07	0.77	0.72	0.75	1,120
Back Right	5,990	2.66	0.39	0.90	0.54	226
Total / Macro Avg	225,348	100.00	0.75	0.85	0.78	12,308
Overall Accuracy	—	—	0.89			—

Table 2: Statistics of Dataset Distribution and Action Router Performance.

Scenario	Training Set		Evaluation Results (Test Set)
Scenario	Samples	Ratio (%)	Precision	Recall	F1-score	Support
Merging	13,304	5.02	0.80	0.76	0.78	670
Parking Exit	1,036	0.46	1.00	1.00	1.00	40
Overtaking	29,921	13.28	0.87	0.91	0.89	1,377
Emergency Brake	12,064	5.35	0.76	0.67	0.71	690
Giveway	5,200	2.31	0.96	0.38	0.54	482
Traffic Sign	45,332	20.12	0.75	0.96	0.84	2,045
Normal	118,491	52.58	0.88	0.85	0.86	7,004
Total / Macro Avg	225,348	100.00	0.86	0.79	0.80	12,308
Overall Accuracy	—	—	0.84			—

Closed-loop evaluation

For closed-loop testing, configure the drivepi config file and drivemoe config file (update paths, etc.). You can download our DrivePi0 and DriveMoE checkpoint here.

The following are our evaluation results on 8*H200 GPUs.

Model	DS	SR	Json
DrivePi0-Base-bf16	55.85	30.00	DrivePi0-Base-bf16
DrivePi0-Base-fp32	65.85	42.27	DrivePi0-Base-fp32
DrivePi0-Full-fp32	67.41	44.09	DrivePi0-Full-fp32
DriveMoE-Base-bf16	74.22	48.64	DriveMoE-Base-bf16

To properly set up the evaluation environment, please clone the Bench2Drive repository at the same directory level as your DriveMoE repository

cd ..
git clone https://github.com/Thinklab-SJTU/Bench2Drive.git

This will result in the following directory structure:

/parent_directory/
├── DriveMoE/
│   ├── ... (DriveMoE contents)
└── Bench2Drive/
    ├── ... (Bench2Drive contents)

Download and setup CARLA 0.9.15, you can skip if you have already downloaded CARLA on your device.

mkdir carla
cd carla
wget https://carla-releases.s3.us-east-005.backblazeb2.com/Linux/CARLA_0.9.15.tar.gz
tar -xvf CARLA_0.9.15.tar.gz
cd Import && wget https://carla-releases.s3.us-east-005.backblazeb2.com/Linux/AdditionalMaps_0.9.15.tar.gz
cd .. && bash ImportAssets.sh

Then link team_code like this:

cp DriveMoE/script/evaluation/closed_loop_drivepi0.sh Bench2Drive/leaderboard/scripts
cp DriveMoE/script/evaluation/closed_loop_drivemoe.sh Bench2Drive/leaderboard/scripts
cp DriveMoE/script/evaluation/requirements.txt Bench2Drive
mkdir Bench2Drive/leaderboard/team_code
cd Bench2Drive/leaderboard/team_code
ln -s ../../../DriveMoE/src/agent/team_code/*  ./
cd ../../   # Now you are in the Bench2Drive directory

Then set up eval environment

conda create -n DriveMoE_eval python=3.8
conda activate DriveMoE_eval
export CARLA_ROOT=YOUR_CARLA_PATH
echo "$CARLA_ROOT/PythonAPI/carla/dist/carla-0.9.15-py3.7-linux-x86_64.egg" >> YOUR_CONDA_PATH/envs/DriveMoE_eval/lib/python3.8/site-packages/carla.pth
pip install -r requirements.txt

Set closed_loop_drivepi0.sh basing on your device and then run:

bash leaderboard/scripts/closed_loop_drivepi0.sh
bash leaderboard/scripts/closed_loop_drivemoe.sh

Follow this to use evaluation tools of Bench2Drive.

Further Assistance

For questions regarding:

Benchmark implementation details
Dataset specifications
Evaluation metrics

Please refer to the Bench2Drive Documentation or open an issue in the Bench2Drive Issues tracker.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation

Open-loop Evaluation

Closed-loop evaluation

Further Assistance

FilesExpand file tree

evaluation.md

Latest commit

History

evaluation.md

File metadata and controls

Evaluation

Open-loop Evaluation

Closed-loop evaluation

Further Assistance