Hello!
I am currently training a BBox Predictor and BoxVideo and working on reproducing your results. I have a few questions regarding the training configurations:
- Could you provide details on the learning rate, learning rate scheduler, warm-up steps, and optimizer parameters (adam_beta1, adam_beta2, adam_weight_decay, adam_epsilon) used for training the BBox Predictor/Box2Video on the KITTI, vKITTI, and BDD100K datasets? I have been using the parameters recommended in the demo, but I get a very spiky learning curve.
- Based on the paper, the number of training epochs should be 5 for KITTI and vKITTI and 1 for BDD100K, with a train_batch_size of 1 for all datasets. Could you confirm if this is correct? And which train_batch_size have you used for all of the datasets?
- Could you share details about the custom split of the KITTI dataset that you used?
- Regarding the BDD100K dataset, I noticed that semantic maps can be used instead of bounding box annotations. Did you train your model using only bounding boxes, only semantic maps, or both? Which setup do the results in the paper correspond to?
Thank you for your help and guidance in reproducing your results!
Also, if you have any pretrained weights available for any of these datasets, let me know.
I understand that you're currently transitioning offices, but I look forward to hearing from you when possible. :)
Cheers
Hello!
I am currently training a BBox Predictor and BoxVideo and working on reproducing your results. I have a few questions regarding the training configurations:
Thank you for your help and guidance in reproducing your results!
Also, if you have any pretrained weights available for any of these datasets, let me know.
I understand that you're currently transitioning offices, but I look forward to hearing from you when possible. :)
Cheers