Dear authors,
Thanks for this great work!
I couldn't obtain good performance when running DepthAnything3 on images from ScanNet++.
For example, for this four-view sequence where I uploaded the processed images (322x504), the online demo cannot predict consistent results, see the misaligned walls.
When I compared the predicted cameras with GT, I found that the prediction has large error in intrinsics. The GT fovX is 124 degree while the predicted fovX is 99 degree.
The performance is surprising as I think DA3 is trained on lots of ScanNet++ dataset or similar indoor scenes.
Thanks so much for your help.
Dear authors,
Thanks for this great work!
I couldn't obtain good performance when running DepthAnything3 on images from ScanNet++.
For example, for this four-view sequence where I uploaded the processed images (322x504), the online demo cannot predict consistent results, see the misaligned walls.
When I compared the predicted cameras with GT, I found that the prediction has large error in intrinsics. The GT fovX is 124 degree while the predicted fovX is 99 degree.
The performance is surprising as I think DA3 is trained on lots of ScanNet++ dataset or similar indoor scenes.
Thanks so much for your help.