(Im3D) xxx@viscam4:~/projects/ig_llm/rtx_3090/Implicit3DUnderstanding$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
(Im3D) xxx@viscam4:~/projects/ig_llm/rtx_3090/Implicit3DUnderstanding$ conda list torch
# packages in environment at /viscam/u/xxx/anaconda3/envs/Im3D:
#
# Name Version Build Channel
pytorch 1.1.0 cuda100py36he554f03_0
torchvision 0.3.0 cuda100py36h72fc40a_0
(Im3D) xxx@viscam4:~/projects/ig_llm/rtx_3090/Implicit3DUnderstanding$ nvidia-smi
Sun Jul 21 14:15:57 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04 Driver Version: 515.43.04 CUDA Version: 11.7 |
(Im3D) xxx@viscam4:~/projects/ig_llm/rtx_3090/Implicit3DUnderstanding$ gpustat
viscam4.stanford.edu Sun Jul 21 14:16:15 2024 515.43.04
[0] NVIDIA GeForce RTX 3090 | 37'C, 0 % | 308 / 24576 MB |
Begin to resume from the last checkpoint.
Loading checkpoint from out/total3d/20110611514267/model_best.pth.
Warning: Could not find epoch in checkpoint!
Warning: Could not find min_loss in checkpoint!
Warning: Could not find step in checkpoint!
set() subnet missed.
Checkpoint out/total3d/20110611514267/model_best.pth resumed.
Loading data.
Traceback (most recent call last):
File "main.py", line 42, in <module>
demo.run(cfg)
File "/viscam/projects/inv_engine/xxx/ig_llm/rtx_3090/Implicit3DUnderstanding/demo.py", line 147, in run
est_data = net(data)
File "/viscam/u/xxx/anaconda3/envs/Im3D/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/viscam/projects/inv_engine/xxx/ig_llm/rtx_3090/Implicit3DUnderstanding/models/total3d/modules/network.py", line 112, in forward
data['split'], data['rel_pair_counts'])
File "/viscam/u/xxx/anaconda3/envs/Im3D/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/viscam/projects/inv_engine/xxx/ig_llm/rtx_3090/Implicit3DUnderstanding/models/total3d/modules/object_detection.py", line 103, in forward
r_features = self.relnet(a_features, g_features, split, rel_pair_counts)
File "/viscam/u/xxx/anaconda3/envs/Im3D/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/viscam/projects/inv_engine/xxx/ig_llm/rtx_3090/Implicit3DUnderstanding/models/total3d/modules/relation_net.py", line 54, in forward
g_weights = self.fc_g(g_features)
File "/viscam/u/xxx/anaconda3/envs/Im3D/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/viscam/u/xxx/anaconda3/envs/Im3D/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 92, in forward
return F.linear(input, self.weight, self.bias)
File "/viscam/u/xxx/anaconda3/envs/Im3D/lib/python3.6/site-packages/torch/nn/functional.py", line 1406, in linear
ret = torch.addmm(bias, input, weight.t())
RuntimeError: cublas runtime error : the GPU program failed to execute at /tmp/pip-req-build-jh50bw28/aten/src/THC/THCBlas.cu:259
RuntimeError: cublas runtime error : the GPU program failed to execute at /tmp/pip-req-build-jh50bw28/aten/src/THC/THCBlas.cu:259
When I run:
CUDA_VISIBLE_DEVICES=0 xvfb-run -a -s "-screen 0 800x600x24" python main.py out/total3d/20110611514267/out_config.yaml --mode demo --demo_path demo/inputs/1My environment:
Bug (shown after a long time loading):