Linux thinkstationpgx-30b2 6.17.0-1014-nvidia #14-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 17 19:01:40 UTC 2026 aarch64 aarch64 aarch64 GNU/Linux
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.142 Driver Version: 580.142 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GB10 On | 0000000F:01:00.0 Off | N/A |
| N/A 38C P8 4W / N/A | Not Supported | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
xinference-local --host 0.0.0.0 --port 13002
/root/miniconda3/envs/xinference/lib/python3.10/site-packages/torch/cuda/__init__.py:61: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
/root/miniconda3/envs/xinference/lib/python3.10/site-packages/torch/cuda/__init__.py:61: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
2026-05-08 14:34:30,050 xinference.core.supervisor 13176 INFO Xinference supervisor 0.0.0.0:41641 started
2026-05-08 14:34:30,202 xinference.core.worker 13176 INFO Starting metrics export server at 0.0.0.0:None
2026-05-08 14:34:30,203 xinference.core.worker 13176 INFO Checking metrics export server...
2026-05-08 14:34:30,948 xinference.core.worker 13176 INFO Metrics server is started at: http://0.0.0.0:39467
2026-05-08 14:34:30,948 xinference.core.worker 13176 INFO Purge cache directory: /root/.xinference/cache
2026-05-08 14:34:30,948 xinference.core.utils 13176 INFO Remove empty directory: /root/.xinference/cache/v2
2026-05-08 14:34:30,951 xinference.core.worker 13176 INFO Connected to supervisor as a fresh worker
2026-05-08 14:34:30,964 xinference.core.worker 13176 INFO Xinference worker 0.0.0.0:41641 started
2026-05-08 14:34:30,966 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:34:36,785 xinference.api.restful_api 13135 INFO Starting Xinference at endpoint: http://0.0.0.0:13002
2026-05-08 14:34:36,853 uvicorn.error 13135 INFO Uvicorn running on http://0.0.0.0:13002 (Press CTRL+C to quit)
2026-05-08 14:34:45,985 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:35:01,002 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:35:16,023 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:35:31,042 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:35:46,062 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:36:01,081 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:36:16,099 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:36:31,119 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
2026-05-08 14:36:46,139 xinference.device_utils 13176 ERROR Fail to get GPU info: Not Supported
Name: xinference
Version: 2.7.0
Summary: Model Serving Made Easy
Home-page: https://github.com/xorbitsai/inference
Author: Qin Xuye
Author-email: qinxuye@xprobe.io
License: Apache License 2.0
xinference-local --host 0.0.0.0 --port 13002
System Info / 系統信息
uname -a
nvidia-smi:
Log:
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local --host 0.0.0.0 --port 13002Reproduction / 复现过程
Expected behavior / 期待表现
正常使用GPU