Skip to content

[Bug] CodeTrans UT test fail #1990

@ZePan110

Description

@ZePan110

Priority

P1-Stopper

OS type

Ubuntu

Hardware type

Xeon-GNR

Installation method

  • Pull docker images from hub.docker.com
  • Build docker images from source
  • Other
  • N/A

Deploy method

  • Docker
  • Docker Compose
  • Kubernetes Helm Charts
  • Kubernetes GMC
  • Other
  • N/A

Running nodes

Single Node

What's the version?

afb46bd

Description

https://github.com/opea-project/GenAIExamples/actions/runs/15189596329/job/42752387718

Image
docker logs codetrans-xeon-vllm-service INFO 05-23 02:56:34 [__init__.py:239] Automatically detected platform cpu. WARNING 05-23 02:56:36 [_logger.py:72] Torch Profiler is enabled in the API server. This should ONLY be used for local developm ent! INFO 05-23 02:56:36 [api_server.py:1034] vLLM API server version 0.8.3 INFO 05-23 02:56:36 [api_server.py:1035] args: Namespace(host='0.0.0.0', port=80, uvicorn_log_level='info', disable_uvicorn_acc ess_log=False, allow_credentials=False, allowed_origins=['*'], allowed_methods=['*'], allowed_headers=['*'], api_key=None, lora _modules=None, prompt_adapters=None, chat_template=None, chat_template_content_format='auto', response_role='assistant', ssl_ke yfile=None, ssl_certfile=None, ssl_ca_certs=None, enable_ssl_refresh=False, ssl_cert_reqs=0, root_path=None, middleware=[], ret urn_tokens_as_token_ids=False, disable_frontend_multiprocessing=False, enable_request_id_headers=False, enable_auto_tool_choice =False, tool_call_parser=None, tool_parser_plugin='', model='mistralai/Mistral-7B-Instruct-v0.3', task='auto', tokenizer=None, hf_config_path=None, skip_tokenizer_init=False, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='aut o', trust_remote_code=False, allowed_local_media_path=None, download_dir=None, load_format='auto', config_format=<ConfigFormat. AUTO: 'auto'>, dtype='auto', kv_cache_dtype='auto', max_model_len=None, guided_decoding_backend='xgrammar', logits_processor_pa ttern=None, model_impl='auto', distributed_executor_backend=None, pipeline_parallel_size=1, tensor_parallel_size=1, data_parall el_size=1, enable_expert_parallel=False, max_parallel_loading_workers=None, ray_workers_use_nsight=False, block_size=None, enab le_prefix_caching=None, prefix_caching_hash_algo='builtin', disable_sliding_window=False, use_v2_block_manager=True, num_lookah ead_slots=0, seed=None, swap_space=4, cpu_offload_gb=0, gpu_memory_utilization=0.9, num_gpu_blocks_override=None, max_num_batch ed_tokens=None, max_num_partial_prefills=1, max_long_partial_prefills=1, long_prefill_token_threshold=0, max_num_seqs=None, max _logprobs=20, disable_log_stats=False, quantization=None, rope_scaling=None, rope_theta=None, hf_overrides=None, enforce_eager= False, max_seq_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenize r_pool_extra_config=None, limit_mm_per_prompt=None, mm_processor_kwargs=None, disable_mm_preprocessor_cache=False, enable_lora= False, enable_lora_bias=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', long_lora_scaling_f actors=None, max_cpu_loras=None, fully_sharded_loras=False, enable_prompt_adapter=False, max_prompt_adapters=1, max_prompt_adap ter_token=0, device='auto', num_scheduler_steps=1, use_tqdm_on_load=True, multi_step_stream_outputs=True, scheduler_delay_facto r=0.0, enable_chunked_prefill=None, speculative_config=None, model_loader_extra_config=None, ignore_patterns=[], preemption_mod e=None, served_model_name=None, qlora_adapter_name_or_path=None, show_hidden_metrics_for_version=None, otlp_traces_endpoint=Non e, collect_detailed_traces=None, disable_async_output_proc=False, scheduling_policy='fcfs', scheduler_cls='vllm.core.scheduler. Scheduler', override_neuron_config=None, override_pooler_config=None, compilation_config=None, kv_transfer_config=None, worker_ cls='auto', worker_extension_cls='', generation_config='auto', override_generation_config=None, enable_sleep_mode=False, calcul ate_kv_scales=False, additional_config=None, enable_reasoning=False, reasoning_parser=None, disable_cascade_attn=False, disable _log_requests=False, max_log_len=None, disable_fastapi_docs=False, enable_prompt_tokens_details=False, enable_server_load_track ing=False) INFO 05-23 02:56:42 [config.py:600] This model supports multiple tasks: {'generate', 'reward', 'score', 'classify', 'embed'}. D efaulting to 'generate'. WARNING 05-23 02:56:42 [_logger.py:72] device type=cpu is not supported by the V1 Engine. Falling back to V0. INFO 05-23 02:56:42 [config.py:1634] Disabled the custom all-reduce kernel because it is not supported on current platform. WARNING 05-23 02:56:42 [_logger.py:72] Environment variable VLLM_CPU_KVCACHE_SPACE (GiB) for CPU backend is not set, using 4 by default. WARNING 05-23 02:56:42 [_logger.py:72] uni is not supported on CPU, fallback to mp distributed executor backend. INFO 05-23 02:56:42 [api_server.py:246] Started engine process with PID 273 /opt/venv/lib/python3.12/site-packages/vllm/transformers_utils/tokenizer_group/tokenizer_group.py:25: FutureWarning: It is stro ngly recommended to run mistral models with --tokenizer-mode "mistral"to ensure correct encoding and decoding. self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config) INFO 05-23 02:56:46 [__init__.py:239] Automatically detected platform cpu. WARNING 05-23 02:56:47 [_logger.py:72] Torch Profiler is enabled in the API server. This should ONLY be used for local developm ent! INFO 05-23 02:56:47 [llm_engine.py:242] Initializing a V0 LLM engine (v0.8.3) with config: model='mistralai/Mistral-7B-Instruct -v0.3', speculative_config=None, tokenizer='mistralai/Mistral-7B-Instruct-v0.3', skip_tokenizer_init=False, tokenizer_mode=auto , revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_l en=32768, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_ reduce=True, quantization=None, enforce_eager=True, kv_cache_dtype=auto, device_config=cpu, decoding_config=DecodingConfig(gui ded_decoding_backend='xgrammar', reasoning_backend=None), observability_config=ObservabilityConfig(show_hidden_metrics=False, o tlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=None, served_model_name=mis tralai/Mistral-7B-Instruct-v0.3, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=None, chunked_pre fill_enabled=False, use_async_output_proc=False, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=N one, compilation_config={"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[256,248,240,232,224,216,208,200,192,1 84,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"max_capture_size":256}, use_cached_output s=True, /opt/venv/lib/python3.12/site-packages/vllm/transformers_utils/tokenizer_group/tokenizer_group.py:25: FutureWarning: It is stro ngly recommended to run mistral models with--tokenizer-mode "mistral" to ensure correct encoding and decoding. self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config) INFO 05-23 02:56:48 [cpu.py:45] Using Torch SDPA backend. INFO 05-23 02:56:48 [importing.py:16] Triton not installed or not compatible; certain GPU-related functions will not be availab le. INFO 05-23 02:56:48 [cpu_worker.py:196] Profiling enabled. Traces will be saved to: /mnt INFO 05-23 02:56:48 [parallel_state.py:957] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0 ERROR 05-23 02:56:48 [engine.py:448] unsupported operand type(s) for *: 'int' and 'NoneType' ERROR 05-23 02:56:48 [engine.py:448] Traceback (most recent call last): ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", lin e 436, in run_mp_engine ERROR 05-23 02:56:48 [engine.py:448] engine = MQLLMEngine.from_vllm_config( ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Process SpawnProcess-1: ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", lin e 128, in from_vllm_config ERROR 05-23 02:56:48 [engine.py:448] return cls( ERROR 05-23 02:56:48 [engine.py:448] ^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", lin e 82, in __init__ ERROR 05-23 02:56:48 [engine.py:448] self.engine = LLMEngine(*args, **kwargs) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 281, in __ init__ ERROR 05-23 02:56:48 [engine.py:448] self.model_executor = executor_class(vllm_config=vllm_config, ) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 286, in __init__ ERROR 05-23 02:56:48 [engine.py:448] super().__init__(*args, **kwargs) ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, i n __init__ ERROR 05-23 02:56:48 [engine.py:448] self._init_executor() ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor ERROR 05-23 02:56:48 [engine.py:448] self._run_workers("load_model", ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers ERROR 05-23 02:56:48 [engine.py:448] driver_worker_output = run_method(self.driver_worker, sent_method, ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/utils.py", line 2347, in run_method ERROR 05-23 02:56:48 [engine.py:448] return func(*args, **kwargs) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/worker/cpu_worker.py", line 233, in lo ad_model ERROR 05-23 02:56:48 [engine.py:448] self.model_runner.load_model() ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/worker/cpu_model_runner.py", line 491, in load_model ERROR 05-23 02:56:48 [engine.py:448] self.model = get_model(vllm_config=self.vllm_config) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/__init__.p y", line 14, in get_model ERROR 05-23 02:56:48 [engine.py:448] return loader.load_model(vllm_config=vllm_config) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/loader.py" , line 441, in load_model ERROR 05-23 02:56:48 [engine.py:448] model = _initialize_model(vllm_config=vllm_config) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/loader.py" , line 127, in _initialize_model ERROR 05-23 02:56:48 [engine.py:448] return model_class(vllm_config=vllm_config, prefix=prefix) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 486, in __init__ ERROR 05-23 02:56:48 [engine.py:448] self.model = self._init_model(vllm_config=vllm_config, ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 527, in _init_model ERROR 05-23 02:56:48 [engine.py:448] return LlamaModel(vllm_config=vllm_config, ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 151, in __init__ ERROR 05-23 02:56:48 [engine.py:448] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 321, in __init__ ERROR 05-23 02:56:48 [engine.py:448] self.start_layer, self.end_layer, self.layers = make_layers( ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 610, in make_layers ERROR 05-23 02:56:48 [engine.py:448] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 323, in <lambda> ERROR 05-23 02:56:48 [engine.py:448] lambda prefix: layer_type(config=config, ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 239, in __init__ ERROR 05-23 02:56:48 [engine.py:448] self.self_attn = LlamaAttention( ERROR 05-23 02:56:48 [engine.py:448] ^^^^^^^^^^^^^^^ ERROR 05-23 02:56:48 [engine.py:448] File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 135, in __init__ ERROR 05-23 02:56:48 [engine.py:448] self.rotary_dim = int(partial_rotary_factor * self.head_dim) ERROR 05-23 02:56:48 [engine.py:448] ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ ERROR 05-23 02:56:48 [engine.py:448] TypeError: unsupported operand type(s) for *: 'int' and 'NoneType' Traceback (most recent call last): File "/root/.local/share/uv/python/cpython-3.12.10-linux-x86_64-gnu/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/root/.local/share/uv/python/cpython-3.12.10-linux-x86_64-gnu/lib/python3.12/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 450, in run_mp_engine raise e File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 436, in run_mp_engine engine = MQLLMEngine.from_vllm_config( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 128, in from_vllm_config return cls( ^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/engine/multiprocessing/engine.py", line 82, in __init__ self.engine = LLMEngine(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 281, in __init__ self.model_executor = executor_class(vllm_config=vllm_config, ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 286, in __init__ super().__init__(*args, **kwargs) File "/opt/venv/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, in __init__ self._init_executor() File "/opt/venv/lib/python3.12/site-packages/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor self._run_workers("load_model", File "/opt/venv/lib/python3.12/site-packages/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers driver_worker_output = run_method(self.driver_worker, sent_method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/utils.py", line 2347, in run_method return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/worker/cpu_worker.py", line 233, in load_model self.model_runner.load_model() File "/opt/venv/lib/python3.12/site-packages/vllm/worker/cpu_model_runner.py", line 491, in load_model self.model = get_model(vllm_config=self.vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model return loader.load_model(vllm_config=vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/loader.py", line 441, in load_model model = _initialize_model(vllm_config=vllm_config) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/model_loader/loader.py", line 127, in _initialize_model return model_class(vllm_config=vllm_config, prefix=prefix) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 486, in __init__ self.model = self._init_model(vllm_config=vllm_config, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 527, in _init_model return LlamaModel(vllm_config=vllm_config, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/compilation/decorators.py", line 151, in __init__ old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs) File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 321, in __init__ self.start_layer, self.end_layer, self.layers = make_layers( ^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/utils.py", line 610, in make_layers maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 323, in <lambda> lambda prefix: layer_type(config=config, ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 239, in __init__ self.self_attn = LlamaAttention( ^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/model_executor/models/llama.py", line 135, in __init__ self.rotary_dim = int(partial_rotary_factor * self.head_dim) ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ TypeError: unsupported operand type(s) for *: 'int' and 'NoneType' Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/opt/venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1121, in <module> uvloop.run(run_server(args)) File "/opt/venv/lib/python3.12/site-packages/uvloop/__init__.py", line 109, in run return __asyncio.run( ^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.12.10-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run return runner.run(main) ^^^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.12.10-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete File "/opt/venv/lib/python3.12/site-packages/uvloop/__init__.py", line 61, in wrapper return await main ^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 1069, in run_server async with build_async_engine_client(args) as engine_client: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.12.10-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 146, in build_async_engine_client async with build_async_engine_client_from_engine_args( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.local/share/uv/python/cpython-3.12.10-linux-x86_64-gnu/lib/python3.12/contextlib.py", line 210, in __aenter__ return await anext(self.gen) ^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.12/site-packages/vllm/entrypoints/openai/api_server.py", line 269, in build_async_engine_client_f rom_engine_args raise RuntimeError( RuntimeError: Engine process failed to start. See stack trace for the root cause.

Reproduce steps

cd CodeTrans/tests
bash test_compose_on_xeon.sh

Raw log

Attachments

No response

Metadata

Metadata

Labels

A1high proritybugSomething isn't working

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions