Skip to content

running eval with -S '{"enable_thinking": false}' gives errors #668

@A-Yucel

Description

@A-Yucel

full command:

prime eval run primeintellect/gsm8k -m Qwen/Qwen3.5-122B-A10B -n 10 -r 1 -s -S '{"enable_thinking": false}'

logs

and here are the env_worker_0.log
2026-05-18 16:57:09 - verifiers.utils.env_utils - INFO - Loading environment: gsm8k
2026-05-18 16:57:09 - verifiers.utils.env_utils - INFO - Using default args: num_train_examples=-1, num_eval_examples=-1, system_prompt='Please reason step by step, and put your final answer within \boxed{}.'
2026-05-18 16:57:17 - verifiers.utils.env_utils - INFO - Successfully loaded environment 'gsm8k'
2026-05-18 16:57:17 - verifiers.utils.thread_utils - INFO - Scaled default executor and 1 registered executor(s) (math-verify=1)
2026-05-18 16:57:17 - verifiers.serve.server.env_worker.EnvWorker - INFO - Initialized worker gsm8k-0 on [redacted]
2026-05-18 16:57:17 - verifiers.serve.server.env_worker.EnvWorker - INFO - Starting worker gsm8k-0
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:22 - verifiers.envs.environment.SingleTurnEnv - ERROR - Aborted rollout due to ModelError() -> TypeError("AsyncCompletions.create() got an unexpected keyword argument 'enable_thinking'")
2026-05-18 16:57:39 - verifiers.utils.process_utils - INFO - Death pipe closed — parent is gone, sending SIGTERM to self
2026-05-18 16:57:39 - verifiers.serve.server.env_worker.EnvWorker - INFO - Shut down worker gsm8k-0

Error description

When i run:
prime eval run primeintellect/gsm8k -m Qwen/Qwen3.5-122B-A10B -n 10 -r 1 -s -S '{"enable_thinking": false}'

as recommended in the prime intellect docs I get errors as shown in the logs. As indicated by the logs the error is because of the -S '{"enable_thinking": false}' `.

After seeing #558 I decided to run with the old way:
prime eval run primeintellect/gsm8k -m Qwen/Qwen3.5-122B-A10B -n 10 -r 1 -s -S '{"extra_body": {"chat_template_kwargs": {"enable_thinking": false}}}'
and it works fine.

So it could be related to #558
But as I am recently starting to experiment with prime labs, I am not really sure where the issue is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions