Skip to content

[Bug]: AssertionError when stopping a recently started job #3950

@jvstme

Description

@jvstme

Steps to reproduce

  1. Have no idle instances.
  2. Apply the configuration:
    type: dev-environment
    backends: [gcp]
  3. Wait until the job enters provisioning status.
  4. Stop the run immediately.

Actual behaviour

The run remains in terminating indefinitely.

Expected behaviour

The run stops.

dstack version

master

Server logs

DEBUG    dstack._internal.server.background.pipeline_tasks.jobs_terminating:664 job(73cde6)strange-squid-1-0-0: stopping container              
ERROR    dstack._internal.server.background.pipeline_tasks.base:361 Unexpected exception when processing item                                   
         Traceback (most recent call last):                                                                                                     
           File "/dstack/src/dstack/_internal/server/background/pipeline_tasks/base.py", line 359, in start              
             await self.process(item)                                                                                                           
           File "/dstack/src/dstack/_internal/server/utils/sentry_utils.py", line 28, in wrapper                         
             return await f(*args, **kwargs)                                                                                                    
           File "/dstack/src/dstack/_internal/server/background/pipeline_tasks/jobs_terminating.py", line 271, in process
             result = await _process_terminating_job(                                                                                           
           File "/dstack/src/dstack/_internal/server/background/pipeline_tasks/jobs_terminating.py", line 666, in        
         _process_terminating_job                                                                                                               
             if not await _stop_container(job_model, jpd, ssh_private_keys):                                                                    
           File "/dstack/src/dstack/_internal/server/background/pipeline_tasks/jobs_terminating.py", line 846, in        
         _stop_container                                                                                                                        
             return await common.run_async(                                                                                                     
           File "/dstack/src/dstack/_internal/utils/common.py", line 51, in run_async                                    
             return await asyncio.get_running_loop().run_in_executor(None, func_with_args)                                                      
           File "/usr/lib64/python3.10/concurrent/futures/thread.py", line 58, in run                                                           
             result = self.fn(*self.args, **self.kwargs)                                                                                        
           File "/dstack/src/dstack/_internal/server/services/runner/ssh.py", line 61, in wrapper                        
             conn = InstanceConnection(                                                                                                         
           File "/dstack/src/dstack/_internal/server/services/runner/pool.py", line 197, in __init__                     
             self._key = InstanceConnectionKey.from_jpd(jpd, jrd)                                                                               
           File "/dstack/src/dstack/_internal/server/services/runner/pool.py", line 50, in from_jpd                      
             assert jpd.hostname is not None and jpd.ssh_port is not None                                                                       
         AssertionError

Additional information

Appears to have been introduced by #3936. Can be reproduced both with and without DSTACK_SERVER_SSH_POOL_ENABLED.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingmajor

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions