Bug Report
Description
I'm on Windows 11 using Python 3.14.4 and installed dvc as recommended using pipx. When queuing experiments starting workers using dvc queue start simply does nothing.
Reproduce
$ git init && dvc init
$ echo -e "1\n2\n3" > input.txt
$ echo "open('output.txt', 'w').write('\n'.join(str(int(line) + 1) for line in open('input.txt')))" > example.py
$ dvc stage add -n example -d example.py -d input.txt -o output.txt python example.py
$ git add -A && git commit -m "initial commit"
$ dvc exp run --queue
$ dvc queue start
Started '1' new experiments task queue worker.
$ dvc queue status
Task Name Created Status
d899386 bardy-doll 07:51 PM Queued
Worker status: 0 active, 0 idle
The worker was seemingly spawned okay, but just never picks up the queued job.
Expected
I expected the worker to spawn, pick up the job ("Running" -> "Success"), then clean up after itself.
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 3.67.1 (pip)
-------------------------
Platform: Python 3.14.4 on Windows-11-10.0.26200-SP0
Subprojects:
dvc_data = 3.18.3
dvc_objects = 5.2.0
dvc_render = 1.0.2
dvc_task = 0.40.2
scmrepo = 3.6.2
Supports:
http (aiohttp = 3.13.5, aiohttp-retry = 2.9.1),
https (aiohttp = 3.13.5, aiohttp-retry = 2.9.1)
Config:
Global: C:\Users\<USERNAME>\AppData\Local\iterative\dvc
System: C:\ProgramData\iterative\dvc
Cache types: hardlink
Cache directory: NTFS on D:\
Caches: local
Remotes: None
Workspace directory: NTFS on D:\
Repo: dvc, git
Repo.site_cache_dir: C:\ProgramData\iterative\dvc\Cache\repo\d482bc26ad57e57c35ce8a849aad758b
Additional Information (if any):
The problem is that the worker actually spawns, but crashes instantly. I set environment variable DVC_DAEMON_LOGFILE and used dvc queue start -v to get a better idea of what was going on.
This is the log output:
Traceback (most recent call last):
File "C:\Users\<USERNAME>\pipx\venvs\dvc\Lib\site-packages\dvc\__main__.py", line 5, in <module>
from dvc.cli import main
File "C:\Users\<USERNAME>\pipx\venvs\dvc\Lib\site-packages\dvc\__init__.py", line 7, in <module>
import dvc.logger
File "C:\Users\<USERNAME>\pipx\venvs\dvc\Lib\site-packages\dvc\logger.py", line 3, in <module>
import logging
File "C:\Users\<USERNAME>\AppData\Local\Python\pythoncore-3.14-64\Lib\logging\__init__.py", line 26, in <module>
import sys, os, time, io, re, traceback, warnings, weakref, collections.abc
File "C:\Users\<USERNAME>\AppData\Local\Python\pythoncore-3.14-64\Lib\re\__init__.py", line 125, in <module>
import enum
File "C:\Users\<USERNAME>\AppData\Local\Python\pythoncore-3.14-64\Lib\enum.py", line 3, in <module>
from types import MappingProxyType, DynamicClassAttribute
File "C:\Users\<USERNAME>\pipx\venvs\dvc\Lib\site-packages\dvc\types.py", line 1, in <module>
from typing import TYPE_CHECKING, Any, AnyStr, Union
File "C:\Users\<USERNAME>\AppData\Local\Python\pythoncore-3.14-64\Lib\typing.py", line 26, in <module>
import functools
File "C:\Users\<USERNAME>\AppData\Local\Python\pythoncore-3.14-64\Lib\functools.py", line 22, in <module>
from types import GenericAlias, MethodType, MappingProxyType, UnionType
ImportError: cannot import name 'GenericAlias' from 'types' (consider renaming 'C:\\Users\\<USERNAME>\\pipx\\venvs\\dvc\\Lib\\site-packages\\dvc\\types.py' since it has the same name as the standard library module named 'types' and prevents importing that standard library module)
The dvc.types module shadows the built-in types module, which causes the worker to crash with the stacktrace shown. However, the code that spawns workers isn't sophisticated enough to notice the non-zero exit code.
The problem is in dvc/daemon.py:59:
|
def _get_dvc_args() -> list[str]: |
|
args = [sys.executable] |
|
if not is_binary(): |
|
root_dir = os.path.abspath(os.path.dirname(__file__)) |
|
main_entrypoint = os.path.join(root_dir, "__main__.py") |
|
args.append(main_entrypoint) |
|
return args |
Basically, when executing through Python, the workers are actually spawned by calling python <path-to-site-packages>/dvc/__main__.py directly, which means the dvc directory will the prepended to sys.path as the first thing.
Now, I don't know enough about dvc or the intention behind calling it that way (and the blame proved not too helpful, either), but I believe simply changing _get_dvc_args as follows would solve the issue. At least it did solve this issue for me.
def _get_dvc_args() -> list[str]:
args = [sys.executable]
if not is_binary():
args.append('-m')
args.append('dvc')
return args
This should be safe, because
- the rest of the code assumes the
dvc module is available anyway (import dvc.<xyz> etc.), and
- there is code at
dvc/daemon.py:177 that puts the site-packages directory in PYTHONPATH, and
- the
dvc.daemon.daemonize function is used exactly once exclusively in the Windows-only part of dvc.repo.experiments.queue.celery
However, maybe there is currently a re-factoring going on to switch to dvc_task.proc.process.ManagedProcess that I may not be aware of? Of course, this doesn't improve _detached_subprocess simply not noticing if the workers it spawned died before doing any work. But that change would seem to be somewhat more involved.
A similar issue might have occurred in issue #10829, which seemingly used a snap-installed dvc on Windows Subsystem for Linux within _posix_detached_subprocess, although I didn't look at it more closely.
Bug Report
Description
I'm on Windows 11 using Python 3.14.4 and installed
dvcas recommended usingpipx. When queuing experiments starting workers usingdvc queue startsimply does nothing.Reproduce
The worker was seemingly spawned okay, but just never picks up the queued job.
Expected
I expected the worker to spawn, pick up the job ("Running" -> "Success"), then clean up after itself.
Environment information
Output of
dvc doctor:Additional Information (if any):
The problem is that the worker actually spawns, but crashes instantly. I set environment variable
DVC_DAEMON_LOGFILEand useddvc queue start -vto get a better idea of what was going on.This is the log output:
The
dvc.typesmodule shadows the built-intypesmodule, which causes the worker to crash with the stacktrace shown. However, the code that spawns workers isn't sophisticated enough to notice the non-zero exit code.The problem is in
dvc/daemon.py:59:dvc/dvc/daemon.py
Lines 59 to 65 in 06ff81c
Basically, when executing through Python, the workers are actually spawned by calling
python <path-to-site-packages>/dvc/__main__.pydirectly, which means thedvcdirectory will the prepended tosys.pathas the first thing.Now, I don't know enough about
dvcor the intention behind calling it that way (and the blame proved not too helpful, either), but I believe simply changing_get_dvc_argsas follows would solve the issue. At least it did solve this issue for me.This should be safe, because
dvcmodule is available anyway (import dvc.<xyz>etc.), anddvc/daemon.py:177that puts thesite-packagesdirectory inPYTHONPATH, anddvc.daemon.daemonizefunction is used exactly once exclusively in the Windows-only part ofdvc.repo.experiments.queue.celeryHowever, maybe there is currently a re-factoring going on to switch to
dvc_task.proc.process.ManagedProcessthat I may not be aware of? Of course, this doesn't improve_detached_subprocesssimply not noticing if the workers it spawned died before doing any work. But that change would seem to be somewhat more involved.A similar issue might have occurred in issue #10829, which seemingly used a snap-installed
dvcon Windows Subsystem for Linux within_posix_detached_subprocess, although I didn't look at it more closely.