Skip to content

nxf_date fails on images using uutils coreutils (e.g. ubuntu:latest → Ubuntu 26.04): Unexpected: unbound variable at .command.run line 178 #7114

@robsyme

Description

@robsyme

Bug report

This sits between a bug report and a feature request: the trigger is a change in Ubuntu (26.04 swaps GNU coreutils for uutils coreutils), not in Nextflow itself. But because Nextflow's wrapper assumes GNU date semantics, every pipeline that runs on ubuntu:latest (or any image based on Ubuntu 26.04) breaks the moment Docker Hub rolls the floating tag forward. Worth handling on the Nextflow side so user pipelines don't silently break on a base-image bump.

Expected behavior and actual behavior

Expected: nxf_date returns a 13-digit millisecond epoch on any modern Linux container, regardless of which coreutils implementation provides date. The trace block in .command.run then computes wall_time = end_millis - start_millis and the task completes normally.

Actual: On Ubuntu 26.04 (which ships uutils coreutils 0.8.0), date +%s%3N does not produce a 13-digit value. uutils ignores the field-width modifier on %N and additionally strips leading zeros, so the raw output is 17–19 digits and varies sample-to-sample. None of nxf_date's length branches match, so it falls through to echo "Unexpected timestamp value: \$ts"; exit 1. The exit 1 only kills the \$() subshell — the caller's local end_millis=\$(nxf_date) swallows the exit and assigns the literal string "Unexpected timestamp value: 1778…". The next line, local wall_time=\$((end_millis - start_millis)), feeds that string into bash arithmetic. Arithmetic recursively expands the variable, hits the word Unexpected, treats it as a variable name, and with set -u aborts with bash: line 178: Unexpected: unbound variable.

Steps to reproduce the problem

Self-contained reproducer (copies the helper verbatim from command-run.txt and runs the same arithmetic that fails on line 178):

#!/usr/bin/env bash
set -euo pipefail

nxf_date() {
    local ts=\$(date +%s%3N)
    if [[ \${#ts} == 10 ]]; then echo \${ts}000
    elif [[ \${#ts} == 13 ]]; then echo \$ts
    elif [[ \${#ts} == 16 ]]; then echo \${ts:0:13}
    elif [[ \${#ts} == 19 ]]; then echo \${ts:0:13}
    else echo "Unexpected timestamp value: \$ts"; exit 1
    fi
}

# Wrapped in a function so `local var=\$(...)` masks the subshell's
# exit 1 — exactly how Nextflow's .command.run is structured.
nxf_trace_excerpt() {
    local start_millis=\$(nxf_date)
    sleep 0.1
    local end_millis=\$(nxf_date)
    local wall_time=\$((end_millis - start_millis))   # line 178 in real .command.run
    echo "wall_time=\$wall_time ms"
}

nxf_trace_excerpt

Run it under both images:

docker run --rm -v "\$PWD":/x -w /x ubuntu:latest bash repro.sh   # FAILS
docker run --rm -v "\$PWD":/x -w /x ubuntu:24.04  bash repro.sh   # PASSES

Equivalently, a one-liner that shows the underlying mismatch:

\$ docker run --rm ubuntu:latest bash -c 'date --version | head -1; date +%s%3N'
date (uutils coreutils) 0.8.0
1778154315811293            # 17–19 digits, varies

\$ docker run --rm ubuntu:24.04 bash -c 'date +%s%3N'
1778154315562                # 13 digits

Program output

From .command.run on the failing run:

.command.run: line 178: Unexpected: unbound variable

From the reproducer above against ubuntu:latest:

start_millis=Unexpected timestamp value: 1778154315127540679
end_millis=Unexpected timestamp value: 1778154315306201263
bash: line 33: Unexpected: unbound variable

Environment

  • Nextflow version: 25.10.5 (also reproduces on 25.10.4 — the bug is in the wrapper template, not version-specific)
  • Java version: not relevant (failure is in the bash wrapper)
  • Operating system: container is ubuntu:latest, which Docker Hub now resolves to Ubuntu 26.04 LTS. Host OS irrelevant.
  • Bash version: GNU bash, version 5.2.x inside the container

Additional context

Why a wider length branch is not enough. A naive fix of "add a 19-digit branch and truncate to 13" looks tempting but is fragile. uutils strips leading zeros from %N, so the raw length varies — we measured 18 digits in the customer's run, and immediately after a second tick the length can drop further. A length whitelist papers over the symptom rather than fixing the assumption.

Suggested fix. Stop relying on %3N (which uutils does not honour) and compute milliseconds explicitly from %s and a zero-padded %N:

nxf_date() {
    local s=\$(date +%s)
    local n=\$(date +%N)
    n=\$(printf '%09d' "\$((10#\$n))")     # uutils strips leading zeros; re-pad to 9
    echo "\$((s * 1000 + 10#\${n:0:3}))"
}

Verified against both ubuntu:latest (uutils 0.8.0) and ubuntu:24.04 (GNU coreutils 9.4); both produce a plausible 13-digit value and a non-negative wall-time delta. The same change should be applied to the test fixtures (test-bash-wrapper.txt, test-bash-wrapper-with-trace.txt).

Customer-side workaround. Pin the process container to ubuntu:24.04 (or any GNU-coreutils image — debian:stable, almalinux, etc.) instead of ubuntu:latest. Floating :latest tags are best avoided in production pipelines for exactly this reason.

Blast radius. Any pipeline using ubuntu:latest, or any user-built image based on FROM ubuntu:latest / FROM ubuntu:26.04. Will affect more users as Ubuntu 26.04 propagates through derivative images.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions