Skip to content

fix(ls): force C locale so non-English month names don't break parsing#1390

Open
GabriWar wants to merge 1 commit intortk-ai:masterfrom
GabriWar:fix/ls-non-english-locale
Open

fix(ls): force C locale so non-English month names don't break parsing#1390
GabriWar wants to merge 1 commit intortk-ai:masterfrom
GabriWar:fix/ls-non-english-locale

Conversation

@GabriWar
Copy link
Copy Markdown

Summary

rtk ls returns (empty) for any non-empty directory when the user's locale prints non-English month abbreviations (e.g. LC_TIME=pt_BR.UTF-8abr, mar, out).

LS_DATE_RE in src/cmds/system/ls.rs is English-only and case-sensitive:

r"\s+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+\d{1,2}\s+(?:\d{4}|\d{2}:\d{2})\s+"

When ls -la prints a Portuguese/French/etc. month, the regex doesn't match, parse_ls_line returns None for every line, and compact_ls falls through to the (empty) sentinel at line 194 — silently hiding every file and directory.

Repro

$ echo $LC_TIME
pt_BR.UTF-8

$ /usr/bin/ls -la ~/.local/bin/ | head -3
total 330912
drwxr-xr-x 1 gabriel gabriel      2592 abr 18 18:59 .
-rwxr-xr-x 1 gabriel gabriel        99 jan 13 16:25 agent

$ rtk -vvv ls ~/.local/bin/
Chars: 9773 → 8 (100% reduction)
(empty)

A 100% reduction on a directory with 140+ entries is a strong hint the filter is eating real data.

Fix

Set LC_ALL=C on the spawned ls process, forcing English month names regardless of the user's locale. Smallest possible change, localized to where the locale-dependent parsing happens.

cmd.env("LC_ALL", "C");

This keeps the existing regex/parser contract intact (it already assumes a stable English format) and doesn't touch any other command path.

After

$ rtk ls ~/.local/bin/ | head -3
agent  99B
aider -> /home/gabriel/.local/share/uv/tools/aider-chat/bin/aider  56B
alphashape  185B

Scope check: grep -rn "Jan|Feb|Mar" src/ confirms ls.rs is the only runtime path with locale-dependent date parsing. tree / find / other proxies don't hit this.

Test plan

  • Manual repro under LC_TIME=pt_BR.UTF-8 before fix — reproduces (empty)
  • Manual verification after fix — rtk ls lists the directory correctly
  • cargo build --release clean
  • Existing ls tests unaffected (they feed the parser directly and are locale-independent)

LS_DATE_RE is English-only and case-sensitive. When the user's locale
prints months in another language (e.g. LC_TIME=pt_BR.UTF-8 → "abr",
"mar", "out"), every `ls -la` line fails the regex, parse_ls_line
returns None, and compact_ls falls through to the "(empty)" sentinel —
even for directories that clearly are not empty.

Setting LC_ALL=C on the spawned `ls` process guarantees English month
names regardless of the user's locale, which is both the smallest fix
and consistent with the rest of the parser contract (it expects a
specific, stable ls format).

Repro before fix:
  LC_TIME=pt_BR.UTF-8 rtk -vvv ls /some/nonempty/dir
  # Chars: 9773 → 8 (100% reduction)
  # (empty)
Copilot AI review requested due to automatic review settings April 18, 2026 22:22
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 18, 2026

CLA assistant check
All committers have signed the CLA.

@pszymkowiak pszymkowiak added bug Something isn't working effort-small Quelques heures, 1 fichier filter-quality Filter produces incorrect/truncated signal good first issue Good for newcomers labels Apr 18, 2026
@pszymkowiak
Copy link
Copy Markdown
Collaborator

[w] wshm · Automated triage by AI

📊 Automated PR Analysis

🐛 Type bug-fix
🟢 Risk low

Summary

Fixes a bug where rtk ls returns (empty) for non-empty directories when the user's locale uses non-English month abbreviations. The fix sets LC_ALL=C on the spawned ls process so month names always match the English-only regex.

Review Checklist

  • Tests present
  • Breaking change
  • Docs updated

Analyzed automatically by wshm · This is an automated analysis, not a human review.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes rtk ls incorrectly returning (empty) under non-English locales by forcing the spawned ls process into a locale that matches the English-only LS_DATE_RE parsing contract.

Changes:

  • Set a fixed locale on the ls subprocess to ensure month abbreviations match the parser’s regex.
  • Add inline documentation explaining the locale/parsing interaction and failure mode.

Comment thread src/cmds/system/ls.rs
Comment on lines +38 to +42
// Force C locale so month names match LS_DATE_RE, which is English-only.
// Without this, non-English locales (e.g. LC_TIME=pt_BR.UTF-8 → "abr"/"mar")
// cause every line to fail the regex, and compact_ls returns "(empty)"
// even for non-empty directories.
cmd.env("LC_ALL", "C");
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using LC_ALL=C forces all locale categories for the spawned ls process. That can change filename rendering/escaping and collation (and may cause non-ASCII filenames to be shown differently), even though this bug only depends on month names from LC_TIME. Prefer setting LC_TIME=C (or LC_TIME=POSIX) so date parsing is stabilized without affecting character encoding/display behavior.

Suggested change
// Force C locale so month names match LS_DATE_RE, which is English-only.
// Without this, non-English locales (e.g. LC_TIME=pt_BR.UTF-8 → "abr"/"mar")
// cause every line to fail the regex, and compact_ls returns "(empty)"
// even for non-empty directories.
cmd.env("LC_ALL", "C");
// Force only the time locale to C so month names match LS_DATE_RE,
// which is English-only. Without this, non-English locales
// (e.g. LC_TIME=pt_BR.UTF-8 → "abr"/"mar") cause every line to fail
// the regex, and compact_ls returns "(empty)" even for non-empty
// directories, while preserving other locale-sensitive output behavior.
cmd.env("LC_TIME", "C");

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working effort-small Quelques heures, 1 fichier filter-quality Filter produces incorrect/truncated signal good first issue Good for newcomers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants