Skip to content

feat(df): add thousands separator support#9090

Open
naoNao89 wants to merge 4 commits intouutils:mainfrom
naoNao89:feature/thousands-separator-df
Open

feat(df): add thousands separator support#9090
naoNao89 wants to merge 4 commits intouutils:mainfrom
naoNao89:feature/thousands-separator-df

Conversation

@naoNao89
Copy link
Copy Markdown
Contributor

Adds GNU-compatible thousands separator formatting to ls, du, and df. Use a leading quote in --block-size to get readable output: --block-size="'1" shows 1,024,000 instead of 1024000.

Respects LC_NUMERIC locale (comma for en_US, period for European locales, none for C/POSIX). Works with environment variables too (LS_BLOCK_SIZE, DU_BLOCK_SIZE, DF_BLOCK_SIZE, etc.).

Core changes in uucore (parse_size.rs, human.rs) handle the parsing and formatting. Each utility integrates it with minimal changes. 12 new integration tests, all existing tests pass.

Closes #9084

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch 3 times, most recently from 3ce60cf to 48f4bc1 Compare October 30, 2025 12:40
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch 2 times, most recently from a8966af to ae09fec Compare October 30, 2025 14:15
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 closed this by deleting the head repository Nov 6, 2025
@naoNao89 naoNao89 reopened this Nov 7, 2025
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Nov 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Copy Markdown
Contributor

sylvestre commented Dec 26, 2025

could you please split this per program in different pr ? thanks
it is harder to review otherwise

/// - `','` for other locales (default, en_US style)
fn get_thousands_separator() -> char {
// Try to read LC_NUMERIC or LANG environment variable
if let Ok(locale) = std::env::var("LC_NUMERIC")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Environment variable reads on every call is inefficient - consider caching the locale information


// Simple heuristic: European locales use period, others use comma
// This covers common cases like de_DE, fr_FR, it_IT, es_ES, nl_NL, etc.
if locale.starts_with("de_")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it?
i don't like this hardcoded list. isn't something in icu to do that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oki, replace with ICU's FixedDecimalFormatter

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 26, 2025
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 26, 2025
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 8b92113 to fb15063 Compare December 26, 2025 02:35
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Dec 26, 2025
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from fb15063 to 13225ab Compare December 26, 2025 02:35
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 26, 2025

Merging this PR will not alter performance

✅ 287 untouched benchmarks
⏩ 38 skipped benchmarks1


Comparing naoNao89:feature/thousands-separator-df (7179294) with main (b439534)

Open in CodSpeed

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 8aa4f8c to c6cb13e Compare December 26, 2025 10:39
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

1 similar comment
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Note: The gnu test tests/pr/pr-tests was skipped on 'main' but is now failing.

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 29d396b to e85ae66 Compare December 26, 2025 11:40
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from e85ae66 to 714a5a2 Compare December 26, 2025 15:25
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 714a5a2 to 3be82c4 Compare December 26, 2025 18:36
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Changes:
- Add extract_thousands_separator_flag() to parse_size.rs
- Add format_with_thousands_separator() to human.rs with locale support
- Add BlockSizeConfig struct to df/blocks.rs to hold separator flag
- Update df to use new block size configuration
- Add 4 integration tests for thousands separator functionality
- Fix clippy warnings (inlined format args, unnecessary qualifications)

Fixes PR uutils#9090
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 97ff706 to f34a0cc Compare February 17, 2026 04:17
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Changes:
- Add extract_thousands_separator_flag() to parse_size.rs
- Add format_with_thousands_separator() to human.rs with locale support
- Add BlockSizeConfig struct to df/blocks.rs to hold separator flag
- Update df to use new block size configuration
- Add 4 integration tests for thousands separator functionality
- Fix clippy warnings (inlined format args, unnecessary qualifications)

Fixes PR uutils#9090
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from f34a0cc to ca4b994 Compare February 17, 2026 04:22
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Changes:
- Add extract_thousands_separator_flag() to parse_size.rs
- Add format_with_thousands_separator() to human.rs with locale support
- Add BlockSizeConfig struct to df/blocks.rs to hold separator flag
- Update df to use new block size configuration
- Add 4 integration tests for thousands separator functionality
- Fix clippy warnings (inlined format args, unnecessary qualifications)

Fixes PR uutils#9090
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from ca4b994 to 66195aa Compare February 17, 2026 04:24
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Changes:
- Add extract_thousands_separator_flag() to parse_size.rs
- Add format_with_thousands_separator() to human.rs with locale support
- Add BlockSizeConfig struct to df/blocks.rs to hold separator flag
- Update df to use new block size configuration
- Add 4 integration tests for thousands separator functionality
- Fix clippy warnings (inlined format args, unnecessary qualifications)

Fixes PR uutils#9090
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 66195aa to 7a559c7 Compare February 17, 2026 04:28
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Includes:
- extract_thousands_separator_flag() in parse_size.rs
- format_with_thousands_separator() in human.rs with locale support
- BlockSizeConfig struct in df/blocks.rs
- 4 integration tests

Also includes ParserBuilderError for better block size validation
(restored from PR uutils#9090).
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 7a559c7 to e99e193 Compare February 17, 2026 04:32
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Includes:
- extract_thousands_separator_flag() in parse_size.rs
- format_with_thousands_separator() in human.rs with locale support
- BlockSizeConfig struct in df/blocks.rs
- 4 integration tests

Also includes ParserBuilderError for better block size validation
(restored from PR uutils#9090).
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch 2 times, most recently from 4a4f79d to 7a559c7 Compare February 17, 2026 04:33
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Includes:
- extract_thousands_separator_flag() in parse_size.rs
- format_with_thousands_separator() in human.rs with locale support
- BlockSizeConfig struct in df/blocks.rs
- 4 integration tests

Also includes ParserBuilderError for better block size validation
(restored from PR uutils#9090).
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from 7a559c7 to f4aa95f Compare February 17, 2026 04:34
@naoNao89 naoNao89 marked this pull request as ready for review February 17, 2026 04:38
naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Includes:
- extract_thousands_separator_flag() in parse_size.rs
- format_with_thousands_separator() in human.rs with locale support
- BlockSizeConfig struct in df/blocks.rs
- 4 integration tests

Also includes ParserBuilderError for better block size validation
(restored from PR uutils#9090).
@naoNao89 naoNao89 force-pushed the feature/thousands-separator-df branch from f4aa95f to efd45e8 Compare February 17, 2026 04:38
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

GNU test failed: tests/cut/bounded-memory. tests/cut/bounded-memory is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/pr/bounded-memory (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/misc/io-errors is no longer failing!
Congrats! The gnu test tests/tail/retry is no longer failing!

naoNao89 added a commit to naoNao89/coreutils that referenced this pull request Feb 17, 2026
Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Includes:
- extract_thousands_separator_flag() in parse_size.rs
- format_with_thousands_separator() in human.rs with locale support
- BlockSizeConfig struct in df/blocks.rs
- 4 integration tests

Also includes ParserBuilderError for better block size validation
(restored from PR uutils#9090).
Comment thread src/uucore/src/lib/features/format/human.rs Outdated
Comment on lines +93 to +95
/// // assert_eq!(format_with_thousands_separator(1234567), "1.234.567");
/// ```
pub fn format_with_thousands_separator(number: u64) -> String {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write unit test for this function

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/date/date-locale-hour is no longer failing!

@naoNao89
Copy link
Copy Markdown
Contributor Author

hmm, was hardcoded to 3 now uses ICU locale data: (3,3) for westernor (3,2) for indian locales

@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/dd/no-allocate is now passing!
Congrats! The gnu test tests/seq/seq-epipe is now passing!

Implement thousands separator support for df using GNU coreutils-compatible
leading single quote syntax (--block-size='1K).

Includes:
- extract_thousands_separator_flag() in parse_size.rs
- format_with_thousands_separator() in human.rs with ICU locale support
- BlockSizeConfig struct in df/blocks.rs
- 4 integration tests

Also includes ParserBuilderError for better block size validation
(restored from PR uutils#9090).
Fix build error when i18n-decimal feature is not enabled by:
- Wrapping ICU imports in #[cfg(feature = "i18n-decimal")]
- Providing fallback implementation that returns comma separator
Add support for Indian numbering system (hi_IN) and other locales
with non-standard grouping:

- Add locale_grouping_sizes() to decimal.rs returning (primary, secondary)
- Update format_with_thousands_separator() to use locale-aware groups
- Most locales: (3, 3) -> 1,234,567
- Indian locales: (3, 2) -> 12,34,567
- Add conditional compilation for i18n-decimal feature
Add tests for:
- test_format_with_thousands_separator: basic functionality
- test_format_with_grouping_sizes: tests (3,3) and (3,2) grouping

Tests verify both Western format (1,234,567) and Indian format (12,34,567)
@github-actions
Copy link
Copy Markdown

GNU testsuite comparison:

GNU test failed: tests/date/date-locale-hour. tests/date/date-locale-hour is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/misc/io-errors. tests/misc/io-errors is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/isatty. tests/rm/isatty is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/dd/no-allocate is now passing!
Congrats! The gnu test tests/seq/seq-epipe is now passing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ls (and du) block size specification with thousands separator fails

3 participants