Skip to content

Feat: Support for colored log output#30

Merged
arekay-nv merged 6 commits intomainfrom
logging/color_log_levelname
Dec 2, 2025
Merged

Feat: Support for colored log output#30
arekay-nv merged 6 commits intomainfrom
logging/color_log_levelname

Conversation

@anandhu-eng
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR adds colored log output to improve readability and UX for benchmarking tool users. Log levels are now color-coded (INFO=green, WARNING=yellow, ERROR/CRITICAL=red) using colorama, with support for the NO_COLOR standard.

Problem

Benchmarking logs are difficult to scan when all output is monochrome. Users cannot quickly distinguish between informational messages, warnings, and errors during long benchmark runs.

Solution

  • Implement ColoredFormatter class that applies colors only to log level names
  • Colors are enabled by default for better UX
  • Respect NO_COLOR=1 environment variable for users/systems that disable colors
  • Suppress verbose internal logging from asyncio and urllib3 (set to WARNING level) to keep benchmark output clean and focused

Changes

  1. src/inference_endpoint/utils/logging.py

    • Extract ColoredFormatter class (previously nested in setup_logging())
    • Add colorama initialization at module level
    • Default behavior: colors enabled; only disabled when NO_COLOR env var is set
    • Formatter temporarily modifies record.levelname for output, restores it afterward to prevent state pollution
  2. requirements/base.txt

    • Add colorama==0.4.6 as a runtime dependency for Windows/cross-platform support
  3. tests/unit/test_logging.py (new file)

    • 8 tests for ColoredFormatter behavior
    • 5 tests for setup_logging() configuration
    • 13 total tests(all passing locally), 97% code coverage
    • Tests cover: color application, NO_COLOR respect, levelname restoration, unmapped levels, asyncio/urllib3 suppression

Example output:
image

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Five tests are currently failing when I run pytest. I am not certain whether this is specific to my environment. The last few logs are included below:

=================================================================================================================== tests coverage ===================================================================================================================
__________________________________________________________________________________________________ coverage: platform linux, python 3.12.3-final-0 ___________________________________________________________________________________________________

Name                                                           Stmts   Miss  Cover   Missing
--------------------------------------------------------------------------------------------
src/inference_endpoint/__init__.py                                 5      0   100%
src/inference_endpoint/cli.py                                    125    125     0%   24-365
src/inference_endpoint/commands/__init__.py                        5      0   100%
src/inference_endpoint/commands/benchmark.py                     231     58    75%   130, 132, 190-204, 264, 347-352, 385, 445, 447, 453, 461, 480-481, 483-484, 494, 507-512, 535-537, 581-583, 605-607, 631-636, 657, 661, 666-678, 687-689
src/inference_endpoint/commands/eval.py                           28      0   100%
src/inference_endpoint/commands/probe.py                          92     13    86%   58-59, 111-112, 136, 141, 172-175, 189-191
src/inference_endpoint/commands/utils.py                         100     27    73%   100, 134-135, 194, 199-203, 220-225, 276-289, 295-303
src/inference_endpoint/config/__init__.py                          0      0   100%
src/inference_endpoint/config/ruleset_base.py                     14      1    93%   35
src/inference_endpoint/config/ruleset_registry.py                 22     22     0%   27-95
src/inference_endpoint/config/rulesets/__init__.py                 0      0   100%
src/inference_endpoint/config/rulesets/mlcommons/__init__.py       2      0   100%
src/inference_endpoint/config/rulesets/mlcommons/datasets.py      20      2    90%   34, 62
src/inference_endpoint/config/rulesets/mlcommons/models.py        23      1    96%   152
src/inference_endpoint/config/rulesets/mlcommons/rules.py         85     11    87%   139, 145, 163-173, 179, 190, 194
src/inference_endpoint/config/runtime_settings.py                 62      5    92%   39-40, 117, 203-204
src/inference_endpoint/config/schema.py                          192     26    86%   172-174, 276-291, 375-380, 400, 408, 417, 441, 469, 478, 504, 510, 528, 610-621
src/inference_endpoint/config/user_config.py                       9      0   100%
src/inference_endpoint/config/yaml_loader.py                      32      2    94%   102, 106
src/inference_endpoint/core/__init__.py                            0      0   100%
src/inference_endpoint/core/types.py                              34      0   100%
src/inference_endpoint/dataset_manager/__init__.py                 3      0   100%
src/inference_endpoint/dataset_manager/dataloader.py             102     33    68%   69, 104, 163-166, 182, 206-211, 217-225, 237, 267, 304-314, 317-319, 322, 325
src/inference_endpoint/dataset_manager/factory.py                 35     13    63%   78-97, 117, 123-130
src/inference_endpoint/endpoint_client/__init__.py                 3      0   100%
src/inference_endpoint/endpoint_client/configs.py                 94      0   100%
src/inference_endpoint/endpoint_client/futures_client.py         103     12    88%   100-102, 143-146, 149-152, 177, 184-185, 220, 234
src/inference_endpoint/endpoint_client/http_client.py             87      5    94%   107-109, 173-176
src/inference_endpoint/endpoint_client/http_sample_issuer.py      58      6    90%   82, 88-89, 93-95
src/inference_endpoint/endpoint_client/worker.py                 233     26    89%   60-62, 76-107, 226-231, 240, 269-270, 344-345, 356-359, 436, 509-510
src/inference_endpoint/endpoint_client/zmq_utils.py               56      3    95%   53-54, 67
src/inference_endpoint/exceptions.py                               8      0   100%
src/inference_endpoint/load_generator/__init__.py                  6      0   100%
src/inference_endpoint/load_generator/events.py                   14      0   100%
src/inference_endpoint/load_generator/load_generator.py           51      4    92%   66, 97, 163, 302
src/inference_endpoint/load_generator/sample.py                   79     10    87%   102, 119, 126-131, 181-183
src/inference_endpoint/load_generator/scheduler.py               102      6    94%   191, 220, 307, 328-329, 381
src/inference_endpoint/load_generator/session.py                  84     21    75%   59, 89, 96-98, 101, 113-123, 128-158
src/inference_endpoint/main.py                                    19     19     0%   24-54
src/inference_endpoint/metrics/__init__.py                         2      0   100%
src/inference_endpoint/metrics/metric.py                          30      8    73%   46, 58-64, 67, 75, 83
src/inference_endpoint/metrics/recorder.py                       207     20    90%   76, 85, 149, 256, 280, 289-290, 328-329, 347, 392, 403, 412, 428-429, 439-440, 474-476
src/inference_endpoint/metrics/reporter.py                       368     53    86%   38, 54, 92, 100, 134, 139-147, 153, 171, 186, 193, 213-214, 224-225, 280-283, 309, 315, 330-332, 372, 383, 412-413, 451, 490, 500, 544-545, 550, 579, 634, 636, 715, 757, 779, 787, 793, 800, 805, 817-822, 827-828
src/inference_endpoint/openai/openai_adapter.py                   40      5    88%   69, 91-93, 117
src/inference_endpoint/openai/openai_types_gen.py               4747      0   100%
src/inference_endpoint/plugins/__init__.py                         0      0   100%
src/inference_endpoint/profiling/__init__.py                       2      0   100%
src/inference_endpoint/profiling/line_profiler.py                114     35    69%   78-79, 86-92, 104-108, 112-120, 149-158, 165-167, 172, 179-180, 185
src/inference_endpoint/profiling/pytest_profiling_plugin.py       53     40    25%   43-53, 58-65, 70-99, 104-111, 116-122
src/inference_endpoint/testing/__init__.py                         2      0   100%
src/inference_endpoint/testing/docker_server.py                  123    104    15%   46-61, 65, 73-80, 88-151, 157-180, 183-184, 187, 195, 208-285
src/inference_endpoint/testing/echo_server.py                    175     44    75%   118-125, 224-226, 249-250, 258, 296-299, 335, 348-349, 368, 382-404, 427-450
src/inference_endpoint/utils/__init__.py                          27      6    78%   29, 34-43, 61
src/inference_endpoint/utils/logging.py                           37      1    97%   135
--------------------------------------------------------------------------------------------
TOTAL                                                           8145    767    91%
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml
============================================================================================================== short test summary info ===============================================================================================================
FAILED tests/performance/endpoint_client/test_http_client_performance_single.py::TestHTTPClientPerformanceSingleWorker::test_streaming_baseline_performance - AssertionError: Failed to achieve 90% of target 350 QPS (got 220.65, issued 199.34)
FAILED tests/performance/endpoint_client/test_http_client_performance_single.py::TestHTTPClientPerformanceSingleWorker::test_streaming_throughput_various_message_sizes[100] - AssertionError: Failed to achieve 90% of target 350 QPS at message size 100 characters (got 239.32, issued 236.65)
FAILED tests/performance/endpoint_client/test_http_client_performance_single.py::TestHTTPClientPerformanceSingleWorker::test_streaming_throughput_various_message_sizes[500] - AssertionError: Failed to achieve 90% of target 350 QPS at message size 500 characters (got 239.38, issued 262.33)
FAILED tests/performance/test_recorder.py::test_2_chunk_per_query_performance[duckdb-time_thresholds0] - assert 0.558441333 <= 0.315
FAILED tests/performance/test_recorder.py::test_2_chunk_per_query_performance[sqlite-time_thresholds1] - assert 0.598611292 <= 0.315
=================================================================================== 5 failed, 256 passed, 4 skipped, 5 deselected, 9 warnings in 260.68s (0:04:20) ===================================================================================

================================================================================
PID 9191 - LINE PROFILER RESULTS
================================================================================
Timer unit: 1e-06 s

Total time: 5.792e-06 s
File: /endpoints/tests/unit/test_profiler.py
Function: TestProfileDecorators.test_profile_decorator_async_when_enabled.<locals>.test_async_func at line 148

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   148                                                       @state.profile
   149                                                       async def test_async_func(x):
   150         1          5.0      5.0     87.1                  await asyncio.sleep(0)
   151         1          0.8      0.8     12.9                  return x * 2


================================================================================
PID 9191 - LINE PROFILER RESULTS
================================================================================
Timer unit: 1e-06 s

Total time: 1.916e-06 s
File: /endpoints/tests/unit/test_profiler.py
Function: TestProfileDecorators.test_profile_decorator_sync_when_enabled.<locals>.test_func at line 130

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   130                                                       @state.profile
   131                                                       def test_func(x):
   132         1          1.9      1.9    100.0                  return x * 2

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

@anandhu-eng anandhu-eng requested a review from a team as a code owner November 18, 2025 06:45
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Nov 18, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @anandhu-eng, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the user experience of the benchmarking tool by introducing colored log output. By visually distinguishing log levels such as INFO, WARNING, and ERROR, users can more quickly identify critical information during long benchmark runs. The implementation is robust, offering default color enablement while also adhering to the NO_COLOR standard for user preference, and it streamlines output by filtering out excessive third-party library logs.

Highlights

  • Colored Log Output: Implemented color-coded log levels (INFO=green, WARNING=yellow, ERROR/CRITICAL=red) to significantly improve log readability and user experience during benchmark runs.
  • Custom Log Formatter: A new ColoredFormatter class was introduced to apply colors specifically to log level names, ensuring the rest of the log message remains untouched and the original levelname is restored after formatting.
  • NO_COLOR Standard Support: The logging system now respects the NO_COLOR environment variable, allowing users or systems to disable colored output if preferred, while colors are enabled by default for better UX.
  • Reduced Verbosity: Internal logging from asyncio and urllib3 libraries is now suppressed by setting their logging levels to WARNING, leading to cleaner and more focused benchmark output.
  • New Dependency and Tests: The colorama library (version 0.4.6) has been added as a dependency for cross-platform color support, and a new test file (tests/unit/test_logging.py) provides 13 comprehensive unit tests for the new logging functionality, achieving 97% code coverage for the logging.py module.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces colored log output, which is a great enhancement for user experience. The implementation is well-done, featuring a ColoredFormatter that correctly applies colors to log levels and respects the NO_COLOR standard. The inclusion of colorama ensures cross-platform support. I'm particularly impressed with the comprehensive unit tests that cover various cases, ensuring the new functionality is reliable. I have one suggestion to make the logging setup more robust, but overall, this is an excellent contribution.

Comment thread src/inference_endpoint/utils/logging.py Outdated
anandhu-eng and others added 2 commits November 18, 2025 14:05
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@arekay-nv arekay-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this opt-in, so disabled by default. Would be great to have colored logs - thanks!

Comment thread src/inference_endpoint/utils/logging.py Outdated
Comment thread src/inference_endpoint/utils/logging.py Outdated
@arekay-nv arekay-nv requested a review from Copilot November 24, 2025 15:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds colored log output to the benchmarking tool to improve readability. Log levels are now color-coded (INFO=green, WARNING=yellow, ERROR/CRITICAL=red) using colorama, with colors disabled by default but can be enabled via the FORCE_COLOR_LOGGING environment variable.

Key Changes:

  • Extracted ColoredFormatter class for applying colors to log level names only
  • Added colorama==0.4.6 dependency for cross-platform color support
  • Comprehensive test suite with 13 tests achieving 97% code coverage

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/inference_endpoint/utils/logging.py Implements ColoredFormatter class and updates setup_logging() to support optional colored output via environment variable
requirements/base.txt Adds colorama dependency for terminal color support
tests/unit/test_logging.py New test file with 13 tests covering color formatting, environment variable handling, and logger configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/utils/logging.py
Comment thread tests/unit/test_logging.py Outdated
@arekay-nv arekay-nv merged commit 86d62dc into main Dec 2, 2025
4 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Dec 2, 2025
@arekay-nv arekay-nv deleted the logging/color_log_levelname branch April 2, 2026 03:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants