📈 Add OTel configuration#10
Conversation
| @@ -1,4 +1,4 @@ | |||
| # This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand. | |||
| # This file is automatically @generated by Poetry 1.8.4 and should not be changed by hand. | |||
There was a problem hiding this comment.
what is the benefit of moving from Poetry 1.8.2 to Poetry 1.8.4?
| api.app.disable_chat_ui = True | ||
| api.set_default_config_id('${CONFIG_ID}') | ||
|
|
||
| # Start the server using uvicorn |
There was a problem hiding this comment.
why replace the cli path with a direct uvicorn? Cli was used previously as composition over replacement was deemed preferable
There was a problem hiding this comment.
Direct uvicorn setups OpenTelemetry before starting the server, which is required, whereas the CLI starts the server immediately (line 155 in cli/init.py)
| service_version = os.getenv("OTEL_SERVICE_VERSION", "1.0.0") | ||
| environment = os.getenv("OTEL_ENVIRONMENT", "production") | ||
| otlp_endpoint = os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT") | ||
| enable_console = os.getenv("OTEL_ENABLE_CONSOLE", "true").lower() == "true" |
There was a problem hiding this comment.
does the default value set on line 17 conflict with entrypoint.sh?
| export OTEL_ENVIRONMENT="${OTEL_ENVIRONMENT:-production}" | ||
| export OTEL_ENABLE_CONSOLE="${OTEL_ENABLE_CONSOLE:-false}" | ||
| export OTEL_EXPORTER_OTLP_INSECURE="${OTEL_EXPORTER_OTLP_INSECURE:-true}" | ||
| export OTEL_EXPORTER_OTLP_ENDPOINT="${OTEL_EXPORTER_OTLP_ENDPOINT:-http://jaeger:4317}" |
There was a problem hiding this comment.
this setup seems to be tightly coupled to http://jaeger:4317 and presumably assumed same-namespace deployment; is this always the case? if it isn't, should the user have some kind of control over this configuration?
There was a problem hiding this comment.
Yes, generally servers and tracing/metrics services are deployed in the same namespace
| print(f'❌ OpenTelemetry setup failed: {e}') | ||
| import traceback | ||
| traceback.print_exc() | ||
| sys.exit(1) |
There was a problem hiding this comment.
so if any error occurs during OTel setup, the server refuses to start? I'd say that observability should not break core functionality
There was a problem hiding this comment.
You're right - fixed this and added messages to verify setup
| from setup_otel import setup_opentelemetry | ||
| setup_opentelemetry() | ||
| print('✅ OpenTelemetry configured in server process') | ||
| except Exception as e: |
There was a problem hiding this comment.
perhaps this is a too broad exception handling?
There was a problem hiding this comment.
Added different exception handling according to the error
|
|
||
| RUN poetry install --no-ansi --extras="sdd jailbreak openai nvidia tracing" && \ | ||
| poetry run pip install "spacy>=3.4.4,<4.0.0" && \ | ||
| poetry run pip install \ |
There was a problem hiding this comment.
we should probably move all these deps inside pyproject.toml
I did not want to initially touch pyproject.toml, but if we are adding more or more deps, they should be managed via pyproject.toml
also I think some of the opentelemetry packages are already inside .toml
I think it might be best to place opentelemetry packages inside tracing
There was a problem hiding this comment.
I initially had them inside pyproject.toml but the NeMo team wants the server to be independent of opentelemetry API packages
| fi | ||
|
|
||
| # Setup OpenTelemetry | ||
| ENABLE_OTEL="${ENABLE_OTEL:-${AUTO_SETUP_OTEL:-false}}" |
There was a problem hiding this comment.
what is the advantage of having the dual variable fallback here?
There was a problem hiding this comment.
100% a typo - removed AUTO_SETUP_OTEL
| export OTEL_SERVICE_VERSION="${OTEL_SERVICE_VERSION:-1.0.0}" | ||
| export OTEL_ENVIRONMENT="${OTEL_ENVIRONMENT:-production}" | ||
| export OTEL_ENABLE_CONSOLE="${OTEL_ENABLE_CONSOLE:-false}" | ||
| export OTEL_EXPORTER_OTLP_INSECURE="${OTEL_EXPORTER_OTLP_INSECURE:-true}" |
There was a problem hiding this comment.
are there any scenarios when this can be secure?
There was a problem hiding this comment.
Yes, if we configure OpenTelemetry/Jaegar to use secure connections but that configuration is independent from the NeMo server (a user would have to specify it on their OTelCollector CR)
m-misiura
left a comment
There was a problem hiding this comment.
Thanks for the contribution! I think it could be beneficial to consider the following:
- tweak how the server starts up in case there are OTel errors
- add more informative logging so that debugging becomes easier
- adding some documentation / example deployment manifests to the PR
I left some additional comments, highlighting specific lines that could be worth revisiting :)
b7d6750 to
603bf72
Compare
0d1efdb to
3bd76e9
Compare
|
@m-misiura thanks for your review and comments ! i updated my PR with the following:
|
* start restructuring and add local build automation * refresh poetry lock * WIP new IA implementation * more restructure and script to auto-update cards * more work * improve * nit to stage * details * nit got to be staged * pull out use case page and add table * more improvements and extend update cards script * big improvements * polish intro a bit * small edit * incorporate devs feedback * remaining pages * more fixes * small edits * more updated and incorporate feedback * some edits * accidentally dropped file
…VIDIA-NeMo#1523) * chore(test): reduce default pytest log level from DEBUG to WARNING The DEBUG level was flooding test output with verbose nemoguardrails logs,making it difficult to read test results. Users can still enable verbose logging when needed via `--log-level=DEBUG`. * Updated test that relies on INFO-level log capture to explicitly set caplog level.
Switch base image from python:3.10 to python:3.12-slim for improved performance and smaller image size. Use --no-install-recommends for apt package installation and clean up apt cache to reduce final image size.
* Use temporary path as default value in AIperf tests * Revert type fix that's needed locally but breaks on the server
Add extensive test coverage for all model initialization paths and edge cases.Tests cover success scenarios, error handling, exception priority, mode filtering, and E2E integration through the full initialization chain.
… bot messages (NVIDIA-NeMo#1530) * feat(content_safety): add auto selected multilingual refusal bot message support Detect user input language and return refusal messages in the same language when content safety rails block unsafe content. Supports 9 languages: English, Spanish, Chinese, German, French, Hindi, Japanese, Arabic, and Thai.
…-NeMo#1542) * docs(streaming): update streaming configuration documentation feature: NVIDIA-NeMo#1538 Revise streaming docs to clarify usage of stream_async(), remove outdated global streaming config, and add CLI usage instructions. Explain output rails streaming requirements and deprecation of StreamingHandler. Improve examples and guidance for token usage tracking. * not sure about these 2
* docs: add multilingual refusal messages documentation Add documentation for the multilingual refusal messages feature in content safety rails, including: - Supported languages table with default messages - Basic and advanced configuration examples - Language detection behavior and accuracy benchmarks - Cold start behavior and caching details - Production considerations for container environments * docs: add installation guide for multilingual feature * remove configuration options
* docs: small update to test * fixing weird bug * updating overview page * updating use cases page * updating how it works section * updating supported llms page * updating getting started section * updating based on meeting discussion * adjusting rail types order * updating first configure rails docs * Update docs/about/rail-types.md Co-authored-by: Miyoung Choi <miyoungc@nvidia.com> Signed-off-by: alexahaushalter <alexahaushalter@hotmail.com> * updating first overall config files * updating guardrails config docs * finishing Core Configuration updates * updating actions section * updating Colang 2.0 getting started section * a few more updates * trying to get a Verified commit * trying to get a Verified commit attempt 2 --------- Signed-off-by: alexahaushalter <alexahaushalter@hotmail.com> Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
8ed88db to
c20ac11
Compare
* add frontmatter and card updates * more fixes and also the library mentions * fix capitalization * Update docs/run-rails/index.md Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/run-rails/index.md Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * save --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
* more doc updates * rm duplicates * Update docs/about/overview.md Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>
* continuing with colang - working with actions * updating defining flows page * working with variables page
* continuing with colang - working with actions * updating defining flows page * working with variables page * through CSL section * finished language reference section * finishing colang 2 section
…rivateAI) (NVIDIA-NeMo#1545) * Adding GLiNER Co-authored-by: Lipika Ramaswamy <lramaswamy@nvidia.com> * addressing comments pt1 * revamping GLiNER server * addressing comments pt2 * fixing path on integration test after restructure * addressing comments pt3 * rolling back changes to project, not needed * mocking request on test, rolling back vscode change committed by mistake * updating docs * addressing final comments * rollback vscode * updating licenses * making extract function non-blocking by removing async --------- Co-authored-by: Lipika Ramaswamy <lramaswamy@nvidia.com>
* Move mock LLMs into top-level benchmark dir, local content_safety under examples, and benchmark-specific tests * Initial checkin of validation script * Remove un-needed files under nemoguardrails/benchmark * Move unit-tests under benchamrk top-level dir * Update unit-tests with new code location * Add requirements to keep benchmark dependencies separate from Guardrails itself * Update server run script and Procfile with new file locations * Return np.array with size (1,) from mock function calls in tests * Remove langchain_nvidia_ai_endpoints from requirements, Guardrails already has this in the poetry env. It's not used in the mocks * Update README to match new file locations and include venv instructions * Cleanups to the README * README.md cleanup
…d align with cards and nav (NVIDIA-NeMo#1554) * another round of frontmatter and titles * some improvements for get started * update * polish content safety tutorial and add use case diagrams to resources * improve tutorials, add config reference based on the source code * incorporate feedback
…VIDIA-NeMo#1526) OpenAI reasoning models (o1, o3, gpt-5 series excluding gpt-5-chat) only support `temperature=1`. When NeMo Guardrails uses `.bind(temperature=0.001)` for deterministic tasks like self-check input/output, the API returns an error: ``` Unsupported value: 'temperature' does not support 0.001 with this model. Only the default (1) value is supported. ``` This happens because LangChain's `ChatOpenAI` handles temperature restrictions at initialization time (setting `temperature=None` for reasoning models), but `.bind()` bypasses this protection and passes the temperature directly to the API. - Added `_filter_params_for_openai_reasoning_models()` function that: - Detects reasoning models by name (o1*, o3*, gpt-5* excluding gpt-5-chat) - Removes the `temperature` parameter before binding for these models - Follows the same pattern as LangChain's `validate_temperature` validator
…ing guides (NVIDIA-NeMo#1563) * refactor(docs): move alignscore and safeguarding guides to advanced section * docs(langchain): update and reorganize integration docs * fix typo --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
* Add streaming to mock LLMs, TTFT and ITL to config files, rename LATENCY_* to E2E_LATENCY_* to distinguish between the two
* improve doc * nit * Update docs/configure-rails/overview.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/about/overview.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/configure-rails/actions/index.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/configure-rails/overview.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/configure-rails/guardrail-catalog.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/user-guides/community/gliner.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/user-guides/community/gliner.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/user-guides/community/gliner.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/configure-rails/guardrail-catalog.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/configure-rails/guardrail-catalog.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * Update docs/user-guides/community/gliner.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> * resolve feedback * nit --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com>
* First pass over About NeMo Guardrails Library * Address feedback. Remove extra space, move diagrams into Resource section, fix README examples link * Fixed how-it-works top-evel card * Update docs/about/supported-llms.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Tim Gasser <200644301+tgasser-nv@users.noreply.github.com> * Update docs/about/supported-llms.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Tim Gasser <200644301+tgasser-nv@users.noreply.github.com> * Update docs/about/supported-llms.md Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Signed-off-by: Tim Gasser <200644301+tgasser-nv@users.noreply.github.com> * Add Integrate NeMo Guardrails Library into Your Application dropdown back to overview * Add accidentally deleted line back --------- Signed-off-by: Tim Gasser <200644301+tgasser-nv@users.noreply.github.com> Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
* docs(run-rails): improve documentation structure and clarity minor fix * docs(generation-options): reorder sections based on importance/usage * Apply suggestions from code review Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> --------- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com> Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>
) * refactor(streaming): simplify streaming support validation - Remove `streaming` and `streaming_supported` properties from RailsConfig - Add `StreamingNotSupportedError` exception for clearer error handling - Move streaming validation from config-time to runtime in LLMRails - Remove `main_llm_supports_streaming` flag and related logic - Update CLI to catch StreamingNotSupportedError instead of pre-checking - Simplify server API streaming logic to use stream_async directly - Configure LLM streaming only when stream_async is called - Remove redundant streaming warnings and fallback logic BREAKING CHANGE: `RailsConfig.streaming` and `RailsConfig.streaming_supported` properties have been removed. Streaming support is now validated at runtime when `stream_async()` is called. * test: update tests for streaming refactor - Remove RailsConfig.streaming references from all tests - Update streaming tests to use StreamingNotSupportedError - Remove tests for deprecated streaming_supported property - Remove tests for main_llm_supports_streaming flag - Update CLI tests to verify StreamingNotSupportedError handling - Simplify test fixtures to remove streaming config parameters - Update token usage tests to reflect stream_usage always enabled - Fix test mocks to align with new streaming validation approach
* add reasoning guardrail connector Signed-off-by: Makesh Sreedhar <makeshn@nvidia.com> * update reasoning guardrail prompt Signed-off-by: Makesh Sreedhar <makeshn@nvidia.com> * feat(content_safety): add reasoning_enabled to prompt ctx Include the `reasoning_enabled` flag from the content safety config in the context passed to both input and output content safety check prompts. This enables prompt templates to conditionally adjust behavior based on the reasoning setting. Removes redundant config exposure in LLMTaskManager's default context. * revert streamers.py change * test(content_safety): add Nemotron reasoning parser tests * Revert "revert streamers.py change" This reverts commit e916dd4. * fix: add type ignore for streamer import * minor updates to docs, example Signed-off-by: Makesh Sreedhar <makeshn@nvidia.com> --------- Signed-off-by: Makesh Sreedhar <makeshn@nvidia.com> Co-authored-by: Pouyanpi <13303554+Pouyanpi@users.noreply.github.com>
This PR configures NeMO to allow OpenTelemetry (OTel) to export traces to Jaegar. It adds
scripts/setup_otel.pyto configure OTel and updatesscripts/entrypoint.shto access this script and setup OTel before starting NeMo server.