-
Notifications
You must be signed in to change notification settings - Fork 29
feat(log_bridge): add /rosout to faults bridge #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
| Changelog for package ros2_medkit_log_bridge | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| Forthcoming | ||
| ----------- | ||
| * Initial release: promote ``/rosout`` log entries (WARN/ERROR/FATAL) to | ||
| FaultManager faults, attributed to the originating node via a per-source | ||
| FaultReporter, with auto-generated stable fault codes. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,125 @@ | ||
| # Copyright 2026 mfaferek93, bburda | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| cmake_minimum_required(VERSION 3.8) | ||
| project(ros2_medkit_log_bridge) | ||
|
|
||
| set(CMAKE_CXX_STANDARD 17) | ||
| set(CMAKE_CXX_STANDARD_REQUIRED ON) | ||
| set(CMAKE_EXPORT_COMPILE_COMMANDS ON) | ||
|
|
||
| find_package(ros2_medkit_cmake REQUIRED) | ||
| include(ROS2MedkitCcache) | ||
| include(ROS2MedkitSanitizers) | ||
| include(ROS2MedkitLinting) | ||
| include(ROS2MedkitWarnings) | ||
|
|
||
| option(ENABLE_COVERAGE "Enable code coverage reporting" OFF) | ||
| if(ENABLE_COVERAGE) | ||
| message(STATUS "Code coverage enabled") | ||
| add_compile_options(--coverage -O0 -g) | ||
| add_link_options(--coverage) | ||
| endif() | ||
|
|
||
| find_package(ament_cmake REQUIRED) | ||
|
|
||
| include(ROS2MedkitCompat) | ||
|
|
||
| find_package(rclcpp REQUIRED) | ||
| find_package(rcl_interfaces REQUIRED) | ||
| find_package(ros2_medkit_msgs REQUIRED) | ||
| find_package(ros2_medkit_fault_reporter REQUIRED) | ||
|
|
||
| # Library target (for testing) | ||
| add_library(log_bridge_lib SHARED | ||
| src/log_bridge_node.cpp | ||
| ) | ||
|
|
||
| target_include_directories(log_bridge_lib PUBLIC | ||
| $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include> | ||
| $<INSTALL_INTERFACE:include> | ||
| ) | ||
|
|
||
| medkit_target_dependencies(log_bridge_lib | ||
| rclcpp | ||
| rcl_interfaces | ||
| ros2_medkit_msgs | ||
| ros2_medkit_fault_reporter | ||
| ) | ||
|
|
||
| # Executable | ||
| add_executable(log_bridge_node src/main.cpp) | ||
| target_link_libraries(log_bridge_node log_bridge_lib) | ||
| medkit_target_dependencies(log_bridge_node rclcpp) | ||
|
|
||
| install(TARGETS log_bridge_node | ||
| DESTINATION lib/${PROJECT_NAME} | ||
| ) | ||
|
|
||
| install(TARGETS log_bridge_lib | ||
| EXPORT export_${PROJECT_NAME} | ||
| ARCHIVE DESTINATION lib | ||
| LIBRARY DESTINATION lib | ||
| RUNTIME DESTINATION bin | ||
| ) | ||
|
|
||
| install(DIRECTORY include/ | ||
| DESTINATION include | ||
| ) | ||
|
|
||
| install(DIRECTORY launch config | ||
| DESTINATION share/${PROJECT_NAME} | ||
| ) | ||
|
|
||
| ament_export_targets(export_${PROJECT_NAME} HAS_LIBRARY_TARGET) | ||
| ament_export_dependencies(rclcpp rcl_interfaces ros2_medkit_msgs ros2_medkit_fault_reporter) | ||
|
|
||
| if(BUILD_TESTING) | ||
| find_package(ament_lint_auto REQUIRED) | ||
| find_package(ament_cmake_gtest REQUIRED) | ||
| find_package(launch_testing_ament_cmake REQUIRED) | ||
|
|
||
| set(ament_cmake_clang_format_CONFIG_FILE "${CMAKE_CURRENT_SOURCE_DIR}/../../.clang-format") | ||
| list(APPEND AMENT_LINT_AUTO_EXCLUDE ament_cmake_uncrustify ament_cmake_cpplint ament_cmake_clang_tidy) | ||
| ament_lint_auto_find_test_dependencies() | ||
|
|
||
| ros2_medkit_clang_tidy() | ||
|
|
||
| include(ROS2MedkitTestDomain) | ||
| medkit_init_test_domains(START 210 END 214) | ||
|
|
||
| ament_add_gtest(test_log_bridge test/test_log_bridge.cpp) | ||
|
mfaferek93 marked this conversation as resolved.
|
||
| target_link_libraries(test_log_bridge log_bridge_lib) | ||
| medkit_target_dependencies(test_log_bridge rclcpp rcl_interfaces ros2_medkit_msgs) | ||
| medkit_set_test_domain(test_log_bridge) | ||
|
|
||
| if(ENABLE_COVERAGE) | ||
| target_compile_options(test_log_bridge PRIVATE --coverage -O0 -g) | ||
| target_link_options(test_log_bridge PRIVATE --coverage) | ||
| endif() | ||
|
|
||
| # Integration test (launch_testing): fault_manager + bridge + synthetic /rosout | ||
| install(DIRECTORY test | ||
| DESTINATION share/${PROJECT_NAME} | ||
| ) | ||
| add_launch_test( | ||
| test/test_integration.test.py | ||
| TARGET test_integration | ||
| TIMEOUT 90 | ||
| ) | ||
|
|
||
| ros2_medkit_relax_vendor_warnings() | ||
| endif() | ||
|
|
||
| ament_package() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,101 @@ | ||
| # ros2_medkit_log_bridge | ||
|
|
||
| Drop-in bridge that promotes ROS 2 `/rosout` log entries to structured medkit | ||
| faults, attributing each fault to the node that logged it. No changes to the | ||
| user's nodes are required. | ||
|
|
||
| It is a compatibility adapter, the same category as | ||
| `ros2_medkit_diagnostic_bridge`. Native `ros2_medkit_fault_reporter` | ||
| instrumentation stays the canonical path for code you control; this bridge is | ||
| the fallback for nodes that only log. | ||
|
|
||
| ## What it does | ||
|
|
||
| Subscribes to `/rosout` (`rcl_interfaces/msg/Log`) and forwards entries at or | ||
| above a severity floor to the FaultManager: | ||
|
|
||
| | Log level | medkit severity | | ||
| |-----------|-----------------| | ||
| | DEBUG (10) / INFO (20) | dropped | | ||
| | WARN (30) | `SEVERITY_WARN` | | ||
| | ERROR (40) | `SEVERITY_ERROR` | | ||
| | FATAL (50) | `SEVERITY_CRITICAL` | | ||
|
|
||
| - `source_id` of each fault is the originating node's fully-qualified name. It | ||
| is derived from `Log.name` by taking the first dotted segment (a `Log.name` | ||
| may carry a sub-logger suffix, e.g. `controller_manager.resource_manager`, and | ||
| node names cannot contain `.`) and prefixing `/`, giving e.g. | ||
| `/controller_manager`. The gateway discovers entities by node FQN, so this is | ||
| the form that lets a fault (and its snapshots / rosbag) associate with the | ||
| entity in the SOVD tree. Each node gets its own per-node `FaultReporter` and | ||
| therefore its own client-side debounce. | ||
| - `fault_code` is auto-generated as `<PREFIX>_<NODE>_<HASH>`. `<HASH>` is a fixed | ||
| FNV-1a 32-bit digest (8 lowercase hex) of a normalized message template | ||
| (numbers / hex / paths stripped, isolated single-letter tokens dropped) so the | ||
| same logical message maps to the same code across occurrences. `<NODE>` is the | ||
| upper-snake of `source_id`. The 8-hex hash is never truncated; if the 64-char | ||
| cap is hit the node part is trimmed instead. | ||
|
|
||
| > Namespaced-node limitation: `Log.name` encodes a node's namespace with the same | ||
| > `.` separator as a sub-logger suffix, so the two are indistinguishable from the | ||
| > string alone. `source_id` takes the first dotted segment, which is right for a | ||
| > non-namespaced node with a sub-logger but collapses a namespaced node | ||
| > (`robot1.planner_server` -> `/robot1`) to its namespace, so same-named nodes in | ||
| > different namespaces share one code. Multi-robot fleets typically isolate robots | ||
| > by `ROS_DOMAIN_ID` (one gateway per robot, federated by peer aggregation), which | ||
| > sidesteps this. | ||
|
|
||
| ## Forwarding, the LocalFilter, and confirmation | ||
|
|
||
| Two independent debounces sit between a log line and a confirmed fault: | ||
|
|
||
| 1. Per-node `FaultReporter` `LocalFilter` (client-side). WARN is held until | ||
| `default_threshold` (3) occurrences within `default_window_sec` (10s). | ||
| ERROR/FATAL have severity `>= bypass_severity` (2) and bypass the filter, | ||
| forwarding immediately. | ||
| 2. Bridge `report_cooldown_sec` cooldown, applied only to `ERROR`/`FATAL` (the | ||
| levels that bypass the LocalFilter). It forwards the first occurrence of a | ||
| `(fault_code, severity)` immediately and suppresses that same pair for | ||
| `report_cooldown_sec` (default 5s, `0.0` disables), bounding a flood. `WARN` | ||
| is never cooled here (that would starve its LocalFilter threshold counting), | ||
| and keying on severity means a `WARN` never suppresses a same-message `ERROR` | ||
| escalation. | ||
|
|
||
| Whether a forwarded fault then shows as `PREFAILED` (suspected) or `CONFIRMED` | ||
| is a separate, gateway-side decision driven by the FaultManager's | ||
| `confirmation_threshold` - not by this bridge and not by the client-side | ||
| LocalFilter. For visible-but-quiet WARNs, launch the FaultManager with a low | ||
| `confirmation_threshold` (or an entity threshold for `LOG_*` codes). | ||
|
|
||
| ## Hard limitations (by construction) | ||
|
|
||
| - Only sees logs that reach `/rosout` via rclcpp from a still-alive node. | ||
| Console-only loggers (e.g. some Micro XRCE-DDS / non-rclcpp loggers) are | ||
| invisible. | ||
| - A node that crashes hard may not flush its final log to `/rosout`, so the | ||
| terminating ERROR can be missed. Process-death detection belongs to a | ||
| separate liveliness bridge, not here. | ||
|
|
||
| ## Run it | ||
|
|
||
| ```bash | ||
| # next to an existing stack + the medkit gateway/fault_manager | ||
| ros2 launch ros2_medkit_log_bridge log_bridge.launch.py | ||
| ``` | ||
|
|
||
| ## Configuration (`config/log_bridge.yaml`) | ||
|
|
||
| | Param | Default | Meaning | | ||
| |-------|---------|---------| | ||
| | `rosout_topic` | `/rosout` | log topic to subscribe | | ||
| | `severity_floor` | `30` (WARN) | minimum level promoted; raise to `40` on chatty / constrained targets. Clamped to `[0, 50]` at load (a value out of range is corrected with a warning) | | ||
| | `code_prefix` | `LOG` | prefix for generated fault codes; normalized to `[A-Z0-9_]` at load | | ||
| | `exclude_nodes` | `[]` | node-FQN substrings to skip | | ||
| | `include_only_nodes` | `[]` | if set, only promote nodes whose FQN matches | | ||
| | `max_tracked_nodes` | `512` | cap on per-node reporters; least-recently-used nodes evicted past this | | ||
| | `report_cooldown_sec` | `5.0` | per-fault_code forward debounce; `0.0` disables | | ||
|
|
||
| `exclude_nodes` / `include_only_nodes` match as **unanchored substrings** | ||
| against the node FQN: `planner` matches `/planner_server` and | ||
| `/robot1/planner_server`. Use a longer, more specific substring (e.g. | ||
| `/planner_server`) to avoid accidental matches. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| log_bridge: | ||
| ros__parameters: | ||
| # Topic carrying aggregated node logs. | ||
| rosout_topic: "/rosout" | ||
| # Minimum rcl_interfaces/msg/Log level promoted to a fault. | ||
| # 10=DEBUG 20=INFO 30=WARN 40=ERROR 50=FATAL. Default WARN. | ||
| # Raise to 40 (ERROR) on chatty / resource-constrained targets. | ||
| severity_floor: 30 | ||
| # Prefix for auto-generated fault codes (<PREFIX>_<NODE>_<HASH>). | ||
| code_prefix: "LOG" | ||
| # Originating-node FQN substrings to skip (e.g. noisy debug nodes). | ||
| exclude_nodes: [] | ||
| # If non-empty, ONLY promote logs from nodes matching these substrings. | ||
| include_only_nodes: [] | ||
| # Cap on per-node FaultReporters; least-recently-used nodes are evicted | ||
| # past this to bound memory under transient-node churn. | ||
| max_tracked_nodes: 512 | ||
| # Per-fault_code forward debounce (seconds). First occurrence of a code is | ||
| # forwarded immediately; the same code within the window is suppressed. | ||
| # 0.0 disables. Tames ERROR/FATAL floods, which bypass the per-node filter. | ||
| report_cooldown_sec: 5.0 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.