feat: structured JSONL logging and observability pipeline#115
Conversation
Add LogContext (contextvars-based) and StructuredJsonFormatter that replaces flat text file logs with structured JSONL. Every log record now carries experiment_id, test_id, service_id, phase context automatically via contextvars — zero changes to call sites needed. - New: log_context.py (LogContext dataclass + context manager) - New: structured_formatter.py (JSONL logging.Formatter) - Modified: logger_factory.py (structured file handler replaces text) - Modified: global_config.py (removed debug_file_logging, added structured_log_file) - Removed: debug_file_logging references from configs and experiment_manager
Wire LogContext into experiment lifecycle boundaries so that every log record and event emitted during a phase/test/service carries the relevant identifiers (experiment_id, test_id, service_id, phase). Context propagation locations: - ExperimentManager.initialize_experiments: experiment_id + phase - ExperimentManager.run_tests: experiment_id + test_execution phase - ExperimentManager.run_tests per-test loop: test_id - ExperimentManager.cleanup: experiment_id + cleanup phase - TestCase.run: test_id (via manual __enter__/__exit__) - ServiceManagementMixin: service_id + phase for setup_testers, setup_implementations, prepare_services, teardown_services New EventStreamRecorder observer: - Appends every event to the same structured.jsonl used by StructuredJsonFormatter, interleaving events and log records chronologically in a single timeline file - Uses "source": "event" and "level": "EVENT" to distinguish from logging records - Reads LogContext for experiment/test/service/phase fields - Thread-safe file writes via threading.Lock - Registered as a global observer in ExperimentObserverMixin with low priority (10) so business observers run first Also: - Register EventStreamRecorder in observer factory as "event_stream" - Add deprecation note to StorageObserver about separate JSONL files now being redundant with structured.jsonl (kept for backward compat) - 12 new unit tests covering schema, context propagation, deduplication, thread safety, error handling
Streaming JSONL log query engine (LogQueryEngine, LogFilter) that reads structured.jsonl files without loading them fully into memory. Supports filtering by level, service, test, phase, time range, correlation ID, message regex, and source. CLI commands: panther logs query <dir> --level/--service/--test/--phase/... panther logs errors <dir> (shorthand for --level ERROR,CRITICAL) Both commands support --json (default JSONL) and --human (table) output. 35 unit tests cover all filter combinations, file discovery, malformed input handling, and the count() aggregation.
Create an incremental, thread-safe output index that tracks every file produced during an experiment and serializes it as output_index.json. - New: panther/core/outputs/output_index.py with OutputIndexBuilder, FileEntry, and detect_type_and_format helper - Integrate into ExperimentManager: init at startup, register known files (config, reports, metrics, structured log), flush at cleanup - Extend OutputAggregator with optional output_index_builder param to register collected environment outputs in the manifest - 35 unit tests covering serialization, auto-detection, dedup, thread safety, and aggregator integration
…mands Add TimelineRenderer for chronological swimlane display of structured log entries, and ArtifactBrowser for listing/filtering experiment output files from output_index.json (with directory walk fallback). Register both as `panther report timeline` and `panther report artifacts` CLI subcommands with filtering by test, service, time range, phase, and artifact type.
…matching Add declarative FailurePattern library (8 built-in patterns derived from ErrorCategory) and RootCauseAnalyzer that reads structured.jsonl via LogQueryEngine to produce ranked root-cause findings with log excerpts and actionable suggestions. Extend ExperimentReporter with optional RCA and artifact inventory sections in both JSON and Markdown reports (gracefully skipped when structured.jsonl is absent). Add `panther report diagnose` CLI subcommand with --json/--human output.
There was a problem hiding this comment.
Sorry @ElNiak, your pull request is larger than the review limit of 150000 diff characters
There was a problem hiding this comment.
Pull request overview
This PR adds an end-to-end structured logging and observability pipeline around a unified structured.jsonl stream, enabling post-hoc log querying, timeline rendering, artifact indexing/browsing, and root-cause analysis (RCA) in both reports and CLI workflows.
Changes:
- Introduces structured JSONL logging (
StructuredJsonFormatter) and context propagation (LogContext+log_context) across experiment/test/service phases. - Adds an event recorder (
EventStreamRecorder) that appends emitted events into the same JSONL stream for unified timelines and RCA. - Adds reporting + CLI tooling: log query engine (
panther logs), timeline renderer, artifact browser backed byoutput_index.json, and RCA integration into experiment reports.
Reviewed changes
Copilot reviewed 36 out of 39 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_core/test_utils/test_structured_formatter.py | Unit tests for JSONL log formatting and context/error fields. |
| tests/unit/test_core/test_utils/test_log_context.py | Unit tests for contextvar-based log context propagation and thread isolation. |
| tests/unit/test_core/test_utils/init.py | Test package init (context only). |
| tests/unit/test_core/test_reporting/test_timeline_renderer.py | Unit tests for timeline sorting/filtering and human rendering. |
| tests/unit/test_core/test_reporting/test_root_cause_analyzer.py | Unit tests for failure pattern matching, RCA ranking, excerpts, serialization. |
| tests/unit/test_core/test_reporting/test_log_query_engine.py | Unit tests for JSONL discovery, filtering, and aggregations. |
| tests/unit/test_core/test_reporting/test_artifact_browser.py | Unit tests for index-based and fallback artifact browsing/filtering. |
| tests/unit/test_core/test_reporting/init.py | Test package init (context only). |
| tests/unit/test_core/test_outputs/test_output_index.py | Unit tests for output index builder, file typing, flushing, concurrency. |
| tests/unit/test_core/test_outputs/init.py | Test package init (context only). |
| tests/unit/test_core/test_event_stream_recorder.py | Unit tests for event-to-JSONL recording, dedup, concurrency behavior. |
| panther/core/utils/structured_formatter.py | Adds StructuredJsonFormatter JSONL formatter with LogContext injection. |
| panther/core/utils/logger_factory.py | Switches file logging to JSONL formatter; injects per-record feature metadata. |
| panther/core/utils/log_context.py | Adds contextvars-backed LogContext + log_context() context manager. |
| panther/core/test_cases/test_case_impl.py | Pushes test-level log_context(test_id=...) during test execution. |
| panther/core/test_cases/mixins/service_management.py | Adds service/phase context around service setup/prepare/teardown logging. |
| panther/core/reporting/log_query_engine.py | Adds streaming structured log reader + composable filtering. |
| panther/core/reporting/timeline_renderer.py | Adds timeline rendering (JSON + human swimlane) on top of the query engine. |
| panther/core/reporting/failure_patterns.py | Adds declarative failure pattern catalog and scoring logic. |
| panther/core/reporting/root_cause_analyzer.py | Adds RCA engine to match errors/events to patterns with excerpts. |
| panther/core/reporting/artifact_browser.py | Adds artifact browser backed by output_index.json with fallback directory walk. |
| panther/core/reporting/experiment_reporter.py | Integrates RCA + artifact inventory into JSON/Markdown reports; enables Jinja autoescape. |
| panther/core/reporting/init.py | Exposes new reporting APIs (query engine, timeline, RCA, artifact browser). |
| panther/core/outputs/output_index.py | Adds thread-safe OutputIndexBuilder + schema for output_index.json. |
| panther/core/outputs/output_aggregator.py | Optionally registers collected outputs into OutputIndexBuilder. |
| panther/core/outputs/init.py | Exports OutputIndexBuilder from outputs package. |
| panther/core/observer/impl/event_stream_recorder.py | Adds event recorder that appends events into structured.jsonl. |
| panther/core/observer/impl/init.py | Exports EventStreamRecorder. |
| panther/core/observer/factory.py | Registers event_stream observer type. |
| panther/core/observer/impl/storage_observer.py | Notes coexistence of legacy per-category JSONL vs new unified structured.jsonl. |
| panther/core/experiment_observer.py | Registers EventStreamRecorder for experiments. |
| panther/core/experiment_manager.py | Switches experiment logging to structured.jsonl, adds log contexts, builds/flushed output index. |
| panther/config/core/models/global_config.py | Replaces debug_file_logging with structured_log_file option. |
| panther/cli/core/main.py | Registers new panther logs CLI command. |
| panther/cli/commands/logs.py | Adds CLI for querying/filtering structured logs and printing JSONL/human output. |
| panther/cli/commands/report.py | Adds report timeline, report artifacts, report diagnose subcommands. |
| experiment-config/base/experiment_config_example_minimal_docker_no_buildx.yaml | Removes deprecated debug_file_logging from examples. |
| experiment-config/base/experiment_config_example_minimal_docker.yaml | Removes deprecated debug_file_logging from examples. |
| experiment-config/base/experiment_config_example_minimal.yaml | Removes deprecated debug_file_logging from examples. |
| Thread safety: writes are protected by a threading.Lock to prevent | ||
| interleaving with concurrent log handler writes. | ||
| """ |
There was a problem hiding this comment.
The thread-safety guarantee in the module docstring is not correct: this lock only serializes writes within EventStreamRecorder, but it does not coordinate with the logging system’s FileHandler writing to the same structured.jsonl. Concurrent event + log writes can still interleave and corrupt JSONL lines. Consider routing events through a shared logging handler (or sharing a single file handle/lock between the recorder and the structured file handler).
| # Add structured JSONL file handler if output_file is configured | ||
| structured_path = cls._structured_log_file or cls._config.get("output_file") | ||
| if structured_path: | ||
| structured_formatter = StructuredJsonFormatter() | ||
| file_handler = logging.FileHandler(structured_path, mode="a") | ||
| file_handler.setLevel(logging.DEBUG) | ||
| file_handler.setFormatter(structured_formatter) | ||
| logger.addHandler(file_handler) |
There was a problem hiding this comment.
get_logger() creates a new logging.FileHandler for every logger instance (even when they all point at the same structured log path). This can open many file descriptors and increases the risk of write contention/interleaving across handlers. Prefer using a shared/cached file handler (similar to _get_or_create_handler) or rely on propagation to a single root file handler.
| try: | ||
| return datetime.fromisoformat(value) | ||
| except ValueError: | ||
| pass | ||
| try: | ||
| from datetime import timezone | ||
|
|
||
| t = datetime.strptime(value, "%H:%M:%S").time() | ||
| return datetime.combine( | ||
| datetime.now(timezone.utc).date(), t, tzinfo=timezone.utc | ||
| ) | ||
| except ValueError: | ||
| raise click.BadParameter(f"Cannot parse time: {value!r}") |
There was a problem hiding this comment.
_parse_time() can return a naive datetime for ISO strings without timezone info (via datetime.fromisoformat). Those values will later be compared against timezone-aware ts values in LogQueryEngine, which raises TypeError (naive vs aware). Normalize parsed datetimes to a timezone (e.g., default to UTC when tzinfo is missing).
| "--type", | ||
| "artifact_type", | ||
| default=None, | ||
| help="Filter by artifact type (pcap, qlog, log, etc.)", |
There was a problem hiding this comment.
--type is documented as filtering by file formats like pcap/qlog, but ArtifactBrowser.list_artifacts() filters artifact_type against the entry’s type field (values like artifact, log, report, etc.), not format. This mismatch will confuse users and make the help text/examples misleading. Either update the help text to match type, or add a separate --format filter.
| help="Filter by artifact type (pcap, qlog, log, etc.)", | |
| help="Filter by artifact entry type (artifact, log, report, etc.)", |
| from typing import Dict, List, Optional | ||
|
|
||
| from panther.core.outputs.output_index import ( | ||
| _EXTENSION_MAP, |
There was a problem hiding this comment.
_EXTENSION_MAP is imported but never used in this module. Removing the unused import will avoid lint failures and keeps the dependency surface minimal.
| _EXTENSION_MAP, |
| # Push cleanup phase context (no with-block to avoid re-indenting entire method) | ||
| _cleanup_ctx = log_context(experiment_id=self.experiment_name, phase="cleanup") | ||
| _cleanup_ctx.__enter__() | ||
| self.logger.info("Starting experiment cleanup") | ||
|
|
There was a problem hiding this comment.
cleanup() manually calls log_context(...).__enter__() / __exit__() without a surrounding try/finally. If an unexpected exception or interrupt occurs before the final __exit__ call, the contextvar will not be reset and subsequent logs may carry the wrong experiment/phase context. Wrap the body in a with log_context(...): block or ensure __exit__ runs in a finally clause.
| structured_log_file: Optional[str] = Field( | ||
| None, | ||
| description="Custom path for structured JSONL log file (auto-set by ExperimentManager if None)", | ||
| ) |
There was a problem hiding this comment.
The new structured_log_file config field is not currently used when configuring logging (e.g., ExperimentManager._load_logging() always sets output_file to <logs_dir>/structured.jsonl). As-is, setting logging.structured_log_file in config will have no effect. Either plumb this value into LoggerFactory/logging_config or drop the field to avoid a misleading config option.
| structured_log_file: Optional[str] = Field( | |
| None, | |
| description="Custom path for structured JSONL log file (auto-set by ExperimentManager if None)", | |
| ) |
| logging_config = { | ||
| "level": level_str, | ||
| "format": self.log_format, | ||
| "enable_colors": getattr(self.global_config.logging, "enable_colors", True), | ||
| "debug_file_logging": getattr( | ||
| self.global_config.logging, "debug_file_logging", True | ||
| ), | ||
| "output_file": str(self.logs_dir / "experiment.log"), | ||
| "output_file": str(self.logs_dir / "structured.jsonl"), | ||
| } |
There was a problem hiding this comment.
_load_logging() ignores global_config.logging.structured_log_file and hardcodes the structured log destination to <logs_dir>/structured.jsonl. If the config model exposes a custom path, it should be respected here (or the option removed).
| after_dt = datetime.fromisoformat(after) if after else None | ||
| before_dt = datetime.fromisoformat(before) if before else None | ||
| msg_pattern = re.compile(pattern) if pattern else None |
There was a problem hiding this comment.
In _build_filter(), datetime.fromisoformat() will produce naive datetimes when the CLI user omits timezone info. LogQueryEngine compares those values with parsed log timestamps and will raise TypeError when mixing naive/aware datetimes. Consider defaulting naive inputs to UTC (or rejecting them with a clear error).
| @click.option( | ||
| "--json", | ||
| "output_json", | ||
| is_flag=True, | ||
| default=False, | ||
| help="Output as JSON (default)", | ||
| ) | ||
| @click.option( | ||
| "--human", | ||
| "output_human", | ||
| is_flag=True, | ||
| default=False, | ||
| help="Output as human-readable swimlane", | ||
| ) | ||
| @handle_errors | ||
| @pass_context_and_setup_logging | ||
| def timeline( | ||
| _ctx, | ||
| experiment_dir, | ||
| test_id, | ||
| service_id, | ||
| after_str, | ||
| before_str, | ||
| limit, | ||
| output_json, | ||
| output_human, | ||
| ): | ||
| r"""Display a chronological timeline of log entries. | ||
|
|
||
| Reads structured.jsonl files from the experiment directory, sorts | ||
| entries by timestamp, and groups them by service for swimlane display. | ||
|
|
||
| \b | ||
| Examples: | ||
| panther report timeline outputs/2024-01-01/exp1 | ||
| panther report timeline outputs/2024-01-01/exp1 --human --service picoquic | ||
| panther report timeline outputs/2024-01-01/exp1 --after 10:00:00 --limit 20 | ||
| """ | ||
| from panther.core.reporting.timeline_renderer import TimelineRenderer | ||
|
|
||
| after = _parse_time(after_str) if after_str else None | ||
| before = _parse_time(before_str) if before_str else None | ||
|
|
||
| renderer = TimelineRenderer(Path(experiment_dir)) | ||
|
|
||
| if output_human: | ||
| text = renderer.render_human( | ||
| test=test_id, | ||
| service=service_id, | ||
| after=after, | ||
| before=before, | ||
| limit=limit, | ||
| ) | ||
| click.echo(text) | ||
| else: | ||
| entries = renderer.render_json( |
There was a problem hiding this comment.
The --json option is unused in the command logic: output format is determined solely by --human (else JSON is emitted). This is confusing for users and makes --json a no-op. Consider switching to a single --json/--human boolean option (as in panther logs) or using output_json in the conditional, and reject --json+--human being set together.
Summary
Changes
structured.jsonloutputEventStreamRecorderfor real-time event capturepanther logsCLI commands for querying structured logsOutputIndexBuilderfor experiment artifact manifestsTest plan
pytest tests/ -n auto -m unit— verify no regressionsstructured.jsonloutput is generatedpanther logsCLI commands