Skip to content

feat: merge metrics into structured.jsonl, add panther logs metrics CLI#112

Open
ElNiak wants to merge 60 commits into
productionfrom
feature/metrics-to-structured-jsonl
Open

feat: merge metrics into structured.jsonl, add panther logs metrics CLI#112
ElNiak wants to merge 60 commits into
productionfrom
feature/metrics-to-structured-jsonl

Conversation

@ElNiak
Copy link
Copy Markdown
Owner

@ElNiak ElNiak commented Mar 16, 2026

Summary

  • Metrics to structured.jsonl: MetricsCollector now writes each metric as a JSONL entry (source="metrics") directly to structured.jsonl instead of separate metrics/ directory with CSV/JSON files
  • panther logs metrics CLI: New subcommand with --name and --type filters for querying metrics
  • Cleanup: Removed metrics/ directory creation, CSV/JSON export from experiment cleanup, stale empty outputs/metrics/ directory

Builds on PR #110 (structured JSONL logging) and PR #111 (console UX).

Test plan

  • 8 new unit tests (metrics JSONL writing + query filters)
  • All pre-commit hooks pass
  • Run experiment and verify structured.jsonl contains metric entries
  • Verify metrics/ directory no longer created
  • Test panther logs metrics CLI on real output

Copilot AI review requested due to automatic review settings March 16, 2026 14:29
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, we are unable to review this pull request

The GitHub API does not allow us to fetch diffs exceeding 20000 lines

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates metrics/event/log outputs into a single structured.jsonl stream and introduces new CLI and reporting utilities to query and summarize those structured records.

Changes:

  • Reworks logging to use a JSONL structured file handler and adds context propagation via LogContext.
  • Adds post-hoc querying/reporting tooling (panther logs, RCA, timeline rendering, artifact browsing, output index).
  • Refactors parts of the event/observer system to use factory methods and broader BaseEvent handler signatures.

Reviewed changes

Copilot reviewed 76 out of 129 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
panther/core/utils/logger_factory.py Switches file logging to structured JSONL and adds feature/context injection + console-level setter.
panther/core/utils/log_context.py Introduces contextvars-backed context propagation for structured logs/events.
panther/core/utils/feature_logger_mixin.py Removes mixin and keeps a lightweight feature-logger accessor.
panther/core/utils/console_formatter.py Adds a context-aware console formatter and banner helper.
panther/core/utils/init.py Updates utils public API exports for new logging utilities.
panther/core/test_cases/test_case_impl.py Pushes test_id log context during test execution.
panther/core/test_cases/mixins/service_management.py Adds per-service/per-phase log context blocks during service lifecycle.
panther/core/reporting/timeline_renderer.py Adds timeline rendering over structured logs.
panther/core/reporting/root_cause_analyzer.py Adds root-cause analysis over structured logs using failure patterns.
panther/core/reporting/log_query_engine.py Implements streaming JSONL querying with composable filters (incl. metrics).
panther/core/reporting/failure_patterns.py Defines built-in failure pattern catalog for RCA.
panther/core/reporting/experiment_reporter.py Integrates RCA + artifact inventory into generated reports.
panther/core/reporting/artifact_browser.py Adds artifact discovery via output index or directory walk.
panther/core/reporting/init.py Exposes new reporting utilities in the package API.
panther/core/outputs/output_index.py Adds a thread-safe output manifest builder (output_index.json).
panther/core/outputs/output_aggregator.py Registers collected outputs into the output index when provided.
panther/core/outputs/init.py Exports OutputIndexBuilder from outputs package.
panther/core/observer/workflow/init.py Docstring normalization only.
panther/core/observer/impl/storage_observer.py Updates typed handlers to accept BaseEvent and adjusts imports/docs.
panther/core/observer/impl/state_observer.py Simplifies imports and broadens handler signatures to BaseEvent.
panther/core/observer/impl/metrics_observer.py Broadens some typed handlers to BaseEvent and simplifies imports.
panther/core/observer/impl/logger_observer.py Broadens error handlers to BaseEvent and simplifies imports.
panther/core/observer/impl/experiment_observer.py Replaces class-based dispatch with (entity_type, event_name) dispatch.
panther/core/observer/impl/event_stream_recorder.py Adds observer that appends all events into structured.jsonl.
panther/core/observer/impl/command_audit_observer.py Broadens handler signatures and changes supported event type registration.
panther/core/observer/impl/init.py Exports EventStreamRecorder.
panther/core/observer/factory.py Registers event_stream observer type.
panther/core/metrics/resource_monitor.py Docstring normalization only.
panther/core/metrics/metric_types.py Introduces new metric dataclasses.
panther/core/metrics/enums.py Docstring normalization only.
panther/core/experiment_observer.py Registers EventStreamRecorder for experiments.
panther/core/events/test/emitter.py Refactors test emitter to use TestEvent factory methods instead of subclasses.
panther/core/events/test/init.py Reduces exports to the new factory-based API.
panther/core/events/step/events.py Refactors step events to factory methods and removes most subclasses.
panther/core/events/step/emitter.py Updates step emitter to use StepEvent factories.
panther/core/events/step/init.py Reduces exports to the new factory-based API.
panther/core/events/service/emitter.py Refactors service emitter to use ServiceEvent factory methods.
panther/core/events/service/init.py Reduces exports to the new factory-based API.
panther/core/events/plugin/emitter.py Refactors plugin emitter to use PluginEvent factory methods.
panther/core/events/plugin/init.py Reduces exports to the new factory-based API.
panther/core/events/experiment/events.py Refactors experiment events to factory methods; retains a compatibility subclass.
panther/core/events/experiment/emitter.py Updates experiment emitter to use ExperimentEvent factories.
panther/core/events/experiment/init.py Reduces exports to the new factory-based API.
panther/core/events/environment/emitter.py Refactors environment emitter to use EnvironmentEvent factory methods.
panther/core/events/environment/init.py Reduces exports to the new factory-based API.
panther/core/events/assertion/events.py Refactors assertion events to factory methods and removes subclasses.
panther/core/events/assertion/emitter.py Updates assertion emitter to use AssertionEvent factories.
panther/core/events/assertion/init.py Reduces exports to the new factory-based API.
panther/core/events/init.py Shrinks public events API surface to factory-based event classes.
panther/core/docker_builder/plugin_mixin/service_manager_docker_mixin.py Adds user-visible CLI output for cached images.
panther/core/docker_builder/platform_detection.py Adds platform/buildx/build-mode detection and validation helpers.
panther/core/command_processor/mixins/modification_mixin.py Removes command modification mixin.
panther/core/command_processor/mixins/init.py Updates mixin exports after removing modification mixin.
panther/core/command_processor/init.py Updates docs/exports after removing modification mixin.
panther/config/core/validators/init.py Replaces lazy __getattr__ exports with eager imports.
panther/config/core/models/global_config.py Replaces debug_file_logging with structured_log_file config field.
panther/cli/core/main.py Registers new CLI command groups (logs/report/debug/build-metrics).
panther/cli/commands/run.py Adds --debug and passes computed console level into experiment runner.
panther/cli/commands/logs.py Adds panther logs query/errors/metrics subcommands over structured logs.
panther/cli/commands/debug.py Adds developer introspection CLI for events/observers.
panther/cli/commands/build_metrics.py Adds CLI to view/export build/test metrics stored in .panther-metrics.
experiment-config/base/experiment_config_example_minimal_docker_no_buildx.yaml Removes debug_file_logging example field.
experiment-config/base/experiment_config_example_minimal_docker.yaml Removes debug_file_logging example field.
experiment-config/base/experiment_config_example_minimal.yaml Removes debug_file_logging example field.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +196 to +198
# Push log context for this test; will be popped in the finally block
_ctx_token = log_context(test_id=self.test_name)
_ctx_token.__enter__()
Comment on lines +443 to +444
# Pop log context for this test
_ctx_token.__exit__(None, None, None)
Comment on lines +132 to +133
record.context_tag = _build_context_tag() # type: ignore[attr-defined]
record.short_module = _short_module(record.module) # type: ignore[attr-defined]
Comment on lines +81 to +83
with self._lock:
with open(self.output_path, "a", encoding="utf-8") as fh:
fh.write(line + "\n")
Comment on lines 1 to 5
"""Storage Observer Module."""

from typing import Any, Dict, List, Optional

"""
Comment thread panther/core/utils/logger_factory.py Outdated
file_level = logging.DEBUG if debug_file_logging else console_level

# Structured JSONL file handler (replaces old text file handler)
structured_path = cls._structured_log_file or cls._config.get("output_file")
before: Include records with ``ts`` strictly before this datetime.
correlation_id: Match records with this ``correlation_id``.
message_pattern: Regex applied to the ``message`` field.
sources: Filter by ``source`` (``"logging"`` or ``"event"``).
Comment on lines +170 to +184
error_records = list(self._engine.query(error_filter))

# Also collect event-source records with error-related types
event_filter = LogFilter(sources={"event"})
for record in self._engine.query(event_filter):
message = record.get("message", "")
event_type = record.get("event_type", "")
# Include events that look error-related
if any(
kw in (message + " " + event_type).lower()
for kw in ("error", "fail", "crash", "timeout", "killed")
):
# Avoid duplicates
if record not in error_records:
error_records.append(record)
Comment on lines +544 to +547
return (self.experiment_dir / "structured.jsonl").is_file() or any(
self.experiment_dir.rglob("structured.jsonl")
)

TestTeardownStartedEvent,
)

__all__ = [
@github-actions
Copy link
Copy Markdown

PR Validation Failed

Your changes have validation issues. Please check the following:

  1. Code Formatting: Run black tests/ panther/ to fix formatting
  2. Import Sorting: Run isort tests/ panther/ to fix imports
  3. Linting: Run flake8 tests/ panther/ --max-line-length=88 to check for issues
  4. Tests: Ensure all tests pass with pytest tests/unit/ -n auto -m "not slow" -p no:panther_metrics

You can run the quick validation locally with:

# Install dependencies
pip install -e .
pip install pytest black isort flake8

# Run validation
black --check tests/ panther/
isort --check-only tests/ panther/
flake8 tests/ panther/ --max-line-length=88
pytest tests/unit/ -n auto -m "not slow" -p no:panther_metrics

Please fix these issues and push again. The CI will automatically re-run.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

10 similar comments
@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

ElNiak added 14 commits March 27, 2026 12:39
…d classes

Track 1 - God class splits:
- Split DockerBuilder into platform_detection, buildx_operations, image_operations
- Split MetricsCollector into metric_types, metric_analysis
- Split ExperimentManager into experiment_phases, experiment_cleanup

Track 2 - Event/observer simplification:
- Replace ~130 event subclasses with factory classmethods on 8 base classes
- Simplify ITypedObserver from 881 to ~80 lines with dynamic dispatch
- Remove all backward-compat alias functions from events
- Clean up events/__init__.py from 404 to ~100 lines
- Update all emitters to use factory methods directly

Track 3 - Validator cleanup:
- Remove dead validator exports from pydantic_factories.py
- Simplify validators/__init__.py (remove __getattr__ machinery)

Net reduction: ~5,600 lines across 40 files. 870 tests passing.
…, and mixin inlining

Wire hidden core modules to CLI (report, build-metrics, debug commands),
delete orphaned tools and unused mixins, inline single-use mixins into
their consumers, replace empty protocol placeholders with intermediate
base classes (ClientServerProtocolBase, PeerToPeerProtocolBase).

Net: -1,746 LOC, 13 files deleted, 5 created, 10 new CLI commands.
Add LogContext (contextvars-based) and StructuredJsonFormatter that
replaces flat text file logs with structured JSONL. Every log record
now carries experiment_id, test_id, service_id, phase context
automatically via contextvars — zero changes to call sites needed.

- New: log_context.py (LogContext dataclass + context manager)
- New: structured_formatter.py (JSONL logging.Formatter)
- Modified: logger_factory.py (structured file handler replaces text)
- Modified: global_config.py (removed debug_file_logging, added structured_log_file)
- Removed: debug_file_logging references from configs and experiment_manager
Wire LogContext into experiment lifecycle boundaries so that every log
record and event emitted during a phase/test/service carries the
relevant identifiers (experiment_id, test_id, service_id, phase).

Context propagation locations:
- ExperimentManager.initialize_experiments: experiment_id + phase
- ExperimentManager.run_tests: experiment_id + test_execution phase
- ExperimentManager.run_tests per-test loop: test_id
- ExperimentManager.cleanup: experiment_id + cleanup phase
- TestCase.run: test_id (via manual __enter__/__exit__)
- ServiceManagementMixin: service_id + phase for setup_testers,
  setup_implementations, prepare_services, teardown_services

New EventStreamRecorder observer:
- Appends every event to the same structured.jsonl used by
  StructuredJsonFormatter, interleaving events and log records
  chronologically in a single timeline file
- Uses "source": "event" and "level": "EVENT" to distinguish
  from logging records
- Reads LogContext for experiment/test/service/phase fields
- Thread-safe file writes via threading.Lock
- Registered as a global observer in ExperimentObserverMixin
  with low priority (10) so business observers run first

Also:
- Register EventStreamRecorder in observer factory as "event_stream"
- Add deprecation note to StorageObserver about separate JSONL files
  now being redundant with structured.jsonl (kept for backward compat)
- 12 new unit tests covering schema, context propagation,
  deduplication, thread safety, error handling
Streaming JSONL log query engine (LogQueryEngine, LogFilter) that reads
structured.jsonl files without loading them fully into memory.  Supports
filtering by level, service, test, phase, time range, correlation ID,
message regex, and source.

CLI commands:
  panther logs query <dir> --level/--service/--test/--phase/...
  panther logs errors <dir>  (shorthand for --level ERROR,CRITICAL)

Both commands support --json (default JSONL) and --human (table) output.
35 unit tests cover all filter combinations, file discovery, malformed
input handling, and the count() aggregation.
Create an incremental, thread-safe output index that tracks every file
produced during an experiment and serializes it as output_index.json.

- New: panther/core/outputs/output_index.py with OutputIndexBuilder,
  FileEntry, and detect_type_and_format helper
- Integrate into ExperimentManager: init at startup, register known
  files (config, reports, metrics, structured log), flush at cleanup
- Extend OutputAggregator with optional output_index_builder param to
  register collected environment outputs in the manifest
- 35 unit tests covering serialization, auto-detection, dedup,
  thread safety, and aggregator integration
…mands

Add TimelineRenderer for chronological swimlane display of structured
log entries, and ArtifactBrowser for listing/filtering experiment output
files from output_index.json (with directory walk fallback).

Register both as `panther report timeline` and `panther report artifacts`
CLI subcommands with filtering by test, service, time range, phase, and
artifact type.
…matching

Add declarative FailurePattern library (8 built-in patterns derived from
ErrorCategory) and RootCauseAnalyzer that reads structured.jsonl via
LogQueryEngine to produce ranked root-cause findings with log excerpts
and actionable suggestions.

Extend ExperimentReporter with optional RCA and artifact inventory
sections in both JSON and Markdown reports (gracefully skipped when
structured.jsonl is absent).

Add `panther report diagnose` CLI subcommand with --json/--human output.
Replace the raw colorlog/logging.Formatter in LoggerFactory with a new
ConsoleFormatter that provides:
- Short HH:MM:SS timestamps (no date for live runs)
- Phase/service context from LogContext: (phase|service_id) prefix
- Abbreviated module names (strips panther.core. prefix)
- Static banner() method for phase separators via click.echo()

Wire phase banners into ExperimentManager at each transition:
initialization, plugin loading, test execution, and cleanup.
Add LoggerFactory.set_console_level() classmethod that updates all
StreamHandler instances (root + named loggers) to a given level while
keeping FileHandlers at DEBUG for structured JSONL output.

Wire --verbose (DEBUG) and --debug (TRACE) Click options through
the run command into ExperimentManager via a new console_level
parameter, giving CLI flags final precedence over YAML config.

Includes 8 unit tests covering level updates, FileHandler
preservation, TRACE support, and idempotency.
Verify format_test_header() and format_test_result() output
formatting, emoji control, timing precision, and padding behavior.
Add format_experiment_summary() that displays a structured banner after
all tests complete, showing pass/fail counts with percentage, duration,
output directory, failed test details, and next-step CLI commands.
Replaces the simple one-line completion message.
DockerOutputParser now adapts to console verbosity: concise one-liner
step labels by default, full instruction text when --verbose/--debug is
active. Download/extract progress is suppressed in concise mode. Build
start, completion (with duration), and cache-skip messages use
click.echo() for consistent structured output.
ElNiak added 12 commits March 27, 2026 12:40
Replace fragile conditional-then-remove command building with clean
append-based construction.
…plugin

create_plugin() and create_subplugin() do not accept force_overwrite.
discover_plugins() returns Dict[str, PluginMetadata], not
Dict[str, List[str]]. Fix params to use direct dict lookup and
scan to group by metadata.type for display.
click.Path() returns str but validate() calls .is_dir() and .is_file().
Add Path() conversion matching the existing pattern in check_deps().
… observer comment

- Add missing FastFailHandler import (2 tests crashed with NameError)
- Remove module-level DockerBuilder.reset_singleton() (unsafe for xdist)
- Add step.* prefix tests for MetricsObserver.is_interested()
- Fix stale comment about MetricsObserver event scope
…s, metrics filter)

1.1: Fix Jinja2 whitespace stripping that merged success summary lines
1.2: Silence ~38 unhandled event types with noop handlers + configurable logging
1.3: Broaden MetricsObserver.is_interested() to accept test.* and step.* events
1.4: Group service health tables by test_name in both Jinja2 and fallback reports
1.5: Remove duplicate TestCompleted/TestFailed emissions from ExperimentManager
1.6: Add required/optional pattern distinction for output completeness calculation
TestServiceHealthTestName and TestErrorCategoryMatching were defined
twice in test_status_collector.py due to the production merge.
…create commands

These options were accepted by `panther create plugin` and `panther create subplugin`
but silently ignored — never forwarded to create_plugin()/create_subplugin().
Removes them to eliminate user confusion and clean up the public API surface.
- Change [tool:pytest] to [pytest] in tests/cli/pytest.ini (fixes
  PytestUnknownMarkWarning for @pytest.mark.unit in CLI tests)
- Fix validate() and check_deps() type annotations: click.Path()
  returns str, not Path
- Add pytestmark = pytest.mark.integration at module level
- Fix Mock identity comparison (== -> is) for fast_fail_handler
- Replace DockerConnection test: builder enters cache-only mode (client=None), not exception
- Remove overly specific message/context assertions from docker build failure tests
- Add mock_docker_client fixture to test_docker_client_none_raises_exception (singleton isolation)
- Fix test_docker_client_none: catches PantherException (validate_build_prerequisites fires first)
- Fix test_docker_build_failure_stops_experiment: test handler directly instead of via build_image
- Fix test_configuration_based_fast_fail_behavior: remove critical_only (nonexistent), test real behavior
- Fix mock_event_manager: configure cleanup_none_observers() to return 0 (not MagicMock)
- Fix TestPluginManagerFastFail: use plugin_factory/create_service_manager (service_factory doesn't exist)
- Fix test_experiment_initialization_error: assign experiment_config; catch TestCaseInitializationError
…ed API, mock setup)

- Replace removed _deploy_services_non_blocking() / deploy_services() tests with
  test_deploy_services_monitoring() and test_monitoring_not_active_before_deployment_event()
- Fix thread-based tests to use polling with deadline instead of fixed sleeps
- Mock execute_docker_command (not _is_service_ready) to match _is_service_ready_with_timeout() internals
- Fix stop_monitoring() assertion: now sets monitor_thread=None internally
- Fix docker_compose_env fixture: env_type must be "network_environment" not "network"
- Fix test_test_case_checks_early_termination: use MagicMock for configs, steps.wait attribute
- Add pytestmark = pytest.mark.integration
…ed API, mock setup)

- Add pytestmark = pytest.mark.integration
- Fix thread timing: use stop_monitoring() internal join instead of sleep
- Replace removed _deploy_services_non_blocking/deploy_services API calls
- Fix BackgroundServiceMonitor mock: mock execute_docker_command not _is_service_ready
- Fix TestCase mock setup with proper patches
- Clean up unused imports (threading, TestConfig)
- Fix BackgroundServiceMonitor import path (moved to own module)
@ElNiak ElNiak force-pushed the feature/metrics-to-structured-jsonl branch from fab3a5a to 31a44e7 Compare March 27, 2026 11:43
@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

… code)

- Add threading.Lock to FastFailHandler for thread-safe error tracking
- Add warning log + method guard to _requires_emitter and _emit_event
- Remove broken conftest fixtures referencing deleted modules
- Fix EventStreamRecorder file handle reset to be inside lock
- Return False from LoggerObserver on handler exceptions
- Pass global_config to metrics/storage observer factory builders
- Move logger assignments below imports (isort compliance)
@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

- Delegate EventStreamRecorder.__del__ to close() for thread safety
- Guard handle_event against non-service events via entity_type check
- Normalize datetimes to UTC before stripping tzinfo in log query engine
- Correct _build_context_tag docstring separator to match code
- Log ImportError in webapp components instead of silently swallowing
- Use lazy log formatting in subprocess_executor
- Skip panther_ivy tests when submodule not installed (fixes CI)
@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

Wrap webapp/components/forms/__init__.py imports in try/except
ImportError so test collection succeeds when nicegui is not installed.
@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

…B memory usage

Coverage was baked into pyproject.toml addopts, meaning every pytest
invocation instrumented 321K lines with branch tracing + generated 3
report formats. This caused ~50GB memory usage on casual test runs.

Changes:
- Remove --cov* from default addopts (coverage now opt-in)
- Set branch = false (halves instrumentation overhead)
- Enable parallel = true (distribute across SQLite files with xdist)
- Expand omit patterns (webapp, tools, banner, install_templates)
- Update CLAUDE.md with explicit coverage commands
@github-actions
Copy link
Copy Markdown

PR Validation Passed

Your changes look good! The quick validation checks have passed:

  • ✅ Code formatting (Black)
  • ✅ Import sorting (isort)
  • ✅ Linting (flake8)
  • ✅ Basic tests

The full CI pipeline will run when this PR is merged or when targeting the main branches.

ElNiak added 5 commits March 30, 2026 11:49
Critical fixes:
- Fix UnboundLocalError in run_tests() by initializing counters before try
- Fix double LoggerFactory.initialize() so root logger gets JSONL handler
- Add explicit EventStreamRecorder.close() in cleanup to prevent data loss
- Upgrade record_error defensive catches from DEBUG to WARNING
- Add concurrent thread-safety tests for FastFailHandler

Important fixes:
- Replace bare except:pass with circuit-breaker logging in resource_sampler
- Stop returning fabricated zero metrics on resource collection error
- Return False from _set_state on exception instead of always True
- Fix event.event_type -> event.get_type() in docstring example
- Rename stale DebugObserver logger to LoggerObserver, add recursion warning
- Remove unused enable_colors/structured_output params from _setup_logging
- Add experiment_workflow field to FeatureLogLevelsConfig
- Count and warn about skipped malformed JSONL lines in StorageObserver
- Narrow subprocess retry exception types to prevent wasted retries

Tests:
- Add FastFailHandler concurrent thread-safety tests
- Add MetricsCollector JSONL write circuit-breaker test
- Add dot-separated event name dispatch test
- Add build-metrics CLI command tests
…tion

- Increase stdout/stderr truncation from 500 to 2000 chars in
  DockerComposeException for better diagnostics
- Wrap NiceGUI UI updates in try/except RuntimeError to handle
  client disconnection gracefully during experiment execution
Reformat multi-line docstrings to single-line where appropriate
(ruff D-rule compliance). Add missing module, class, and method
docstrings throughout panther/ and tests/.

Net reduction of ~370 lines from docstring compaction.
Replace three independent file-write paths (LoggerFactory FileHandler,
EventStreamRecorder lock+handle, MetricsCollector lock+open) with a
single JsonlWriter that serializes all structured.jsonl output through
one threading.Lock and one persistent file handle.

This eliminates the critical bug where interleaved writes from
concurrent producers could corrupt the JSONL file.

- Add panther/core/utils/jsonl_writer.py (JsonlWriter, JsonlLogHandler)
- LoggerFactory: use JsonlLogHandler instead of FileHandler, expose
  get_jsonl_writer() for other components
- EventStreamRecorder: remove own lock/handle/close, delegate to writer
- MetricsCollector: remove _jsonl_lock and per-write open(), use writer
- ExperimentManager: wire shared writer to metrics and event recorder
- Update all affected tests (8 new, 3 test files updated)
Fix ImportError handling (C2/C3), experiment_manager context manager (C4),
legacy code removal (4a-4f), error handling improvements (5a-5c), bounded
dedup set, consistent global_config injection, docstring accuracy fixes,
and strict event filtering.

Add test coverage for _requires_emitter decorator and
ExperimentObserverMixin. Add pytest.mark.unit markers to 3 test files.
Fix .add() → OrderedDict in typed_observer and update test_observer_dispatch
to remove _HANDLER_OVERRIDES references.
@github-actions
Copy link
Copy Markdown

PR Validation Failed

Your changes have validation issues. Please check the following:

  1. Code Formatting: Run black tests/ panther/ to fix formatting
  2. Import Sorting: Run isort tests/ panther/ to fix imports
  3. Linting: Run flake8 tests/ panther/ --max-line-length=88 to check for issues
  4. Tests: Ensure all tests pass with pytest tests/unit/ -n auto -m "not slow" -p no:panther_metrics

You can run the quick validation locally with:

# Install dependencies
pip install -e .
pip install pytest black isort flake8

# Run validation
black --check tests/ panther/
isort --check-only tests/ panther/
flake8 tests/ panther/ --max-line-length=88
pytest tests/unit/ -n auto -m "not slow" -p no:panther_metrics

Please fix these issues and push again. The CI will automatically re-run.

Replace fragile __class__ mutation with direct mixin inheritance,
use call.kwargs for observer_id extraction instead of positional
indexing, and strengthen warning count assertion from >= 1 to >= 2.
@github-actions
Copy link
Copy Markdown

PR Validation Failed

Your changes have validation issues. Please check the following:

  1. Code Formatting: Run black tests/ panther/ to fix formatting
  2. Import Sorting: Run isort tests/ panther/ to fix imports
  3. Linting: Run flake8 tests/ panther/ --max-line-length=88 to check for issues
  4. Tests: Ensure all tests pass with pytest tests/unit/ -n auto -m "not slow" -p no:panther_metrics

You can run the quick validation locally with:

# Install dependencies
pip install -e .
pip install pytest black isort flake8

# Run validation
black --check tests/ panther/
isort --check-only tests/ panther/
flake8 tests/ panther/ --max-line-length=88
pytest tests/unit/ -n auto -m "not slow" -p no:panther_metrics

Please fix these issues and push again. The CI will automatically re-run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants