[TMP] Feature Preview#510
Draft
nicklafleur wants to merge 11 commits into
Draft
Conversation
This commit implements function-level hashing to skip re-testing unchanged mutants, along with fixes for mypy type errors and architectural improvements. A follow-up commit will implement transitive invalidation of mutants based on function call graphs and the new hashing mechanism. INCREMENTAL MUTATION TESTING - Add _compute_function_hashes() in file_mutation.py to generate SHA-256 hashes (truncated to 12 chars) for each mutated function's source code - Store hash_by_function_name in SourceFileMutationData for persistence - On subsequent runs, compare old vs new hashes to identify changed functions - Reset mutant results to None (needs re-testing) when function hash changes - Return changed_functions and current_hashes from create_mutants_for_file() MUTATION METADATA TRACKING - Add MutationMetadata dataclass with line_number, mutation_type, and description - Each Mutation now carries metadata about what changed and where - Add OPERATOR_TO_TYPE mapping to categorize mutations (number, string, boolean, etc.) - Add _determine_mutation_type() to disambiguate operator categories - Add _describe_mutation() for human-readable mutation descriptions - Serialize/deserialize metadata to JSON via to_dict()/from_dict() NAMING AND CONVENTIONS - Rename public functions to private (_create_mutations, _combine_mutations_to_source, etc.) - Rename mutation_operators to MUTATION_OPERATORS (constant naming convention) - Add explicit type annotations throughout (dict[str, MutationMetadata], etc.) NEW BENCHMARK PROJECT - Add e2e_projects/benchmark_1k/ with ~1000 mutants for testing - Includes modules: numbers, strings, booleans, operators, comparisons, arguments, returns, complex (recursion, higher-order functions) - Configurable delays via BENCHMARK_IMPORT_DELAY, BENCHMARK_CONFTEST_DELAY, BENCHMARK_TEST_DELAY environment variables
Introduce MutmutState class to more easily manage runtime state for dependency tracking (old_function_hashes, current_function_hashes, function_dependencies). Persist hashes and dependencies to mutmut-stats.json for incremental runs. Changes: - Add state.py with MutmutState dataclass and state() singleton accessor - Add core.py with MutmutCallStack (ContextVar-based) for async-safe tracking - Move record_trampoline_hit to core.py, now tracks caller->callee edges - Update trampoline to track call depth and record dependencies during stats - Extend load_stats/save_stats to persist function_hashes and dependencies - Add _cleanup_stale_stats and _invalidate_stale_dependency_edges functions - Add track_dependencies and dependency_tracking_depth config options - Update documentation describing the dependency tracking feature
…ngleton pattern First commit in a series of refactors to make implementing different forking strategies easier and improve performance. - Replace Config.get()/ensure_loaded()/reset() with config()/reset_config() - Migrate globals from __init__.py to MutmutState with deprecation warnings - Add __getattr__ in __init__.py for backwards-compatible deprecated access - Remove all Config.ensure_loaded() calls (config() auto-loads) Both config() and state() now follow the same lazy-loading singleton pattern: - First call creates the instance - Subsequent calls return the cached instance - reset_*() clears for testing Deprecated access patterns (emit FutureWarning): - mutmut.config → use mutmut.configuration.config() - mutmut.stats_time → use mutmut.state.state().stats_time - mutmut.duration_by_test → use mutmut.state.state().duration_by_test - mutmut.tests_by_mangled_function_name → use state().tests_by_mangled_function_name - mutmut._stats → use mutmut.state.state()._stats - mutmut._covered_lines → use mutmut.state.state()._covered_lines
Move timeout management from threading/ to workers/ package and add Changes: - Move threading/timeout.py → workers/timeout.py - Update __main__.py import to use new location - Move tests/threading/ → tests/workers/
Commit contains purely non-functional changes to code organization. Extract components from __main__.py into dedicated modules for better separation of concerns and maintainability: New modules: - runners/harness.py - TestRunner ABC with PytestRunner and HammettRunner - ui/browse.py - ResultBrowser Textual TUI application - ui/terminal.py - Terminal display utilities (spinner, status printer) - stats.py - Statistics collection and reporting (Stat, collect_stat, print_stats, etc) - utils/file_utils.py - File utilities (walk_source_files, copy_src_dir, copy_also_copy_files, setup_source_paths) Test reorganization: - Move all unit tests to tests/unit/ directory - Organize tests by module: models/, mutation/, runners/, stats/, ui/, utils/, workers/ - Add tests/unit/utils/test_file_utils.py with tests for file utilities - Add proper __init__.py files for test packages Pre-commit fixes: - Update exclude patterns in .pre-commit-config.yaml for tests/unit/data/ - Fix browse.py type annotations: callable -> Callable Other changes: - Fix TYPE_CHECKING import in code_coverage.py to use new module path - Update test_e2e_result_snapshots.py to import walk_source_files from file_utils
Add run_in_fork_with_result() and run_in_fork() to run functions in forked children while keeping the parent process clean. This is the foundation for hot-fork mode where the parent must never import pytest/conftest. Uses os.pipe() for IPC - lower overhead than temp files, no cleanup needed.
Consolidate fork isolation utilities into a single module and add OrchestratorCrashError for hot-fork crash handling. Changes: - Rename fork_isolation.py → isolation.py - Add OrchestratorCrashError exception class with: - Exit code of crashed orchestrator - List of lost in-flight mutants (truncated if >10) - Optional crash log path - Instructions to resume with 'mutmut run' - Update test imports and add 10 tests for OrchestratorCrashError This prepares for HotForkRunner implementation by providing graceful crash handling with clear user guidance.
Add fork-based process isolation infrastructure: - MutantRunner ABC defining the runner interface - HotForkRunner class with orchestrator/worker pattern - OrchestratorCrashError for crash recovery - Orchestrator crash recovery with pending work restart Configuration updates: - ProcessIsolation enum (FORK, HOT_FORK) - HotForkWarmup enum (NONE, IMPORT, COLLECT) - Config fields with defaults preserving current behavior (fork) - load_config() validation for new enum fields Summary output feature: - Summary, SummaryStats, PhaseTimings dataclasses - SurvivingMutant, NoTestsMutant, NotCheckedMutant dataclasses - calculate_stats_by_mutation_type() function - write_summary_file() writes mutants/summary.json Phase timing tracking: - MutmutState fields for phase durations - Recording in __main__.py for all phases
Add ForkRunner class that encapsulates the traditional os.fork() based mutation testing approach. This provides a consistent interface with HotForkRunner while preserving the fast forking behavior for projects that don't use fork-unsafe libraries. Changes: - Add RunningWorker NamedTuple to track in-flight mutation workers - Implement ForkRunner with full MutantRunner interface: - submit(): forks child, sets MUTANT_UNDER_TEST, runs tests - wait_for_result(): waits for any child, returns MutantResult - has_capacity/pending_count: track concurrent workers - get_active_workers: return ActiveWorker list for timeout checking - shutdown: wait for all remaining children - Extend MutantRunner ABC with new methods: collect_stats(), run_clean_tests(), run_forced_fail(), list_all_tests() - Add get_mutant_runner() factory function to create runner based on process_isolation config - Add StatsResult dataclass for stats collection serialization - Refactor logging: extract to logging_utils.py, add log_to_file and log_file_path config options - Remove safe_setproctitle wrapper, use setproctitle directly - Add gc.freeze() call in startup() for both runners E2E test projects: - Add hot_fork_basic: simple test case for ForkRunner - Add hot_fork_gevent: test case demonstrating HotForkRunner need - Update benchmark_1k with run_benchmark.py script and results The ForkRunner is faster than HotForkRunner but incompatible with gevent, grpc, and torch. Users can select via process_isolation config.
…tracking Browser improvements: - Add two-column dependencies table showing upstream (callers) and downstream (callees) with depth levels - Display diff view and dependencies side-by-side (3:2 ratio) - Add three-level depth toggle: 1-lvl, configured depth, and full (skips config option if already 1 or unlimited) - Add CacheStatus enum (CACHED/STALE_DEPENDENCY/INVALID) with severity-based ordering Performance optimizations: - Precompute raw_deps (mangled→raw name conversion) once per data load - Lazy-cache BFS traversal results by (func_name, depth) key - Fix cache status not clearing after tests by excluding invalid functions themselves from funcs_with_invalid_deps New features: - Add `generate` command to regenerate mutants without running tests Refactoring: - Rename browse.py to browser.py with significant restructuring - Extract UI helpers into dedicated helpers.py module - Move result_browser_layout.tcss to ui/ directory
This was referenced Apr 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR serves as a public preview of our in-flight work