CodeClone 1.4.2: maintenance update
Overview
This patch release is a maintenance update. Determinism remains guaranteed: reports are stable and ordering is
unchanged.
Performance & Implementation Cleanup
process_file()now uses a singleos.stat()call to obtain both size (size guard) andst_mtime_ns/st_size(file
stat signature), removing a redundantos.path.getsize()call.- Discovery logic was deduplicated by extracting
_discover_files(); quiet/non-quiet behavior differs only by UI status
wrapper, not by semantics or filtering. - Cache path wiring now precomputes
wire_mapso_wire_filepath_from_runtime()is evaluated once per key.
Hash Reuse for Block/Segment Analysis
extract_blocks()andextract_segments()accept optionalprecomputed_hashes. When provided, they reuse hashes
instead of recomputing.- The extractor computes function body hashes once and passes them to both block and segment extraction when both
analyses run for the same function.
Scanner Efficiency (No Semantic Change)
iter_py_files()now filters candidates before sorting, so only valid candidates are sorted. The final order remains
deterministic and equivalent to previous behavior.
Contract Tightening
precomputed_hashestype strengthened:list[str] | None→Sequence[str] | None(read-only intent in the type
contract).- Added
assert len(precomputed_hashes) == len(body)in bothextract_blocks()andextract_segments()to catch
mismatched inputs early (development-time invariant).
Testing & Determinism
- Byte-identical JSON reports verified across repeated runs; differences, when present, are limited to
volatile/provenance meta fields (e.g., cache status/path, timestamps), while semantic payload remains stable. - Unit tests updated to mock
os.statinstead ofos.path.getsizewhere applicable (test_process_file_stat_error,
test_process_file_size_limit).
Notes
- No changes to:
- detection semantics / fingerprints
- baseline hash inputs (
payload_sha256semantic payload) - exit code contract and precedence
- schema versions (baseline v1.0, cache v1.2, report v1.1)