Skip to content

Commit a56df8e

Browse files
authored
Merge pull request #11 from n-n-code/quality_harness_improvements
Quality harness work
2 parents e128268 + 9950991 commit a56df8e

18 files changed

Lines changed: 851 additions & 71 deletions

AGENTS.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ Current architecture:
99
- Global shortcut handling goes through `KGlobalAccel`
1010
- Audio capture uses Qt Multimedia
1111
- Transcription is in-process through vendored `whisper.cpp`
12+
- The public runtime seam is streaming-first through app-owned chunks, events, and compatibility helpers
13+
- Static backend support lives in `BackendCapabilities`, while runtime/device/model inspection lives in `RuntimeDiagnostics`
1214
- Clipboard writes prefer `KSystemClipboard` with `QClipboard` fallback
1315
- There is an early Qt Widgets tray shell in `mutterkey-tray`, but the daemon remains the product core
1416
- The recommended day-to-day runtime path is the `systemd --user` service
@@ -31,18 +33,22 @@ This repository is intentionally kept minimal:
3133
- `src/audio/audiorecorder.*`: microphone capture
3234
- `src/audio/recording.h`: shared recorded-audio payload passed between the recorder, service, tests, and transcription path
3335
- `src/clipboardwriter.*`: clipboard integration, preferring KDE system clipboard support
34-
- `src/audio/recordingnormalizer.*`: conversion to Whisper-ready mono `float32` at `16 kHz`
35-
- `src/transcription/whispercpptranscriber.*`: in-process Whisper integration
36-
- `src/transcription/transcriptionengine.*`: app-owned engine/session seam for backend selection and future runtime evolution; the engine owns immutable runtime metadata such as backend capabilities
36+
- `src/audio/recordingnormalizer.*`: conversion to runtime-ready mono `float32` at `16 kHz`
37+
- `src/transcription/audiochunker.*`: deterministic chunking of normalized audio for the streaming runtime path
38+
- `src/transcription/transcriptassembler.*`: final transcript assembly from streaming transcript events
39+
- `src/transcription/transcriptioncompat.*`: compatibility wrapper that routes one-shot recordings through the streaming runtime seam
40+
- `src/transcription/whispercpptranscriber.*`: in-process Whisper integration and whisper-specific engine construction
41+
- `src/transcription/transcriptionengine.*`: app-owned engine/session seam for backend selection and future runtime evolution; the engine owns immutable capability metadata and separate runtime diagnostics
3742
- `src/transcription/transcriptionworker.*`: worker object hosted on a dedicated `QThread`
38-
- `src/transcription/transcriptiontypes.h`: normalized audio, typed runtime error, and capability value types
43+
- `src/transcription/transcriptiontypes.h`: runtime diagnostics, normalized audio, typed runtime error, chunk, event, and capability value types
3944
- `src/config.*`: JSON config loading and defaults
4045
- `src/app/*`: shared CLI/runtime command helpers used by the main entrypoint
4146
- `src/control/*`: local daemon control transport, typed snapshots, and session/client APIs
4247
- `src/tray/*`: Qt Widgets tray-shell UI scaffolding
4348
- `contrib/mutterkey.service`: example user service
4449
- `contrib/org.mutterkey.mutterkey.desktop`: hidden desktop entry used for desktop identity/integration
4550
- `scripts/check-release-hygiene.sh`: repo hygiene checks for publication-facing content
51+
- `scripts/check-test-commentary.sh`: validates `WHAT/HOW/WHY` commentary blocks in repo-owned Qt tests
4652
- `next_feature/`: tracked upcoming feature plans as Markdown; keep only plan `.md` files and the folder-local `.gitignore`
4753
- `docs/Doxyfile.in`: Doxygen config template for repo-owned API docs
4854
- `docs/mainpage.md`: Doxygen landing page used instead of the full README
@@ -108,12 +114,15 @@ Notes:
108114

109115
- `once` mode requires microphone access and a valid Whisper model path
110116
- Real transcription verification needs a configured model in `~/.config/mutterkey/config.json` or a custom config path
111-
- A small `Qt Test` + `CTest` suite exists for config loading and audio normalization, including malformed JSON, wrong-type config inputs, and recording-normalizer edge cases
117+
- A small `Qt Test` + `CTest` suite exists for config loading, audio normalization, streaming-runtime helpers, and transcription-worker orchestration, including malformed JSON, wrong-type config inputs, recording-normalizer edge cases, and fake streaming backend behavior
118+
- Repo-owned test cases are expected to carry `WHAT/HOW/WHY` comments near the start of each real test body; `scripts/check-test-commentary.sh` and `scripts/check-release-hygiene.sh` enforce that convention
112119
- Config loading is intentionally forgiving: invalid runtime values fall back to defaults and log warnings
113120
- Use `ctest --test-dir "$BUILD_DIR" --output-on-failure` for changes that affect covered code
114121
- Keep Qt GUI or Widgets tests headless under `CTest`: set `QT_QPA_PLATFORM=offscreen` in the test registration or test properties rather than relying on the caller environment
115122
- Use `bash scripts/run-valgrind.sh "$BUILD_DIR"` or `cmake --build "$BUILD_DIR" --target valgrind` when validating memory behavior for release readiness or after fixing memory-lifetime issues
116123
- On Debian-family systems, install `libc6-dbg` if Valgrind fails at startup with a `ld-linux` / mandatory redirection error
124+
- The default Valgrind lane now covers the deterministic pure/headless binaries around config parsing, command dispatch, streaming helpers, control payload parsing, worker orchestration, platform helper seams, and `mutterkey --help`
125+
- Keep `daemoncontrolclientservertest` under normal `ctest`, but do not assume it is a good default direct-launch Memcheck target in restricted sandboxes; some environments block direct `AF_UNIX` bind/listen with `EPERM` before the code under test runs
117126
- Use `cmake --build "$BUILD_DIR" --target clang-tidy` after C++ changes when static-analysis noise is likely to matter
118127
- Use `cmake --build "$BUILD_DIR" --target clazy` after Qt-facing changes when `clazy-standalone` is available
119128
- Use `cmake --build "$BUILD_DIR" --target lint` when you want the repo's full static-analysis pass in one command
@@ -128,9 +137,10 @@ Notes:
128137
- The repo-owned `clang-tidy` path is intentionally strict and now relies on `.clang-tidy` `RemovedArgs` support for `-mno-direct-extern-access` instead of rewriting `compile_commands.json`; prefer extending that config rather than adding more build-dir munging
129138
- Prefer `cmake --build "$BUILD_DIR" --target clang-tidy` in a real configured build tree when validating analyzer changes, because Qt test sources depend on generated `*_autogen` / `.moc` outputs that ad hoc one-off `clang-tidy` calls may not have available
130139
- Treat Valgrind Memcheck as the release memory gate and ASan/UBSan as the faster developer complements; they overlap, but they are not interchangeable
140+
- Prefer using the repo-owned Valgrind runner instead of ad hoc Memcheck commands so `--read-var-info=yes`, leak policy, and the deterministic test selection stay aligned
131141
- Treat the release-hygiene script and GitHub CI workflow as repo-maintained checks too; keep them passing when build inputs, docs, or repository metadata change
132142
- Treat the Doxygen `docs` target as a repo-maintained check too when touching repo-owned API docs or docs/CI wiring; repo-owned Doxygen warnings should be fixed rather than ignored
133-
- Keep the default Valgrind lane deterministic and headless: prefer `configtest`, `recordingnormalizertest`, and `mutterkey --help` over microphone, clipboard, or KGlobalAccel-heavy paths unless the task is specifically about those integrations
143+
- Keep the default Valgrind lane deterministic and headless: prefer the pure/configuration/control/runtime-helper tests plus `mutterkey --help` over microphone, clipboard-heavy, tray, KGlobalAccel-heavy, or direct local-socket integration paths unless the task is specifically about those integrations
134144
- The release-hygiene script intentionally ignores generated build trees while scanning for machine-specific home-directory paths and absolute Markdown links, then reports generated-artifact roots separately; if it flags `./build`, remove the repo-local build directory rather than weakening the content checks
135145
- Keep the Doxygen main page in `docs/mainpage.md` small and API-focused. The release-facing `README.md` may link to files outside the Doxygen input set and should not be used as the Doxygen main page unless the input set is expanded deliberately
136146
- Keep analyzer fixes targeted to `src/` and `tests/`; do not churn `third_party/` or generated Qt autogen output to satisfy tooling
@@ -155,11 +165,13 @@ Notes:
155165
- Prefer explicit validation and safe fallback behavior for config-driven runtime values
156166
- Avoid introducing optional backends, plugin systems, or cross-platform abstractions unless the task requires them
157167
- Keep the audio path explicit: recorder output may not already match Whisper input requirements, so preserve normalization behavior
168+
- Prefer product-owned naming such as runtime audio, chunks, events, diagnostics, and compatibility wrappers over backend-shaped naming when touching app-owned code
158169
- Prefer narrow shared value types across subsystems; for example, consumers that only need captured audio should include `src/audio/recording.h`, not the full recorder class
159170
- Keep JSON and other transport details at subsystem boundaries; prefer typed C++ snapshots/results once data crosses into app-owned control, tray, or service code
160171
- Prefer dependency injection for tray-shell and control-surface code from the first implementation so headless Qt tests stay simple
161-
- When preparing the transcription path for future runtime work, prefer app-owned engine/session seams and injected sessions over leaking concrete backend types into CLI, service, or worker orchestration. Keep immutable capability reporting and backend metadata on the engine side, and keep the session side focused on mutable decode state, warmup, and transcription
172+
- When preparing the transcription path for future runtime work, prefer app-owned engine/session seams and injected sessions over leaking concrete backend types into CLI, service, or worker orchestration. Keep immutable capability reporting on the engine side, keep runtime inspection data in `RuntimeDiagnostics`, and keep the session side focused on mutable decode state, warmup, chunk ingestion, finish, and cancellation
162173
- Prefer product-owned runtime interfaces, model/session separation, and deterministic backend selection before adding new inference backends or widening cross-platform support
174+
- Keep compatibility shims explicit in naming. If a one-shot daemon/CLI path is implemented on top of the streaming runtime seam, name it as a compatibility wrapper rather than making the old one-shot shape look like the primary contract
163175
- Keep backend-specific validation out of `src/config.*` when practical. Product config parsing should normalize and preserve user input, while backend support checks should live in the app-owned runtime layer near `src/transcription/*`
164176
- Preserve the current product direction: embedded `whisper.cpp`, KDE-first, CLI/service-first
165177

@@ -187,6 +199,7 @@ Apply the C++ Core Guidelines selectively and pragmatically. For this repo, the
187199
- `scripts/update-whisper.sh` requires a clean Git work tree before it will fetch or run subtree operations
188200
- Treat `third_party/whisper.cpp` as subtree-managed vendor content and update it through the helper script rather than manual directory replacement
189201
- Prefer changing app-side integration code before patching vendored dependency code
202+
- Prefer keeping fake runtime tests and app-owned helpers free of vendored whisper linkage unless the test is specifically about the whisper adapter or engine factory
190203
- Prefer fixing vendored target metadata from the top-level CMake when the issue is Mutterkey packaging or warning noise, instead of patching upstream vendored files directly
191204
- If you must modify vendored code, document why in the final response and record the deviation in `third_party/whisper.cpp.UPSTREAM.md`
192205
- Do not restore the removed upstream examples/tests/scripts unless the task requires them
@@ -236,17 +249,22 @@ Typical model location:
236249
- Treat `mutterkey-tray` as a shipped artifact once it is installed or validated in CI; keep install rules, README/setup notes, release checklist items, and workflow checks aligned with that status
237250
- Verify with a fresh CMake build when the change affects compilation or linkage
238251
- Run `ctest` when touching covered code in `src/config.*` or `src/audio/recordingnormalizer.*`, and extend the deterministic headless tests when practical
252+
- When adding or refreshing repo-owned Qt tests, keep the `WHAT/HOW/WHY` commentary current; treat `scripts/check-test-commentary.sh` as part of the maintained test contract, not optional polish
239253
- When touching transcription orchestration or backend seams, prefer small headless tests with fake/injected sessions or fake engines over model-dependent integration tests. Engine injection is the preferred seam for orchestration tests; direct session injection is still useful for narrow worker behavior
254+
- When touching the streaming runtime seam, prefer testing chunking, transcript assembly, and compatibility-wrapper behavior in pure headless tests before adding backend-specific or environment-heavy coverage
240255
- When adding or fixing Qt GUI tests, make the `CTest` registration itself headless with `QT_QPA_PLATFORM=offscreen` so CI does not try to load `xcb`
241256
- Prefer expanding tests around pure parsing, value normalization, and other environment-independent logic before adding KDE-session or device-heavy coverage
257+
- For boundary-heavy code such as recorder, clipboard, or hotkey helpers, prefer extracting narrow pure helper seams for deterministic tests before trying to mock full KDE/session/device behavior
242258
- Use `-DMUTTERKEY_ENABLE_ASAN=ON` and `-DMUTTERKEY_ENABLE_UBSAN=ON` for fast iteration on memory and UB bugs, and use the repo-owned Valgrind lane as the slower release-focused confirmation step
243259
- Run `clang-tidy` and `clazy` targets for non-trivial C++/Qt changes when the tools are installed in the environment
244260
- If `clang-tidy` validation needs Qt test `.moc` files, generate the relevant `*_autogen` targets first or just run the repo-owned `clang-tidy` target from a configured build tree
245261
- Run the `docs` target when touching repo-owned public headers, Doxygen comments, `docs/Doxyfile.in`, `docs/mainpage.md`, or CI/docs wiring
246262
- Prefer the `lint` target for a full pre-handoff analyzer pass, and use the individual analyzer targets when iterating on one class of warnings
247263
- Run `bash scripts/run-valgrind.sh "$BUILD_DIR"` before handoff when the task is specifically about memory, ownership, lifetime, shutdown, or release hardening
248264
- Run `bash scripts/check-release-hygiene.sh` before handoff when the task touches publication-facing files or repository metadata
265+
- Remember that the release-hygiene script now also enforces test commentary coverage, so changes to test structure or helper scripts may need both test updates and commentary updates
249266
- If `QT_QPA_PLATFORM=offscreen "$BUILD_DIR/mutterkey" diagnose 1` fails in a headless environment after model loading or during KDE/session-dependent startup, note that limitation explicitly rather than assuming the runtime seam or docs-only change regressed behavior
267+
- A headless `diagnose 1` failure after whisper model loading still does not necessarily indicate a streaming-runtime regression; separate runtime-contract changes from KDE/session or headless-environment limits
250268
- Do not leave generated artifacts in the repository tree at the end of the task
251269
- Do not assume every workspace copy is an initialized git repository; if `git` commands fail, continue with file-based validation and mention the limitation in the final response
252270

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -437,15 +437,15 @@ Valgrind and sanitizers have different roles:
437437

438438
- use `MUTTERKEY_ENABLE_ASAN` / `MUTTERKEY_ENABLE_UBSAN` for fast developer iteration and CI-friendly memory or UB checks
439439
- use `bash scripts/run-valgrind.sh "$BUILD_DIR"` or the `valgrind` target as the slower release-readiness gate
440-
- the default Valgrind lane stays deterministic and headless: `configtest`, `recordingnormalizertest`, and `mutterkey --help`
440+
- the default Valgrind lane stays deterministic and headless: it runs the pure/configuration/control/runtime helper tests plus `mutterkey --help`, and still avoids microphone, clipboard-heavy, tray, or KDE hotkey integration paths
441441
- the default Valgrind lane intentionally does not run live microphone capture, clipboard-heavy flows, or KDE hotkey/service integration
442442

443443
Notes for contributors:
444444

445445
- prefer an out-of-tree build so the repository stays clean
446446
- keep changes targeted to repo-owned code in `src/`, `tests/`, and top-level project files
447447
- avoid editing `third_party/whisper.cpp` unless the task is specifically about the vendored dependency
448-
- run `bash scripts/check-release-hygiene.sh` when changing publication-facing files such as this README, licenses, `contrib/`, or CI
448+
- run `bash scripts/check-release-hygiene.sh` when changing publication-facing files such as this README, licenses, `contrib/`, CI, or test-commentary tooling; it also enforces `WHAT/HOW/WHY` coverage in repo-owned test cases
449449

450450
Release hygiene:
451451

scripts/check-release-hygiene.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,9 @@ generated_artifact_matches="$(
9090
)"
9191
print_violation "repository must not contain generated build artifacts" "$generated_artifact_matches"
9292

93+
commentary_check_output="$(bash "$repo_root/scripts/check-test-commentary.sh" "$repo_root" 2>&1 || true)"
94+
print_violation "test sources must keep WHAT/HOW/WHY commentary blocks" "$commentary_check_output"
95+
9396
if ((failures != 0)); then
9497
exit 1
9598
fi

scripts/check-test-commentary.sh

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
#!/usr/bin/env bash
2+
3+
set -euo pipefail
4+
5+
repo_root="${1:-$(pwd)}"
6+
status=0
7+
8+
while IFS= read -r test_file; do
9+
awk '
10+
function should_skip(name) {
11+
return name == "initTestCase" || name == "init" || name == "cleanup" || name == "cleanupTestCase" || name ~ /_data$/
12+
}
13+
14+
/^[[:space:]]*void[[:space:]]+[A-Za-z0-9_]+::[A-Za-z0-9_]+\(/ {
15+
function_name = $0
16+
sub(/^[[:space:]]*void[[:space:]]+[A-Za-z0-9_]+::/, "", function_name)
17+
sub(/\(.*/, "", function_name)
18+
19+
in_test = !should_skip(function_name)
20+
saw_what = 0
21+
saw_how = 0
22+
saw_why = 0
23+
test_line = NR
24+
next
25+
}
26+
27+
in_test {
28+
if ($0 ~ /^[[:space:]]*\/\/[[:space:]]*WHAT:/) {
29+
saw_what = 1
30+
next
31+
}
32+
if ($0 ~ /^[[:space:]]*\/\/[[:space:]]*HOW:/) {
33+
saw_how = 1
34+
next
35+
}
36+
if ($0 ~ /^[[:space:]]*\/\/[[:space:]]*WHY:/) {
37+
saw_why = 1
38+
next
39+
}
40+
if ($0 ~ /^[[:space:]]*$/ || $0 ~ /^[[:space:]]*{[[:space:]]*$/ || $0 ~ /^[[:space:]]*\/\/.*$/) {
41+
next
42+
}
43+
44+
if (!(saw_what && saw_how && saw_why)) {
45+
printf("%s:%d: missing WHAT/HOW/WHY block near test body\n", FILENAME, test_line)
46+
missing = 1
47+
}
48+
in_test = 0
49+
}
50+
51+
END {
52+
if (in_test && !(saw_what && saw_how && saw_why)) {
53+
printf("%s:%d: missing WHAT/HOW/WHY block near test body\n", FILENAME, test_line)
54+
missing = 1
55+
}
56+
if (missing) {
57+
exit 1
58+
}
59+
}
60+
' "$test_file" || status=1
61+
done < <(find "$repo_root/tests" -maxdepth 1 -name '*test.cpp' | sort)
62+
63+
exit "$status"

0 commit comments

Comments
 (0)