Skip to content

Commit 5453feb

Browse files
authored
Merge pull request #13 from n-n-code/model_handling_detach
More detaching model handling from whisper.cpp to mutterkey side
2 parents 8787c30 + b18a262 commit 5453feb

24 files changed

Lines changed: 1252 additions & 58 deletions

AGENTS.md

Lines changed: 44 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,13 @@ Current architecture:
88

99
- Global shortcut handling goes through `KGlobalAccel`
1010
- Audio capture uses Qt Multimedia
11-
- Transcription is in-process through vendored `whisper.cpp`
11+
- Transcription goes through an app-owned runtime seam with explicit runtime
12+
selection
13+
- A product-owned native CPU reference runtime scaffold now exists alongside
14+
the legacy whisper adapter
15+
- `whisper.cpp` is still the only real end-user speech decoder today, but the
16+
vendored runtime is now optional at build time through
17+
`MUTTERKEY_ENABLE_LEGACY_WHISPER=OFF`
1218
- Native Mutterkey model packages are now the canonical model artifact; raw
1319
whisper.cpp-compatible `.bin` files remain only as a migration/import path
1420
- The public runtime seam is streaming-first through app-owned chunks, events, and compatibility helpers
@@ -22,7 +28,9 @@ Current architecture:
2228
This repository is intentionally kept minimal:
2329

2430
- CMake is the only supported build system
25-
- `whisper.cpp` is the only supported transcription backend
31+
- `whisper.cpp` remains a vendored legacy backend, but new runtime ownership
32+
work should prefer the product-owned native CPU path and selector/model-loader
33+
seams first
2634
- Keep the repo free of generated build output
2735
- Keep publication-facing files free of machine-specific paths and broken local links
2836
- Do not reintroduce legacy qmake or external-command transcription paths unless explicitly requested
@@ -37,11 +45,14 @@ This repository is intentionally kept minimal:
3745
- `src/clipboardwriter.*`: clipboard integration, preferring KDE system clipboard support
3846
- `src/audio/recordingnormalizer.*`: conversion to runtime-ready mono `float32` at `16 kHz`
3947
- `src/transcription/audiochunker.*`: deterministic chunking of normalized audio for the streaming runtime path
48+
- `src/transcription/cpureferencemodel.*`: product-owned native CPU reference model header/parser and immutable model-handle loading
49+
- `src/transcription/cpureferencetranscriber.*`: native CPU reference runtime scaffold behind the app-owned engine/session seam
4050
- `src/transcription/modelpackage.*`: product-owned manifest and validated package value types
4151
- `src/transcription/modelvalidator.*`: package integrity, compatibility, and bounds validation
4252
- `src/transcription/modelcatalog.*`: model artifact inspection and resolution
4353
- `src/transcription/rawwhisperprobe.*`: lightweight raw whisper.cpp header inspection used for migration compatibility
4454
- `src/transcription/rawwhisperimporter.*`: import path from raw Whisper `.bin` files into native Mutterkey packages
55+
- `src/transcription/runtimeselector.*`: app-owned runtime-selection policy and diagnostic reasoning
4556
- `src/transcription/transcriptassembler.*`: final transcript assembly from streaming transcript events
4657
- `src/transcription/transcriptioncompat.*`: compatibility wrapper that routes one-shot recordings through the streaming runtime seam
4758
- `src/transcription/whispercpptranscriber.*`: in-process Whisper integration and whisper-specific engine construction
@@ -86,6 +97,13 @@ cmake -S . -B "$BUILD_DIR" -G Ninja
8697
cmake --build "$BUILD_DIR" -j"$(nproc)"
8798
```
8899

100+
To validate the native-runtime-only path without vendored `whisper.cpp` /
101+
`ggml`, configure with:
102+
103+
```bash
104+
cmake -S . -B "$BUILD_DIR" -G Ninja -DMUTTERKEY_ENABLE_LEGACY_WHISPER=OFF
105+
```
106+
89107
If a sandboxed build fails with `ccache: error: Read-only file system`, treat
90108
that as an environment limitation rather than a repo regression and rerun the
91109
build with `CCACHE_DISABLE=1`.
@@ -136,6 +154,9 @@ Notes:
136154
- Use `bash scripts/check-release-hygiene.sh` when touching publication-facing files such as `README.md`, licenses, `contrib/`, CI, or helper scripts
137155
- Use `cmake --build "$BUILD_DIR" --target docs` when touching repo-owned public headers, Doxygen config, the Doxygen main page, or CI/docs wiring
138156
- If install rules or licensing files change, confirm the temporary install contains the expected files under `share/licenses/mutterkey`
157+
- If a task changes runtime selection, native model loading, or legacy-whisper
158+
build toggles, validate at least one `MUTTERKEY_ENABLE_LEGACY_WHISPER=OFF`
159+
build in addition to the normal default build
139160
- If you add or change public methods in repo-owned headers, expect `cmake --build "$BUILD_DIR" --target docs` to fail until the new API is documented; treat that as part of the normal implementation loop, not follow-up polish
140161
- Newly added repo-owned public structs and free functions in public headers also
141162
need Doxygen comments immediately; the `docs` target treats undocumented new
@@ -158,6 +179,12 @@ Notes:
158179
- When validating inside a restricted sandbox, be ready to disable `ccache` with `CCACHE_DISABLE=1` if the cache location is read-only; that is an execution-environment issue, not a Mutterkey build failure
159180
- Prefer fixing the code over weakening `.clang-tidy` or the Clazy check set; only relax tool config when the warning is clearly low-value for this repo
160181
- If `clang-tidy` flags a new small enum for `performance-enum-size`, prefer an explicit narrow underlying type such as `std::uint8_t` instead of suppressing the warning
182+
- If `clang-tidy` flags a small fixed binary header type, prefer
183+
`std::array<std::byte, N>` or `std::array<char, N>` plus value
184+
initialization over C-style arrays
185+
- When helper functions take two adjacent same-shaped parameters such as two
186+
`QString` values, prefer a small request struct when that keeps tests and
187+
runtime code from tripping `bugprone-easily-swappable-parameters`
161188
- In this Qt-heavy repo, treat `misc-include-cleaner` and `readability-redundant-access-specifiers` as low-value `clang-tidy` noise unless the underlying tool behavior improves; they conflict with Qt header-provider reality and `signals` / `slots` / `Q_SLOTS` sectioning more than they improve safety
162189
- Prefer anonymous-namespace `Q_LOGGING_CATEGORY` for file-local logging categories; `Q_STATIC_LOGGING_CATEGORY` is not portable enough across the Qt versions this repo may build against
163190
- Do not add broad Valgrind suppressions by default; only add narrow suppressions after reproducing stable third-party noise and keep them clearly scoped
@@ -183,7 +210,15 @@ Notes:
183210
- Keep JSON and other transport details at subsystem boundaries; prefer typed C++ snapshots/results once data crosses into app-owned control, tray, or service code
184211
- Prefer dependency injection for tray-shell and control-surface code from the first implementation so headless Qt tests stay simple
185212
- When preparing the transcription path for future runtime work, prefer app-owned engine/session seams and injected sessions over leaking concrete backend types into CLI, service, or worker orchestration. Keep immutable capability reporting on the engine side, keep runtime inspection data in `RuntimeDiagnostics`, and keep the session side focused on mutable decode state, warmup, chunk ingestion, finish, and cancellation
186-
- Prefer product-owned runtime interfaces, model/session separation, and deterministic backend selection before adding new inference backends or widening cross-platform support
213+
- Prefer product-owned runtime interfaces, model/session separation, explicit
214+
runtime-selection policy, and deterministic backend selection before adding
215+
new inference backends or widening cross-platform support
216+
- Keep runtime-selection policy in `src/transcription/runtimeselector.*`
217+
instead of burying compatibility/fallback rules inside
218+
`createTranscriptionEngine()`
219+
- Keep native model-format parsing and immutable model loading in
220+
`src/transcription/cpureferencemodel.*` or similar app-owned loader code
221+
rather than mixing artifact parsing into the mutable session implementation
187222
- Keep model validation, metadata extraction, and compatibility checks app-owned.
188223
`whisper.cpp` should not be the first component that tells Mutterkey whether a
189224
model artifact is obviously malformed, incompatible, or oversized
@@ -218,6 +253,9 @@ Apply the C++ Core Guidelines selectively and pragmatically. For this repo, the
218253
- Prefer resolving model-package, metadata, and import work entirely in app-owned
219254
code. Raw whisper.cpp `.bin` support is now a compatibility/import concern, not
220255
the canonical product contract
256+
- Prefer treating `whisper.cpp` as a legacy migration/parity dependency from
257+
here forward. If new work can land in app-owned selector, model-loader,
258+
native-runtime, or package code instead, do that first
221259
- Prefer keeping fake runtime tests and app-owned helpers free of vendored whisper linkage unless the test is specifically about the whisper adapter or engine factory
222260
- Prefer fixing vendored target metadata from the top-level CMake when the issue is Mutterkey packaging or warning noise, instead of patching upstream vendored files directly
223261
- If you must modify vendored code, document why in the final response and record the deviation in `third_party/whisper.cpp.UPSTREAM.md`
@@ -233,6 +271,9 @@ Apply the C++ Core Guidelines selectively and pragmatically. For this repo, the
233271
separate release asset outside Git
234272
- Do not introduce machine-specific home-directory paths, absolute local Markdown links, or generated build artifacts into tracked files
235273
- If a task changes install layout or shipped assets, keep the CMake install rules and license installs aligned with the new behavior
274+
- If a task changes whether legacy whisper support is installed, keep
275+
`README.md`, `RELEASE_CHECKLIST.md`, `docs/mainpage.md`, install rules, and
276+
license installs aligned with that choice
236277
- The installed shared-library payload is runtime-focused; do not start installing vendored upstream public headers unless the package contract intentionally changes
237278

238279
## Config Expectations

CMakeLists.txt

Lines changed: 58 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ set(CMAKE_CXX_EXTENSIONS OFF)
1515

1616
option(MUTTERKEY_ENABLE_ASAN "Enable AddressSanitizer for repo-owned code and vendored whisper.cpp" OFF)
1717
option(MUTTERKEY_ENABLE_UBSAN "Enable UndefinedBehaviorSanitizer for repo-owned code and vendored whisper.cpp" OFF)
18+
option(MUTTERKEY_ENABLE_LEGACY_WHISPER "Build the legacy whisper.cpp runtime for migration and parity validation" ON)
1819
option(MUTTERKEY_ENABLE_WHISPER_CUDA "Enable whisper.cpp CUDA backend support (NVIDIA)" OFF)
1920
option(MUTTERKEY_ENABLE_WHISPER_VULKAN "Enable whisper.cpp Vulkan backend support" OFF)
2021
option(MUTTERKEY_ENABLE_WHISPER_BLAS "Enable whisper.cpp BLAS CPU acceleration" OFF)
@@ -47,6 +48,10 @@ set(MUTTERKEY_CORE_SOURCES
4748
src/transcription/transcriptionengine.h
4849
src/transcription/audiochunker.cpp
4950
src/transcription/audiochunker.h
51+
src/transcription/cpureferencemodel.cpp
52+
src/transcription/cpureferencemodel.h
53+
src/transcription/cpureferencetranscriber.cpp
54+
src/transcription/cpureferencetranscriber.h
5055
src/transcription/modelcatalog.cpp
5156
src/transcription/modelcatalog.h
5257
src/transcription/modelpackage.cpp
@@ -57,16 +62,23 @@ set(MUTTERKEY_CORE_SOURCES
5762
src/transcription/rawwhisperimporter.h
5863
src/transcription/rawwhisperprobe.cpp
5964
src/transcription/rawwhisperprobe.h
65+
src/transcription/runtimeselector.cpp
66+
src/transcription/runtimeselector.h
6067
src/transcription/transcriptassembler.cpp
6168
src/transcription/transcriptassembler.h
6269
src/transcription/transcriptioncompat.cpp
6370
src/transcription/transcriptioncompat.h
6471
src/transcription/transcriptionworker.cpp
6572
src/transcription/transcriptionworker.h
66-
src/transcription/whispercpptranscriber.cpp
67-
src/transcription/whispercpptranscriber.h
6873
)
6974

75+
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
76+
list(APPEND MUTTERKEY_CORE_SOURCES
77+
src/transcription/whispercpptranscriber.cpp
78+
src/transcription/whispercpptranscriber.h
79+
)
80+
endif()
81+
7082
set(MUTTERKEY_CONTROL_SOURCES
7183
src/control/daemoncontrolclient.cpp
7284
src/control/daemoncontrolclient.h
@@ -114,6 +126,10 @@ set_target_properties(mutterkey-tray PROPERTIES
114126
INSTALL_RPATH "$ORIGIN/../lib"
115127
)
116128

129+
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
130+
target_compile_definitions(mutterkey_core PRIVATE MUTTERKEY_WITH_LEGACY_WHISPER)
131+
endif()
132+
117133
function(mutterkey_enable_sanitizers target_name)
118134
if(NOT CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang|AppleClang")
119135
message(WARNING "Sanitizers were requested, but ${CMAKE_CXX_COMPILER_ID} is not configured for repo-owned sanitizer flags")
@@ -203,47 +219,53 @@ else()
203219
message(STATUS "Doxygen not found; the docs target will be unavailable")
204220
endif()
205221

206-
if(NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/third_party/whisper.cpp/CMakeLists.txt")
207-
message(FATAL_ERROR "Vendored whisper.cpp dependency is missing from third_party/whisper.cpp")
208-
endif()
209-
210-
set(WHISPER_BUILD_TESTS OFF CACHE BOOL "" FORCE)
211-
set(WHISPER_BUILD_EXAMPLES OFF CACHE BOOL "" FORCE)
212-
set(WHISPER_BUILD_SERVER OFF CACHE BOOL "" FORCE)
213-
set(WHISPER_SANITIZE_ADDRESS ${MUTTERKEY_ENABLE_ASAN} CACHE BOOL "" FORCE)
214-
set(WHISPER_SANITIZE_UNDEFINED ${MUTTERKEY_ENABLE_UBSAN} CACHE BOOL "" FORCE)
215-
set(GGML_CUDA ${MUTTERKEY_ENABLE_WHISPER_CUDA} CACHE BOOL "" FORCE)
216-
set(GGML_VULKAN ${MUTTERKEY_ENABLE_WHISPER_VULKAN} CACHE BOOL "" FORCE)
217-
set(GGML_BLAS ${MUTTERKEY_ENABLE_WHISPER_BLAS} CACHE BOOL "" FORCE)
218-
set(GGML_BLAS_VENDOR ${MUTTERKEY_WHISPER_BLAS_VENDOR} CACHE STRING "" FORCE)
219-
add_subdirectory(third_party/whisper.cpp EXCLUDE_FROM_ALL)
220-
221-
# Mutterkey ships the vendored shared libraries, but it does not install their
222-
# upstream public headers as part of its own package layout.
223-
set_target_properties(whisper ggml PROPERTIES PUBLIC_HEADER "")
222+
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
223+
if(NOT EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/third_party/whisper.cpp/CMakeLists.txt")
224+
message(FATAL_ERROR "Vendored whisper.cpp dependency is missing from third_party/whisper.cpp")
225+
endif()
224226

225-
target_link_libraries(mutterkey_core PRIVATE whisper)
227+
set(WHISPER_BUILD_TESTS OFF CACHE BOOL "" FORCE)
228+
set(WHISPER_BUILD_EXAMPLES OFF CACHE BOOL "" FORCE)
229+
set(WHISPER_BUILD_SERVER OFF CACHE BOOL "" FORCE)
230+
set(WHISPER_SANITIZE_ADDRESS ${MUTTERKEY_ENABLE_ASAN} CACHE BOOL "" FORCE)
231+
set(WHISPER_SANITIZE_UNDEFINED ${MUTTERKEY_ENABLE_UBSAN} CACHE BOOL "" FORCE)
232+
set(GGML_CUDA ${MUTTERKEY_ENABLE_WHISPER_CUDA} CACHE BOOL "" FORCE)
233+
set(GGML_VULKAN ${MUTTERKEY_ENABLE_WHISPER_VULKAN} CACHE BOOL "" FORCE)
234+
set(GGML_BLAS ${MUTTERKEY_ENABLE_WHISPER_BLAS} CACHE BOOL "" FORCE)
235+
set(GGML_BLAS_VENDOR ${MUTTERKEY_WHISPER_BLAS_VENDOR} CACHE STRING "" FORCE)
236+
add_subdirectory(third_party/whisper.cpp EXCLUDE_FROM_ALL)
237+
238+
# Mutterkey ships the vendored shared libraries, but it does not install their
239+
# upstream public headers as part of its own package layout.
240+
set_target_properties(whisper ggml PROPERTIES PUBLIC_HEADER "")
241+
242+
target_link_libraries(mutterkey_core PRIVATE whisper)
243+
endif()
226244

227245
install(TARGETS mutterkey RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
228246
install(TARGETS mutterkey-tray RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
229-
install(TARGETS whisper ggml ggml-base
230-
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
231-
)
232-
if(TARGET ggml-cpu)
233-
install(TARGETS ggml-cpu LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
234-
endif()
235-
if(TARGET ggml-cuda)
236-
install(TARGETS ggml-cuda LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
237-
endif()
238-
if(TARGET ggml-vulkan)
239-
install(TARGETS ggml-vulkan LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
240-
endif()
241-
if(TARGET ggml-blas)
242-
install(TARGETS ggml-blas LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
247+
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
248+
install(TARGETS whisper ggml ggml-base
249+
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
250+
)
251+
if(TARGET ggml-cpu)
252+
install(TARGETS ggml-cpu LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
253+
endif()
254+
if(TARGET ggml-cuda)
255+
install(TARGETS ggml-cuda LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
256+
endif()
257+
if(TARGET ggml-vulkan)
258+
install(TARGETS ggml-vulkan LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
259+
endif()
260+
if(TARGET ggml-blas)
261+
install(TARGETS ggml-blas LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR})
262+
endif()
243263
endif()
244264
install(FILES contrib/org.mutterkey.mutterkey.desktop DESTINATION ${CMAKE_INSTALL_DATADIR}/applications)
245265
install(FILES LICENSE THIRD_PARTY_NOTICES.md DESTINATION ${MUTTERKEY_LICENSE_INSTALL_DIR})
246-
install(FILES third_party/whisper.cpp/LICENSE DESTINATION ${MUTTERKEY_LICENSE_INSTALL_DIR}/third_party/whisper.cpp)
266+
if(MUTTERKEY_ENABLE_LEGACY_WHISPER)
267+
install(FILES third_party/whisper.cpp/LICENSE DESTINATION ${MUTTERKEY_LICENSE_INSTALL_DIR}/third_party/whisper.cpp)
268+
endif()
247269

248270
if(BUILD_TESTING)
249271
find_package(Qt6 REQUIRED COMPONENTS Test)

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,10 @@ This installs:
131131
- `~/.local/lib/libwhisper.so*` and the required `ggml` libraries
132132
- `~/.local/share/applications/org.mutterkey.mutterkey.desktop`
133133

134+
If you configure with `-DMUTTERKEY_ENABLE_LEGACY_WHISPER=OFF`, Mutterkey builds
135+
without the vendored `whisper.cpp` runtime and does not install the legacy
136+
`libwhisper` / `ggml` shared libraries.
137+
134138
Optional acceleration flags:
135139

136140
```bash
@@ -164,6 +168,7 @@ Notes:
164168
- `MUTTERKEY_ENABLE_WHISPER_VULKAN=ON` is for Vulkan-capable GPUs and requires Vulkan development headers and loader libraries
165169
- `MUTTERKEY_ENABLE_WHISPER_BLAS=ON` improves CPU inference speed rather than enabling GPU execution
166170
- these options are forwarded to the vendored `whisper.cpp` / `ggml` build and install any resulting backend libraries alongside Mutterkey
171+
- `-DMUTTERKEY_ENABLE_LEGACY_WHISPER=OFF` disables the vendored runtime entirely and skips all `whisper.cpp` / `ggml` install targets
167172

168173
### 2. Put a model on disk
169174

0 commit comments

Comments
 (0)