Add kesha-voice-kit - local voice toolkit OpenClaw skill#3
Conversation
📝 WalkthroughWalkthroughThis pull request adds a single line to the OpenClaw Skills list in README.md, introducing the kesha-voice-kit skill entry with its description of local speech-to-text and text-to-speech functionality built on a Rust engine, requiring no cloud services. ChangesOpenClaw Skills Documentation
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Warning |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Line 17: The table row for "kesha-voice-kit" is in English while the rest of
the README is Chinese; update the description string for the kesha-voice-kit
entry (the text after the repository link/name) to Chinese while keeping the
repo name/link and the "Rust" column unchanged; replace the English description
with a concise Chinese translation such as: "本地语音工具包:语音转文本(25 种语言,借助 CoreML 在
Apple Silicon 上比 Whisper 快 ~19 倍,ONNX 回退)、文本转语音(Kokoro + Vosk-TTS + 180 个 macOS
系统声音,支持 SSML)、语音活动检测、语言检测(107 种语言)。Rust 引擎,无需云服务。"
- Line 17: Update the performance claim for kesha-voice-kit in the README:
locate the table row containing the project name "kesha-voice-kit" and replace
the "~19x faster" STT performance text with "~15x faster" so it matches upstream
documentation (keep the rest of the row unchanged, including "25 languages" and
"107 langs" references).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| | **[github-explorer](./github-explorer/)** | GitHub 项目深度分析。多源采集 + 结构化研判报告 | search-layer, content-extract | | ||
| | **[dependency-tracker](./dependency-tracker/)** | 依赖健康检查。扫描 skills/npm/pip/CLI 版本漂移,生成报告 | `requests` | | ||
| | **[gitclaw-backup](./gitclaw-backup/)** | GitHub 备份。将 OpenClaw 工作区同步到 GitHub 仓库 | git | | ||
| | **[kesha-voice-kit](https://github.com/drakulavich/kesha-voice-kit)** | Local STT+TTS voice toolkit. Speech-to-text (25 languages, ~19x faster than Whisper on Apple Silicon via CoreML, ONNX fallback), text-to-speech (Kokoro + Vosk-TTS + 180 macOS system voices, SSML), VAD, language detection (107 langs). Rust engine. No cloud required. | Rust | |
There was a problem hiding this comment.
Translate entry to Chinese for consistency.
The entire README is written in Chinese, but this new entry is in English. For consistency and readability, please translate the description to match the language of the rest of the document.
🌐 Suggested translation (example)
-| **[kesha-voice-kit](https://github.com/drakulavich/kesha-voice-kit)** | Local STT+TTS voice toolkit. Speech-to-text (25 languages, ~19x faster than Whisper on Apple Silicon via CoreML, ONNX fallback), text-to-speech (Kokoro + Vosk-TTS + 180 macOS system voices, SSML), VAD, language detection (107 langs). Rust engine. No cloud required. | Rust |
+| **[kesha-voice-kit](https://github.com/drakulavich/kesha-voice-kit)** | 本地 STT+TTS 语音工具包。语音转文字(支持 25 语言,Apple Silicon 上通过 CoreML 比 Whisper 快约 19 倍,ONNX 降级)、文字转语音(Kokoro + Vosk-TTS + 180 个 macOS 系统语音,支持 SSML)、VAD、语言检测(107 语言)。Rust 引擎。无需云服务。 | Rust |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | **[kesha-voice-kit](https://github.com/drakulavich/kesha-voice-kit)** | Local STT+TTS voice toolkit. Speech-to-text (25 languages, ~19x faster than Whisper on Apple Silicon via CoreML, ONNX fallback), text-to-speech (Kokoro + Vosk-TTS + 180 macOS system voices, SSML), VAD, language detection (107 langs). Rust engine. No cloud required. | Rust | | |
| | **[kesha-voice-kit](https://github.com/drakulavich/kesha-voice-kit)** | 本地 STT+TTS 语音工具包。语音转文字(支持 25 语言,Apple Silicon 上通过 CoreML 比 Whisper 快约 19 倍,ONNX 降级)、文字转语音(Kokoro + Vosk-TTS + 180 个 macOS 系统语音,支持 SSML)、VAD、语言检测(107 语言)。Rust 引擎。无需云服务。 | Rust | |
🧰 Tools
🪛 LanguageTool
[style] ~17-~17: This phrase is redundant (‘OS’ stands for ‘operating system’). Use simply “macOS”.
Context: ...text-to-speech (Kokoro + Vosk-TTS + 180 macOS system voices, SSML), VAD, language detection ...
(ACRONYM_TAUTOLOGY)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 17, The table row for "kesha-voice-kit" is in English
while the rest of the README is Chinese; update the description string for the
kesha-voice-kit entry (the text after the repository link/name) to Chinese while
keeping the repo name/link and the "Rust" column unchanged; replace the English
description with a concise Chinese translation such as: "本地语音工具包:语音转文本(25 种语言,借助
CoreML 在 Apple Silicon 上比 Whisper 快 ~19 倍,ONNX 回退)、文本转语音(Kokoro + Vosk-TTS + 180
个 macOS 系统声音,支持 SSML)、语音活动检测、语言检测(107 种语言)。Rust 引擎,无需云服务。"
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Fetch README from kesha-voice-kit repo to verify technical claims
# Fetch the upstream README
curl -s https://raw.githubusercontent.com/drakulavich/kesha-voice-kit/main/README.md > /tmp/kesha_readme.md
# Display relevant sections mentioning the claimed features
echo "=== Checking for language support claims ==="
rg -i "25 language|language.*25|speech.*language" /tmp/kesha_readme.md || echo "No exact match for 25 languages"
echo -e "\n=== Checking for performance claims ==="
rg -i "19.*faster|whisper.*faster|faster.*whisper" /tmp/kesha_readme.md || echo "No exact match for 19x faster claim"
echo -e "\n=== Checking for TTS voice count ==="
rg -i "180.*voice|voice.*180|macos.*voice" /tmp/kesha_readme.md || echo "No exact match for 180 voices"
echo -e "\n=== Checking for language detection ==="
rg -i "107.*lang|language.*detection.*107" /tmp/kesha_readme.md || echo "No exact match for 107 language detection"
echo -e "\n=== Full README context ==="
cat /tmp/kesha_readme.mdRepository: blessonism/openclaw-skills
Length of output: 12329
Correct the performance claim — it should be ~15× faster, not ~19×.
The upstream README states the STT performance as "~15x faster than Whisper on Apple Silicon, ~2.5x on CPU" (against Whisper large-v3-turbo). The description's "~19× faster" claim is inaccurate. Update to match the documented benchmark. The other claims (25 languages for STT, 107 for language detection) are accurate and match the upstream documentation.
🧰 Tools
🪛 LanguageTool
[style] ~17-~17: This phrase is redundant (‘OS’ stands for ‘operating system’). Use simply “macOS”.
Context: ...text-to-speech (Kokoro + Vosk-TTS + 180 macOS system voices, SSML), VAD, language detection ...
(ACRONYM_TAUTOLOGY)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 17, Update the performance claim for kesha-voice-kit in
the README: locate the table row containing the project name "kesha-voice-kit"
and replace the "~19x faster" STT performance text with "~15x faster" so it
matches upstream documentation (keep the rest of the row unchanged, including
"25 languages" and "107 langs" references).
kesha-voice-kit is an open-source local-first voice toolkit with a built-in OpenClaw skill:
https://github.com/drakulavich/kesha-voice-kit
Summary by CodeRabbit
Release Notes