Skip to content

Commit 4454eae

Browse files
committed
refactor: full codebase overhaul — security, architecture, tests, DX
- Phase 1: security fixes (shell=True removed, brightness injection fixed, PROTECTED_PROCESSES whitelist, audio file cleanup, regex validation) - Phase 2: custom exceptions hierarchy, type hints on all public APIs, pinned requirements.txt - Phase 3: nlu.py split into 3 modules, skills.py split into 4 modules, AssistantState dataclass replaces globals, OllamaClient with injectable base_url - Phase 4: 4 test files added (nlu rules, app resolver, config, exceptions) - Phase 5: setup.bat, GitHub Actions lint workflow, updated README
1 parent 6c83b08 commit 4454eae

28 files changed

Lines changed: 2103 additions & 1539 deletions

.github/workflows/lint.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
name: Lint & Type Check
2+
3+
on:
4+
push:
5+
branches: ["**"]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
lint:
11+
runs-on: windows-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
15+
- uses: actions/setup-python@v5
16+
with:
17+
python-version: "3.11"
18+
19+
- name: Install lint tools
20+
run: pip install ruff mypy
21+
22+
- name: Ruff (lint + format check)
23+
run: ruff check . --select E,F,W,I --ignore E501
24+
25+
- name: Mypy (type check)
26+
run: mypy assistant --ignore-missing-imports --no-strict-optional

README.md

Lines changed: 104 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,117 @@
11
# Personal PC Assistant
22

3-
A voice assistant for Windows that actually listens and executes commands. Built it because I got tired of clicking through menus and wanted my computer to feel like a conversation partner.
4-
5-
## How it looks
3+
A local Windows voice assistant. Hold a hotkey → speak → the assistant executes your command. Everything runs offline: speech recognition via Faster Whisper, intent detection via rule engine + Ollama fallback.
64

75
![Control Panel Screenshot](assets/gui-screenshot.png)
86

9-
## Why this is cool
7+
## Features
8+
9+
- Push-to-talk voice control (default: Right Shift)
10+
- 15+ built-in skills: open/close apps, volume, brightness, screenshots, Wi-Fi toggle, shutdown, web search, clipboard
11+
- Cyberpunk-themed PyQt6 GUI with animated waveform
12+
- Teach custom voice commands without code — saved to `config.json`
13+
- Fully local: Faster Whisper ASR + Ollama LLM, no cloud APIs
14+
15+
## Architecture
16+
17+
```
18+
main_gui.py / main_fast.py
19+
20+
├─ assistant/
21+
│ ├─ core/
22+
│ │ ├─ exceptions.py (AssistantError hierarchy)
23+
│ │ └─ state.py (AssistantState dataclass)
24+
│ ├─ nlu/
25+
│ │ ├─ normalizer.py (text cleaning, lemmatization)
26+
│ │ ├─ ollama_client.py (OllamaClient — injectable base_url)
27+
│ │ └─ engine.py (rules + Ollama fallback → intent)
28+
│ ├─ skills/
29+
│ │ ├─ app_control.py (open/close/minimize apps)
30+
│ │ ├─ system_control.py (volume, brightness, shutdown …)
31+
│ │ ├─ browser.py (search, open website)
32+
│ │ ├─ clipboard.py (copy/paste)
33+
│ │ └─ registry.py (SKILLS dict + Skill Protocol)
34+
│ ├─ asr.py (Faster Whisper transcription)
35+
│ ├─ recorder.py (push-to-talk audio capture)
36+
│ └─ runner.py (skill dispatch + confirmation flow)
37+
└─ config.json (hotkey, app aliases, custom commands)
38+
```
39+
40+
## Entry points
41+
42+
| File | Use when |
43+
|------|----------|
44+
| `main_gui.py` | Daily use — cyberpunk control panel, visual feedback |
45+
| `main_fast.py` | Debugging / scripting — console-only, lighter startup |
46+
47+
## Setup
48+
49+
### Requirements
50+
51+
- Windows 10/11
52+
- Python 3.10+
53+
- [Ollama](https://ollama.ai) installed and on PATH
54+
- Microphone
55+
56+
### Quick start
57+
58+
```bat
59+
:: 1. Clone
60+
git clone https://github.com/Bogdusik/Personal-PC-Assistant.git
61+
cd Personal-PC-Assistant
62+
63+
:: 2. Run setup (checks admin, copies config, installs deps)
64+
setup.bat
65+
66+
:: 3. Pull an Ollama model (one-time)
67+
ollama pull gemma3:12b
68+
69+
:: 4. Launch (as Administrator for hotkey support)
70+
python main_gui.py
71+
```
72+
73+
### Manual setup
74+
75+
```bat
76+
python -m venv venv
77+
venv\Scripts\activate
78+
pip install -r requirements.txt
79+
copy config.example.json config.json
80+
```
81+
82+
Edit `config.json`:
83+
- `hotkey` — key to hold while speaking (default: `"right shift"`)
84+
- `mic_device` — microphone device index or `null` for default
85+
- `ollama_model` — model name (default: `"gemma3:12b"`)
86+
- `app_aliases` — map your app names to full `.exe` paths
87+
88+
> **Run as Administrator** — required for keyboard hotkey hooks.
1089
11-
**Animation** - Custom sci-fi/cyberpunk GUI with animated waveform that pulses from your voice in real-time, smooth fade-in on launch, glassmorphism effects
12-
**Fallback** - Works even if Ollama crashes, graceful fallback guides you through setup
13-
**Local AI** - Everything runs locally (Faster Whisper + Ollama), nothing goes to the cloud
14-
**Ripple on buttons** - Interactive interface with ripple effects on hover, the whole UI "wakes up" when you launch it
90+
## Environment variables
1591

16-
## How to run
92+
| Variable | Default | Description |
93+
|----------|---------|-------------|
94+
| `OLLAMA_BASE_URL` | `http://localhost:11434` | Override Ollama server URL |
95+
| `OLLAMA_MODEL` | `gemma3:12b` | Override model (also settable in config.json) |
1796

18-
1. `git clone https://github.com/Bogdusik/Personal-PC-Assistant.git`
19-
2. `cd Personal-PC-Assistant`
20-
3. `python -m venv venv` (optional, but recommended)
21-
4. `venv\Scripts\activate` (Windows) or `source venv/bin/activate` (Linux/Mac)
22-
5. `pip install -r requirements.txt`
23-
6. `ollama pull gemma3:12b` (install Ollama first from ollama.ai)
24-
7. `python main_gui.py` (run as Administrator for hotkey support)
97+
## Teaching custom commands
2598

26-
**Important:** Works only on Windows 10/11. Requires microphone access. Run as Administrator for hotkey functionality (default: Right Shift - hold to speak, release to process).
99+
Say *"новая команда"* (new command) or press **Ctrl+4** while the assistant is running to add a custom voice trigger:
100+
- Choose a match type: `equals`, `startswith`, `contains`, or `regex`
101+
- Pick an intent and its arguments
102+
- Commands are saved to `config.json` immediately
27103

28-
## What I learned from this
104+
## Running tests
29105

30-
• Mastered speech recognition in practice - Faster Whisper is incredible
31-
• Got comfortable with PyQt6 and Windows API (PyCaw, win32gui, keyboard hooks)
32-
• Finally built something I always wanted - talking to my computer feels natural now
106+
```bat
107+
pip install pytest
108+
pytest tests/ -v
109+
```
33110

34-
## Want to use it?
111+
## Development
35112

36-
Fork it, improve it. I won't be offended.
113+
```bat
114+
pip install ruff mypy
115+
ruff check .
116+
mypy assistant --ignore-missing-imports
117+
```

assistant/asr.py

Lines changed: 40 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,107 +1,89 @@
1-
import os
2-
import logging
1+
from __future__ import annotations
32
import gc
3+
import logging
4+
import os
5+
6+
from assistant.core.exceptions import AudioError
47

58
os.environ["CUDA_VISIBLE_DEVICES"] = ""
69
os.environ["CT2_VERBOSE"] = "0"
710

11+
logger = logging.getLogger(__name__)
12+
813
_cached_model = None
9-
_model_name = None
14+
_model_name: str | None = None
15+
1016

1117
def init_asr():
1218
global _cached_model, _model_name
13-
19+
1420
if _cached_model is not None:
15-
logging.info(f"Используем кэшированную модель: {_model_name}")
21+
logger.info("Используем кэшированную модель: %s", _model_name)
1622
return _cached_model
17-
23+
1824
print("Гружу ASR-модель (faster-whisper, CPU)...", flush=True)
19-
logging.info("Начинаю загрузку ASR модели")
20-
21-
models_to_try = [
25+
26+
try:
27+
from faster_whisper import WhisperModel
28+
except ImportError as exc:
29+
raise AudioError(f"faster-whisper не установлен: {exc}") from exc
30+
31+
for model_name, device, compute_type in [
2232
("tiny", "cpu", "int8"),
2333
("tiny", "cpu", "float32"),
2434
("base", "cpu", "int8"),
2535
("base", "cpu", "float32"),
26-
]
27-
28-
for model_name, device, compute_type in models_to_try:
36+
]:
2937
try:
3038
print(f"Пробую модель: {model_name} ({compute_type})...", flush=True)
31-
logging.info(f"Попытка загрузки модели: {model_name} ({compute_type})")
32-
33-
from faster_whisper import WhisperModel
3439
model = WhisperModel(model_name, device=device, compute_type=compute_type)
35-
3640
_cached_model = model
3741
_model_name = f"{model_name}_{compute_type}"
38-
39-
print(f"✅ Модель {model_name} загружена успешно!", flush=True)
40-
logging.info(f"ASR модель {model_name} загружена и кэширована")
42+
print(f"✅ Модель {model_name} загружена!", flush=True)
43+
logger.info("ASR модель %s загружена", _model_name)
4144
return model
42-
43-
except ImportError as e:
44-
logging.error(f"Ошибка импорта faster-whisper: {e}")
45-
print(f"❌ Ошибка импорта: {e}", flush=True)
46-
break
47-
except Exception as e:
48-
error_msg = str(e)[:100]
49-
print(f"❌ Ошибка с {model_name}: {error_msg}...", flush=True)
50-
logging.warning(f"Ошибка загрузки {model_name}: {e}")
45+
except Exception as exc:
46+
logger.warning("Ошибка загрузки %s/%s: %s", model_name, compute_type, exc)
5147
continue
52-
48+
5349
try:
54-
print("Пробую базовую загрузку...", flush=True)
55-
logging.info("Попытка базовой загрузки модели")
56-
from faster_whisper import WhisperModel
5750
model = WhisperModel("tiny")
58-
5951
_cached_model = model
6052
_model_name = "tiny_default"
61-
6253
print("✅ Базовая модель загружена!", flush=True)
63-
logging.info("Базовая ASR модель загружена успешно")
54+
logger.info("Базовая ASR модель загружена")
6455
return model
65-
66-
except Exception as e:
67-
error_msg = f"Критическая ошибка загрузки ASR: {e}"
68-
print(f"❌ {error_msg}", flush=True)
69-
logging.error(error_msg)
70-
raise Exception(f"Не удалось загрузить ни одну модель Whisper: {e}")
56+
except Exception as exc:
57+
raise AudioError(f"Не удалось загрузить ни одну модель Whisper: {exc}") from exc
7158

72-
def cleanup_asr():
59+
60+
def cleanup_asr() -> None:
7361
global _cached_model, _model_name
7462
if _cached_model is not None:
75-
logging.info("Очищаю память ASR модели")
63+
logger.info("Очищаю память ASR модели")
7664
del _cached_model
7765
_cached_model = None
7866
_model_name = None
7967
gc.collect()
8068

69+
8170
def transcribe(model, path: str) -> str:
8271
try:
83-
logging.info(f"Начинаю транскрипцию файла: {path}")
8472
segments, _ = model.transcribe(
8573
path,
8674
language="ru",
8775
vad_filter=True,
8876
vad_parameters=dict(min_silence_duration_ms=300),
89-
# Максимально быстрый режим: жадный декодер без бима
9077
beam_size=1,
9178
best_of=1,
92-
condition_on_previous_text=False
79+
condition_on_previous_text=False,
9380
)
94-
9581
result = "".join(seg.text for seg in segments).strip()
96-
logging.info(f"Транскрипция завершена: '{result}'")
82+
logger.info("Транскрипция завершена: '%s'", result)
9783
return result
98-
99-
except FileNotFoundError:
100-
logging.error(f"Аудио файл не найден: {path}")
101-
print(f"❌ Файл не найден: {path}")
102-
return ""
103-
except Exception as e:
104-
error_msg = f"ASR ошибка транскрипции: {e}"
105-
logging.error(error_msg)
106-
print(f"❌ {error_msg}")
107-
return ""
84+
except FileNotFoundError as exc:
85+
logger.error("Аудио файл не найден: %s", path)
86+
raise AudioError(f"Файл не найден: {path}") from exc
87+
except Exception as exc:
88+
logger.error("ASR ошибка транскрипции: %s", exc)
89+
raise AudioError(f"Ошибка транскрипции: {exc}") from exc

assistant/core/__init__.py

Whitespace-only changes.

assistant/core/exceptions.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
from __future__ import annotations
2+
3+
4+
class AssistantError(Exception):
5+
pass
6+
7+
8+
class OllamaError(AssistantError):
9+
pass
10+
11+
12+
class ConfigError(AssistantError):
13+
pass
14+
15+
16+
class SkillError(AssistantError):
17+
pass
18+
19+
20+
class AudioError(AssistantError):
21+
pass

assistant/core/state.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
from __future__ import annotations
2+
import subprocess
3+
from dataclasses import dataclass, field
4+
5+
6+
@dataclass
7+
class AssistantState:
8+
cfg: dict = field(default_factory=dict)
9+
hotkey: str = "right shift"
10+
mic_device: int | str | None = None
11+
last_text: str = ""
12+
ollama_process: subprocess.Popen | None = None

0 commit comments

Comments
 (0)