Skip to content

Commit 0b2c738

Browse files
v2.2.0
1 parent 2bc478e commit 0b2c738

161 files changed

Lines changed: 4571 additions & 878 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitattributes

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,5 @@
1+
* text=auto
12
*.bat linguist-detectable=true
3+
4+
# I almost didn't change code and I don't want whole Whisper.unity C# statistic in this repository and my profile stats
5+
src/Unity/Assets/UnityNeuroSpeech/Whisper/** linguist-vendored

.gitignore

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,21 @@
11
# MkDocs
22
.venv/
33
.idea/
4-
site/
4+
site/
5+
6+
# CMake and result output
7+
src/Native/build/
8+
src/Native/result/
9+
10+
# VS
11+
src/UnityExtensions/.vs
12+
src/UnityExtensions/UnityExtensions/bin
13+
src/UnityExtensions/UnityExtensions/obj
14+
15+
# Unity
16+
# It's not project files to download and open with unity. It contains only frameworks files,
17+
# critical assets (TTS models, audio files) are excluded. Downloaded latest Release to get a full working version for your Unity project.
18+
/src/Unity/**
19+
!/src/Unity/Assets/
20+
!/src/Unity/Assets/UnityNeuroSpeech/
21+
!/src/Unity/Assets/UnityNeuroSpeech/**

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,27 @@
11
> [!IMPORTANT]
22
> Only the latest version is officially supported.
33
4+
# v2.2.0 [21.01.2026]
5+
6+
## Unity
7+
8+
- **Fixed IL2CPP freeze when running TTS process.**
9+
- **Replaced deprecated `Microsoft.Extensions.AI` with OllamaSharp.**
10+
- Centralized all framework paths into `StaticData.cs`.
11+
- Simplified **UNS Manager Creation** — now only Whisper model name is required instead of full `StreamingAssets/` path.
12+
- Removed legacy and low-value features (IL2CPP logging, custom framework locations).
13+
- `CreateSettings.cs` no longer generates `CreateAgent.cs` from template.
14+
- Improved code comments, logs, and internal validation.
15+
16+
---
17+
18+
## ~Setup~
19+
20+
- Setup script was removed and replaced with a fully documented manual installation process.
21+
22+
---
23+
24+
 
425

526
# v2.1.0 [23.11.2025]
627

LICENSES.md

Lines changed: 32 additions & 271 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,19 +18,6 @@
1818
1919
---
2020

21-
> [!IMPORTANT]
22-
> Full response time (from your speech to TTS generated voice) can sometimes take a minute or more right now.
23-
> This isn't a bug – it's the current reality of running powerful AI models locally and for free:
24-
> - Good STT (like Whisper) needs time to be accurate.
25-
> - Even small LLM (via Ollama) needs some time to think up a good response.
26-
> - Generating high-quality voice with TTS is also a complex and not fast task.
27-
>
28-
> The key thing is: this framework is built on the most optimal and user-friendly local solutions available for each stage (STT, LLM, TTS). You have the freedom to choose and download your own models (like Whisper `.bin` for STT and any model for Ollama), to use any custom voice you want, to make it for different languages, to customize your agents and to find the perfect balance between speed and quality for your setup.
29-
>
30-
> Also this project is actively maintained. With every update, I'm working on making it faster, more optimized, and easier to use!
31-
32-
---
33-
3421
UnityNeuroSpeech is an open-source framework for creating **fully voice-interactive AI agents** inside Unity.
3522
It connects:
3623

@@ -76,7 +63,7 @@ No subscriptions, no accounts, no OpenAI API keys.
7663

7764
## 🧪 Built with:
7865

79-
- 🧠 [Microsoft.Extensions.AI](https://learn.microsoft.com/en-us/dotnet/ai/) (Ollama)
66+
- 🧠 [OllamaSharp](https://github.com/awaescher/OllamaSharp)
8067
- 🎤 [whisper.unity](https://github.com/Macoron/whisper.unity)
8168
-[UniTask](https://github.com/Cysharp/UniTask)
8269
- 🧊 [Coqui XTTS](https://github.com/idiap/coqui-ai-TTS)

docs/index.md

Lines changed: 74 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,88 @@
1-
# 🚀 Quick Start
1+
# 🚀 Getting Started
22

33
---
44

55
## 🛠 Installing Requirements
66

7+
UnityNeuroSpeech requires several things to be installed before using it in Unity. Here what you need to install:
8+
79
---
810

9-
UnityNeuroSpeech requires several programs to be installed.
10-
You can simply run `setup.bat` — it will download everything automatically.
11-
Then just import the `.unitypackage` into your project.
11+
### 1. Ollama
12+
13+
**Ollama** is a platform for running large language models (LLMs) locally. You can use models like DeepSeek, Gemma, Qwen, etc.
14+
Note that small models might affect accuracy and context understanding, but big models response can take a long time.
15+
16+
Install it from the [official website](https://ollama.com/download).
17+
18+
Then you need to download LLM model with this command:
19+
```console
20+
ollama pull modelname
21+
```
22+
> For quick test I recommend to download **qwen2.5:3b** - it responds very fast.
1223
1324
---
1425

15-
## 🎙️ Voice Files
26+
### 2. STT
27+
28+
**Whisper** — Speech-To-Text model that transcribes and translates audio with high accuracy.
29+
30+
You need to download Whisper model from [here](https://huggingface.co/ggerganov/whisper.cpp/tree/main).
31+
> For quick test I recommend to download `ggml-base.bin`.
32+
33+
---
34+
35+
### 3. TTS
36+
37+
**UV** — a modern, ultra-fast Python package and environment manager. It replaces traditional tools like `pip`.
38+
**Coqui TTS**(runs XTTS) uses **UV** to simplify installation and allows running the TTS command directly, without manual Python setup.
39+
40+
**Coqui XTTS** — a Text-To-Speech model that can generate speech in any custom voice you want: Chester Bennington, Vito Corleone (The Godfather), Cyn (Murder Drones) or any other.
41+
42+
 
43+
44+
Install **UV** with this command:
45+
46+
```console
47+
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
48+
```
49+
50+
Then install **Coqui TTS**:
51+
52+
```console
53+
uv tool install --python 3.11 "coqui-tts==0.27.2"
54+
```
55+
> You can try to install latest **Coqui TTS** version, but I can't guarantee that it will work.
1656
1757
---
1858

59+
### 4. Unity deps
60+
61+
UnityNeuroSpeech uses **UniTask** for... I think everyone knows what **UniTask** is.
62+
You can install it from the [official repository](https://github.com/Cysharp/UniTask/releases).
63+
64+
Or via **UPM**:
65+
66+
```console
67+
https://github.com/Cysharp/UniTask.git?path=src/UniTask/Assets/Plugins/UniTask
68+
```
69+
70+
---
71+
72+
### 5. UnityNeuroSpeech
73+
74+
Now you can finally import **UnityNeuroSpeech** to your Unity project.
75+
76+
Download `UnityNeuroSpeech.X.X.X.unitypackage` and `UNS_StreamingAssets.unitypackage` from the [official repository](https://github.com/HardCodeDev777/UnityNeuroSpeech/releases). Then you can import them in your project.
77+
78+
They are splitted to avoid importing almost 2GB XTTS model in your project if you only need code/fixes.
79+
80+
**Don't forget to put your downloaded Whisper model(`.bin`) in `Assets/StreamingAssets/UnityNeuroSpeech/Whisper/` - yes, it's important.**
81+
82+
---
83+
84+
## 🎙️ Voice Files
85+
1986
Don’t forget that you need voice files for TTS speech.
2087
Make sure your files meet the following requirements:
2188

@@ -38,7 +105,5 @@ All voices must be placed in:
38105

39106
## 🖼️ Microphone Sprites
40107

41-
---
42-
43-
You’ll need two sprites for the microphone state (enabled/disabled).
44-
Yes — without them, it won’t work 🤠
108+
You’ll need two sprites for the microphone state (enabled/disabled).
109+
But for quick test you can use random default sprites from Unity.

docs/unity/agent-api.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,6 @@ public class AlexBehaviour : AgentBehaviour
4848

4949
### 🔍 Methods Overview
5050

51-
---
52-
5351
- **AfterTTS** — Called after the audio playback finishes.
5452
- **BeforeTTS** — Called before sending text to the TTS model.
5553
- **AfterSTT** — Called after the STT model finishes transcribing microphone input.
@@ -70,8 +68,6 @@ public override void Awake()
7068

7169
### 💡 What Is `SetBehaviourToAgent()`?
7270

73-
---
74-
7571
The `SetBehaviourToAgent()` method connects your `AgentBehaviour` to the agent’s internal event hooks:
7672

7773
```csharp

docs/unity/steps-to-make-it-work.md

Lines changed: 3 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -9,40 +9,24 @@ You can find tooltips for each field directly in the Unity Editor.
99

1010
## Step 1. 🧪 Settings
1111

12-
---
13-
1412
Go to **UnityNeuroSpeech → Main → Create Settings** in the Unity toolbar.
1513
Default settings are recommended.
1614

1715
---
1816

1917
## Step 2. 👀 UNS Manager
2018

21-
---
22-
2319
**UnityNeuroSpeech Manager** is a GameObject in your scene that controls all non-agent scripts.
2420
Without it, no agent (talkable AI) will work.
2521

26-
---
2722

2823
Create a `Dropdown` in your scene.
29-
Then go to **UnityNeuroSpeech → Main → Create UNS Manager**.
30-
31-
The important setting there is:
32-
33-
- **Whisper model path in StreamingAssets** — path to your downloaded Whisper model (`.bin`) inside the `StreamingAssets` folder (without the `Assets` directory).
34-
Example:
35-
If the full path is
36-
`Assets/StreamingAssets/UnityNeuroSpeech/Whisper/ggml-medium.bin`
37-
then you should enter
38-
`UnityNeuroSpeech/Whisper/ggml-medium.bin`
24+
Then go to **UnityNeuroSpeech → Main → Create UNS Manager**.
3925

4026
---
4127

4228
## Step 3. 🧠 Agent
4329

44-
---
45-
4630
An **Agent** in UnityNeuroSpeech is a GameObject that can listen, respond, and talk using LLM.
4731
**Once you create your first agent, you’ll be able to talk with your AI!**
4832

@@ -52,7 +36,7 @@ Add a `Button` and an `AudioSource` to your scene.
5236
Then go to **UnityNeuroSpeech → Main → Create Agent**.
5337
Here are some important settings:
5438

55-
- **Agent index** — the index mentioned in the QuickStart.
39+
- **Agent index** — the index mentioned in the Getting Started.
5640
It links an agent to its voice file.
5741
⚠️ Each agent must have a unique index!
5842

@@ -95,7 +79,7 @@ Agent performance (“speed”) depends on:
9579
- Voice files length
9680
- AI response size
9781

98-
Small models like **deepseek-r1:7b** or **ggml-tiny.bin** run fast but may ignore system prompts (emotions, actions, etc.).
82+
Small models like **qwen2.5:3b** or **ggml-base.bin** run fast, but may ignore system prompts (emotions, actions, etc.).
9983
Large models like **ggml-large.bin** usually work perfectly — but will be very slow 😐
10084

10185
> On first load, TTS may respond slowly — it’s ok. It'll work faster next time.

docs/unity/useful-tools.md

Lines changed: 8 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,37 +2,35 @@
22

33
---
44

5-
UnityNeuroSpeech provides several Editor tools to make development more convenient.
5+
UnityNeuroSpeech provides several Editor tools to improve development experience.
66

77
---
88

99
## 🗒️ Prompts Test
1010

11-
---
1211

1312
Let’s say you want to check how a selected LLM model responds to a specific prompt.
14-
Normally, you would have to run the game, wait for Whisper to load, say something into the microphone (and risk transcription errors), then wait for the LLM and TTS to finish — quite the hardcore workflow, right?
13+
Normally, you would have to run the game, wait for Whisper to load, say something into the microphone,
14+
then wait for the LLM and TTS to finish — quite the hardcore, right?
1515

1616
This tool allows you to test prompts instantly.
1717
You only wait for the **LLM** (as usual) to generate a response — and you can even see the **generation time in milliseconds**!
1818

19-
---
2019

21-
To access it, go to **UnityNeuroSpeech → Tools → Prompts Test**.
20+
Go to **UnityNeuroSpeech → Tools → Prompts Test**.
2221

2322
---
2423

2524
## 🕵️‍♂️ Decode Encoded
2625

27-
---
2826

2927
If you use AES encryption, your `.json` dialog history files will be encrypted.
3028
But what if you want to view their contents?
3129
This tool lets you decrypt and read them easily.
3230

33-
---
3431

35-
To access it, go to **UnityNeuroSpeech → Tools → Decode Encoded**.
32+
Go to **UnityNeuroSpeech → Tools → Decode Encoded**.
33+
34+
Important setting:
3635

37-
Note about the **Key to encrypt** field:
38-
You must use the same key you specified in your `AgentBehaviour` script.
36+
- **Key to encrypt**: You must use the same key you specified in your `AgentBehaviour` script.

mkdocs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ theme:
1414
- search.suggest
1515
- header.autohide
1616
nav:
17-
- Quickstart: index.md
17+
- Getting started: index.md
1818
- Unity:
1919
- Steps to make it work: unity/steps-to-make-it-work.md
2020
- Agent API: unity/agent-api.md

0 commit comments

Comments
 (0)