Skip to content

Circuit-Digest/ESP32-Speech-To-Text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

🎙️ Speech-to-Text on ESP32 using Wit.ai

Convert spoken words to text in real-time using an ESP32 microcontroller, an I2S MEMS microphone, and the Wit.ai speech recognition API — with results displayed on an OLED screen and the Serial Monitor.


📖 Overview

This project demonstrates how to build a low-cost, Wi-Fi-enabled Speech-to-Text system on the ESP32. Audio is captured via an I2S digital microphone, streamed to the Wit.ai cloud API for transcription, and the resulting text is shown on a connected OLED display.


✨ Features

  • 🎤 Real-time audio capture using an I2S MEMS microphone
  • ☁️ Cloud-based speech recognition via Wit.ai
  • 🖥️ Transcribed text displayed on an SSD1306 OLED screen
  • 📟 Output also available via Serial Monitor (for debugging)
  • 📶 Wi-Fi connectivity using the ESP32's built-in radio

🛠️ Hardware Requirements

Component Description
ESP32 Dev Board Main microcontroller (e.g., ESP32-WROOM-32)
I2S MEMS Microphone e.g., INMP441
OLED Display 0.96" SSD1306 (128×64, I2C)
Jumper Wires For connections
Breadboard Optional, for prototyping

🔌 Wiring Diagram

I2S Microphone → ESP32

Mic Pin ESP32 Pin
VDD 3.3V
GND GND
WS (LRCK) GPIO 15
SCK (BCLK) GPIO 14
SD (Data) GPIO 32
L/R GND (Left channel)

OLED Display → ESP32 (I2C)

OLED Pin ESP32 Pin
VCC 3.3V
GND GND
SDA GPIO 21
SCL GPIO 22

⚠️ Pin numbers may vary depending on your specific ESP32 board. Adjust in the code as needed.


📦 Software Requirements

  • Arduino IDE (v1.8+ or v2.x)
  • ESP32 Board Support Package
  • Required Libraries:
    • Adafruit SSD1306
    • Adafruit GFX Library
    • WiFiClientSecure (built-in with ESP32 core)
    • ArduinoJson (optional, for parsing Wit.ai response)

⚙️ Setup & Installation

1. Clone the Repository

git clone https://github.com/your-username/esp32-speech-to-text.git
cd esp32-speech-to-text

2. Install ESP32 Board in Arduino IDE

  1. Go to File → Preferences
  2. Add this URL to Additional Board Manager URLs:
    https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json
    
  3. Go to Tools → Board → Board Manager, search for esp32, and install.

3. Install Required Libraries

In Arduino IDE, go to Sketch → Include Library → Manage Libraries and install:

  • Adafruit SSD1306
  • Adafruit GFX Library
  • ArduinoJson (if used)

4. Configure Your Credentials

Open the main .ino file and update the following:

const char* ssid     = "YOUR_WIFI_SSID";
const char* password = "YOUR_WIFI_PASSWORD";
const char* witai_token = "YOUR_WIT_AI_ACCESS_TOKEN";

5. Get a Wit.ai Access Token

  1. Go to https://wit.ai and sign in with your Facebook/Meta account.
  2. Create a new app and select your language.
  3. Copy the Server Access Token from Settings.

6. Upload to ESP32

  1. Select your board under Tools → Board → ESP32 Dev Module
  2. Select the correct Port
  3. Click Upload

🚀 How It Works

[User Speaks]
     ↓
[I2S Mic captures audio]
     ↓
[ESP32 buffers audio samples]
     ↓
[Audio sent to Wit.ai via HTTPS POST]
     ↓
[Wit.ai returns transcribed text (JSON)]
     ↓
[Text displayed on OLED + Serial Monitor]

📁 Project Structure

esp32-speech-to-text/
├── esp32_speech_to_text.ino   # Main Arduino sketch
├── wit_ai.h                   # Wit.ai API communication
├── i2s_mic.h                  # I2S microphone configuration
├── oled_display.h             # OLED display helpers
└── README.md

🧪 Serial Monitor Output

Open the Serial Monitor at 115200 baud to see debug logs:

Connecting to WiFi...
Connected! IP: 192.168.1.42
Recording audio...
Sending to Wit.ai...
Response: "turn on the light"

🐛 Troubleshooting

Issue Possible Fix
No audio captured Check I2S wiring; verify pin definitions in code
OLED not displaying Confirm I2C address (usually 0x3C); check SDA/SCL pins
Wi-Fi not connecting Double-check SSID and password in config
Wit.ai returns empty Speak clearly; check your access token; verify audio format
Upload fails Hold the BOOT button on ESP32 during upload

📄 License

This project is licensed under the MIT License.


🙌 Acknowledgements


🤝 Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you'd like to change.

  1. Fork the repo
  2. Create your feature branch: git checkout -b feature/my-feature
  3. Commit your changes: git commit -m 'Add some feature'
  4. Push to the branch: git push origin feature/my-feature
  5. Open a Pull Request

About

This repo contains the Arduino code that records speech to an ESP32 via an I2S microphone, sends it to wit.ai and displays the text extracted by wit.ai on an OLED display.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages