SPEECH-RECOGNITION-SYSTEM

COMPANY: CODETECH IT SOLUTIONS

NAME: MAHALAXMI K

INTERN ID: CT08RQW

COMPANY: AI

DURATION: 4 WEEKS

MENTOR: NEELA SANTHOSH KUMAR

DESCRIPTION OF THE PROJECT

This project is a Speech-to-Text Converter, which allows users to convert spoken words into written text using a microphone. The main functionality revolves around capturing live audio from the user, processing it with speech recognition technology, and displaying the transcribed text in a graphical user interface (GUI). The user can start and stop the speech recognition process using two buttons, making the application interactive and user-friendly. The project uses the Tkinter library for the graphical interface, the SpeechRecognition library for speech-to-text conversion, and threading to ensure that the GUI remains responsive during the listening process. The tool enables real-time conversion of spoken words to text, providing a seamless and efficient way to transcribe audio.

The core technology behind this project is Google’s Speech Recognition API, which is used through the SpeechRecognition Python package. The API takes audio input, processes it, and returns the recognized speech in the form of text. The application continuously listens for speech input until the user presses the "Stop" button. This is managed using a flag (listening), which ensures the program continues to record audio until explicitly stopped. The Tkinter library is used to create the user interface, allowing the user to interact with the program. The interface includes buttons to start and stop the speech recognition process, a text output area to display the transcribed text, and a status label that provides real-time feedback about the system’s state.

The tech stack used in this project includes Python, Tkinter, SpeechRecognition, and threading. Python is the primary language, chosen for its simplicity and flexibility, making it ideal for rapid prototyping and integration with external libraries. Tkinter is the default GUI toolkit for Python, which is lightweight and easy to use for building simple desktop applications. SpeechRecognition is an essential library for implementing speech-to-text functionality and utilizes Google's cloud-based speech recognition API. Threading is used to run the speech recognition process in a separate thread, ensuring the GUI doesn’t freeze while recording or processing speech.

Through working on this project, I learned several key skills. First, I gained experience with integrating speech recognition technology into Python applications, using SpeechRecognition to access and process real-time audio input. I also enhanced my understanding of multithreading in Python, which is crucial for maintaining a responsive user interface while performing tasks that take time, such as speech recognition. Additionally, I deepened my knowledge of GUI development with Tkinter, learning how to create functional interfaces with interactive components like buttons, text areas, and labels. The project also taught me how to manage application flow with flags and conditions, such as using the listening variable to control when the recording starts and stops. This project improved my ability to design user-centric applications and handle external APIs and asynchronous tasks, laying a solid foundation for further work in speech processing and user interface design.

OUTPUT OF THE PROJECT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
TASK_2_SPEECH_RECOGNITION.py		TASK_2_SPEECH_RECOGNITION.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPEECH-RECOGNITION-SYSTEM

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SPEECH-RECOGNITION-SYSTEM

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages