This project serves as a comprehensive guide to building an advanced audio recorder application, featuring real-time signal visualization and seamless integration with Edge Impulse for uploading audio data. It combines the power of Python with PyQt5 for creating an intuitive graphical user interface, Matplotlib for dynamic plotting, and PyAudio for efficient audio stream management. By leveraging these technologies, this application not only provides real-time audio monitoring but also facilitates AI-driven audio classification and model training via Edge Impulse's API.
- Muhammad 'Azmilfadhil S. (2042231003)
- Bagus Wijaksono (2042231029)
- Rivaldi Satrio W. (2042231043)
- Ahmad Radhy (supervisor)
Teknik Instrumentasi - Institut Teknologi Sepuluh Nopember
- Record audio in real-time.
- Display both time-domain and frequency-domain plots of the audio signal.
- Save recorded audio as a WAV file.
- Upload audio files directly to Edge Impulse for training purposes.
Ensure the following are installed on your system:
- Python 3.7 or later
- Pip
- Required Python libraries (listed below)
Install the required libraries using pip:
pip install numpy matplotlib PyQt5 pyaudio requests- Create a new directory for the project and navigate into it:
mkdir audio_recorder cd audio_recorder - Create a Python file (e.g.,
audio_recorder.py) and copy the provided code into it.
- Replace the
self.api_keyvariable in the code with your Edge Impulse API key. - Modify the
self.labelvariable to reflect the category label of your audio data.
Execute the application using the command:
python audio_recorder.py- Click the Record button to start recording.
- Real-time audio plots (time-domain and frequency-domain) will update as you record.
- Click Stop to end the recording.
- Click Save & Upload to save the recording as a WAV file.
- The audio file is automatically uploaded to Edge Impulse using the configured API key.
- Click Reset to clear the plots and reset the recorder for a new session.
-
Real-Time Audio Plotting:
- Time-domain: Displays the amplitude of the signal over time.
- Frequency-domain: Displays the amplitude of frequencies using Discrete Fourier Transform (DFT).
-
Saving Audio:
- The audio is saved as a WAV file in the
audio_filesdirectory. - Each file is named with a timestamp to ensure uniqueness.
- The audio is saved as a WAV file in the
-
Uploading to Edge Impulse:
- Uses the
requestslibrary to send POST requests to the Edge Impulse ingestion API.
- Uses the
- PyQt5: For GUI creation.
- Matplotlib: For plotting real-time audio signals.
- PyAudio: For audio stream handling.
- Requests: For sending audio files to Edge Impulse.
.
|-- audio_recorder.py # Main application code
|-- audio_files/ # Directory for saved audio files (created automatically)
- Ensure your microphone is functional and accessible to the application.
- The API key should be kept secure and not shared publicly.
-
PyAudio Installation Issues: On some platforms, installing PyAudio might require additional setup. Use the following commands:
- On Windows:
pip install pipwin pipwin install pyaudio
- On macOS/Linux:
Ensure PortAudio is installed (e.g., via Homebrew or apt) before running
pip install pyaudio.
- On Windows:
-
Permission Issues: Run the script with elevated permissions if necessary to access your microphone.
- Add more advanced audio processing features, such as noise reduction or feature extraction.
- Provide options for multiple labels when uploading to Edge Impulse.
- Add support for stereo audio recording.