Adverse medical events often arise from miscommunication or overlooked information during patient-physician interactions. Identifying these events proactively could significantly improve patient safety and healthcare outcomes. However, there is currently no scalable, automated method to analyze spoken medical conversations and flag potential adverse outcomes.
Deployed Link - PulsePredict App
Develop an end-to-end system that:
- Transcribes doctor-patient audio conversations
- Extracts key medical entities
- Predicts the likelihood of an adverse medical event based on the conversation
PulsePredict follows a multi-stage processing pipeline:
-
Audio Transcription
Transcribe audio calls using OpenAI’sWhispermodel to convert speech into text. -
Medical Entity Extraction
Use AWS Comprehend Medical to extract relevant medical entities (symptoms, medications, diagnoses, etc.) from the transcriptions. -
Labeling for Adverse Events
Utilize a curated FAERS (FDA Adverse Event Reporting System) dataset to identify and label potential adverse medical events based on extracted entities. -
Feature Engineering
Engineer structured features from the medical concepts to be used for model training. -
Adverse Event Prediction
Train a machine learning model on these features to predict the probability of an adverse medical event occurring as a result of the doctor-patient interaction.
The following diagram illustrates the complete pipeline for predicting adverse medical events from audio-based medical conversations:
A breakdown of the core scripts used in this project:
-
predict_from_audio.py
Orchestrates the complete pipeline: from audio input to final prediction using the trained model. -
train_model.py
Trains the machine learning model using features extracted from labeled medical conversations. -
evaluate_model.py
Evaluates the trained model and benchmarks its performance against a rule-based baseline system. -
utils/
A directory containing helper functions for:- Loading FAERS adverse event data
- Labeling entities with known adverse events
- Parsing and cleaning entities from AWS Comprehend Medical
- Transcribe medical calls using Whisper
- Extract medical entities using AWS Comprehend Medical
- Label entities based on FAERS adverse events database
- Engineer features from labeled entities
- Train a classifier to predict if an adverse event occurred
- Run end-to-end predictions on new audio
Video Link - PulsePredict Video
| Layer | Technology |
|---|---|
| Transcription | OpenAI Whisper |
| NLP | AWS Comprehend Medical |
| ML Model | Scikit-learn (Random Forest) |
| Backend | Python |
| Front End | Streamlit |
| Deployment | Streamlit |
The project includes two types of testing:
-
UI Automation Testing
- Tests built with Playwright and Pytest
- Validates UI elements: titles, input fields, and buttons
- Runs in a headless browser (Chromium)
- Ensures UI consistency and responsiveness
-
Manual Testing
- Covers backend pipeline from audio input to prediction
- Tests transcription, entity extraction, and adverse event detection
- Includes edge case handling (e.g. missing/corrupted files)
- Helps verify functional correctness of each module
Test Documents
- Challenge: Batch processing of .mp3 files with Whisper.
- Issues: Manual transcription, CPU performance warnings.
- Solution: Developed
batch_transcribe.pyandpredict_from_audio.pyfor automated transcription.
- Challenge: Structured medical data extraction via AWS Comprehend Medical.
- Issues: Missing AWS credentials.
- Solution: Configured AWS CLI and used
batch_entity_extraction.py.
- Challenge: Noisy transcripts reduced NLP performance.
- Solution: Built
batch_preprocess_transcripts.pyto clean transcripts using a filler word list.
- Challenge: Matching entities with FAERS data.
- Issues: Complex CSV format, exact matching.
- Solution: Cleaned FAERS data and used partial/lowercase matching in
label_entities.py.
- Challenge: Poor model performance due to weak features.
- Solution: Added meaningful features (e.g., adverse_event_ratio) and rebuilt
feature_engineering.py.
- Challenge: Model overfitting and poor generalization.
- Solution: Balanced dataset with false samples, evaluated with both model and rule-based methods.
- Challenge: Dataset bias towards positive samples.
- Solution: Added negative samples and improved feature diversity for better model performance.
- Challenge: Interface bugs and missing dependencies.
- Solution: Installed necessary libraries and finalized
medical_streamlit_app_updated.py.
- Challenge: Uploaded unnecessary files, missing
.gitignore. - Solution: Added
.gitignore, removed unused scripts, and updated project documentation.
- Incorporate time-aware features such as event sequences and timestamps.
- Use larger Whisper models to improve transcription accuracy.
- Fine-tune domain-specific NLP models like BioBERT for better entity extraction.
- Expand the FAERS dataset to cover more entity types and adverse events.












