Skip to content

Fix memory leaks and possible use-after-free#2024

Open
uvlad7 wants to merge 5 commits intoalphacep:masterfrom
uvlad7:python_memory_leak
Open

Fix memory leaks and possible use-after-free#2024
uvlad7 wants to merge 5 commits intoalphacep:masterfrom
uvlad7:python_memory_leak

Conversation

@uvlad7
Copy link
Copy Markdown
Contributor

@uvlad7 uvlad7 commented Feb 22, 2026

  • Free string returned by vosk_text_processor_itn - the C function transfers ownership of the returned string, but the Python binding was not freeing it, causing a leak on every Processor.process() call. void free(void*) is now exposed via cffi and called after copying the string.

  • Fix set_spk_model leak - calling vosk_recognizer_set_spk_model twice on the same recognizer - or once a on recognizer created with SpkModel - was leaking both the old speaker model reference and the old spk_feature_ object. The old model is now unref'd and the old feature object is deleted before reassigning.

  • Reference counting for BatchModel - BatchModel now tracks how many BatchRecognizer instances hold a reference to it. Each BatchRecognizer increments the count on construction and decrements on destruction; vosk_batch_model_free also decrements instead of deleting directly. The object is only deleted when the count reaches zero. The implementation uses the same mechanism as regular Model and ensures that different lifetimes in Python/Ruby bindings don't cause use-after-free. This can be a breaking change, but it simplifies bindings, otherwise it's needed to ensure that BatchRecognizer keeps a refecence of its BatchModel in all implementations.

  • Safe BatchRecognizer teardown - the destructor now ensures any in-progress stream is finished and all pending chunks are drained before the recognizer is destroyed, and before it releases its reference to the model. This is intended to address the heap-use-after-free crashes reported in issue Crash with Node.js acceptWaveformAsync (heap-use-after-free) #1189, where a recognizer was freed while an async waveform call was still running. Note: this commit was generated with AI assistance. The general approach looks reasonable to me, but I have not fully reviewed or tested this code and I'm not certain it fully resolves the race condition described in Crash with Node.js acceptWaveformAsync (heap-use-after-free) #1189. Extra scrutiny on BatchRecognizer::~BatchRecognizer() and the finished_ logic would be appreciated.

@uvlad7 uvlad7 changed the title Fix memory leak: free str returned from vosk_text_processor_itn Fix memory leaks and possible use-after-free Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant