Fix memory leaks and possible use-after-free#2024
Open
uvlad7 wants to merge 5 commits intoalphacep:masterfrom
Open
Fix memory leaks and possible use-after-free#2024uvlad7 wants to merge 5 commits intoalphacep:masterfrom
uvlad7 wants to merge 5 commits intoalphacep:masterfrom
Conversation
vosk_text_processor_itn
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Free string returned by
vosk_text_processor_itn- the C function transfers ownership of the returned string, but the Python binding was not freeing it, causing a leak on everyProcessor.process()call.void free(void*)is now exposed via cffi and called after copying the string.Fix
set_spk_modelleak - callingvosk_recognizer_set_spk_modeltwice on the same recognizer - or once a on recognizer created withSpkModel- was leaking both the old speaker model reference and the oldspk_feature_object. The old model is now unref'd and the old feature object is deleted before reassigning.Reference counting for
BatchModel-BatchModelnow tracks how manyBatchRecognizerinstances hold a reference to it. EachBatchRecognizerincrements the count on construction and decrements on destruction;vosk_batch_model_freealso decrements instead of deleting directly. The object is only deleted when the count reaches zero. The implementation uses the same mechanism as regularModeland ensures that different lifetimes in Python/Ruby bindings don't cause use-after-free. This can be a breaking change, but it simplifies bindings, otherwise it's needed to ensure thatBatchRecognizerkeeps a refecence of itsBatchModelin all implementations.Safe
BatchRecognizerteardown - the destructor now ensures any in-progress stream is finished and all pending chunks are drained before the recognizer is destroyed, and before it releases its reference to the model. This is intended to address the heap-use-after-free crashes reported in issue Crash with Node.js acceptWaveformAsync (heap-use-after-free) #1189, where a recognizer was freed while an async waveform call was still running. Note: this commit was generated with AI assistance. The general approach looks reasonable to me, but I have not fully reviewed or tested this code and I'm not certain it fully resolves the race condition described in Crash with Node.js acceptWaveformAsync (heap-use-after-free) #1189. Extra scrutiny onBatchRecognizer::~BatchRecognizer()and thefinished_logic would be appreciated.