Skip to content

Fix #587: Remove leading space from txt output#3921

Open
JingliangGao wants to merge 1 commit into
ggml-org:masterfrom
JingliangGao:fix/issue-587-leading-space-in-txt
Open

Fix #587: Remove leading space from txt output#3921
JingliangGao wants to merge 1 commit into
ggml-org:masterfrom
JingliangGao:fix/issue-587-leading-space-in-txt

Conversation

@JingliangGao

Copy link
Copy Markdown

The BPE tokenizer used by Whisper produces tokens with leading spaces, causing each line in the txt output to start with an unwanted space.

This fix strips leading whitespace (spaces and tabs) from each segment when writing to txt output files, improving the readability of the transcription output.

Fixes: #587

The BPE tokenizer used by Whisper produces tokens with leading spaces,
causing each line in the txt output to start with an unwanted space.

This fix strips leading whitespace (spaces and tabs) from each segment
when writing to txt output files, improving the readability of the
transcription output.

Fixes: ggml-org#587
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Each sentence has a leading space when outputting to .txt file

1 participant