Skip to content

Can't encode transcription: when train on tessdata_best with ara.traineddata #308

@engmustafak26

Description

@engmustafak26

I follow the steps on YouTube and it works if the gt.txt file is not unicode, so it fails with the Arabic language


Can't encode transcription: '‎داليملا' in language ''
Encoding of string failed! Failure bytes: e2 80 8e d9 89 d9 86 d9 8a d8 a8 d9 84 d9 81
Can't encode transcription: '‎ىنيبلف' in language ''
Encoding of string failed! Failure bytes: e2 80 8e d9 85 d8 a7 d8 b9 d9 84 d8 a7
Can't encode transcription: '‎ماعلا' in language ''
Encoding of string failed! Failure bytes: e2 80 8e d8 b3 d9 8a d9 86 d9 88 d9 8a d8 a7 d9 84
Can't encode transcription: '‎سينويال' in language ''
Encoding of string failed! Failure bytes: e2 80 8e d8 af d8 a7 d9 84 d9 8a d9 85 d9 84 d8 a7
Can't encode transcription: '‎داليملا' in language ''
Encoding of string failed! Failure bytes: e2 80 8e d9 89 d9 86 d9 8a d8 a8 d9 84 d9 81
Can't encode transcription: '‎ىنيبلف' in language ''
2 Percent improvement time=70, best error was 71.355 @ 341

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions