Added Hindi percentage ITN class#1
Added Hindi percentage ITN class#1mayuris-00 wants to merge 4 commits intoRajanPutty:ITN-KT_Percentage-Classfrom
Conversation
test_percentage.py
Outdated
There was a problem hiding this comment.
This file should not be here. You already have the correct copy at tests/nemo_text_processing/hi/test_percentage.py. This root-level version will not work -- the relative import from ..utils import CACHE_DIR requires the file to be inside the tests/nemo_text_processing/hi/ package. Please delete this file.
nemo_text_processing/inverse_text_normalization/hi/taggers/percentage.py
Show resolved
Hide resolved
nemo_text_processing/inverse_text_normalization/hi/verbalizers/percentage.py
Show resolved
Hide resolved
nemo_text_processing/inverse_text_normalization/hi/verbalizers/verbalize.py
Show resolved
Hide resolved
There was a problem hiding this comment.
This cdifflib fallback fix is unrelated to the percentage class. Even if it was needed to get your environment working, it should be a separate commit or a separate PR. Mixing unrelated fixes into a feature PR makes the review harder and the git history messier. Please remove this change from this PR and raise it separately if needed.
There was a problem hiding this comment.
Reverted the cdifflib change from this PR. Will raise it separately if needed.
tests/nemo_text_processing/hi/data_inverse_text_normalization/test_cases_percentage.txt
Show resolved
Hide resolved
nemo_text_processing/inverse_text_normalization/hi/data/percentage/__init__.py
Show resolved
Hide resolved
nemo_text_processing/inverse_text_normalization/hi/data/percentage/percent_symbol.tsv
Show resolved
Hide resolved
nemo_text_processing/inverse_text_normalization/hi/taggers/tokenize_and_classify.py
Show resolved
Hide resolved
Signed-off-by: mayuris-00 <mayuris@nvidia.com>
Summary
Added a new
percentagesemiotic class to the Hindi ITN pipeline.The system now converts spoken Hindi percentages to written form:
Files Added
Files Modified
Test Results
All 12 percentage test cases passed.
All existing Hindi ITN tests still pass.
Verbose Trace
Input: बीस प्रतिशत
Tagged: tokens { percentage { integer: "२०" percent: "%" } }
Output: २०%
Input: सत्तर परसेंट
Tagged: tokens { percentage { integer: "७०" percent: "%" } }
Output: ७०%