Skip to content

Commit 345f585

Browse files
authored
Es tn romans fix (#98)
* fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com>
1 parent 45c04c4 commit 345f585

4 files changed

Lines changed: 13 additions & 3 deletions

File tree

Jenkinsfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ pipeline {
1414
AR_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-08-23-0'
1515
DE_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-08-23-0'
1616
EN_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-14-23-0'
17-
ES_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-08-23-0'
17+
ES_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/08-29-23-0'
1818
ES_EN_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-13-23-1'
1919
FR_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/08-16-23-1'
2020
HU_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-08-23-0'

nemo_text_processing/inverse_text_normalization/run_evaluate.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ def parse_args():
5555
if args.lang == 'en':
5656
from nemo_text_processing.inverse_text_normalization.en.clean_eval_data import filter_loaded_data
5757
file_path = args.input
58-
inverse_normalizer = InverseNormalizer()
58+
inverse_normalizer = InverseNormalizer(lang=args.lang)
5959

6060
print("Loading training data: " + file_path)
6161
training_data = load_files([file_path])
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
di
2+
Di
3+
DI
4+
mi
5+
Mi
6+
MI
7+
vi
8+
Vi
9+
VI

nemo_text_processing/text_normalization/es/taggers/ordinal.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,8 @@ def __init__(self, cardinal: GraphFst, deterministic: bool = True):
131131

132132
# Managing Romanization, excluding words that may be ambiguous
133133
roman_ordinals = roman_to_int(ordinal_graph)
134-
exceptions = pynini.accep("vi") | pynini.accep("di") | pynini.accep("mi")
134+
# exceptions = pynini.accep("vi") | pynini.accep("di") | pynini.accep("mi")
135+
exceptions = pynini.string_file(get_abs_path("data/ordinals/roman_exceptions.tsv"))
135136
graph_exception = pynini.project(exceptions, 'input')
136137
roman_ordinals = (pynini.project(roman_ordinals, "input") - graph_exception.arcsort()) @ roman_ordinals
137138

0 commit comments

Comments
 (0)