Commit 5c26e5d
Add Vietnamese TN support for Money and Range semiotic classes (#304)
* Add Vietnamese TN support for Money and Range semiotic classes
- Add money.py tagger and verbalizer for Vietnamese currency handling
- Add range.py tagger for numerical range processing
- Add supporting data files for money (currency, currency_minor, per_unit)
- Add quantity abbreviations and time units data
- Update existing taggers and verbalizers for integration
- Add comprehensive test cases for money and range functionality
- Update tokenize_and_classify to include new semiotic classes
Signed-off-by: folivoramanh <palasek182@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* modify illogical test cases
Signed-off-by: folivoramanh <palasek182@gmail.com>
* refractor and simplify word and punctuation to avoid hardcoding
Signed-off-by: folivoramanh <palasek182@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refractor code money range
Signed-off-by: folivoramanh <palasek182@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: folivoramanh <palasek182@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>1 parent 39704ac commit 5c26e5d
File tree
24 files changed
+892
-117
lines changed- nemo_text_processing/text_normalization/vi
- data
- money
- numbers
- time
- taggers
- verbalizers
- tests/nemo_text_processing/vi
- data_text_normalization
24 files changed
+892
-117
lines changedLines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
Lines changed: 51 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
Lines changed: 45 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
| 47 | + | |
46 | 48 | | |
47 | 49 | | |
48 | 50 | | |
| |||
Lines changed: 86 additions & 36 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
59 | 62 | | |
60 | | - | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
61 | 66 | | |
62 | 67 | | |
63 | | - | |
| 68 | + | |
64 | 69 | | |
65 | 70 | | |
66 | | - | |
| 71 | + | |
67 | 72 | | |
68 | 73 | | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
| 74 | + | |
| 75 | + | |
74 | 76 | | |
75 | 77 | | |
76 | 78 | | |
77 | 79 | | |
| 80 | + | |
78 | 81 | | |
79 | 82 | | |
80 | 83 | | |
81 | 84 | | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
82 | 124 | | |
83 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
84 | 133 | | |
85 | 134 | | |
86 | 135 | | |
87 | 136 | | |
88 | 137 | | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
95 | 142 | | |
96 | 143 | | |
97 | 144 | | |
| 145 | + | |
98 | 146 | | |
99 | 147 | | |
100 | 148 | | |
101 | 149 | | |
102 | | - | |
103 | | - | |
104 | | - | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
105 | 158 | | |
106 | 159 | | |
| 160 | + | |
107 | 161 | | |
108 | | - | |
| 162 | + | |
| 163 | + | |
109 | 164 | | |
| 165 | + | |
110 | 166 | | |
111 | 167 | | |
112 | 168 | | |
113 | | - | |
| 169 | + | |
114 | 170 | | |
115 | 171 | | |
| 172 | + | |
| 173 | + | |
116 | 174 | | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
| 175 | + | |
| 176 | + | |
124 | 177 | | |
125 | 178 | | |
126 | | - | |
127 | | - | |
128 | | - | |
| 179 | + | |
129 | 180 | | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
| 181 | + | |
134 | 182 | | |
135 | 183 | | |
| 184 | + | |
136 | 185 | | |
137 | 186 | | |
138 | 187 | | |
| |||
143 | 192 | | |
144 | 193 | | |
145 | 194 | | |
146 | | - | |
| 195 | + | |
147 | 196 | | |
148 | 197 | | |
149 | 198 | | |
| |||
154 | 203 | | |
155 | 204 | | |
156 | 205 | | |
157 | | - | |
| 206 | + | |
158 | 207 | | |
159 | 208 | | |
160 | 209 | | |
161 | 210 | | |
162 | 211 | | |
| 212 | + | |
163 | 213 | | |
164 | 214 | | |
165 | 215 | | |
| |||
0 commit comments