|
| 1 | +# itax-code — LLMs Codebase Reference |
| 2 | + |
| 3 | +**Ruby gem** (v2.0.7) for encoding, decoding, and validating the Italian Tax Code (Codice Fiscale). |
| 4 | +Repo: https://github.com/matteoredz/itax-code | Ruby >= 2.5.0 | License: MIT |
| 5 | +Key feature: full **Omocodia** support — 128 alternative valid encodings per person. |
| 6 | + |
| 7 | +--- |
| 8 | + |
| 9 | +## Tax Code Structure (16 characters) |
| 10 | + |
| 11 | +``` |
| 12 | +R S S M R A 8 0 A 0 1 F 2 0 5 X |
| 13 | +[0][1][2][3][4][5][6][7][8][9][A][B][C][D][E][F] |
| 14 | +|--SRN--|--NAME-|--YY--|MO|-DD-||--PLACE--|CIN| |
| 15 | +``` |
| 16 | + |
| 17 | +| Segment | Positions | Encoding | |
| 18 | +|---------|-----------|----------| |
| 19 | +| Surname | 0–2 | Consonants first, then vowels, pad with `X` | |
| 20 | +| Name | 3–5 | Same as surname, **but** if >3 consonants, skip the 2nd | |
| 21 | +| Year | 6–7 | Last 2 digits of birth year | |
| 22 | +| Month | 8 | Letter: A=Jan B=Feb C=Mar D=Apr E=May H=Jun L=Jul M=Aug P=Sep R=Oct S=Nov T=Dec | |
| 23 | +| Day | 9–10 | DD for males; DD+40 for females (e.g., born 5th → `45`) | |
| 24 | +| Birthplace | 11–14 | 4-char Belfiore code (Italian cities) or Z-code (foreign countries) | |
| 25 | +| CIN | 15 | Checksum character (weighted odd/even table lookup) | |
| 26 | + |
| 27 | +**Omocodia positions** (indices): `[6, 7, 9, 10, 12, 13, 14]` |
| 28 | +Digit→letter substitution: `0→L 1→M 2→N 3→P 4→Q 5→R 6→S 7→T 8→U 9→V` |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## Architecture |
| 33 | + |
| 34 | +``` |
| 35 | +ItaxCode (public API module) |
| 36 | +├── .encode(data) → Encoder → String (16-char tax code) |
| 37 | +├── .decode(code) → Parser → Hash |
| 38 | +└── .valid?(code) → Parser → Boolean (true / false, never raises) |
| 39 | +
|
| 40 | +Supporting classes: |
| 41 | +├── Omocode → generates all 128 valid omocode variants |
| 42 | +├── Utils → CIN calculation, CSV loaders, consonant/vowel extraction, month table |
| 43 | +├── Transliterator → 200+ accented chars → ASCII (Ò→O, etc.) |
| 44 | +└── Error → custom exception hierarchy |
| 45 | +``` |
| 46 | + |
| 47 | +### Key files |
| 48 | + |
| 49 | +| Path | Purpose | |
| 50 | +|------|---------| |
| 51 | +| `lib/itax_code.rb` | Public API entry point | |
| 52 | +| `lib/itax_code/encoder.rb` | Tax code generation | |
| 53 | +| `lib/itax_code/parser.rb` | Decoding + CIN validation | |
| 54 | +| `lib/itax_code/omocode.rb` | 128-variant omocode generation | |
| 55 | +| `lib/itax_code/utils.rb` | CIN algorithm, CSV loaders, helpers | |
| 56 | +| `lib/itax_code/transliterator.rb` | Unicode→ASCII character map | |
| 57 | +| `lib/itax_code/error.rb` | Exception hierarchy | |
| 58 | +| `lib/itax_code/data/cities.csv` | ~8,000 Italian municipalities (code, province, name, created_on, deleted_on) | |
| 59 | +| `lib/itax_code/data/countries.csv` | 276 foreign countries (code, province, name) | |
| 60 | + |
| 61 | +--- |
| 62 | + |
| 63 | +## Public API |
| 64 | + |
| 65 | +```ruby |
| 66 | +# Encode — generate a tax code from personal data |
| 67 | +ItaxCode.encode( |
| 68 | + surname: String, # required |
| 69 | + name: String, # required |
| 70 | + gender: "M" | "F", # required |
| 71 | + birthdate: String|Date|Time, # required — String parsed via Date.parse |
| 72 | + birthplace: String # required — city name OR 4-char Belfiore code |
| 73 | +) #=> String (e.g. "RSSMRA80A01F205X") |
| 74 | + |
| 75 | +# Decode — parse a tax code into components |
| 76 | +ItaxCode.decode(tax_code) #=> Hash: |
| 77 | +# { |
| 78 | +# code: String, # the input code, upcased |
| 79 | +# gender: "M" | "F", |
| 80 | +# birthdate: String, # "YYYY-MM-DD" |
| 81 | +# birthplace: { # nil if not found in either CSV |
| 82 | +# code: String, # e.g. "F205" |
| 83 | +# province: String, # e.g. "MI" |
| 84 | +# name: String, # e.g. "MILANO" |
| 85 | +# created_on: String, # ISO date (cities only) |
| 86 | +# deleted_on: String # ISO date (cities only, if deleted) |
| 87 | +# }, |
| 88 | +# omocodes: Array<String>, # all 128 valid omocode variants |
| 89 | +# raw: { |
| 90 | +# surname: String, # chars 0-2 |
| 91 | +# name: String, # chars 3-5 |
| 92 | +# birthdate: String, # chars 6-10 |
| 93 | +# birthdate_year: String, # chars 6-7 |
| 94 | +# birthdate_month: String, # char 8 |
| 95 | +# birthdate_day: String, # chars 9-10 |
| 96 | +# birthplace: String, # chars 11-14 |
| 97 | +# cin: String # char 15 |
| 98 | +# } |
| 99 | +# } |
| 100 | + |
| 101 | +# Validate — returns true/false, never raises |
| 102 | +ItaxCode.valid?(tax_code) #=> Boolean |
| 103 | +``` |
| 104 | + |
| 105 | +--- |
| 106 | + |
| 107 | +## Error Hierarchy |
| 108 | + |
| 109 | +``` |
| 110 | +ItaxCode::Error (< StandardError) |
| 111 | +├── ItaxCode::Encoder::Error |
| 112 | +│ ├── MissingDataError — a required encode field is nil/blank |
| 113 | +│ └── InvalidBirthdateError — birthdate string cannot be parsed by Date.parse |
| 114 | +└── ItaxCode::Parser::Error |
| 115 | + ├── NoTaxCodeError — input is nil or blank |
| 116 | + ├── InvalidTaxCodeError — not 16 chars |
| 117 | + ├── InvalidControlInternalNumberError — CIN checksum mismatch |
| 118 | + └── DateTaxCodeError — decoded date is impossible (e.g. Feb 30) |
| 119 | +``` |
| 120 | + |
| 121 | +`valid?` rescues `Parser::Error` (all parser errors) and returns `false`. |
| 122 | + |
| 123 | +--- |
| 124 | + |
| 125 | +## Key Implementation Details & Edge Cases |
| 126 | + |
| 127 | +| Topic | Detail | |
| 128 | +|-------|--------| |
| 129 | +| **Female day encoding** | Day stored as `day + 40`; decoded as `val > 40 ? val - 40 : val` | |
| 130 | +| **Ambiguous birth year** | 2-digit year prefixed with current century; if result > current year, subtract 100 (e.g. `80` in 2026 → `1980`, not `2080`) | |
| 131 | +| **City validity dates** | cities.csv has `created_on`/`deleted_on`; parser filters by whether the decoded birthdate falls within that range | |
| 132 | +| **Foreign birthplaces** | Belfiore codes starting with `Z` (Z100+); encoder/parser fall back from cities.csv to countries.csv | |
| 133 | +| **Birthplace lookup** | Encoder detects Belfiore code format via `/^\w{1}\d{3}$/`; otherwise matches by slugged name | |
| 134 | +| **Omocodia decode** | Parser runs `utils.omocodia_decode` on year, day, and birthplace digits before interpreting them | |
| 135 | +| **Transliteration** | Applied before consonant/vowel extraction so accented names (Ò, Ç, À…) are handled correctly | |
| 136 | +| **CIN algorithm** | Separate odd/even position lookup tables; sum of all 15 positions mod 26 maps to A–Z | |
| 137 | +| **Name consonant rule** | If name has >3 consonants, the 2nd consonant is dropped (e.g. MARCO → MRC, not MAR) | |
| 138 | + |
| 139 | +--- |
| 140 | + |
| 141 | +## Development Commands |
| 142 | + |
| 143 | +```bash |
| 144 | +bundle exec rake # run full test suite (default Rake task) |
| 145 | +bundle exec rubocop # lint all files |
| 146 | +bin/console # interactive Ruby console with gem loaded |
| 147 | +bin/setup # install dependencies (bundle install) |
| 148 | +bin/release # automated release script |
| 149 | +bundle exec rake cities # update cities.csv data (rakelib/cities.rake) |
| 150 | +``` |
| 151 | + |
| 152 | +--- |
| 153 | + |
| 154 | +## Testing Conventions |
| 155 | + |
| 156 | +- **Framework:** Minitest (`test/**/*_test.rb`) |
| 157 | +- **Coverage:** 100% line + branch enforced via SimpleCov (configured in `test/test_helper.rb`) |
| 158 | +- **Mocking:** Mocha |
| 159 | +- **Time control:** `Timecop.freeze` for any date-sensitive test |
| 160 | +- **Test helper macros:** `test/support/test_macro.rb` |
| 161 | +- **CI matrix:** Ruby 2.5 through 3.4 + head (GitHub Actions) |
| 162 | +- **Coverage reporting:** Qlty.sh (configured in CI) |
0 commit comments