Skip to content

Commit f44bd34

Browse files
matteoredzclaude
andcommitted
docs: fix decode signature and InvalidTaxCodeError description in AGENTS.md
- ItaxCode.decode takes a positional arg, not a keyword arg - InvalidTaxCodeError is raised only for length != 16, not regex failure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent c06a187 commit f44bd34

1 file changed

Lines changed: 162 additions & 0 deletions

File tree

AGENTS.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
# itax-code — LLMs Codebase Reference
2+
3+
**Ruby gem** (v2.0.7) for encoding, decoding, and validating the Italian Tax Code (Codice Fiscale).
4+
Repo: https://github.com/matteoredz/itax-code | Ruby >= 2.5.0 | License: MIT
5+
Key feature: full **Omocodia** support — 128 alternative valid encodings per person.
6+
7+
---
8+
9+
## Tax Code Structure (16 characters)
10+
11+
```
12+
R S S M R A 8 0 A 0 1 F 2 0 5 X
13+
[0][1][2][3][4][5][6][7][8][9][A][B][C][D][E][F]
14+
|--SRN--|--NAME-|--YY--|MO|-DD-||--PLACE--|CIN|
15+
```
16+
17+
| Segment | Positions | Encoding |
18+
|---------|-----------|----------|
19+
| Surname | 0–2 | Consonants first, then vowels, pad with `X` |
20+
| Name | 3–5 | Same as surname, **but** if >3 consonants, skip the 2nd |
21+
| Year | 6–7 | Last 2 digits of birth year |
22+
| Month | 8 | Letter: A=Jan B=Feb C=Mar D=Apr E=May H=Jun L=Jul M=Aug P=Sep R=Oct S=Nov T=Dec |
23+
| Day | 9–10 | DD for males; DD+40 for females (e.g., born 5th → `45`) |
24+
| Birthplace | 11–14 | 4-char Belfiore code (Italian cities) or Z-code (foreign countries) |
25+
| CIN | 15 | Checksum character (weighted odd/even table lookup) |
26+
27+
**Omocodia positions** (indices): `[6, 7, 9, 10, 12, 13, 14]`
28+
Digit→letter substitution: `0→L 1→M 2→N 3→P 4→Q 5→R 6→S 7→T 8→U 9→V`
29+
30+
---
31+
32+
## Architecture
33+
34+
```
35+
ItaxCode (public API module)
36+
├── .encode(data) → Encoder → String (16-char tax code)
37+
├── .decode(code) → Parser → Hash
38+
└── .valid?(code) → Parser → Boolean (true / false, never raises)
39+
40+
Supporting classes:
41+
├── Omocode → generates all 128 valid omocode variants
42+
├── Utils → CIN calculation, CSV loaders, consonant/vowel extraction, month table
43+
├── Transliterator → 200+ accented chars → ASCII (Ò→O, etc.)
44+
└── Error → custom exception hierarchy
45+
```
46+
47+
### Key files
48+
49+
| Path | Purpose |
50+
|------|---------|
51+
| `lib/itax_code.rb` | Public API entry point |
52+
| `lib/itax_code/encoder.rb` | Tax code generation |
53+
| `lib/itax_code/parser.rb` | Decoding + CIN validation |
54+
| `lib/itax_code/omocode.rb` | 128-variant omocode generation |
55+
| `lib/itax_code/utils.rb` | CIN algorithm, CSV loaders, helpers |
56+
| `lib/itax_code/transliterator.rb` | Unicode→ASCII character map |
57+
| `lib/itax_code/error.rb` | Exception hierarchy |
58+
| `lib/itax_code/data/cities.csv` | ~8,000 Italian municipalities (code, province, name, created_on, deleted_on) |
59+
| `lib/itax_code/data/countries.csv` | 276 foreign countries (code, province, name) |
60+
61+
---
62+
63+
## Public API
64+
65+
```ruby
66+
# Encode — generate a tax code from personal data
67+
ItaxCode.encode(
68+
surname: String, # required
69+
name: String, # required
70+
gender: "M" | "F", # required
71+
birthdate: String|Date|Time, # required — String parsed via Date.parse
72+
birthplace: String # required — city name OR 4-char Belfiore code
73+
) #=> String (e.g. "RSSMRA80A01F205X")
74+
75+
# Decode — parse a tax code into components
76+
ItaxCode.decode(tax_code) #=> Hash:
77+
# {
78+
# code: String, # the input code, upcased
79+
# gender: "M" | "F",
80+
# birthdate: String, # "YYYY-MM-DD"
81+
# birthplace: { # nil if not found in either CSV
82+
# code: String, # e.g. "F205"
83+
# province: String, # e.g. "MI"
84+
# name: String, # e.g. "MILANO"
85+
# created_on: String, # ISO date (cities only)
86+
# deleted_on: String # ISO date (cities only, if deleted)
87+
# },
88+
# omocodes: Array<String>, # all 128 valid omocode variants
89+
# raw: {
90+
# surname: String, # chars 0-2
91+
# name: String, # chars 3-5
92+
# birthdate: String, # chars 6-10
93+
# birthdate_year: String, # chars 6-7
94+
# birthdate_month: String, # char 8
95+
# birthdate_day: String, # chars 9-10
96+
# birthplace: String, # chars 11-14
97+
# cin: String # char 15
98+
# }
99+
# }
100+
101+
# Validate — returns true/false, never raises
102+
ItaxCode.valid?(tax_code) #=> Boolean
103+
```
104+
105+
---
106+
107+
## Error Hierarchy
108+
109+
```
110+
ItaxCode::Error (< StandardError)
111+
├── ItaxCode::Encoder::Error
112+
│ ├── MissingDataError — a required encode field is nil/blank
113+
│ └── InvalidBirthdateError — birthdate string cannot be parsed by Date.parse
114+
└── ItaxCode::Parser::Error
115+
├── NoTaxCodeError — input is nil or blank
116+
├── InvalidTaxCodeError — not 16 chars
117+
├── InvalidControlInternalNumberError — CIN checksum mismatch
118+
└── DateTaxCodeError — decoded date is impossible (e.g. Feb 30)
119+
```
120+
121+
`valid?` rescues `Parser::Error` (all parser errors) and returns `false`.
122+
123+
---
124+
125+
## Key Implementation Details & Edge Cases
126+
127+
| Topic | Detail |
128+
|-------|--------|
129+
| **Female day encoding** | Day stored as `day + 40`; decoded as `val > 40 ? val - 40 : val` |
130+
| **Ambiguous birth year** | 2-digit year prefixed with current century; if result > current year, subtract 100 (e.g. `80` in 2026 → `1980`, not `2080`) |
131+
| **City validity dates** | cities.csv has `created_on`/`deleted_on`; parser filters by whether the decoded birthdate falls within that range |
132+
| **Foreign birthplaces** | Belfiore codes starting with `Z` (Z100+); encoder/parser fall back from cities.csv to countries.csv |
133+
| **Birthplace lookup** | Encoder detects Belfiore code format via `/^\w{1}\d{3}$/`; otherwise matches by slugged name |
134+
| **Omocodia decode** | Parser runs `utils.omocodia_decode` on year, day, and birthplace digits before interpreting them |
135+
| **Transliteration** | Applied before consonant/vowel extraction so accented names (Ò, Ç, À…) are handled correctly |
136+
| **CIN algorithm** | Separate odd/even position lookup tables; sum of all 15 positions mod 26 maps to A–Z |
137+
| **Name consonant rule** | If name has >3 consonants, the 2nd consonant is dropped (e.g. MARCO → MRC, not MAR) |
138+
139+
---
140+
141+
## Development Commands
142+
143+
```bash
144+
bundle exec rake # run full test suite (default Rake task)
145+
bundle exec rubocop # lint all files
146+
bin/console # interactive Ruby console with gem loaded
147+
bin/setup # install dependencies (bundle install)
148+
bin/release # automated release script
149+
bundle exec rake cities # update cities.csv data (rakelib/cities.rake)
150+
```
151+
152+
---
153+
154+
## Testing Conventions
155+
156+
- **Framework:** Minitest (`test/**/*_test.rb`)
157+
- **Coverage:** 100% line + branch enforced via SimpleCov (configured in `test/test_helper.rb`)
158+
- **Mocking:** Mocha
159+
- **Time control:** `Timecop.freeze` for any date-sensitive test
160+
- **Test helper macros:** `test/support/test_macro.rb`
161+
- **CI matrix:** Ruby 2.5 through 3.4 + head (GitHub Actions)
162+
- **Coverage reporting:** Qlty.sh (configured in CI)

0 commit comments

Comments
 (0)