Skip to content

Commit 1d8b502

Browse files
committed
Add comprehensive documentation in docs/
Move documentation from the wiki into the repository so it is versioned with the code. The wiki now redirects here. - docs/input-formats.md — every accepted format, file lists, directories, binary - docs/output-formats.md — CIDR, ranges, single IPs, binary, CSV, prefix/suffix - docs/operations.md — merge, intersect, exclude, diff, reduce, compare, count - docs/ipv6.md — address family, normalization, cross-family rules - docs/dns-resolution.md — threading, retry, configuration - docs/ipset-reduce.md — prefix reduction tutorial with examples - README.md updated with documentation section linking to docs/
1 parent acc8882 commit 1d8b502

File tree

7 files changed

+709
-0
lines changed

7 files changed

+709
-0
lines changed

README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -366,13 +366,25 @@ To skip the man page: `./configure --disable-man`
366366
| Directory | Contents |
367367
|-----------|----------|
368368
| `src/` | C sources and headers |
369+
| `docs/` | Detailed documentation |
369370
| `packaging/` | Spec template, ebuild, release tooling |
370371
| `tests.d/` | CLI regression tests |
371372
| `tests.build.d/` | Build and layout regressions |
372373
| `tests.sanitizers.d/` | Sanitizer CLI regressions |
373374
| `tests.tsan.d/` | TSAN regressions |
374375
| `tests.unit/` | Unit-style internal harnesses |
375376

377+
## Documentation
378+
379+
Detailed guides in the [`docs/`](docs/) directory:
380+
381+
- [Input formats](docs/input-formats.md) — every accepted format, file lists, directories, binary
382+
- [Output formats](docs/output-formats.md) — CIDR, ranges, single IPs, binary, CSV, prefix/suffix
383+
- [Operations](docs/operations.md) — merge, intersect, exclude, diff, reduce, compare, count
384+
- [IPv6 support](docs/ipv6.md) — address family, normalization, cross-family rules
385+
- [DNS resolution](docs/dns-resolution.md) — threading, retry, configuration
386+
- [Optimizing ipsets for iptables](docs/ipset-reduce.md) — prefix reduction with examples
387+
376388
## Getting help
377389

378390
```bash

docs/dns-resolution.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# DNS resolution
2+
3+
When input files contain hostnames (one per line), `iprange` resolves them in parallel using a thread pool.
4+
5+
## Configuration
6+
7+
| Option | Default | Meaning |
8+
|--------|---------|---------|
9+
| `--dns-threads N` | 5 | Maximum number of parallel DNS queries |
10+
| `--dns-silent` | off | Suppress all DNS error messages |
11+
| `--dns-progress` | off | Show a progress bar during resolution |
12+
13+
## How it works
14+
15+
1. As each input line is parsed, hostnames are queued for resolution.
16+
2. Worker threads pick requests from the queue and call `getaddrinfo()`.
17+
3. Resolved IPs are added to a reply queue.
18+
4. The main thread drains the reply queue periodically and after all requests finish.
19+
20+
Threads are created on demand up to `--dns-threads`. If the queue grows faster than threads can process, new threads are spawned up to the limit.
21+
22+
## Address family behavior
23+
24+
| Mode | Records resolved | Normalization |
25+
|------|-----------------|---------------|
26+
| IPv4 (default / `-4`) | A records only | None |
27+
| IPv6 (`-6`) | AAAA and A records | A results mapped to `::ffff:x.x.x.x` |
28+
29+
In IPv6 mode, a hostname that has both AAAA and A records will contribute all addresses — IPv6 addresses directly, IPv4 addresses as IPv4-mapped IPv6.
30+
31+
## Retry and error handling
32+
33+
- **Temporary failures** (`EAI_AGAIN`): retried up to 20 times with 1-second delays between retry cycles.
34+
- **Permanent failures** (`EAI_NONAME`, `EAI_FAIL`, etc.): logged to stderr and counted.
35+
- **System errors** (`EAI_SYSTEM`, `EAI_MEMORY`): logged to stderr.
36+
37+
After all resolutions complete, if any hostname permanently failed, the entire load fails (returns error). Use `--dns-silent` to suppress the per-hostname error messages, but the load will still fail.
38+
39+
## Hostname detection
40+
41+
A line is treated as a hostname when:
42+
- It contains only hostname-valid characters (alphanumeric, dot, hyphen, underscore)
43+
- It does not look like a valid IP address or CIDR
44+
- It appears alone on the line (optionally followed by a comment)
45+
46+
Lines that look like IPs but fail to parse are treated as errors, not hostnames. This prevents typos like `1.2.3.999` from triggering DNS resolution.
47+
48+
Hostnames cannot appear as range endpoints. A line like `host1.example.com - host2.example.com` is invalid.
49+
50+
## Performance notes
51+
52+
- With the default 5 threads, `iprange` can resolve hundreds of hostnames per second.
53+
- For files with thousands of hostnames, increase `--dns-threads` (e.g., 50-100).
54+
- DNS results are added to the ipset as they arrive, so resolution overlaps with continued file parsing.
55+
- Each hostname resolution is independent — one slow or failing hostname does not block others.

docs/input-formats.md

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Input formats
2+
3+
`iprange` accepts one entry per line. All formats can coexist in the same file.
4+
5+
## IPv4 (default mode)
6+
7+
### Addresses and CIDRs
8+
9+
| Format | Example | Expansion |
10+
|--------|---------|-----------|
11+
| Dotted decimal | `1.2.3.4` | Single IP |
12+
| CIDR prefix | `1.2.3.0/24` | 1.2.3.0 - 1.2.3.255 |
13+
| Dotted netmask | `1.2.3.0/255.255.255.0` | Same as /24 |
14+
| Abbreviated | `10.1` | `inet_aton()` expansion |
15+
| Decimal integer | `16909060` | 1.2.3.4 |
16+
| Octal | `012.0.0.1` | 10.0.0.1 (leading zero = octal) |
17+
| Hex | `0x0A000001` | 10.0.0.1 |
18+
19+
IPv4 parsing uses `inet_aton()`, which accepts all the above forms. Be careful with leading zeros — `010.0.0.1` is octal 8.0.0.1, not decimal 10.0.0.1.
20+
21+
By default, CIDRs are normalized to the network address: `1.1.1.17/24` is read as `1.1.1.0/24`. Use `--dont-fix-network` to disable this.
22+
23+
The default prefix for bare IPs (no `/` suffix) is /32. Change with `--default-prefix N`.
24+
25+
### Ranges
26+
27+
| Format | Example | Meaning |
28+
|--------|---------|---------|
29+
| IP range | `1.2.3.0 - 1.2.3.255` | Explicit start-end |
30+
| CIDR range | `1.2.3.0/24 - 1.2.4.0/24` | Network of first to broadcast of second |
31+
| Mixed | `1.2.3.0/24 - 1.2.4.0/255.255.255.0` | CIDR and netmask can be mixed |
32+
33+
The dash can have optional spaces around it.
34+
35+
### Hostnames
36+
37+
Hostnames (one per line) are resolved via parallel DNS queries. In IPv4 mode, only A records are resolved. If a hostname resolves to multiple IPs, all are added.
38+
39+
See [DNS resolution](dns-resolution.md) for threading and configuration.
40+
41+
## IPv6 (`-6` mode)
42+
43+
### Addresses and CIDRs
44+
45+
| Format | Example | Notes |
46+
|--------|---------|-------|
47+
| Full notation | `2001:0db8:0000:0000:0000:0000:0000:0001` | |
48+
| Compressed | `2001:db8::1` | Standard `::` compression |
49+
| Loopback | `::1` | |
50+
| CIDR | `2001:db8::/32` | Prefix 0-128 |
51+
| IPv4-mapped | `::ffff:10.0.0.1` | |
52+
| Plain IPv4 | `10.0.0.1` | Auto-normalized to `::ffff:10.0.0.1` |
53+
54+
IPv6 parsing uses `inet_pton(AF_INET6)`.
55+
56+
### Ranges
57+
58+
IPv6 ranges use the same `addr1 - addr2` syntax. Both endpoints must be the same address family — a range like `10.0.0.1 - 2001:db8::1` is rejected as a mixed-family error.
59+
60+
### Hostnames
61+
62+
In IPv6 mode, hostnames are resolved for both AAAA and A records. A-record results are normalized to IPv4-mapped IPv6 (`::ffff:x.x.x.x`).
63+
64+
## Comments and whitespace
65+
66+
- `#` or `;` at the start of a line marks it as a comment.
67+
- `#` or `;` after an IP/range/hostname starts an inline comment (rest of line ignored).
68+
- Empty lines and leading/trailing whitespace are silently skipped.
69+
70+
## File inputs
71+
72+
### Regular files
73+
74+
```bash
75+
iprange file1.txt file2.txt file3.txt
76+
```
77+
78+
Each file argument is loaded as a separate ipset. For modes like `--compare`, each file appears as a separate column in the output.
79+
80+
### stdin
81+
82+
```bash
83+
cat blocklist.txt | iprange -
84+
# or just:
85+
cat blocklist.txt | iprange
86+
```
87+
88+
If no file arguments are given, stdin is assumed. Explicit `-` reads stdin.
89+
90+
### File lists (`@filename`)
91+
92+
```bash
93+
iprange @my-lists.txt
94+
```
95+
96+
The file `my-lists.txt` contains one filename per line. Comments (`#`, `;`) and empty lines are ignored. Each listed file is loaded as a separate ipset.
97+
98+
```
99+
# my-lists.txt
100+
/path/to/blocklist-a.txt
101+
/path/to/blocklist-b.txt
102+
# /path/to/disabled.txt
103+
```
104+
105+
Feature detection: `iprange --has-filelist-loading` exits 0 if supported.
106+
107+
### Directory loading (`@directory`)
108+
109+
```bash
110+
iprange @/etc/firehol/ipsets/
111+
```
112+
113+
All regular files in the directory are loaded (sorted alphabetically), each as a separate ipset. Subdirectories are not traversed.
114+
115+
Feature detection: `iprange --has-directory-loading` exits 0 if supported.
116+
117+
### Naming for CSV output
118+
119+
Any file argument can be followed by `as NAME` to override its name in CSV output:
120+
121+
```bash
122+
iprange --compare --header file1.txt as "Blocklist A" file2.txt as "Blocklist B"
123+
```
124+
125+
## Binary input
126+
127+
Binary files (produced by `--print-binary`) are auto-detected by their header line:
128+
- IPv4 binary: format v1.0
129+
- IPv6 binary: format v2.0
130+
131+
Loading a binary file of the wrong family is an error. In IPv4 mode, an IPv6 binary file is rejected. In IPv6 mode, an IPv4 binary file is rejected.
132+
133+
Binary files are architecture-specific (no endianness conversion). They are intended as a same-machine cache, not a portable interchange format.

docs/ipset-reduce.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Optimizing ipsets for iptables
2+
3+
netfilter/iptables `hash:net` ipsets (netsets) are a fast way to manage IP lists for firewall rules. The number of entries in an ipset does not affect lookup performance. However, each **distinct prefix length** in the netset adds one extra lookup per packet. A netset using all 32 possible IPv4 prefixes forces 32 lookups per packet.
4+
5+
`iprange --ipset-reduce` consolidates prefixes while keeping the matched IP set identical. For example, one /23 entry becomes two /24 entries — same IPs, one fewer prefix.
6+
7+
## Parameters
8+
9+
| Option | Default | Purpose |
10+
|--------|---------|---------|
11+
| `--ipset-reduce PERCENT` | 20 | Allow this % increase in entries |
12+
| `--ipset-reduce-entries ENTRIES` | 16384 | Minimum absolute entry cap |
13+
14+
You enable reduce mode by giving either option. The maximum acceptable entries is computed as:
15+
16+
```
17+
max(current_entries * (1 + PERCENT / 100), ENTRIES)
18+
```
19+
20+
This design works well across all netset sizes:
21+
- Small netsets (hundreds of entries) are scaled up to ENTRIES
22+
- Large netsets (hundreds of thousands) are scaled by PERCENT
23+
24+
## Algorithm
25+
26+
The algorithm is optimal: at each step it finds the prefix whose elimination adds the fewest new entries, merges it into the next available prefix, and repeats until the entry limit is reached. Use `-v` to see the elimination steps.
27+
28+
## Example: country netset
29+
30+
The GeoLite2 netset for Greece:
31+
32+
```bash
33+
$ iprange -C --header country_gr.netset
34+
entries,unique_ips
35+
406,6304132
36+
```
37+
38+
406 entries, 6.3 million unique IPs. The prefix breakdown (`-v`):
39+
40+
```
41+
prefix /13 counts 1 entries
42+
prefix /14 counts 3 entries
43+
prefix /15 counts 7 entries
44+
prefix /16 counts 42 entries
45+
prefix /17 counts 19 entries
46+
prefix /18 counts 17 entries
47+
prefix /19 counts 21 entries
48+
prefix /20 counts 21 entries
49+
prefix /21 counts 30 entries
50+
prefix /22 counts 50 entries
51+
prefix /23 counts 50 entries
52+
prefix /24 counts 98 entries
53+
prefix /25 counts 4 entries
54+
prefix /27 counts 2 entries
55+
prefix /28 counts 7 entries
56+
prefix /29 counts 25 entries
57+
prefix /31 counts 3 entries
58+
prefix /32 counts 6 entries
59+
```
60+
61+
**18 distinct prefixes** = 18 lookups per packet.
62+
63+
After reduction with 20% entry increase:
64+
65+
```bash
66+
$ iprange -v --ipset-reduce 20 country_gr.netset >/dev/null
67+
Eliminated 15 out of 18 prefixes (3 remain in the final set).
68+
69+
prefix /21 counts 3028 entries
70+
prefix /24 counts 398 entries
71+
prefix /32 counts 900 entries
72+
```
73+
74+
**3 prefixes, 4,326 entries** — same 6.3 million unique IPs. The kernel now does 3 lookups instead of 18.
75+
76+
With a higher entry cap:
77+
78+
```bash
79+
$ iprange -v --ipset-reduce 20 --ipset-reduce-entries 50000 country_gr.netset >/dev/null
80+
Eliminated 16 out of 18 prefixes (2 remain in the final set).
81+
82+
prefix /24 counts 24622 entries
83+
prefix /32 counts 900 entries
84+
```
85+
86+
**2 prefixes, 25,522 entries** — one more prefix eliminated thanks to the higher entry budget.
87+
88+
## Example: large blocklist
89+
90+
A large blocklist (218,307 entries, 25 prefixes, 765 million IPs):
91+
92+
```bash
93+
$ iprange -v --ipset-reduce 20 --ipset-reduce-entries 50000 \
94+
ib_bluetack_level1.netset >/dev/null
95+
Eliminated 17 out of 25 prefixes (8 remain in the final set).
96+
97+
prefix /16 counts 11118 entries
98+
prefix /20 counts 5216 entries
99+
prefix /24 counts 46718 entries
100+
prefix /26 counts 17902 entries
101+
prefix /27 counts 18123 entries
102+
prefix /28 counts 32637 entries
103+
prefix /29 counts 94802 entries
104+
prefix /32 counts 33570 entries
105+
```
106+
107+
From 25 prefixes to 8, entries from 218,307 to 260,086. At 50%: 6 prefixes. At 100%: 5 prefixes.
108+
109+
## Lossless round-trip
110+
111+
The reduction is lossless. Piping reduced output back through `iprange` reproduces the original optimized set:
112+
113+
```bash
114+
iprange --ipset-reduce 100 blocklist.txt | iprange -v >/dev/null
115+
# output is identical to: iprange -v blocklist.txt >/dev/null
116+
```
117+
118+
## Typical usage
119+
120+
```bash
121+
# Moderate reduction (good default)
122+
iprange --ipset-reduce 20 blocklist.txt > reduced.txt
123+
124+
# Aggressive reduction for small lists
125+
iprange --ipset-reduce 20 --ipset-reduce-entries 50000 country.netset > reduced.txt
126+
127+
# Generate ipset restore commands from reduced set
128+
iprange --ipset-reduce 20 --print-prefix "add myset " blocklist.txt
129+
```

0 commit comments

Comments
 (0)