Skip to content

Commit a00a52f

Browse files
committed
Update experimental search strategies documentation
Revised the README section on experimental search strategies for substring search. Clarified algorithm descriptions, provided a concise API table, improved example code, and added notes on heuristic selection and configuration.
1 parent 2ef3928 commit a00a52f

1 file changed

Lines changed: 15 additions & 27 deletions

File tree

README.md

Lines changed: 15 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -116,43 +116,31 @@ pub fn main() {
116116

117117
### ⚠️ Experimental: Search Strategies
118118

119-
We added experimental search strategies to provide faster, algorithmic alternatives for substring search in different scenarios.
119+
**Algorithms:**
120+
- **KMP**: optimized for long/repetitive patterns
121+
- **Sliding**: fast for short patterns, zero allocations
120122

121-
- New algorithms:
122-
- **KMP** (Knuth–Morris–Pratt): fast for long or highly repetitive patterns.
123-
- **Sliding-match**: non-allocating, often faster for short/non-repetitive patterns.
123+
**APIs:**
124124

125-
- Experimental (opt-in) APIs:
126-
- `core.index_of_auto(text, pattern)` — automatic heuristic choosing between KMP and Sliding.
127-
- `core.count_auto(haystack, pattern, overlapping)` — automatic count using the heuristic.
128-
- `core.index_of_strategy(text, pattern, core.Kmp|core.Sliding)` — explicit selection of algorithm.
129-
- `core.count_strategy(haystack, pattern, overlapping, core.Kmp|core.Sliding)` — explicit count with chosen algorithm.
130-
131-
- Configuration (tunable thresholds):
132-
- See `src/str/config.gleam` for:
133-
- `kmp_min_pattern_len()`
134-
- `kmp_large_text_threshold()`
135-
- `kmp_large_text_min_pat()`
136-
- `kmp_border_multiplier()`
137-
- These defaults are conservative; projects can replace `str/config.gleam` during their build to change behavior.
138-
139-
- Notes & guidance:
140-
- `index_of_auto` is experimental and uses a heuristic based on pattern length, text length and prefix-table repetitiveness. It may choose a non-optimal algorithm for some inputs. For performance-critical code prefer the explicit `index_of_strategy` or `count_strategy` APIs.
141-
- Use `scripts/bench_beam.erl` (BEAM-native) and `scripts/bench_kmp.py` to measure and tune thresholds. The BEAM harness now emits `max_border` (prefix table max) into CSV results to help heuristic tuning.
125+
| Function | Description |
126+
|----------|-------------|
127+
| `index_of_auto(text, pattern)` | Auto-select algorithm (heuristic) |
128+
| `index_of_strategy(text, pattern, Kmp\|Sliding)` | Explicit algorithm choice |
129+
| `count_auto(text, pattern, overlapping)` | Auto-select for counting |
130+
| `count_strategy(text, pattern, overlapping, Kmp\|Sliding)` | Explicit count algorithm |
142131

143-
Examples:
132+
**Examples:**
144133

145134
```gleam
146-
// Explicitly force KMP
135+
// Force KMP explicitly
147136
core.index_of_strategy("long text...", "pattern", core.Kmp)
148137
149-
// Use the automatic heuristic (experimental)
138+
// Let heuristic decide (experimental)
150139
core.index_of_auto("some text", "pat")
151-
152-
// Count occurrences with explicit strategy
153-
core.count_strategy("abababab", "ab", True, core.Kmp)
154140
```
155141

142+
> **Note:** `_auto` variants use heuristics and may not always choose optimally. For performance-critical code, use `_strategy` variants. Configure thresholds in `src/str/config.gleam`.
143+
156144
### 🧩 Splitting & Partitioning
157145

158146
| Function | Example | Result |

0 commit comments

Comments
 (0)