@@ -43,23 +43,23 @@ Each regex expression has a per-class `spark.comet.expression.<ClassName>.enable
4343useful for narrowing a regression or comparing performance on a single operator without changing
4444the engine selector:
4545
46- | Expression | Config |
47- | ------------------- | - ------------------------------------------------------- |
48- | ` rlike ` | ` spark.comet.expression.RLike.enabled=false ` |
49- | ` regexp_extract ` | ` spark.comet.expression.RegExpExtract.enabled=false ` |
50- | ` regexp_extract_all ` | ` spark.comet.expression.RegExpExtractAll.enabled=false ` |
51- | ` regexp_instr ` | ` spark.comet.expression.RegExpInStr.enabled=false ` |
52- | ` regexp_replace ` | ` spark.comet.expression.RegExpReplace.enabled=false ` |
53- | ` split ` | ` spark.comet.expression.StringSplit.enabled=false ` |
46+ | Expression | Config |
47+ | -------------------- | ------------------------------------------------------- |
48+ | ` rlike ` | ` spark.comet.expression.RLike.enabled=false ` |
49+ | ` regexp_extract ` | ` spark.comet.expression.RegExpExtract.enabled=false ` |
50+ | ` regexp_extract_all ` | ` spark.comet.expression.RegExpExtractAll.enabled=false ` |
51+ | ` regexp_instr ` | ` spark.comet.expression.RegExpInStr.enabled=false ` |
52+ | ` regexp_replace ` | ` spark.comet.expression.RegExpReplace.enabled=false ` |
53+ | ` split ` | ` spark.comet.expression.StringSplit.enabled=false ` |
5454
5555## Choosing an engine
5656
57- | | Rust engine | Java engine (default) |
58- | -------------------- | ----------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
59- | ** Compatibility** | Differs from Java regex (see below) | 100% compatible with Spark |
57+ | | Rust engine | Java engine (default) |
58+ | -------------------- | ------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------- |
59+ | ** Compatibility** | Differs from Java regex (see below) | 100% compatible with Spark |
6060| ** Feature coverage** | ` rlike ` , ` regexp_replace ` , ` split ` natively; ` regexp_extract ` , ` regexp_extract_all ` , ` regexp_instr ` via fallthrough | All regexp expressions (` rlike ` , ` regexp_extract ` , ` regexp_extract_all ` , ` regexp_instr ` , ` regexp_replace ` , ` split ` ) |
61- | ** Performance** | Fully native, no JNI overhead | One JNI round-trip per batch (Arrow vectors stay columnar) |
62- | ** Pattern support** | Linear-time subset only | All Java regex features (backreferences, lookaround, etc.) |
61+ | ** Performance** | Fully native, no JNI overhead | One JNI round-trip per batch (Arrow vectors stay columnar) |
62+ | ** Pattern support** | Linear-time subset only | All Java regex features (backreferences, lookaround, etc.) |
6363
6464The ** Rust engine** is faster but cannot match Java regex semantics for every pattern. Because the engine
6565choice is itself the opt-in, setting ` spark.comet.exec.regexp.engine=rust ` declares acceptance of those
0 commit comments