Skip to content

Commit a684db7

Browse files
committed
refactor(engine): drop num2hex/num2bin/num2oct, redundant with format spec
1 parent 63b32a8 commit a684db7

5 files changed

Lines changed: 31 additions & 150 deletions

File tree

FORMULA-SUPPORT.md

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -847,24 +847,20 @@ Output: 100. apple
847847

848848
### Base Conversions
849849

850-
Built-in functions for converting between decimal and hexadecimal, binary, octal, or Roman numerals. Each base has a pair of functions: `numXXX` for *to-string* output, `XXXnum` for parsing back to a number.
850+
Built-in functions for parsing hexadecimal, binary, octal, or Roman numerals back to a number, plus Roman output. The decimal-to-hex/bin/oct *output* direction is handled by the format spec `~ x` / `~ b` / `~ o` (see the Number formatting section), so there is no `num2hex`-style built-in.
851851

852852
| Function | Returns | Example |
853853
|--------------|---------|--------------------------------------|
854-
| `num2hex(n)` | string | `num2hex(255)``"ff"` |
855-
| `num2bin(n)` | string | `num2bin(10)``"1010"` |
856-
| `num2oct(n)` | string | `num2oct(511)``"777"` |
857854
| `num2rom(n)` | string | `num2rom(2024)``"MMXXIV"` |
858855
| `hex2num(s)` | number | `hex2num('ff')``255` |
859856
| `bin2num(s)` | number | `bin2num('1010')``10` |
860857
| `oct2num(s)` | number | `oct2num('777')``511` |
861858
| `rom2num(s)` | number | `rom2num('MMXXIV')``2024` |
862859

863860
**Output conventions:**
864-
- Hex/bin/oct outputs are **bare lowercase** — no `0x` / `0b` / `0o` prefix. Compose prefixes if you want them: `(?='0x' + num2hex(num(1)))`.
861+
- Hex/bin/oct output via `~ x` / `~ b` / `~ o` is **bare lowercase** — no `0x` / `0b` / `0o` prefix. Compose prefixes if you want them: `(?='0x' + num(1) ~ x)`.
865862
- Roman output is **uppercase canonical** form (subtractive pairs `IV`, `IX`, `XL`, etc.).
866-
- Negative inputs to the bases produce `"-<digits>"` (e.g. `num2hex(-15)``"-f"`). For Roman, only the range 1..3999 is meaningful; out of range returns an empty string.
867-
- Float inputs truncate toward zero (`num2hex(15.7) == num2hex(15)`).
863+
- For Roman, only the range 1..3999 is meaningful; out of range returns an empty string.
868864

869865
**Parser conventions:**
870866
- `hex2num`, `bin2num`, `oct2num` accept input **case-insensitively**, with or without the matching prefix, and trim surrounding whitespace. Invalid characters for the target base yield `NaN`.
@@ -875,7 +871,7 @@ Built-in functions for converting between decimal and hexadecimal, binary, octal
875871

876872
| Find | Replace | Description |
877873
|-----------------------|----------------------------------|------------------------------|
878-
| `(\d+)` | `(?='0x' + num2hex(num(1)))` | Decimal → `0xff` |
874+
| `(\d+)` | `(?='0x' + num(1) ~ x)` | Decimal → `0xff` |
879875
| `0x([0-9a-fA-F]+)` | `(?=hex2num(txt(1)))` | Hex literal → decimal |
880876
| `(\d+)` | `(?=num2rom(num(1)))` | Chapter `14``XIV` |
881877
| `([IVXLCDM]+)` | `(?=rom2num(txt(1)))` | Roman → decimal |

help_formula_support_dark.html

Lines changed: 5 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2017,7 +2017,7 @@ <h2>String Output</h2>
20172017
<p><code>(?=txt(1))</code> &nbsp; &mdash; emit capture 1 as text<br>
20182018
<code>(?='prefix-' + txt(1))</code> &nbsp; &mdash; concatenate literals and captures<br>
20192019
<code>(?=fname + ': ' + txt(0))</code> &nbsp; &mdash; mix string variables with captures</p>
2020-
<p>String sources include literals in single quotes, the string variables (<code>FPATH</code>, <code>FNAME</code>), <code>txt(N)</code> capture text (use <code>txt(0)</code> for the full match), <code>txtcol(...)</code> CSV columns, and the conversion functions (<code>num2hex</code>, <code>num2rom</code>, etc.). The <code>+</code> operator concatenates strings.</p>
2020+
<p>String sources include literals in single quotes, the string variables (<code>FPATH</code>, <code>FNAME</code>), <code>txt(N)</code> capture text (use <code>txt(0)</code> for the full match), <code>txtcol(...)</code> CSV columns, and the conversion functions (<code>num2rom</code>, <code>hex2num</code>, etc.). The <code>+</code> operator concatenates strings.</p>
20212021
<p>The legacy form <code>(?=return [a, b, c])</code> still compiles and behaves the same as <code>(?=a + b + c)</code>; new expressions should prefer the direct form.</p>
20222022
<section class="note">
20232023
<p><strong>String literals are ASCII-only.</strong> UTF-8 bytes in <code>'...'</code> will fail to compile. Non-ASCII text must come from the document via captures or string variables, or be placed in the literal portion of the replace string outside <code>(?=...)</code>. See <a href="#exprtk-limitations">Limitations</a> below.</p>
@@ -2100,28 +2100,13 @@ <h2>Sequence Generator</h2>
21002100

21012101
<section id="exprtk-baseconv">
21022102
<h2>Base Conversions</h2>
2103-
<p>Built-in functions for converting between decimal and hexadecimal, binary, octal, or Roman numerals. Each base has a pair of functions: <code>numXXX</code> for <em>to-string</em> output, <code>XXXnum</code> for parsing back to a number.</p>
2103+
<p>Built-in functions for parsing hexadecimal, binary, octal, or Roman numerals back to a number, plus Roman output. The decimal-to-hex/bin/oct <em>output</em> direction is handled by the format spec <code>~ x</code> / <code>~ b</code> / <code>~ o</code> (see the Number formatting section), so there is no <code>num2hex</code>-style built-in.</p>
21042104
<table class="optionsTable">
21052105
<tr>
21062106
<th>Function</th>
21072107
<th>Returns</th>
21082108
<th>Example</th>
21092109
</tr>
2110-
<tr>
2111-
<td><code>num2hex(n)</code></td>
2112-
<td>string</td>
2113-
<td><code>num2hex(255)</code> &rarr; <code>"ff"</code></td>
2114-
</tr>
2115-
<tr>
2116-
<td><code>num2bin(n)</code></td>
2117-
<td>string</td>
2118-
<td><code>num2bin(10)</code> &rarr; <code>"1010"</code></td>
2119-
</tr>
2120-
<tr>
2121-
<td><code>num2oct(n)</code></td>
2122-
<td>string</td>
2123-
<td><code>num2oct(511)</code> &rarr; <code>"777"</code></td>
2124-
</tr>
21252110
<tr>
21262111
<td><code>num2rom(n)</code></td>
21272112
<td>string</td>
@@ -2150,10 +2135,9 @@ <h2>Base Conversions</h2>
21502135
</table>
21512136
<p><strong>Output conventions:</strong></p>
21522137
<ul>
2153-
<li>Hex/bin/oct outputs are <strong>bare lowercase</strong> &mdash; no <code>0x</code> / <code>0b</code> / <code>0o</code> prefix. Compose prefixes if you want them: <code>(?='0x' + num2hex(num(1)))</code>.</li>
2138+
<li>Hex/bin/oct output via <code>~ x</code> / <code>~ b</code> / <code>~ o</code> is <strong>bare lowercase</strong> &mdash; no <code>0x</code> / <code>0b</code> / <code>0o</code> prefix. Compose prefixes if you want them: <code>(?='0x' + num(1) ~ x)</code>.</li>
21542139
<li>Roman output is <strong>uppercase canonical</strong> form (subtractive pairs <code>IV</code>, <code>IX</code>, <code>XL</code>, etc.).</li>
2155-
<li>Negative inputs to the bases produce <code>"-&lt;digits&gt;"</code> (e.g. <code>num2hex(-15)</code> &rarr; <code>"-f"</code>). For Roman, only the range 1..3999 is meaningful; out of range returns an empty string.</li>
2156-
<li>Float inputs truncate toward zero (<code>num2hex(15.7) == num2hex(15)</code>).</li>
2140+
<li>For Roman, only the range 1..3999 is meaningful; out of range returns an empty string.</li>
21572141
</ul>
21582142
<p><strong>Parser conventions:</strong></p>
21592143
<ul>
@@ -2170,7 +2154,7 @@ <h2>Base Conversions</h2>
21702154
</tr>
21712155
<tr>
21722156
<td><code>(\d+)</code></td>
2173-
<td><code>(?='0x' + num2hex(num(1)))</code></td>
2157+
<td><code>(?='0x' + num(1) ~ x)</code></td>
21742158
<td>Decimal &rarr; <code>0xff</code></td>
21752159
</tr>
21762160
<tr>

help_formula_support_light.html

Lines changed: 5 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2017,7 +2017,7 @@ <h2>String Output</h2>
20172017
<p><code>(?=txt(1))</code> &nbsp; &mdash; emit capture 1 as text<br>
20182018
<code>(?='prefix-' + txt(1))</code> &nbsp; &mdash; concatenate literals and captures<br>
20192019
<code>(?=fname + ': ' + txt(0))</code> &nbsp; &mdash; mix string variables with captures</p>
2020-
<p>String sources include literals in single quotes, the string variables (<code>FPATH</code>, <code>FNAME</code>), <code>txt(N)</code> capture text (use <code>txt(0)</code> for the full match), <code>txtcol(...)</code> CSV columns, and the conversion functions (<code>num2hex</code>, <code>num2rom</code>, etc.). The <code>+</code> operator concatenates strings.</p>
2020+
<p>String sources include literals in single quotes, the string variables (<code>FPATH</code>, <code>FNAME</code>), <code>txt(N)</code> capture text (use <code>txt(0)</code> for the full match), <code>txtcol(...)</code> CSV columns, and the conversion functions (<code>num2rom</code>, <code>hex2num</code>, etc.). The <code>+</code> operator concatenates strings.</p>
20212021
<p>The legacy form <code>(?=return [a, b, c])</code> still compiles and behaves the same as <code>(?=a + b + c)</code>; new expressions should prefer the direct form.</p>
20222022
<section class="note">
20232023
<p><strong>String literals are ASCII-only.</strong> UTF-8 bytes in <code>'...'</code> will fail to compile. Non-ASCII text must come from the document via captures or string variables, or be placed in the literal portion of the replace string outside <code>(?=...)</code>. See <a href="#exprtk-limitations">Limitations</a> below.</p>
@@ -2100,28 +2100,13 @@ <h2>Sequence Generator</h2>
21002100

21012101
<section id="exprtk-baseconv">
21022102
<h2>Base Conversions</h2>
2103-
<p>Built-in functions for converting between decimal and hexadecimal, binary, octal, or Roman numerals. Each base has a pair of functions: <code>numXXX</code> for <em>to-string</em> output, <code>XXXnum</code> for parsing back to a number.</p>
2103+
<p>Built-in functions for parsing hexadecimal, binary, octal, or Roman numerals back to a number, plus Roman output. The decimal-to-hex/bin/oct <em>output</em> direction is handled by the format spec <code>~ x</code> / <code>~ b</code> / <code>~ o</code> (see the Number formatting section), so there is no <code>num2hex</code>-style built-in.</p>
21042104
<table class="optionsTable">
21052105
<tr>
21062106
<th>Function</th>
21072107
<th>Returns</th>
21082108
<th>Example</th>
21092109
</tr>
2110-
<tr>
2111-
<td><code>num2hex(n)</code></td>
2112-
<td>string</td>
2113-
<td><code>num2hex(255)</code> &rarr; <code>"ff"</code></td>
2114-
</tr>
2115-
<tr>
2116-
<td><code>num2bin(n)</code></td>
2117-
<td>string</td>
2118-
<td><code>num2bin(10)</code> &rarr; <code>"1010"</code></td>
2119-
</tr>
2120-
<tr>
2121-
<td><code>num2oct(n)</code></td>
2122-
<td>string</td>
2123-
<td><code>num2oct(511)</code> &rarr; <code>"777"</code></td>
2124-
</tr>
21252110
<tr>
21262111
<td><code>num2rom(n)</code></td>
21272112
<td>string</td>
@@ -2150,10 +2135,9 @@ <h2>Base Conversions</h2>
21502135
</table>
21512136
<p><strong>Output conventions:</strong></p>
21522137
<ul>
2153-
<li>Hex/bin/oct outputs are <strong>bare lowercase</strong> &mdash; no <code>0x</code> / <code>0b</code> / <code>0o</code> prefix. Compose prefixes if you want them: <code>(?='0x' + num2hex(num(1)))</code>.</li>
2138+
<li>Hex/bin/oct output via <code>~ x</code> / <code>~ b</code> / <code>~ o</code> is <strong>bare lowercase</strong> &mdash; no <code>0x</code> / <code>0b</code> / <code>0o</code> prefix. Compose prefixes if you want them: <code>(?='0x' + num(1) ~ x)</code>.</li>
21542139
<li>Roman output is <strong>uppercase canonical</strong> form (subtractive pairs <code>IV</code>, <code>IX</code>, <code>XL</code>, etc.).</li>
2155-
<li>Negative inputs to the bases produce <code>"-&lt;digits&gt;"</code> (e.g. <code>num2hex(-15)</code> &rarr; <code>"-f"</code>). For Roman, only the range 1..3999 is meaningful; out of range returns an empty string.</li>
2156-
<li>Float inputs truncate toward zero (<code>num2hex(15.7) == num2hex(15)</code>).</li>
2140+
<li>For Roman, only the range 1..3999 is meaningful; out of range returns an empty string.</li>
21572141
</ul>
21582142
<p><strong>Parser conventions:</strong></p>
21592143
<ul>
@@ -2170,7 +2154,7 @@ <h2>Base Conversions</h2>
21702154
</tr>
21712155
<tr>
21722156
<td><code>(\d+)</code></td>
2173-
<td><code>(?='0x' + num2hex(num(1)))</code></td>
2157+
<td><code>(?='0x' + num(1) ~ x)</code></td>
21742158
<td>Decimal &rarr; <code>0xff</code></td>
21752159
</tr>
21762160
<tr>

src/engine/ExprTkEngine.cpp

Lines changed: 7 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -98,9 +98,6 @@ namespace MultiReplaceEngine {
9898
, _nowFunction()
9999
, _todayFunction()
100100
, _todateFunction(this)
101-
, _num2hexFunction(this, 16)
102-
, _num2binFunction(this, 2)
103-
, _num2octFunction(this, 8)
104101
, _hex2numFunction(this, 16)
105102
, _bin2numFunction(this, 2)
106103
, _oct2numFunction(this, 8)
@@ -201,13 +198,11 @@ namespace MultiReplaceEngine {
201198
// Returns Unix timestamp on success, NaN on parse failure.
202199
_symbolTable.add_function("todate", _todateFunction);
203200

204-
// Base conversions: numeric <-> hex/bin/oct as built-ins. num2X
205-
// returns a bare lowercase string; X2num accepts the input case-
206-
// insensitively, with or without the matching prefix, NaN on
207-
// invalid characters.
208-
_symbolTable.add_function("num2hex", _num2hexFunction);
209-
_symbolTable.add_function("num2bin", _num2binFunction);
210-
_symbolTable.add_function("num2oct", _num2octFunction);
201+
// Base parsing: hex/bin/oct string -> numeric as built-ins. X2num
202+
// accepts the input case-insensitively, with or without the
203+
// matching prefix, NaN on invalid characters. The reverse
204+
// direction (numeric -> base string) is covered by the format
205+
// spec '~ x' / '~ b' / '~ o'.
211206
_symbolTable.add_function("hex2num", _hex2numFunction);
212207
_symbolTable.add_function("bin2num", _bin2numFunction);
213208
_symbolTable.add_function("oct2num", _oct2numFunction);
@@ -1311,62 +1306,8 @@ namespace MultiReplaceEngine {
13111306
return 0;
13121307
}
13131308

1314-
// Renders a non-negative integer in the given base as lowercase
1315-
// ASCII. Always emits at least "0" so num2hex(0) -> "0".
1316-
std::string formatUnsignedBase(unsigned long long v, int base)
1317-
{
1318-
if (v == 0) return std::string("0");
1319-
1320-
static const char kDigits[] = "0123456789abcdef";
1321-
char buf[80];
1322-
std::size_t pos = sizeof(buf);
1323-
while (v > 0) {
1324-
buf[--pos] = kDigits[v % static_cast<unsigned>(base)];
1325-
v /= static_cast<unsigned>(base);
1326-
}
1327-
return std::string(buf + pos, sizeof(buf) - pos);
1328-
}
1329-
13301309
} // anonymous namespace
13311310

1332-
double ExprTkEngine::Num2BaseFunction::operator()(
1333-
std::string& result,
1334-
parameter_list_t parameters)
1335-
{
1336-
result.clear();
1337-
1338-
if (parameters.size() != 1) {
1339-
return 0.0;
1340-
}
1341-
const scalar_t s(parameters[0]);
1342-
const double v = s();
1343-
1344-
// NaN or Inf -> empty string. Same recoverable-error contract
1345-
// as formatDouble(): never let "nan"/"inf" leak into output.
1346-
if (!std::isfinite(v)) {
1347-
return 0.0;
1348-
}
1349-
1350-
// Truncate toward zero so num2hex(15.7) == num2hex(15) and
1351-
// num2hex(-15.7) == num2hex(-15). Matches (int) cast semantics.
1352-
const long long signedVal = static_cast<long long>(v);
1353-
const bool negative = signedVal < 0;
1354-
const unsigned long long magnitude = negative
1355-
? static_cast<unsigned long long>(-(signedVal + 1)) + 1ULL // safe abs for INT64_MIN
1356-
: static_cast<unsigned long long>(signedVal);
1357-
1358-
std::string body = formatUnsignedBase(magnitude, _base);
1359-
if (negative) {
1360-
result.reserve(body.size() + 1);
1361-
result.push_back('-');
1362-
result.append(body);
1363-
}
1364-
else {
1365-
result = std::move(body);
1366-
}
1367-
return 0.0;
1368-
}
1369-
13701311
double ExprTkEngine::Base2NumFunction::operator()(
13711312
parameter_list_t parameters)
13721313
{
@@ -1474,8 +1415,8 @@ namespace MultiReplaceEngine {
14741415
if (!std::isfinite(v)) return 0.0;
14751416

14761417
// Range check first: classical Roman covers 1..3999. Out of
1477-
// range -> empty string (same recoverable-error pattern as
1478-
// num2hex(NaN)).
1418+
// range -> empty string (same recoverable-error pattern as the
1419+
// non-finite guard above).
14791420
const long long iv = static_cast<long long>(v);
14801421
if (iv < 1 || iv > 3999) return 0.0;
14811422

src/engine/ExprTkEngine.h

Lines changed: 10 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -539,7 +539,7 @@ namespace MultiReplaceEngine {
539539
// Parses str against the strftime-style fmt and returns the
540540
// resulting time as seconds-since-epoch. With a leading '!'
541541
// in fmt the result is treated as UTC; otherwise as local
542-
// time (Lua convention, matches our D[...] output spec).
542+
// time (Lua convention, matches our d: output spec).
543543
//
544544
// Returns NaN on parse failure or out-of-range fields, so a
545545
// bad input flows through the same recoverable-error dialog
@@ -565,37 +565,15 @@ namespace MultiReplaceEngine {
565565
ExprTkEngine* _owner;
566566
};
567567

568-
// Base-conversion built-ins: num <-> hex/bin/oct.
568+
// Base-parsing built-in: hex/bin/oct string -> numeric.
569569
//
570-
// num2X(n) takes a scalar, returns a bare lowercase string
571-
// ("ff", "1010", "77") - no "0x" / "0b" / "0o" prefix.
572-
// Negative inputs come out as "-f". Float inputs are
573-
// truncated toward zero before conversion.
574570
// X2num(s) takes a string, returns a scalar. Accepts the input
575571
// case-insensitively and with or without the matching
576572
// prefix; surrounding whitespace is trimmed.
577573
// Invalid characters for the base yield NaN.
578-
class Num2BaseFunction : public exprtk::igeneric_function<double> {
579-
public:
580-
using igenfunct_t = exprtk::igeneric_function<double>;
581-
using generic_t = typename igenfunct_t::generic_type;
582-
using parameter_list_t = typename igenfunct_t::parameter_list_t;
583-
using scalar_t = typename generic_t::scalar_view;
584-
585-
Num2BaseFunction(ExprTkEngine* owner, int base)
586-
: igenfunct_t("T", igenfunct_t::e_rtrn_string)
587-
, _owner(owner)
588-
, _base(base) {
589-
}
590-
591-
double operator()(std::string& result,
592-
parameter_list_t parameters) override;
593-
594-
private:
595-
ExprTkEngine* _owner;
596-
int _base;
597-
};
598-
574+
//
575+
// The reverse direction (numeric -> base string) is provided by
576+
// the format spec '~ x' / '~ b' / '~ o', not a built-in.
599577
class Base2NumFunction : public exprtk::igeneric_function<double> {
600578
public:
601579
using igenfunct_t = exprtk::igeneric_function<double>;
@@ -940,15 +918,13 @@ namespace MultiReplaceEngine {
940918
TodayFunction _todayFunction;
941919

942920
// The todate(str, fmt) callable for string-to-timestamp
943-
// parsing - the inverse of D[fmt] output.
921+
// parsing - the inverse of d:fmt output.
944922
TodateFunction _todateFunction;
945923

946-
// Base-conversion built-ins. Two parameterised templates serve
947-
// hex/bin/oct in both directions; each instance carries its base
948-
// (16/2/8) so the same operator() logic handles all six names.
949-
Num2BaseFunction _num2hexFunction;
950-
Num2BaseFunction _num2binFunction;
951-
Num2BaseFunction _num2octFunction;
924+
// Base-parsing built-ins: hex/bin/oct string -> numeric. The
925+
// parameterised class carries its base (16/2/8) so one operator()
926+
// handles all three names. The reverse direction (numeric -> base
927+
// string) is covered by the format spec '~ x' / '~ b' / '~ o'.
952928
Base2NumFunction _hex2numFunction;
953929
Base2NumFunction _bin2numFunction;
954930
Base2NumFunction _oct2numFunction;

0 commit comments

Comments
 (0)