Skip to content

Fix numeric FORMAT ~F and ~E ANSI test failures#744

Open
blakemcbride wants to merge 1 commit intoarmedbear:masterfrom
blakemcbride:format-numeric
Open

Fix numeric FORMAT ~F and ~E ANSI test failures#744
blakemcbride wants to merge 1 commit intoarmedbear:masterfrom
blakemcbride:format-numeric

Conversation

@blakemcbride
Copy link
Copy Markdown

Fix numeric FORMAT ~F and ~E ANSI test failures

Summary

Repairs the ~F (fixed-format) and ~E (exponential) directive
implementations in format.lisp. Fourteen ANSI conformance failures
in this area now pass, and no previously-passing test regresses.

Test Directive
FORMAT.F.5 ~F
FORMAT.F.8 ~F
FORMAT.F.45 ~F
FORMAT.F.46 ~F
FORMAT.F.46B ~F
FORMAT.F.47 ~F
FORMATTER.F.45 ~F
FORMATTER.F.46 ~F
FORMATTER.F.46B ~F
FORMATTER.F.47 ~F
FORMAT.E.1 ~E
FORMAT.E.2 ~E
FORMAT.E.3 ~E
FORMAT.E.26 ~E

Root causes

The old implementation computed the scale exponent and rounded digits
using long-float arithmetic via decode-float / log / expt.
That path:

  1. Lost precision on extreme magnitudes. For inputs like
    2.9085037515399494d185, (* exponent (log 2l0 10)) plus the
    subsequent scale-by-(expt 10 …) dropped low-order significant
    digits; the rendered mantissa no longer round-tripped.
  2. Rounded with binary-float multiplication/division, so the ~F
    and ~E outputs disagreed with prin1 on boundary cases such as
    0.999…1.000, and on the corresponding carry-out of the
    exponent.
  3. Mishandled the ~F leading/trailing-zero reconciliation when
    the formatted value was a bare "." or when width pressure forced
    a choice between the leading 0 (making 0.xxx a float token) and
    the trailing 0 (making xxx.0 a float token). Tests like
    FORMAT.F.45/.46/.47 require preferring the trailing zero when
    both exist but only one fits, since .0 reads as a float while
    0. reads as an integer.
  4. Emitted a lowercase e exponent marker even when the float's
    type matched *read-default-float-format*. ABCL's prin1 emits
    uppercase E in that case (inherited from Java's
    Double.toString), so FORMAT.E.1 / FORMAT.E.2 — which compare
    (format nil "~e" x) against prin1-to-string with string=
    failed on every in-range sample.

Changes

1. scale-exponent rewritten with exact rational arithmetic

The new version computes the base-10 exponent from the bit-lengths
of (rational x)'s numerator and denominator, then refines up or
down with integer expt. The returned mantissa is exact:

(values (coerce (/ (rational original-x) (expt 10 ex))
                (type-of original-x))
        ex)

No transcendental calls remain in the hot path; extreme doubles and
denormals are handled uniformly.

2. New helper shortest-digits-and-exponent

For the d = NIL branch of ~E (where CLHS 22.3.3.2 requires the
shortest round-trip fraction), we take the trimmed digits and the
implied exponent directly from sys::float-string (which already
produces Java's shortest round-trip representation). This eliminates
the precision loss that went through scale-exponent + rounding.

3. New helper exact-round-digits

For the d-given branch, rounding is done on the exact rational:

(let* ((scaled (* r (expt 10 (- n-sig ex))))
       (int (round scaled)))
  (when (>= int (expt 10 n-sig))
    ;; 0.999... -> 1.000 carry-out: bump exponent.
    (setf int (/ int 10))
    (incf ex))
  (values (format nil "~v,'0d" n-sig int) ex))

The carry-out branch is what makes (format nil "~,1,,0e" 9.99d0)
produce "0.1e+2" instead of "0.10e+1".

4. format-exp-aux rewritten to use the helpers

The new body:

  • Picks the digit count from d and k (n-sig = 1+d for k>0,
    d+k otherwise) and delegates to shortest-digits-and-exponent
    or exact-round-digits.
  • Lays out the mantissa around the decimal point based on the scale
    factor k:
    • k > 0: split digits at position k
    • k = 0: ".DDD" with a leading-zero flag
    • k < 0: "." + |k| zeros + digits with a leading-zero flag
  • Tracks lpoint / tpoint flags and reconciles them against width
    in the same style as the old code, but now consistent with the
    ~F fix in point 5 below.

5. flonum-to-string + format-fixed-aux trailing-zero handling

Two coupled fixes to the ~F path:

  • flonum-to-string (around line 234): when shortening an over-wide
    digit string and the integer part is "0", the caller will prepend
    a leading "0" that consumes one column of width. Compute
    effective-width = (max (1- width) 0) before deciding how many
    trailing zeros to drop. Without this, ~F width calculations
    over-estimated the space available and produced invalid tokens
    (e.g. "0.") for values like 0.05 at width 4.
  • format-fixed-aux (around line 2264): rewritten lpoint/tpoint
    reconciliation. When both flags are set and only one column
    remains, prefer tpoint (yielding .5-style output) over lpoint
    (which would yield 5.-style, an integer token). When only one
    flag is set, keep it unless dropping would leave a bare ".".

6. format-exponent-marker default case uppercased

format-exponent-marker returns the marker character when the
float's type matches *read-default-float-format*. Changed from
#\e to #\E so it matches prin1's output (which inherits
uppercase E from String.valueOf(double)). The explicit-type
markers (#\f, #\d, #\s, #\l) stay lowercase, which also
matches prin1.

Files changed

  • src/org/armedbear/lisp/format.lisp — the only file touched.

Test plan

  • ant abcl builds cleanly.
  • Full ANSI suite
    (asdf:test-system :abcl/test/ansi/compiled) goes from
    52 → 50 unexpected failures; the two retired failures are
    FORMAT.E.1 and FORMAT.E.2. No FORMAT.F.*,
    FORMATTER.F.*, FORMAT.E.* tests remain on the failure list.
  • Regression guards: FORMAT.F.3F.44, FORMAT.E.4E.25,
    FORMAT.E.27E.29, and FORMATTER.F.3F.44 continue to
    pass with mixed-case string-equal comparisons.
  • Manual round-trip spot checks:
    - (format nil "~e" 2.9085037515399494d185)
    "2.9085037515399494E+185" (matches prin1).
    - (format nil "~,1,,0e" 9.99d0)"0.1E+2" (carry-out).
    - (format nil "~,2,,-1e" 5.0)"0.05e-1" (k<0 layout).

Compatibility

No public API change. FORMAT and FORMATTER become stricter in
the directions CLHS requires:

  • ~E output is now rounded exactly, so previously-incorrect
    mantissas on large-magnitude doubles will change (toward the
    round-trip-correct form).
  • ~E default exponent marker is now uppercase E for
    default-type floats, matching prin1. Callers that relied on the
    previous lowercase e for those floats will see a case change in
    output.
  • ~F output at tight widths now prefers trailing-zero tokens
    (.5) over leading-zero tokens (5.) when forced to choose,
    matching CLHS's requirement that the result remain readable as a
    float.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant