Skip to content

Adding 5 tests that were failing on me: [ wtq-00080, wtq-00085, wtq-00113, wtq-00162, wtq-00168]#9

Merged
ronibhakta1 merged 7 commits into
ZON-Format:mainfrom
P6rguVyrst:add-add-failure-scenatios
Apr 23, 2026
Merged

Adding 5 tests that were failing on me: [ wtq-00080, wtq-00085, wtq-00113, wtq-00162, wtq-00168]#9
ronibhakta1 merged 7 commits into
ZON-Format:mainfrom
P6rguVyrst:add-add-failure-scenatios

Conversation

@P6rguVyrst
Copy link
Copy Markdown

@P6rguVyrst P6rguVyrst commented Apr 14, 2026

Summary

Three round-trip bugs fixed, discovered via accuracy testing with WTQ benchmark datasets.

Bug Fixes

1. Boolean-like dictionary keys converted to booleans

Problem: {"f": 1} decoded as {False: 1}.

Fix: Encoder now quotes keys matching boolean/null keywords (t, f, true, false, null, none, nil). Decoder uses a new parse_key() function that preserves keys as strings.

2. Dictionary header values not quoted

Problem: Values containing commas, colons, or special characters were truncated at delimiters.

Fix: Uses _format_value() for consistent quoting of dictionary compression header values.

3. Float columns corrupted by delta encoding

Problem: Delta encoding violated spec §2.3's round-trip requirement. IEEE-754 arithmetic prev + (cur - prev) doesn't recover original bit patterns, and round(diff, 10) compounded the loss.

Example: 1865.43 decoded as 1865.4299999999994

Fix: Restrict SparseMode.DELTA to int-only columns. Float columns use standard value encoding which round-trips exactly via Python's shortest-round-trip str(float).

Regression tests cover: benchmark values, math.pi/math.e, 0.1+0.2, extreme exponents, negatives.

Testing Context

Added 5 WTQ datasets that were failing accuracy tests with encode()/decode().

Model used: llama3.1:8b-instruct-q4_K_M (localhost via ollama)

Limitation: No AZURE_OPENAI_* env vars available to run against the zon repo accuracy suite.

Test data sources:

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

Adds five benchmark JSON datasets, new unit tests (dynamic roundtrip, boolean-key, delta float), and updates core parsing/encoding: introduces parse_key, switches key parsing to use it, tightens sparse/delta logic, formats dictionary values, and improves key quoting.

Changes

Cohort / File(s) Summary
Benchmark Data Files
zon-format/benchmarks/data/wtq-00080.json, zon-format/benchmarks/data/wtq-00085.json, zon-format/benchmarks/data/wtq-00113.json, zon-format/benchmarks/data/wtq-00162.json, zon-format/benchmarks/data/wtq-00168.json
Added five static JSON datasets (rally/event records, catalog variants, competition results). Data-only additions; no code/API changes.
Dynamic Roundtrip Tests
zon-format/tests/unit/test_roundtrip_benchmarks.py
New unittest module that discovers benchmarks/data/*.json (excludes filenames containing questions) and dynamically attaches test_roundtrip_<dataset> methods which run zon.encodezon.decode and assert JSON equality. Raises if no datasets found.
Boolean-key Unit Tests
zon-format/tests/unit/test_boolean_keys.py
New test suite TestBooleanLikeKeys covering keys that resemble boolean/null tokens to ensure keys remain strings through encode/decode roundtrips.
Delta/Float Unit Test
zon-format/tests/unit/test_delta.py
Added test_float_column_roundtrip_is_lossless to assert float values round-trip bit-exactly via encoder/decoder. Minor whitespace edits elsewhere.
Core Utils
zon-format/src/zon/core/utils.py
Added parse_key(val: str) -> str: trims whitespace, decodes quoted strings with JSON (with CSV-style escape fallback), and always returns a string (does not coerce boolean/null).
Core Decoder
zon-format/src/zon/core/decoder.py
Switched object key parsing to use parse_key(...) instead of parse_value(...) in node parsing branches; value parsing unchanged.
Core Encoder
zon-format/src/zon/core/encoder.py
Changed sparse-mode detection to allow DELTA only for int-only columns (excluding bool); format dictionary table values via _format_value(...) before joining; enhance inline key quoting to also quote boolean/null-like keys.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibbled JSON leaves all night,
Five new crumbs tucked out of sight,
Keys now parsed with careful art,
Encoders tuned to play their part,
I hop — tests pass — a tiny rabbit's delight!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title specifically mentions adding 5 test cases with their exact identifiers (wtq-00080, wtq-00085, wtq-00113, wtq-00162, wtq-00168), which directly corresponds to the new JSON data files and test infrastructure added in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (4)
zon-format/benchmarks/data/wtq-00113.json (1)

39-41: Normalize missing-value representation for benchmark stability.

Using "-" here while other missing fields use empty strings introduces schema inconsistency and can skew encode/decode retrieval behavior. Prefer one sentinel ("" or null) across all records.

Proposed normalization (empty strings)
-    "Driver": "-",
-    "Co-driver": "-",
-    "Car": "-",
+    "Driver": "",
+    "Co-driver": "",
+    "Car": "",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00113.json` around lines 39 - 41, The JSON
record uses "-" as a missing-value sentinel for the fields "Driver",
"Co-driver", and "Car", which breaks schema consistency; update those three
values from "-" to the normalized sentinel (use empty string "") so they match
other records' missing-value representation and avoid encoding/decoding
inconsistencies.
zon-format/benchmarks/data/wtq-00162.json (1)

15-215: Normalize distance formatting in Event values for benchmark stability.

Event mixes "5000 m" and "10,000 m". For retrieval-accuracy fixtures, this can create avoidable mismatches in tokenization and exact-match assertions. Use one canonical format across all rows (e.g., always "5,000 m" / "10,000 m" or always uncommaed).

♻️ Suggested normalization example
-    "Event": "5000 m",
+    "Event": "5,000 m",

Apply consistently to every 5000 m entry in this file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00162.json` around lines 15 - 215, The Event
field mixes distance formats (e.g., "5000 m" vs "10,000 m") which hurts
benchmark stability; pick a single canonical format (suggestion: remove
thousands separators so use "5000 m" and "10000 m", or alternatively use comma
style "5,000 m" and "10,000 m") and apply it consistently to every "5000 m" and
"10,000 m" entry in this JSON (search for the "Event" keys and replace all
occurrences so all rows use the chosen canonical form).
zon-format/benchmarks/data/wtq-00085.json (1)

4-4: Standardize Date and Format field values for consistency.

This benchmark fixture mixes date granularities (YYYY, Month YYYY, Month DD YYYY) and format casing/tokenization patterns. If the evaluator normalizes or compares these values directly, heterogeneity could introduce spurious variance in benchmark results. Consider standardizing these fields to a single format (e.g., YYYY-MM-DD for dates, consistent casing for formats) or documenting that the evaluator handles this variation intentionally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00085.json` at line 4, Normalize the "Date"
and "Format" fields in this JSON fixture: replace the human-readable "Date"
value ("April 19, 2004") with a consistent ISO-like representation (e.g.,
"2004-04-19" or "2004" depending on required granularity used across fixtures)
and standardize the "Format" token/casing to match the project convention (e.g.,
all-lowercase or TitleCase) so all benchmark fixtures use the same date format
and format-token style; update the entry keys "Date" and "Format" in
wtq-00085.json accordingly and ensure it follows the same pattern used by other
fixtures.
zon-format/benchmarks/data/wtq-00168.json (1)

7-7: Replace non-breaking spaces in Event values with normal spaces

Many Event strings use NBSP (\u00A0) instead of ASCII spaces (e.g., 20 km walk). This can cause subtle encode/decode mismatches and brittle string comparisons in tests.

Consider normalizing all Event values to plain spaces (20 km walk, 50 km walk) across benchmark files for deterministic matching.

Also applies to: 15-15, 31-31, 39-39, 47-47, 55-55, 63-63, 71-71, 79-79, 87-87, 95-95, 103-103, 111-111, 119-119, 127-127, 135-135, 143-143, 151-151, 159-159, 167-167, 175-175, 183-183, 191-191, 199-199

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00168.json` at line 7, Many benchmark JSON
objects use a non-breaking space (\u00A0) inside the "Event" value (e.g.,
"20\u00A0km walk"), which breaks deterministic string matching — search all
files for the "Event" key and replace any NBSP characters in its value with a
normal ASCII space so values read like "20 km walk"; update every occurrence
(e.g., "50\u00A0km walk" etc.), ensure files are saved with UTF-8 encoding, and
run the repository-wide search/replace or a small script to normalize all
"Event" values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zon-format/benchmarks/data/wtq-00080.json`:
- Line 21: The file mixes empty strings ("") and hyphens ("-") for missing
values—normalize all missing/unavailable fields in this JSON to use JSON null
instead; specifically replace occurrences of the "Notes" field and other
empty/hyphen values (e.g., the entries referenced at lines 28, 35, 39-41, 49,
56-57, 63-64, 70-71, 84-85, 91-92, 105-106, 112-113, 119-120, 126-127) so their
values are null, ensuring consistent encode/decode behavior across functions
that read these records.
- Line 98: Update the JSON entry where the "Notes" field currently contains the
typo "Price granted posthumously": change the string value to "Prize granted
posthumously" by editing the "Notes" key in the affected record so the wording
is correct.

In `@zon-format/benchmarks/data/wtq-00113.json`:
- Line 98: The "Notes" field contains a typo: replace the string "Price granted
posthumously" with "Prize granted posthumously" in the JSON entry so retrievals
and assertions match expected wording; locate the entry with the "Notes" key in
wtq-00113.json and update its value accordingly.

In `@zon-format/benchmarks/data/wtq-00162.json`:
- Around line 78-80: The Position field currently uses an en dash sentinel
("Position": "–") alongside "Notes": "DNF"; replace the en dash with a canonical
machine-friendly value (e.g., change "Position" to "DNF") so the record reads
"Position": "DNF", or alternatively add normalization logic in consumers to map
the en dash "–" to "DNF"; update any code that reads the Position key to expect
the canonical "DNF" value and/or to normalize the "–" sentinel to "DNF".

In `@zon-format/benchmarks/data/wtq-00168.json`:
- Line 16: Normalize the inconsistent time strings in the Notes fields to
include hours so the format matches HH:MM:SS (and preserve fractional seconds if
present); specifically replace "1:23.47" with "00:01:23.47" and similarly change
"3:45.57" and "3:42.08" to "00:03:45.57" and "00:03:42.08". Update the Notes
values (the JSON "Notes" keys) at the affected entries to use the normalized
HH:MM:SS.xx form to keep the benchmark parser assumptions consistent.
- Around line 67-80: Two consecutive array entries share the same key tuple
(Year, Competition, Venue, Event) but differ (Position "–", Notes "DQ" vs
Position "5th", Notes "3:45:57") and the Event contains a non‑breaking space;
normalize Event by replacing the NBSP with a normal space and dedupe by
collapsing entries with the same (Year, Competition, Venue, Event) after
normalization, keeping the canonical record (prefer the entry with a concrete
Position/result — e.g. Position "5th" with Notes "3:45:57") or otherwise
disambiguate per source data; ensure the fields Year, Competition, Venue, Event,
Position, and Notes are used for the duplicate-detection and update the array to
remove the DQ duplicate.

---

Nitpick comments:
In `@zon-format/benchmarks/data/wtq-00085.json`:
- Line 4: Normalize the "Date" and "Format" fields in this JSON fixture: replace
the human-readable "Date" value ("April 19, 2004") with a consistent ISO-like
representation (e.g., "2004-04-19" or "2004" depending on required granularity
used across fixtures) and standardize the "Format" token/casing to match the
project convention (e.g., all-lowercase or TitleCase) so all benchmark fixtures
use the same date format and format-token style; update the entry keys "Date"
and "Format" in wtq-00085.json accordingly and ensure it follows the same
pattern used by other fixtures.

In `@zon-format/benchmarks/data/wtq-00113.json`:
- Around line 39-41: The JSON record uses "-" as a missing-value sentinel for
the fields "Driver", "Co-driver", and "Car", which breaks schema consistency;
update those three values from "-" to the normalized sentinel (use empty string
"") so they match other records' missing-value representation and avoid
encoding/decoding inconsistencies.

In `@zon-format/benchmarks/data/wtq-00162.json`:
- Around line 15-215: The Event field mixes distance formats (e.g., "5000 m" vs
"10,000 m") which hurts benchmark stability; pick a single canonical format
(suggestion: remove thousands separators so use "5000 m" and "10000 m", or
alternatively use comma style "5,000 m" and "10,000 m") and apply it
consistently to every "5000 m" and "10,000 m" entry in this JSON (search for the
"Event" keys and replace all occurrences so all rows use the chosen canonical
form).

In `@zon-format/benchmarks/data/wtq-00168.json`:
- Line 7: Many benchmark JSON objects use a non-breaking space (\u00A0) inside
the "Event" value (e.g., "20\u00A0km walk"), which breaks deterministic string
matching — search all files for the "Event" key and replace any NBSP characters
in its value with a normal ASCII space so values read like "20 km walk"; update
every occurrence (e.g., "50\u00A0km walk" etc.), ensure files are saved with
UTF-8 encoding, and run the repository-wide search/replace or a small script to
normalize all "Event" values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 431a006a-3848-4c12-a301-281f13e2a516

📥 Commits

Reviewing files that changed from the base of the PR and between 3ed82e4 and a0019ce.

📒 Files selected for processing (5)
  • zon-format/benchmarks/data/wtq-00080.json
  • zon-format/benchmarks/data/wtq-00085.json
  • zon-format/benchmarks/data/wtq-00113.json
  • zon-format/benchmarks/data/wtq-00162.json
  • zon-format/benchmarks/data/wtq-00168.json

"Driver": "Charles Muhangi",
"Co-driver": "Steven Byaruhanga",
"Car": "Subaru Impreza",
"Notes": ""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use one missing-value representation across the dataset.

This file mixes "" and "-" for unavailable values, which can make encode/decode normalization and retrieval assertions brittle.

Proposed normalization patch (use JSON nulls for missing fields)
-    "Notes": ""
+    "Notes": null
-    "Driver": "-",
-    "Co-driver": "-",
-    "Car": "-",
+    "Driver": null,
+    "Co-driver": null,
+    "Car": null,

Also applies to: 28-28, 35-35, 39-41, 49-49, 56-57, 63-64, 70-71, 84-85, 91-92, 105-106, 112-113, 119-120, 126-127

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00080.json` at line 21, The file mixes empty
strings ("") and hyphens ("-") for missing values—normalize all
missing/unavailable fields in this JSON to use JSON null instead; specifically
replace occurrences of the "Notes" field and other empty/hyphen values (e.g.,
the entries referenced at lines 28, 35, 39-41, 49, 56-57, 63-64, 70-71, 84-85,
91-92, 105-106, 112-113, 119-120, 126-127) so their values are null, ensuring
consistent encode/decode behavior across functions that read these records.

"Driver": "Riyaz Kurji",
"Co-driver": "Sayed Kadri",
"Car": "Subaru Impreza N8",
"Notes": "Price granted posthumously"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix likely typo in Notes text (PricePrize).

This looks like a spelling error and can reduce benchmark retrieval quality if tests compare expected phrasing.

Proposed text fix
-    "Notes": "Price granted posthumously"
+    "Notes": "Prize granted posthumously"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"Notes": "Price granted posthumously"
"Notes": "Prize granted posthumously"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00080.json` at line 98, Update the JSON entry
where the "Notes" field currently contains the typo "Price granted
posthumously": change the string value to "Prize granted posthumously" by
editing the "Notes" key in the affected record so the wording is correct.

"Driver": "Riyaz Kurji",
"Co-driver": "Sayed Kadri",
"Car": "Subaru Impreza N8",
"Notes": "Price granted posthumously"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Possible typo in Notes text (Price vs Prize).

Line 98 reads “Price granted posthumously”, which likely should be “Prize granted posthumously”. This can affect retrieval assertions that depend on exact wording.

Proposed fix
-    "Notes": "Price granted posthumously"
+    "Notes": "Prize granted posthumously"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"Notes": "Price granted posthumously"
"Notes": "Prize granted posthumously"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00113.json` at line 98, The "Notes" field
contains a typo: replace the string "Price granted posthumously" with "Prize
granted posthumously" in the JSON entry so retrievals and assertions match
expected wording; locate the entry with the "Notes" key in wtq-00113.json and
update its value accordingly.

Comment on lines +78 to +80
"Position": "–",
"Event": "5000 m",
"Notes": "DNF"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use a canonical non-finish representation in Position.

"Position": "–" with "Notes": "DNF" is human-readable, but the en dash sentinel is easy to mishandle in strict parsers/normalizers. Prefer a canonical machine-friendly value (for example, "Position": "DNF"), or ensure fixture consumers explicitly normalize this symbol.

🛠️ Suggested canonical value
-    "Position": "–",
-    "Notes": "DNF"
+    "Position": "DNF",
+    "Notes": ""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00162.json` around lines 78 - 80, The Position
field currently uses an en dash sentinel ("Position": "–") alongside "Notes":
"DNF"; replace the en dash with a canonical machine-friendly value (e.g., change
"Position" to "DNF") so the record reads "Position": "DNF", or alternatively add
normalization logic in consumers to map the en dash "–" to "DNF"; update any
code that reads the Position key to expect the canonical "DNF" value and/or to
normalize the "–" sentinel to "DNF".

"Venue": "Split, Yugoslavia",
"Position": "4th",
"Event": "20 km walk",
"Notes": "1:23.47"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalize time notation in Notes

Line 16, Line 88, and Line 168 use mixed delimiter style (1:23.47, 3:45.57, 3:42.08) while most rows use HH:MM:SS. This inconsistency can break parser assumptions in benchmark tooling.

Proposed patch
-    "Notes": "1:23.47"
+    "Notes": "1:23:47"
...
-    "Notes": "3:45.57"
+    "Notes": "3:45:57"
...
-    "Notes": "3:42.08"
+    "Notes": "3:42:08"

Also applies to: 88-88, 168-168

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00168.json` at line 16, Normalize the
inconsistent time strings in the Notes fields to include hours so the format
matches HH:MM:SS (and preserve fractional seconds if present); specifically
replace "1:23.47" with "00:01:23.47" and similarly change "3:45.57" and
"3:42.08" to "00:03:45.57" and "00:03:42.08". Update the Notes values (the JSON
"Notes" keys) at the affected entries to use the normalized HH:MM:SS.xx form to
keep the benchmark parser assumptions consistent.

Comment on lines +67 to +80
"Year": "1994",
"Competition": "European Championships",
"Venue": "Helsinki, Finland",
"Position": "–",
"Event": "50 km walk",
"Notes": "DQ"
},
{
"Year": "1994",
"Competition": "European Championships",
"Venue": "Helsinki, Finland",
"Position": "5th",
"Event": "50 km walk",
"Notes": "3:45:57"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import json
from collections import defaultdict

path = "zon-format/benchmarks/data/wtq-00168.json"
rows = json.load(open(path, encoding="utf-8"))

groups = defaultdict(list)
for i, r in enumerate(rows, start=1):
    k = (r.get("Year"), r.get("Competition"), r.get("Venue"), r.get("Event"))
    groups[k].append((i, r.get("Position"), r.get("Notes")))

for k, vals in groups.items():
    if len(vals) > 1:
        print("DUPLICATE KEY:", k)
        for v in vals:
            print("  line~", v[0], "Position=", v[1], "Notes=", v[2])
PY

Repository: ZON-Format/ZON

Length of output: 217


Conflicting duplicate record for the same event key

Array indices 9 and 10 contain two entries with the same (Year, Competition, Venue, Event) but different outcomes: position with DQ vs position 5th with result 3:45:57. This ambiguity invalidates benchmark ground truth for retrieval assertions.

The Event field contains a non-breaking space (50\xa0km walk), which may contribute to the duplication. Verify the source dataset and retain only the canonical record, or disambiguate the Event values.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/benchmarks/data/wtq-00168.json` around lines 67 - 80, Two
consecutive array entries share the same key tuple (Year, Competition, Venue,
Event) but differ (Position "–", Notes "DQ" vs Position "5th", Notes "3:45:57")
and the Event contains a non‑breaking space; normalize Event by replacing the
NBSP with a normal space and dedupe by collapsing entries with the same (Year,
Competition, Venue, Event) after normalization, keeping the canonical record
(prefer the entry with a concrete Position/result — e.g. Position "5th" with
Notes "3:45:57") or otherwise disambiguate per source data; ensure the fields
Year, Competition, Venue, Event, Position, and Notes are used for the
duplicate-detection and update the array to remove the DQ duplicate.

@P6rguVyrst P6rguVyrst changed the title Adding 5 tests that were failing on me. Adding 5 tests that were failing on me: [ wtq-00080, wtq-00085, wtq-00113, wtq-00162, wtq-00168] Apr 14, 2026
@ronibhakta1 ronibhakta1 self-requested a review April 15, 2026 14:08
Copy link
Copy Markdown
Contributor

@ronibhakta1 ronibhakta1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets keep the tests file to a different folder here to make sure we can keep a track and just make it plugin which we can plug or unplug it from the test scripts

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zon-format/tests/unit/test_roundtrip_benchmarks.py`:
- Around line 16-21: get_json_files currently returns an empty list when no
benchmark JSONs match, which lets the test class silently skip all dynamic
tests; change get_json_files so after building the sorted list it asserts or
raises a clear error (e.g., raise RuntimeError or use pytest.fail) when the list
is empty, with a message like "no benchmark JSON files found in
BENCHMARKS_DATA_DIR"; apply the same fail-fast check to the other file-discovery
helper in this module (the sibling function that returns benchmark files used
for dynamic tests) so discovery errors cause the test run to fail immediately.
- Around line 31-32: The test opens benchmark fixture files without specifying
encoding, causing failures on non-UTF-8 locales; update the file-opening call in
test_roundtrip_benchmarks.py (the with open(filepath) block that sets original =
json.load(f)) to open the file explicitly as UTF-8 by passing encoding="utf-8"
so the JSON fixtures with non-ASCII bytes (e.g. github-repos.json,
wtq-00162.json, wtq-00168.json) are read consistently across platforms.
- Around line 37-42: The test currently compares JSON strings (orig_json,
dec_json) which masks key type changes; change the assertion to compare the
decoded Python structures directly by replacing the json.dumps-based comparison
with self.assertEqual(original, decoded, f"Roundtrip failed for
{filepath.name}") so the test checks structural and type equality (use
assertDictEqual if you want a clearer dict-specific diff); update/remove the
orig_json/dec_json variables accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df3563a5-d0aa-464e-912a-dddd994304cc

📥 Commits

Reviewing files that changed from the base of the PR and between a0019ce and d05f41d.

📒 Files selected for processing (1)
  • zon-format/tests/unit/test_roundtrip_benchmarks.py

Comment thread zon-format/tests/unit/test_roundtrip_benchmarks.py
Comment thread zon-format/tests/unit/test_roundtrip_benchmarks.py Outdated
Comment on lines +37 to +42
orig_json = json.dumps(original, sort_keys=True)
dec_json = json.dumps(decoded, sort_keys=True)

self.assertEqual(
orig_json, dec_json,
f"Roundtrip failed for {filepath.name}"
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Demonstrate that json.dumps can hide mapping key type changes.
python - <<'PY'
import json

original = {"1": "value"}
decoded = {1: "value"}

assert original != decoded
assert json.dumps(original, sort_keys=True) == json.dumps(decoded, sort_keys=True)

print("json.dumps comparison masks string-vs-int key changes")
PY

Repository: ZON-Format/ZON

Length of output: 111


🏁 Script executed:

cat -n zon-format/tests/unit/test_roundtrip_benchmarks.py | head -60

Repository: ZON-Format/ZON

Length of output: 1733


🏁 Script executed:

python3 <<'PY'
# Test direct comparison of complex Python objects loaded from JSON
import json

json_str = '{"1": "value", "nested": {"2": "data"}, "list": [1, 2, 3]}'
original = json.loads(json_str)

# Simulate a zon decoder that accidentally converts string keys to int
decoded_buggy = {1: "value", "nested": {2: "data"}, "list": [1, 2, 3]}

# Test 1: Direct comparison catches the bug
print("Direct comparison catches key type change:", original != decoded_buggy)

# Test 2: JSON string comparison hides the bug
orig_json = json.dumps(original, sort_keys=True)
dec_json = json.dumps(decoded_buggy, sort_keys=True)
print("JSON comparison hides key type change:", orig_json == dec_json)

# Test 3: Edge case - NaN handling
import math
obj1 = {"value": float('nan')}
obj2 = {"value": float('nan')}
print("NaN comparison (might be False):", obj1 == obj2)
PY

Repository: ZON-Format/ZON

Length of output: 662


🏁 Script executed:

python3 <<'PY'
import json

# Test what actually happens with json.dumps() on mixed key types
original = {"1": "value"}
decoded = {1: "value"}

# Without sort_keys
try:
    orig_json = json.dumps(original)
    dec_json = json.dumps(decoded)
    print("Without sort_keys:")
    print(f"  original: {orig_json}")
    print(f"  decoded:  {dec_json}")
    print(f"  Equal: {orig_json == dec_json}")
except TypeError as e:
    print(f"Error without sort_keys: {e}")

# With sort_keys (as in the actual code)
try:
    orig_json = json.dumps(original, sort_keys=True)
    dec_json = json.dumps(decoded, sort_keys=True)
    print("\nWith sort_keys:")
    print(f"  original: {orig_json}")
    print(f"  decoded:  {dec_json}")
    print(f"  Equal: {orig_json == dec_json}")
except TypeError as e:
    print(f"Error with sort_keys: {e}")
PY

Repository: ZON-Format/ZON

Length of output: 229


🏁 Script executed:

# Let's check the actual benchmark data to see if any have numeric-looking keys
ls -la zon-format/benchmarks/data/*.json | head -20

Repository: ZON-Format/ZON

Length of output: 1784


🏁 Script executed:

# Check a smaller JSON file to see the structure
head -50 zon-format/benchmarks/data/hikes.json | python3 -m json.tool 2>/dev/null | head -50

Repository: ZON-Format/ZON

Length of output: 906


🏁 Script executed:

# Let's understand what zon codec does - check if it's supposed to preserve string keys
cat -n zon-format/zon/__init__.py | head -40

Repository: ZON-Format/ZON

Length of output: 116


🏁 Script executed:

find zon-format -name "*.py" -path "*/zon/*" -type f | head -20

Repository: ZON-Format/ZON

Length of output: 786


🏁 Script executed:

cat -n zon-format/src/zon/__init__.py | head -50

Repository: ZON-Format/ZON

Length of output: 1870


Compare decoded structures directly, not their JSON strings.

json.dumps() coerces object keys to strings, so this can miss non-lossless key-type changes such as {1: "value"} (integer key) becoming {"1": "value"} (string key) in the decoded output. Since the test docstring claims to validate "lossless" encoding, the current comparison fails to catch this type of data corruption.

Proposed fix
-        orig_json = json.dumps(original, sort_keys=True)
-        dec_json = json.dumps(decoded, sort_keys=True)
-
         self.assertEqual(
-            orig_json, dec_json,
+            original, decoded,
             f"Roundtrip failed for {filepath.name}"
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
orig_json = json.dumps(original, sort_keys=True)
dec_json = json.dumps(decoded, sort_keys=True)
self.assertEqual(
orig_json, dec_json,
f"Roundtrip failed for {filepath.name}"
self.assertEqual(
original, decoded,
f"Roundtrip failed for {filepath.name}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/tests/unit/test_roundtrip_benchmarks.py` around lines 37 - 42, The
test currently compares JSON strings (orig_json, dec_json) which masks key type
changes; change the assertion to compare the decoded Python structures directly
by replacing the json.dumps-based comparison with self.assertEqual(original,
decoded, f"Roundtrip failed for {filepath.name}") so the test checks structural
and type equality (use assertDictEqual if you want a clearer dict-specific
diff); update/remove the orig_json/dec_json variables accordingly.

Copy link
Copy Markdown
Author

@P6rguVyrst P6rguVyrst Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{1: "value"} isn't valid json -- we are specifically loading in json content and encoding+decoding it.
the json.load at line 32 will fail with the example

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 5 (char 6)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@P6rguVyrst, you're right — JSON only permits string keys, so json.load() will never produce a Python dict with integer keys. My example of {1: "value"} was a Python-dict construct that couldn't originate from a JSON file, making that specific concern invalid in this context.

Given that original is always produced by json.load() and zon.decode() is expected to return JSON-compatible structures, comparing via json.dumps(sort_keys=True) is a reasonable approach here. I'll dismiss this comment.


✏️ Learnings added
Learnt from: P6rguVyrst
Repo: ZON-Format/ZON PR: 9
File: zon-format/tests/unit/test_roundtrip_benchmarks.py:37-42
Timestamp: 2026-04-20T19:10:00.422Z
Learning: In `zon-format/tests/unit/test_roundtrip_benchmarks.py`, comparing roundtrip results via `json.dumps(sort_keys=True)` is valid because all input data comes from `json.load()`, which always produces string-keyed dicts (JSON only allows string keys). The concern about integer-key coercion (`{1: "value"}`) does not apply to JSON-sourced data.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@P6rguVyrst
Copy link
Copy Markdown
Author

Lets keep the tests file to a different folder here to make sure we can keep a track and just make it plugin which we can plug or unplug it from the test scripts

Thanks for getting back so quickly, I didn't have the energy mid-week to follow up, sorry.

Added a test for testing round-trip accuracy-- test_roundtrip_benchmarks tests for all JSON files in benchmarks/data/ folder now, including the 5 new ones that I added.

How to run the tests that validate that #10 is fixing genuine bugs with the format. Currently ZON format does not offer "lossless round-trips".

I'm surpprised that the CI isn't failing. Should probably add a workflow to CI that runs the tests before-- I can do that, but not in this PR, as that's a separate concern.

python -m unittest tests.unit.test_roundtrip_benchmarks -v

or...

pytest tests/unit/test_roundtrip_benchmarks.py -v

@ronibhakta1
Copy link
Copy Markdown
Contributor

Lets keep the tests file to a different folder here to make sure we can keep a track and just make it plugin which we can plug or unplug it from the test scripts

Thanks for getting back so quickly, I didn't have the energy mid-week to follow up, sorry.

Added a test for testing round-trip accuracy-- test_roundtrip_benchmarks tests for all JSON files in benchmarks/data/ folder now, including the 5 new ones that I added.

How to run the tests that validate that #10 is fixing genuine bugs with the format. Currently ZON format does not offer "lossless round-trips".

I'm surpprised that the CI isn't failing. Should probably add a workflow to CI that runs the tests before-- I can do that, but not in this PR, as that's a separate concern.

python -m unittest tests.unit.test_roundtrip_benchmarks -v

or...

pytest tests/unit/test_roundtrip_benchmarks.py -v

Hey @P6rguVyrst Thanks for finding's. I only had round trip check's. Lossless once might have got missed out on the egde cases.

The edge cases for roundtrip are like Spider Multiverse
Alt Text

We can try fixing and adding the github test's with your data sets.

Just tell me which one should we merge first

@P6rguVyrst
Copy link
Copy Markdown
Author

P6rguVyrst commented Apr 20, 2026

Lets keep the tests file to a different folder here to make sure we can keep a track and just make it plugin which we can plug or unplug it from the test scripts

Thanks for getting back so quickly, I didn't have the energy mid-week to follow up, sorry.
Added a test for testing round-trip accuracy-- test_roundtrip_benchmarks tests for all JSON files in benchmarks/data/ folder now, including the 5 new ones that I added.
How to run the tests that validate that #10 is fixing genuine bugs with the format. Currently ZON format does not offer "lossless round-trips".
I'm surpprised that the CI isn't failing. Should probably add a workflow to CI that runs the tests before-- I can do that, but not in this PR, as that's a separate concern.

python -m unittest tests.unit.test_roundtrip_benchmarks -v

or...

pytest tests/unit/test_roundtrip_benchmarks.py -v

Hey @P6rguVyrst Thanks for finding's. I only had round trip check's. Lossless once might have got missed out on the egde cases.

The edge cases for roundtrip are like Spider Multiverse Alt Text Alt Text

We can try fixing and adding the github test's with your data sets.

Just tell me which one should we merge first

I was thinking of adding the failing tests first, and then the PR that's fixing them.

I don't mind if you scrap my PRs, and patch the code yourself-- just wanted to use ZON, but not have to maintain my own copy of it for accuracy 😅

The PR history for the actual patch in my for got a bit messy, so I'd probably do a squash-and-merge for at least the 3 bug fixes-- there's a bit back and forth there of reverting a PR I accidentally merged to my main.

@ronibhakta1
Copy link
Copy Markdown
Contributor

Lets keep the tests file to a different folder here to make sure we can keep a track and just make it plugin which we can plug or unplug it from the test scripts

Thanks for getting back so quickly, I didn't have the energy mid-week to follow up, sorry.
Added a test for testing round-trip accuracy-- test_roundtrip_benchmarks tests for all JSON files in benchmarks/data/ folder now, including the 5 new ones that I added.
How to run the tests that validate that #10 is fixing genuine bugs with the format. Currently ZON format does not offer "lossless round-trips".
I'm surpprised that the CI isn't failing. Should probably add a workflow to CI that runs the tests before-- I can do that, but not in this PR, as that's a separate concern.

python -m unittest tests.unit.test_roundtrip_benchmarks -v

or...

pytest tests/unit/test_roundtrip_benchmarks.py -v

Hey @P6rguVyrst Thanks for finding's. I only had round trip check's. Lossless once might have got missed out on the egde cases.
The edge cases for roundtrip are like Spider Multiverse Alt Text Alt Text
We can try fixing and adding the github test's with your data sets.
Just tell me which one should we merge first

I was thinking of adding the failing tests first, and then the PR that's fixing them.

I don't mind if you scrap my PRs, and patch the code yourself-- just wanted to use ZON, but not have to maintain my own copy of it for accuracy 😅

The PR history for the actual patch in my for got a bit messy, so I'd probably do a squash-and-merge for at least the 3 bug fixes-- there's a bit back and forth there of reverting a PR I accidentally merged to my main.

Hey I would love to have code fixes from your end. So I will start with the PR which adds the data set's and then you can try to add a fix to that.

@ronibhakta1
Copy link
Copy Markdown
Contributor

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

@P6rguVyrst
Copy link
Copy Markdown
Author

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

I mean my main branch will fix the failing tests.
What I could do is squash & merge my main branch to this PR to have the tests & the fix go out in a single release.

@ronibhakta1
Copy link
Copy Markdown
Contributor

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

I mean my main branch will fix the failing tests. What I could do is squash & merge my main branch to this PR to have the tests & the fix go out in a single release.

That can be a better approach.

Toomas Ormisson and others added 4 commits April 20, 2026 22:08
The encoder now quotes keys that match boolean/null keywords (t, f, true,
false, null, none, nil) to prevent the decoder from misinterpreting them.

The decoder now uses a new parse_key() function for dictionary keys that
preserves them as strings, unlike parse_value() which converts keywords.

This fixes a critical round-trip bug where {"f": 1} would decode as
{False: 1}.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Values in dictionary compression headers containing commas, colons,
or other special characters were not being quoted, causing data
corruption on decode (values truncated at delimiter).

Now uses _format_value() for consistent quoting of dictionary values.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Delta encoding violated spec §2.3's MUST round-trip requirement for float
columns: prev + (cur - prev) in IEEE-754 does not recover the original
double's bit pattern for arbitrary values, and round(diff, 10) at the
encode step compounded the loss. Benchmark data exposed this as e.g.
1865.43 decoding to 1865.4299999999994.

Restrict SparseMode.DELTA eligibility to int-only columns. Float columns
now fall through to standard value encoding, which round-trips exactly
via Python's shortest-round-trip str(float). Int delta encoding (the
common case: IDs, counts, timestamps) is unchanged.

Regression test covers multiple precision regimes — benchmark values,
math.pi/math.e, 0.1+0.2, extreme exponents, negatives — so a future
round-to-N workaround cannot sneak through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
zon-format/src/zon/core/encoder.py (1)

176-189: Delta-eligibility check correctly excludes bool/float/None.

Because bool is a subclass of int, using isinstance(val, int) and not isinstance(val, bool) is the right idiom here — a plain isinstance(val, int) check would have accepted True/False as delta-eligible and still been wrong. Restricting DELTA to pure int columns is the correct remedy for the IEEE-754 round-trip failure surfaced by the new test_float_column_roundtrip_is_lossless test.

One optional refinement: a column that is int-valued except for occasional None (sparse) will now fall back to non-delta encoding even where delta is still lossless. If that's a noticeable size regression in the benchmarks, you could treat None as "skip this sample" rather than disqualifying. Not required for correctness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/src/zon/core/encoder.py` around lines 176 - 189, The is_int_only
check should explicitly accept only pure ints and exclude bools; replace the
loop condition with the clearer idiom using isinstance(val, int) and not
isinstance(val, bool) (i.e., set is_int_only = False when not (isinstance(val,
int) and not isinstance(val, bool))) so columns containing bool/float/None are
excluded from selecting SparseMode.DELTA (refer to variables/functions:
is_int_only, values, SparseMode.DELTA).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zon-format/src/zon/core/utils.py`:
- Around line 49-58: The code branch where trimmed.startswith('"') catches
JSONDecodeError but only strips quotes when trimmed.endswith('"') can return a
string with a dangling leading quote; update the error path so that when a
leading quote is present but JSON decoding fails you do not return trimmed
verbatim — either raise a descriptive parsing exception or strip the leading
(and trailing if present) quote before returning so the caller doesn't get a
leaked '"' character. Locate the block that tries json.loads(trimmed) in
zon/core/utils.py (the code handling trimmed and JSONDecodeError) and make it
mimic the existing parse_value behavior: on decode failure, remove enclosing
quotes (or raise a parsing error) rather than returning the original trimmed
string unchanged.

In `@zon-format/tests/unit/test_delta.py`:
- Around line 135-144: The loop using zip(data, decoded) can silently truncate
if decoded is shorter; either assert lengths before comparing or pass
strict=True to zip so mismatched lengths fail the test—update the test in
test_delta.py to compare len(decoded) == len(data) or change the loop to for
original, got in zip(data, decoded, strict=True) and keep the existing
repr-based float equality check using original['v'] and got['v'] to surface any
truncation regressions.

---

Nitpick comments:
In `@zon-format/src/zon/core/encoder.py`:
- Around line 176-189: The is_int_only check should explicitly accept only pure
ints and exclude bools; replace the loop condition with the clearer idiom using
isinstance(val, int) and not isinstance(val, bool) (i.e., set is_int_only =
False when not (isinstance(val, int) and not isinstance(val, bool))) so columns
containing bool/float/None are excluded from selecting SparseMode.DELTA (refer
to variables/functions: is_int_only, values, SparseMode.DELTA).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 21ae1361-714a-4563-af39-833e74bf2277

📥 Commits

Reviewing files that changed from the base of the PR and between f4bc7e2 and 0bdef33.

📒 Files selected for processing (5)
  • zon-format/src/zon/core/decoder.py
  • zon-format/src/zon/core/encoder.py
  • zon-format/src/zon/core/utils.py
  • zon-format/tests/unit/test_boolean_keys.py
  • zon-format/tests/unit/test_delta.py

Comment on lines +49 to +58
if trimmed.startswith('"'):
try:
return json.loads(trimmed)
except json.JSONDecodeError:
if trimmed.endswith('"'):
inner = trimmed[1:-1]
json_str = inner.replace('""', '\\"')
return json.loads(f'"{json_str}"')

return trimmed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Malformed quoted keys can leak the leading quote into the returned string.

If trimmed starts with " but the initial json.loads raises and trimmed does not end with " (e.g., an unterminated quoted key reaching this point), the endswith branch is skipped and the function returns trimmed verbatim — including the leading quote character. This is unlikely in practice since the tokenizer typically won't yield such fragments, but the current code silently produces a string key containing " rather than signalling the malformed input.

Consider either raising on that path or unconditionally stripping the enclosing quotes before falling through. Behaviour otherwise matches the existing parse_value pattern, so this is minor.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/src/zon/core/utils.py` around lines 49 - 58, The code branch where
trimmed.startswith('"') catches JSONDecodeError but only strips quotes when
trimmed.endswith('"') can return a string with a dangling leading quote; update
the error path so that when a leading quote is present but JSON decoding fails
you do not return trimmed verbatim — either raise a descriptive parsing
exception or strip the leading (and trailing if present) quote before returning
so the caller doesn't get a leaked '"' character. Locate the block that tries
json.loads(trimmed) in zon/core/utils.py (the code handling trimmed and
JSONDecodeError) and make it mimic the existing parse_value behavior: on decode
failure, remove enclosing quotes (or raise a parsing error) rather than
returning the original trimmed string unchanged.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encoder produces "f" and "true" (properly closed quotes). The malformed "foo case requires:

  1. Hand-crafted invalid ZON, OR
  2. Bug in encoder/tokenizer

The bug is not reachable with valid ZON.

Should the decoder be strict (reject malformed input) or permissive (best-effort)?

That's a design choice, not a correctness bug.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@P6rguVyrst, that's a fair and correct rebuttal. You're right — since the encoder always emits properly closed quoted strings, the unterminated-quote path is not reachable through normal encoder output, and triggering it would require deliberately crafted or externally corrupted ZON. My concern was labeled minor precisely because of this low likelihood.

On the strict vs. permissive question: both are valid positions.

  • Strict (raise ValueError("Malformed quoted key: ...")) makes the decoder a spec validator and surfaces upstream bugs early.
  • Permissive (current behavior, best-effort strip) maximises interoperability with hand-authored or third-party ZON.

Since this is explicitly a design choice and the current behavior is consistent with the existing parse_value pattern in the same file, I won't push further. The behavior is acceptable as-is.

Comment on lines +135 to +144
decoded = self.decoder.decode(self.encoder.encode(data))

for original, got in zip(data, decoded):
# repr(float) is the shortest string that round-trips to the same
# double, so repr equality is equivalent to bit equality.
self.assertEqual(
repr(original['v']),
repr(got['v']),
f"float roundtrip lost precision: {original['v']!r} -> {got['v']!r}",
)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Pass strict=True to zip so truncation can't silently pass the test.

If a future regression caused decoder.decode(...) to return fewer rows than data, zip would silently iterate over the shorter prefix and the test would still pass. Adding strict=True (or a prior assertEqual(len(decoded), len(data))) makes that failure mode visible. Ruff also flags this as B905.

🔧 Proposed tweak
-        for original, got in zip(data, decoded):
+        self.assertEqual(len(decoded), len(data))
+        for original, got in zip(data, decoded, strict=True):
🧰 Tools
🪛 Ruff (0.15.10)

[warning] 137-137: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zon-format/tests/unit/test_delta.py` around lines 135 - 144, The loop using
zip(data, decoded) can silently truncate if decoded is shorter; either assert
lengths before comparing or pass strict=True to zip so mismatched lengths fail
the test—update the test in test_delta.py to compare len(decoded) == len(data)
or change the loop to for original, got in zip(data, decoded, strict=True) and
keep the existing repr-based float equality check using original['v'] and
got['v'] to surface any truncation regressions.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finding doesn't apply meaningfully to this test's purpose.

If a future regression caused decoder.decode(...) to return fewer rows than data

Row dropping isn't a realistic failure mode. If it were to happen, the benchmark roundtrip tests would fail immediately.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@P6rguVyrst, fair point — the float-precision test is narrowly scoped, and the benchmark roundtrip suite already acts as a broader safety net for structural regressions like row dropping. I'll defer to your judgment here.

(╯°□°)╯


🧠 Learnings used
Learnt from: P6rguVyrst
Repo: ZON-Format/ZON PR: 9
File: zon-format/tests/unit/test_roundtrip_benchmarks.py:37-42
Timestamp: 2026-04-20T19:10:05.092Z
Learning: In `zon-format/tests/unit/test_roundtrip_benchmarks.py`, comparing roundtrip results via `json.dumps(sort_keys=True)` is valid because all input data comes from `json.load()`, which always produces string-keyed dicts (JSON only allows string keys). The concern about integer-key coercion (`{1: "value"}`) does not apply to JSON-sourced data.

@P6rguVyrst
Copy link
Copy Markdown
Author

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

I mean my main branch will fix the failing tests. What I could do is squash & merge my main branch to this PR to have the tests & the fix go out in a single release.

That can be a better approach.

It's all in one PR now, however I don't see the workflow test runs triggering.

@ronibhakta1
Copy link
Copy Markdown
Contributor

@P6rguVyrst Before we merge. do you wanna do a token test where we use token tester model to get the idea how much is getting consumed there @coderabbitai please tag the right files we have in the code base

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get overall repo structure
tree -L 4 --gitignore 2>/dev/null || find . -not -path '*/.git/*' -not -path '*/node_modules/*' | sort | head -100

echo "---"
# Search for token-related files
fd -i "token" --type f
echo "---"
# Search for retrieval/accuracy/benchmark runner scripts
fd -i -e py --type f | head -60
echo "---"
# Look for any requirements or setup files
fd -i "requirements" --type f
fd -i "setup.py" --type f
fd -i "setup.cfg" --type f
fd -i "pyproject.toml" --type f

Length of output: 10786


🏁 Script executed:

#!/bin/bash
echo "=== token_analysis.py ==="
cat zon-format/benchmarks/analysis/token_analysis.py

echo "=== token_counter.py ==="
cat zon-format/src/zon/llm/token_counter.py

echo "=== retrieval_accuracy.py ==="
cat zon-format/benchmarks/tests/retrieval_accuracy.py

echo "=== unified_benchmark.py ==="
cat zon-format/benchmarks/runners/unified_benchmark.py

Length of output: 24275


@ronibhakta1 Here are the relevant files for token analysis in the codebase:


🔢 Token-related files

File Purpose
zon-format/benchmarks/runners/unified_benchmark.py Best for this test — auto-scans all *.json in benchmarks/data/ (including all 5 new wtq-* files) and reports byte sizes + token counts across multiple tokenizers (o200k_base for GPT-4o, cl100k_base for Claude/Llama-3)
zon-format/benchmarks/analysis/token_analysis.py Deep tokenizer comparison using tiktoken (o200k_base, cl100k_base, p50k_base) — currently hardcoded to hikes.json only, but can be adapted
zon-format/src/zon/llm/token_counter.py Internal utility (TokenCounter class) using a simple character-based estimate (~1 token per 4 chars), no tiktoken dependency
zon-format/benchmarks/tests/retrieval_accuracy.py Full LLM accuracy benchmark via AzureAIClient — requires AZURE_OPENAI_* env vars (not runnable by @P6rguVyrst in their environment)

▶️ Recommended token test command

To run a token count across all datasets including the 5 new wtq-* ones, from the zon-format/ directory:

# Install dependency first if needed
pip install tiktoken

# Run unified benchmark (covers all *.json in benchmarks/data/)
python benchmarks/runners/unified_benchmark.py

This will produce a table showing JSON vs ZON byte sizes, compression ratios, and token counts per dataset for multiple tokenizer encodings — giving a clear picture of the token impact of the new files before merging.

@P6rguVyrst — since retrieval_accuracy.py requires Azure credentials you don't have, the unified_benchmark.py is the most practical option for a local token test on your end as well.

@ronibhakta1
Copy link
Copy Markdown
Contributor

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

I mean my main branch will fix the failing tests. What I could do is squash & merge my main branch to this PR to have the tests & the fix go out in a single release.

That can be a better approach.

It's all in one PR now, however I don't see the workflow test runs triggering.

Yes, I kept it to be triggered mnually

@P6rguVyrst
Copy link
Copy Markdown
Author

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

I mean my main branch will fix the failing tests. What I could do is squash & merge my main branch to this PR to have the tests & the fix go out in a single release.

That can be a better approach.

It's all in one PR now, however I don't see the workflow test runs triggering.

Yes, I kept it to be triggered mnually

Looks like at least with this dataset the tests now pass? :)

There was one rounding error even with the original dataset-- not sure if it's fixed all the acuracy bugs, but after fixing 3 distinct issues, it does seem more accurate than before.

Let me know if I should address fix else within this scope-- did address few CodeRabbit nitpicks yesterday..

@ronibhakta1
Copy link
Copy Markdown
Contributor

Hey @P6rguVyrst Now you can see the test being in action. Would you like to look into the logs above.

I mean my main branch will fix the failing tests. What I could do is squash & merge my main branch to this PR to have the tests & the fix go out in a single release.

That can be a better approach.

It's all in one PR now, however I don't see the workflow test runs triggering.

Yes, I kept it to be triggered mnually

Looks like at least with this dataset the tests now pass? :)

There was one rounding error even with the original dataset-- not sure if it's fixed all the acuracy bugs, but after fixing 3 distinct issues, it does seem more accurate than before.

Let me know if I should address fix else within this scope-- did address few CodeRabbit nitpicks yesterday..

Yeah the test pass this time thanks for that. Speaking about code rabbit finding if the issues are minor let's fix and finish it off. Speaking about if any of the tests or datasets fails we have to figure it out cause it can lead to more issues on user's accuracy.

@P6rguVyrst
Copy link
Copy Markdown
Author

Should I do anything else?

I addressed CodeRabbit comments where appropriate, not going to commnet on test data which is valid json input.

@ronibhakta1
Copy link
Copy Markdown
Contributor

Should I do anything else?

I addressed CodeRabbit comments where appropriate, not going to commnet on test data which is valid json input.

Yeah, looks good to me.
Let's get this one merged. Thank you for your contribution.

@P6rguVyrst
Copy link
Copy Markdown
Author

Should I do anything else?
I addressed CodeRabbit comments where appropriate, not going to commnet on test data which is valid json input.

Yeah, looks good to me. Let's get this one merged. Thank you for your contribution.

I don't have the permission to merge, nor should I 🫠

@ronibhakta1
Copy link
Copy Markdown
Contributor

Should I do anything else?
I addressed CodeRabbit comments where appropriate, not going to commnet on test data which is valid json input.

Yeah, looks good to me. Let's get this one merged. Thank you for your contribution.

I don't have the permission to merge, nor should I 🫠

Yeah sorry I will do that rn. I got busy yesterday to work on something.

@ronibhakta1 ronibhakta1 merged commit 8afdba9 into ZON-Format:main Apr 23, 2026
6 checks passed
@ronibhakta1
Copy link
Copy Markdown
Contributor

@P6rguVyrst I have deployed the package. You can use it seamlessly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants