Skip to content

[bugfix] map keys: compare by op:same-key, never coerce across type families#6491

Open
joewiz wants to merge 1 commit into
eXist-db:developfrom
joewiz:bugfix/map-key-samekey-conformance
Open

[bugfix] map keys: compare by op:same-key, never coerce across type families#6491
joewiz wants to merge 1 commit into
eXist-db:developfrom
joewiz:bugfix/map-key-samekey-conformance

Conversation

@joewiz

@joewiz joewiz commented Jun 18, 2026

Copy link
Copy Markdown
Member

[This PR was co-authored with Claude Code. -Joe]

Summary

eXist conflated map keys of different op:same-key families that share a lexical value, on constructor-built maps. Per XQuery 3.1 §17.1, map keys compare via op:same-key — a type-group comparison with no key casting: the numeric family interchanges; the string family (xs:string/xs:anyURI/xs:untypedAtomic) interchanges; every other type matches only itself.

Measured on a clean build of current develop (these are the new mapKeySameKey.xqm cases) against the two reference implementations:

Expression eXist before eXist after Saxon-HE BaseX Spec
map:contains(map{"12":"x"}, 12) true false false false false
map:get(map{"12":"x"}, 12) ["x"] () () () ()
map:contains(map{"2020-01-01":1}, xs:date("2020-01-01")) true false false false false
map:contains(map{true():"x"}, "true") true false false false false
map:contains(map{1:1}, 1.5e0) (numeric, lossy cast) true false false false false
map:contains(map{5:1}, 5.0e0) (numeric family) true true true true true
map:contains(map{"urn:x":1}, xs:anyURI("urn:x")) (string family) true true true true true

eXist was the lone outlier on the cross-family cases.

Root cause

A map{…} constructor builds a homogeneously-typed MapType that records a concrete keyType. MapType.get()/contains() ran a private convert() that cast the lookup key to keyType (e.g. integer 12 → string "12") before delegating to the underlying map — so a cross-family lookup that was castable matched a key it should not have. The coercion dates to the original 2012 map implementation.

convert() is now redundant. The underlying map's comparator already implements op:same-key (AbstractMapType.sameKey), and NumericValue/string-family hashCode are already canonical (recent commits c608d55a, 6b63ec83), so cross-type-within-family pairs (xs:integer(5)/xs:double(5.0), xs:string/xs:anyURI) already hash into the same bucket and match correctly via sameKey alone. SingleKeyMapType (built by map:entry) already compares with sameKey directly and was correct; removing convert() makes MapType consistent with it.

Why XQTS misses it

qt3 and qt4 place their cross-family distinctness tests (map/contains.xml, map/get.xml: xs:untypedAtomic("12") vs 12) on map:entry, which yields a MIXED-key map where convert() was a no-op → eXist already passed them. Neither suite has a constructor-built cross-family distinctness test. (Worth a small upstream contribution to qt4tests to close the blind spot.)

What changed

  • MapType — removed convert(); get()/contains() delegate straight to the op:same-key comparator. keyType is retained only for the getKeyType() API (documented).
  • mapKeySameKey.xqm (new) — XQSuite regression tests using the constructor form: cross-family distinctness (string/date/boolean keys vs same-lexical numeric/other lookups) and the numeric lossy-cast case, plus within-family positives (numeric; string/anyURI/untypedAtomic) and same-type self-matches that must not regress.

Coupling: closes the serialization bare-string "backdoor"

eXist serialization parameters are looked up by their exist: QName. A user could previously also pass them as the prefixed string "exist:expand-xincludes" only because convert() coerced the QName lookup to a string. With this fix, QName and string are distinct keys, so only the spec-aligned exist: QName form is honored — a deliberate, spec-aligning behavior change.

Upgrade note: eXist-specific serialization parameters in an options map must be keyed by their exist: QName (map { xs:QName("exist:indent-spaces"): 2 }); the previously-tolerated bare "exist:…" string key is no longer accepted. (declare option exist:serialize declarations are not affected.)

A full-repo scan found no production/app usage of the bare-string form — only tests. Affected tests updated: serialize.xql, fnSerializeNewline.xqm, and the file module sync-serialize.xqm — prefixed-string cases now assert the key is ignored; the real functionality is retained via the QName form.

Test plan

All on a clean build of this branch off current develop:

  • MapTests — 155 run, 0 failures (9 new cases were failing before the fix)
  • XQuery3Tests — 1071 run, 0 failures (covers serialize.xql, fnSerializeNewline.xqm)
  • file module FileTests — 53 run, 0 failures (covers sync-serialize.xqm)
  • ArrayTests (164), MapTest, BifurcanMapTest — 0 failures
  • Codacy PMD on MapType.java — clean

XQTS has no relevant tests.

Spec references

…amilies

A map{...} constructor builds a homogeneously-typed MapType that recorded a
concrete keyType. MapType.get()/contains() then ran convert() on the lookup
key, casting it to keyType (e.g. integer 12 -> string "12") before delegating
to the underlying map. This conflated keys of different op:same-key families
that share a lexical value: map{"12":"x"}(12) wrongly returned "x", and
map{"2020-01-01":1}(xs:date("2020-01-01")) wrongly matched. It also conflated
within the numeric family by lossy cast: map{1:1}(1.5) wrongly matched.

Per XQuery 3.1 section 17.1, map keys compare via op:same-key, a type-group
comparison with no key casting: the numeric family interchanges, the string
family (xs:string/xs:anyURI/xs:untypedAtomic) interchanges, every other type
matches only itself.

convert() is now redundant. The underlying map's comparator already implements
op:same-key (AbstractMapType.sameKey), and NumericValue/string-family hashCode
are canonical, so cross-type-within-family pairs (xs:integer(5) / xs:double(5.0),
xs:string / xs:anyURI) already hash into the same bucket. Removing convert()
makes MapType behave like SingleKeyMapType, which already compares with sameKey
directly. keyType is retained only for the getKeyType() API.

XQTS misses this because qt3/qt4 place their cross-family distinctness tests on
map:entry() (a MIXED-key map, where convert() was a no-op), never on the
constructor form. New XQSuite tests (mapKeySameKey.xqm) cover the constructor
form: cross-family distinctness plus within-family positives that must still
interchange.

Coupling: this closes the non-conformant serialization "bare-string backdoor".
eXist serialization parameters are looked up by their exist: QName; a user could
also pass them as the prefixed string "exist:expand-xincludes" only because
convert() coerced the QName lookup to a string. With the fix, QName and string
are distinct keys, so only the spec-aligned QName form is honored. The affected
tests (serialize.xql, fnSerializeNewline.xqm, file module sync-serialize.xqm)
are updated: the prefixed-string cases now assert the key is ignored, and the
real functionality is retained via the QName form.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@joewiz joewiz requested a review from a team as a code owner June 18, 2026 00:26
joewiz added a commit to joewiz/exist that referenced this pull request Jun 18, 2026
…ions map

file:sync's 3rd-argument options map (since 6.1.0) accepts serialization options, but the eXist
extension parameter expand-xincludes could not be set to a non-default value: it was silently
overwritten by the hard-coded default. SerializerUtils.getSerializationOptions reads W3C parameters
by their string key but the eXist extension parameters by their exist-namespace QName key;
Sync.getOptions' "fill defaults" loop re-checked the input map by the bare string key only, so for an
exist-namespaced parameter the lookup always missed and the default clobbered the supplied value.

Fix: the "supplied?" check (extracted as suppliedInOptionsMap) tests both key forms -- the bare string
key for W3C parameters and the exist-namespace QName key for the extension parameters.

Tests use the spec-conformant exist-namespace QName key as the primary form (the existing
insert-final-newline tests are converted to it too). A canonical test
(fnSerializeExtensionParamKeyForms.xqm) documents the key-form contract for exist serialization
parameters: the QName key is honored; the prefixed "exist:..." string key and the bare local name are
both ignored.

Stacked on the MapType same-key conformance fix (eXist-db#6491): that PR removes the map
key-coercion that previously let a bare "exist:..." string key match an exist serialization
parameter, so this branch's tests assume the QName-only behavior. Until eXist-db#6491 merges, the diff here
also carries its commits; rebase onto develop once it lands.

Relates to eXist-db#3704 (the options-map key-matching defect; the
conf.xml-default-reading half of eXist-db#3704 remains open).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
joewiz added a commit to joewiz/exist that referenced this pull request Jun 18, 2026
compression:zip and compression:tar serialized XML resources written to
the archive using only the serializer defaults plus any prolog
"declare option exist:serialize"; there was no way to control
serialization per call. Add an optional 5th argument of serialization
options, mirroring fn:serialize and file:sync#3.

As with fn:serialize, the argument accepts either form:
  - a map(*), e.g. map { "indent": false() }; or
  - a W3C output:serialization-parameters element.
Standard W3C parameters use their string key (map) or output-namespace
child element; eXist extension parameters use their exist-namespace
QName key (map) or exist-namespace child element. Parsing is delegated
to FunSerialize.getSerializationProperties so both forms behave exactly
as they do for fn:serialize. The result is applied after (so overriding)
the prolog declare option exist:serialize, and re-read on each call.

The new arity is registered for both zip and tar in CompressionModule.

Stacked on eXist-db#6491 (MapType same-key conformance): the
exist-namespace QName key is the spec-conformant form for the extension
parameters. Until eXist-db#6491 merges this branch also carries its commit; it
rebases to a clean diff once eXist-db#6491 lands.

Relates to eXist-db#178

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
joewiz added a commit to joewiz/roaster that referenced this pull request Jun 18, 2026
…tus code

Guards the fix in router:get-content-type-for-code: a JSON route that relies
on response-code -> content-type negotiation (api:arrays-get returns a bare map
with no explicit response type) must serialize as application/json, not fall
back to application/xml. This fails on eXist >= eXist-db/exist#6491 without the
string($code) lookup (integer key no longer matches the string-keyed responses
map), and passes with it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant