Skip to content

fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos]#48702

Open
xinlian12 wants to merge 21 commits intoAzure:mainfrom
xinlian12:fix/issue-45521-custom-item-serializer
Open

fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos]#48702
xinlian12 wants to merge 21 commits intoAzure:mainfrom
xinlian12:fix/issue-45521-custom-item-serializer

Conversation

@xinlian12
Copy link
Copy Markdown
Member

@xinlian12 xinlian12 commented Apr 7, 2026

Summary

Fixes #45521

Two bugs with CustomItemSerializer:

Bug 1 Deserialization failures in ORDER BY/GROUP BY/aggregate/DISTINCT queries

When customItemSerializer is configured on CosmosClientBuilder, complex queries (ORDER BY, GROUP BY, VALUE, COUNT, SUM, DISTINCT, HybridSearch) fail because the custom serializer leaks into the internal query pipeline and is used to deserialize internal SDK structures (OrderByRowResult, Document, etc.).

Root cause: PipelinedDocumentQueryExecutionContext.createBaseComponentFunction() called setCustomItemSerializer(null) on cloned request options, but RxDocumentClientImpl.getEffectiveItemSerializer() falls back to the client-level defaultCustomSerializer when the request-level serializer is null.

Fix: Set CosmosItemSerializer.DEFAULT_SERIALIZER instead of null on internal request options. This causes getEffectiveItemSerializer() to return DEFAULT_SERIALIZER immediately without falling through to the client-level custom serializer.

Bug 2 SqlParameter ignores customItemSerializer

SqlParameter.setValue() delegates to JsonSerializable.set() which hardcodes INTERNAL_DEFAULT_SERIALIZER. Custom serialization settings (e.g., dates as ISO strings) are never applied to query parameters.

Fix: Added rawValue field to SqlParameter to retain the original value, plus applySerializer() to re-serialize with a custom serializer. Called from PipelinedQueryExecutionContextBase.createAsync() where the effective serializer is known.

Changes

File Change
PipelinedDocumentQueryExecutionContext.java null DEFAULT_SERIALIZER in 4 internal pipeline paths
PipelinedQueryExecutionContextBase.java Apply serializer to SqlParameters at query execution time
SqlParameter.java Added rawValue field + applySerializer() method
SqlQuerySpec.java Added applySerializerToParameters()

Testing

Build compiles cleanly. Existing tests unaffected. Additional test cases for ORDER BY/GROUP BY/aggregate queries with custom serializer and SqlParameter serialization should be added.

Annie Liang and others added 2 commits April 6, 2026 19:45
…eline structures and is applied to SqlParameter serialization

Fixes Azure#45521

Bug 1: Changed PipelinedDocumentQueryExecutionContext to set DEFAULT_SERIALIZER
instead of null on internal request options, preventing getEffectiveItemSerializer()
from falling back to the client-level custom serializer for internal pipeline processing.
This fixes ORDER BY, GROUP BY, aggregate, DISTINCT, DCount, and hybrid search queries
when a custom serializer is configured on CosmosClientBuilder.

Bug 2: Added serializer-aware parameter serialization in the query execution path
so SqlParameter values are serialized with the effective custom serializer.
SqlParameter now stores the raw value and re-serializes it when applySerializer()
is called from PipelinedQueryExecutionContextBase.createAsync().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…structor

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12 xinlian12 requested review from a team and kirankumarkolli as code owners April 7, 2026 03:03
Copilot AI review requested due to automatic review settings April 7, 2026 03:03
@xinlian12 xinlian12 requested a review from a team as a code owner April 7, 2026 03:03
@github-actions github-actions bot added the Cosmos label Apr 7, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Cosmos DB query behavior when a CustomItemSerializer is configured by preventing it from deserializing internal SDK query structures, and by ensuring query parameters can be serialized with the effective item serializer.

Changes:

  • Force internal query pipeline request options to use CosmosItemSerializer.DEFAULT_SERIALIZER (instead of null) to avoid leaking client-level custom serializers into internal deserialization.
  • Add internal plumbing to re-serialize SqlParameter values using the effective serializer at query execution time.
  • Introduce internal bridge method to apply parameter re-serialization across packages.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/SqlQuerySpec.java Adds internal method to apply an effective serializer to all parameters.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/SqlParameter.java Stores raw parameter values and adds internal re-serialization hook.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/ModelBridgeInternal.java Exposes internal bridge method to apply serializer to query parameters.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/PipelinedQueryExecutionContextBase.java Applies effective serializer to parameters at query execution time.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/PipelinedDocumentQueryExecutionContext.java Forces DEFAULT serializer on internal pipeline request options to prevent custom serializer leaks.
.coding-harness/spec.json Adds harness metadata/spec for the reported issue and intended fix.

@xinlian12 xinlian12 changed the title fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos] [NO REVIEW]fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos] Apr 7, 2026
Annie Liang and others added 5 commits April 6, 2026 21:25
…ndition

When a custom serializer is configured, applySerializerToParameters() mutates
SqlParameter values in-place on the user's original SqlQuerySpec. If the same
SqlQuerySpec is reused across concurrent queries with different serializers,
the mutations race without synchronization.

Defensive fix: clone the SqlQuerySpec and its SqlParameter list before applying
the serializer, so the original object is never mutated by the SDK.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… method, use ImplementationBridgeHelpers accessor

- C1: Fix forceSerialization=false to true in SqlParameter.applySerializer()
- C4: Remove applySerializerToParameters from ModelBridgeInternal, use SqlQuerySpecHelper accessor pattern via ImplementationBridgeHelpers instead

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…aggregate, DISTINCT, and SqlParameter queries

Covers fix for Azure#45521

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ze()

The Spark connector and potentially other consumers implement
CosmosItemSerializer with only deserialize() and stub serialize()
as unimplemented. Guard with try-catch to fall back to default
serialization when the custom serializer's serialize() is not
available.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12
Copy link
Copy Markdown
Member Author

@sdkReviewAgent-2

@xinlian12
Copy link
Copy Markdown
Member Author

Review complete (54:56)

No new comments — existing review coverage is sufficient.

Steps: cross-sdk, history, past-prs, quality, sanity-check, synthesis, test-coverage

xinlian12 and others added 2 commits April 8, 2026 11:56
…ializer in tests

Address PR review comments from xinlian12 - use the clientSerializer
variable (from getClientBuilder().getCustomItemSerializer()) instead of
hardcoding EnvelopWrappingItemSerializer.INSTANCE_NO_TRACKING_ID_VALIDATION
in the new query test methods.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12 xinlian12 changed the title [NO REVIEW]fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos] fix: CustomItemSerializer not applied correctly in queries and SqlParameter [Cosmos] Apr 8, 2026
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

2 similar comments
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - but there are still a few test issues

SELECT VALUE COUNT(1) returns a scalar integer, so use Integer.class
instead of ObjectNode.class as the result type. ObjectNode.class fails
because ValueUnwrapCosmosItemSerializer extracts the _value field and
cannot convert the resulting integer to ObjectNode.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12 xinlian12 force-pushed the fix/issue-45521-custom-item-serializer branch from eacb7b6 to 20bd377 Compare April 10, 2026 16:13
xinlian12 and others added 4 commits April 10, 2026 09:26
Create a BasicCustomItemSerializer that mirrors the real-world use case
from issue Azure#45521 — a simple ObjectMapper-based serializer with custom
settings (dates as ISO strings via JavaTimeModule) without transforming
the document structure.

Changes:
- Add BasicCustomItemSerializer inner class with custom ObjectMapper
- Add it to the Factory data provider so query tests also run with it
- Update query tests to use the custom serializer directly when the
  serializer does not transform document structure (vs falling back to
  DEFAULT_SERIALIZER for envelope-wrapping)
- Use instanceof checks for EnvelopWrappingItemSerializer instead of
  identity comparison for robustness

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…back to default

For queryWithAggregate, queryWithDistinct, and queryWithGroupBy custom
serializer tests: skip the test entirely when EnvelopWrappingItemSerializer
is used, rather than falling back to DEFAULT_SERIALIZER. These query result
shapes (aggregates, projections) are not full documents and cannot be
properly deserialized by the envelope-wrapping serializer.

Also removed unused isEnvelopeWrapper variable from queryWithOrderBy.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e serializer test

Custom serializers' deserialize(Map<String, Object>, Class<T>) API is designed
for object-to-POJO mapping. SELECT VALUE COUNT(1) returns a scalar that gets
wrapped in a {"_value": N} Document by the aggregate pipeline, which cannot
be deserialized as Integer.class through a custom serializer.

Changed to SELECT COUNT(1) which returns an object {"$1": 3} that properly
goes through the Map-based deserialization path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Adds queryWithSqlParameterDateTimeAndCustomSerializer test that validates
the end-to-end scenario: when a custom serializer writes Instant as an
ISO-8601 string (not a timestamp), SqlParameter must apply the same
serializer so the query filter matches the stored representation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - kafka

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - spark

@xinlian12
Copy link
Copy Markdown
Member Author

/azp run java - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@xinlian12
Copy link
Copy Markdown
Member Author

@sdkReviewAgent-2

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

When SqlParameter.applySerializer() calls serialize(Instant), the previous
implementation did customMapper.convertValue(item, Map.class) which fails
for non-object values (Instant, String, numbers).

Now serialize() first converts to JsonNode, checks isObject(), and for
scalar values returns a single-entry Map with the well-known primitive
value key so the framework correctly sets it as a scalar. This ensures
SqlParameter values like Instant are serialized consistently with how the
custom serializer stores them in documents (e.g. ISO-8601 strings).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
assertThat(results).isNotNull();
assertThat(results).hasSize(1);
assertThat(results.get(0).id).isEqualTo(id);
assertThat(results.get(0).createdAt).isEqualTo(createdAt);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking — Correctness: applySerializer is ineffective for scalar SqlParameter values

The CosmosItemSerializer.serialize() contract returns Map<String, Object>, which fundamentally cannot represent scalar values. For custom serializers like BasicCustomItemSerializer, serialize(scalar) calls customMapper.convertValue(scalar, Map.class), which always throws for non-Map-able types (String, Integer, Instant, etc.). The catch(Throwable) block silently swallows the exception, and the parameter retains its default-serialized representation.

Concrete failure trace for this test:

  1. Document created with BasicCustomItemSerializer (WRITE_DATES_AS_TIMESTAMPS=false) → createdAt stored as ISO string "2026-03-15T10:30:00Z"
  2. SqlParameter("@createdAt", Instant)setValue() stores rawValue=Instant and default-serializes to epoch number 1773837000 (SDK's ObjectMapper has WRITE_DATES_AS_TIMESTAMPS=true)
  3. Clone (line 78): getValue(Object.class) reads the epoch number from JSON property bag → clone's rawValue = Long(1773837000) (type lost)
  4. applySerializer(BasicCustomItemSerializer)serialize(Long)convertValue(Long, Map.class) → throws IllegalArgumentException
  5. catch(Throwable) fires → parameter keeps epoch number 1773837000
  6. Query: c.createdAt = @createdAt compares string "2026-03-15T10:30:00Z" vs number 17738370000 results
  7. assertThat(results).hasSize(1)test fails

This means Bug 2's fix (SqlParameter custom serialization) is a no-op for the exact scenario described in issue #45521 — scalar query parameters with types where the custom serializer changes the wire format.

Root cause: DefaultCosmosItemSerializer.serialize() handles scalars via PrimitiveJsonNodeMap (an internal type), but custom serializers can't — the Map<String, Object> return type makes it impossible to return a scalar from serialize().

Suggested approaches (pick one):

(a) Use the serializer's ObjectMapper directly for scalar parameter values:

void applySerializer(CosmosItemSerializer serializer) {
    if (this.rawValue != null && serializer != null) {
        ObjectMapper mapper = serializer.getItemObjectMapper();
        JsonNode node = mapper.convertValue(this.rawValue, JsonNode.class);
        this.jsonSerializable.getPropertyBag().set("value", node);
    }
}

But this requires BasicCustomItemSerializer (and customer serializers) to call setItemObjectMapper(customMapper) — document this contract.

(b) Add a serializeToNode(T item) method to CosmosItemSerializer with a default implementation using getItemObjectMapper(), and use that in applySerializer instead of serialize().

⚠️ AI-generated review — may be incorrect. Agree? → resolve the conversation. Disagree? → reply with your reasoning.

List<SqlParameter> originalParams = original.getParameters();
if (originalParams != null) {
for (SqlParameter p : originalParams) {
clonedParams.add(new SqlParameter(p.getName(), p.getValue(Object.class)));
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocking — Correctness: Clone via getValue(Object.class) loses the original Java type

This line reconstructs the SqlParameter value by reading from the JSON property bag (getObject("value", Object.class)getValue(JsonNode)), not from rawValue. The getValue(JsonNode) method converts JSON nodes back to Java primitives: LongNodeLong, DecimalNodeDouble, TextNodeString.

For an Instant parameter that was default-serialized as epoch-seconds (a numeric node), getValue(Object.class) returns a Long — the clone's rawValue is Long, not the original Instant. When applySerializer later tries to re-serialize this Long with the custom serializer, it serializes the wrong type.

Fix: Expose rawValue through a package-private accessor and use it for cloning:

// In SqlParameter.java (package-private):
Object getRawValue() { return this.rawValue; }

// Here:
clonedParams.add(new SqlParameter(p.getName(), p.getRawValue()));

Or add a package-private copy constructor:

SqlParameter(SqlParameter original) {
    this.jsonSerializable = new JsonSerializable(
        original.jsonSerializable.getPropertyBag().deepCopy());
    this.rawValue = original.rawValue;
}

This is a prerequisite for any fix to the applySerializer scalar issue — even after serialize() is made to handle scalars, the clone still needs to preserve the original Java type.

⚠️ AI-generated review — may be incorrect. Agree? → resolve the conversation. Disagree? → reply with your reasoning.

if (this.rawValue != null && serializer != null) {
try {
this.jsonSerializable.set("value", this.rawValue, serializer, true);
} catch (Throwable t) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Recommendation — Error Handling: catch(Throwable) catches fatal JVM errors; log level too low

Two issues with this error handling:

  1. Throwable catches Error subclassesOutOfMemoryError, StackOverflowError, ThreadDeath, and VirtualMachineError are fatal JVM conditions that should never be silently swallowed. The intended use case (Spark connector stubs throwing UnsupportedOperationException) is an Exception, not an Error.

  2. LOGGER.debug() masks incorrect query results — When the serializer fails and falls back to default serialization, the parameter value may have a different wire representation than the document field (e.g., epoch number vs ISO string). The query silently returns zero results. At DEBUG level, this is invisible in production. The log message says "does not support serialize()" but the actual failure mode (scalar-to-Map conversion failure) is different and much harder to diagnose.

// Suggested fix:
} catch (Exception e) {
    LOGGER.warn(
        "Custom serializer '{}' failed to serialize SqlParameter value of type '{}'; "
            + "falling back to default serialization. Query results may not match if the "
            + "custom serializer changes the wire format for this type.",
        serializer.getClass().getSimpleName(),
        this.rawValue.getClass().getSimpleName(),
        e);
}

⚠️ AI-generated review — may be incorrect. Agree? → resolve the conversation. Disagree? → reply with your reasoning.

@xinlian12
Copy link
Copy Markdown
Member Author

Review complete (50:22)

Posted 3 inline comment(s).

Steps: ✓ context, correctness, cross-sdk, design, history, past-prs, synthesis, test-coverage

xinlian12 and others added 2 commits April 11, 2026 10:05
Previously caught Throwable, silently swallowing all serialization errors
and falling back to default serialization. This masked real bugs where the
custom serializer's output didn't match the document's stored format,
causing queries to silently return wrong results.

Now only catches:
- UnsupportedOperationException (Java standard for 'not implemented')
- scala.NotImplementedError (Scala's ??? operator, used by Spark connector)

All other exceptions propagate so serializer bugs are surfaced immediately.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When cloning SqlParameters for custom serializer application, the previous
code used p.getValue(Object.class) which reads from the JSON property bag.
This loses the original Java type (e.g. Instant becomes Long) because the
JSON property bag stores the default-serialized representation.

Now uses getRawValue() to preserve the original Java type, ensuring that
applySerializer() re-serializes the correct type with the custom serializer.

Changes:
- SqlParameter: add package-private getRawValue() accessor
- SqlQuerySpec: add cloneSqlParameter() to bridge accessor
- ImplementationBridgeHelpers: add cloneSqlParameter to SqlQuerySpecAccessor
- PipelinedQueryExecutionContextBase: use cloneSqlParameter() for cloning

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Cosmos – CustomItemSerializer - not working in certain queries and not applied in SqlParameter

3 participants