Skip to content

fix: no such method error in lance arrow util due to transitive json4s usage#465

Merged
hamersaw merged 4 commits intolance-format:mainfrom
ivscheianu:fix-json4s-shading
Apr 27, 2026
Merged

fix: no such method error in lance arrow util due to transitive json4s usage#465
hamersaw merged 4 commits intolance-format:mainfrom
ivscheianu:fix-json4s-shading

Conversation

@ivscheianu
Copy link
Copy Markdown
Contributor

@ivscheianu ivscheianu commented Apr 21, 2026

Problem

Three call sites in lance-spark use org.json4s in ways that break when the library is included in a shaded fat-JAR with a json4s relocation rule.


1. LanceArrowUtils.toArrowFieldMetadata.jsonValue() (write path)

Metadata.jsonValue() is a public method on spark-sql-api's Metadata; its return type is org.json4s.JsonAST.JValue. The original code called this and immediately chained json4s operations on the result:

meta = metadata.jsonValue.extract[Map[String, Object]].map { case (k, v) => (k, String.valueOf(v)) }

2. LanceArrowUtils.fromArrowSchemaMetadata.fromJObject(JObject(...)) (read path)

Metadata.fromJObject accepts org.json4s.JsonAST.JObject. The original code constructed a JObject and passed it directly:

Metadata.fromJObject(JObject(field.getMetadata.asScala.map { case (k, v) => (k, JString(v)) }.toList))

Root cause (problems 1 & 2): The Shade Plugin rewrites every bytecode descriptor in every class it processes, including those from lance-spark-bundle, replacing org.json4s/ with the target namespace (e.g. shaded/org/json4s/). It cannot rewrite spark-sql-api.jar because that JAR is on the cluster classpath and is not processed by the plugin.

After shading, call site 1 becomes:

INVOKEVIRTUAL org/apache/spark/sql/types/Metadata.jsonValue
              ()Lshaded/org/json4s/JsonAST$JValue;

At runtime the JVM resolves Metadata.jsonValue() from spark-sql-api.jar, whose actual descriptor is ()Lorg/json4s/JsonAST$JValue;. The descriptors do not match → NoSuchMethodError.

Call site 2 fails identically in the other direction: the Shade Plugin rewrites the JObject argument descriptor to shaded/org/json4s/JsonAST$JObject while the Spark API method still expects org/json4s/JsonAST$JObject.

Both failures are unconditional once the conditions are met. field.metadata in Spark defaults to Metadata.empty (never null), so both code paths are always executed.

Conditions required to reproduce (problems 1 & 2):

  • Spark 3.4 or later (spark-sql-api as a standalone JAR was introduced in 3.4; before 3.4, Metadata lived in spark-sql and the relevant methods were not part of a public API JAR)
  • The consuming application shades lance-spark-bundle (or includes it in a fat-JAR execution) and relocates org.json4s

3. IndexUtils.toJsonjson4s-jackson not available on OSS Spark

IndexUtils.toJson (called from FragmentBasedIndexJob.run to serialize index creation arguments) imports org.json4s.jackson.JsonMethods.{compact, render}:

import org.json4s.JsonAST._
import org.json4s.jackson.JsonMethods.{compact, render}

def toJson(args: Seq[LanceNamedArgument]): String = {
  if (args.isEmpty) "{}"
  else compact(render(JObject(...)))
}

json4s-jackson is not a declared dependency of lance-spark. and it is absent from OSS Spark's classpath. When a consumer shades lance-spark-bundle and includes org.json4s:*, they typically pull in json4s-core and/or json4s-native but not json4s-jackson, leaving the shaded references to shaded/org/json4s/jackson/JsonMethods$ unresolvable at runtime.

The failure is intermittent: toJson short-circuits to "{}" when args is empty (e.g.CREATE INDEX without options), so it is only triggered by index creation calls that pass options (e.g. FTS indexes with tokenizer configuration).


Why the Fix Must Be in the Library

The only consumer-side approach for problems 1 & 2 is to exclude LanceArrowUtils from the json4s relocation rule so that its descriptors remain unshaded. This relies on knowledge of the library's internal implementation, breaks silently if the class is moved or renamed, and would need to be re-evaluated for every library release. It is not a supportable expectation for library users.

More fundamentally, the root cause of problems 1 & 2 is a design choice within LanceArrowUtils: it calls Spark API methods that carry json4s types in their signatures and uses the results with internal json4s operations in the same expression. No external configuration can safely separate these two requirements.

For problem 3, the fix must be in the library because json4s-jackson is not a guaranteed transitive dependency in all Spark distributions and is absent from OSS Spark.


Fix

All three replacements use com.fasterxml.jackson.databind.ObjectMapper, which is available on every Spark cluster classpath and carries no json4s type in any method descriptor.

Problem 1 (toArrowField): mapper.writeValueAsString(field.getMetadata) serialises a java.util.Map[String, String] to a JSON object string; Metadata.fromJson parses it back, the same round-trip as the json4s version.

Problem 2 (fromArrowSchema): Metadata.json returns the same JSON string that
Metadata.jsonValue serialises to internally; mapper.readValue deserialises it to a map, the same result.

Problem 3 (IndexUtils.toJson): An ObjectNode is built field-by-field and serialised with mapper.writeValueAsString, semantically identical output. The ObjectMapper instance is held as a private singleton on the IndexUtils object; ObjectMapper is thread-safe for serialisation once constructed.

No json4s type appears in any descriptor after these changes. The fixes are independent of the consuming application's shading configuration.

@github-actions github-actions Bot added the bug Something isn't working label Apr 21, 2026
@ivscheianu ivscheianu changed the title fix: no such method error in lance arrow util when json4s is shaded fix: no such method error in lance arrow util due to transitive json4s usage Apr 22, 2026
Copy link
Copy Markdown
Collaborator

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for the fix!

@hamersaw hamersaw merged commit b4d86fa into lance-format:main Apr 27, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants