Search before asking
Description
Motivation
The current flink-agents-dist-*.jar bundles a large number of third-party
dependencies without relocating them:
jackson-databind 2.18.2
kafka-clients 4.0.0
kotlin-stdlib 1.x (transitive from openai-java)
anthropic-java, openai-java, etc.
When users drop this fat jar into the Flink cluster's lib/ directory, or
include it via --jars at job submission time, these classes land in the
same classloader as the user's job code. If the user's job also depends on
e.g. kafka-clients (a very common scenario in Flink streaming jobs) or
jackson-databind with a different version, runtime errors like
NoSuchMethodError or ClassCastException are likely.
Note that Flink itself already relocates its own bundled dependencies
(e.g. flink-shaded-jackson, flink-shaded-guava) precisely to avoid
this problem. flink-agents-dist should follow the same approach.
Analysis
The top offenders by size and conflict risk:
| Dependency |
Size |
Risk |
Notes |
kafka-clients 4.0.0 |
~22 MB |
High |
Many Flink jobs use Kafka; major version gap |
openai-java + kotlin-stdlib |
~80 MB |
Medium |
Uncommon in user code but large |
jackson-databind 2.18.2 |
~6 MB |
Medium |
Flink uses shaded jackson, user code may not |
anthropic-java |
~17 MB |
Low |
Rarely used alongside |
Proposed Solution
Relocate in shade plugin
Add <relocation> rules in dist/pom.xml to move third-party packages
under an org.apache.flink.agents.shaded.* namespace:
<relocations>
<relocation>
<pattern>com.fasterxml.jackson</pattern>
<shadedPattern>org.apache.flink.agents.shaded.jackson</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.kafka</pattern>
<shadedPattern>org.apache.flink.agents.shaded.kafka</shadedPattern>
</relocation>
<!-- etc. -->
</relocations>
This requires verifying that internal usages of these libraries (especially
Jackson annotations on public API classes, Kafka SPI configurations) still
work correctly after relocation.
Are you willing to submit a PR?
Search before asking
Description
Motivation
The current
flink-agents-dist-*.jarbundles a large number of third-partydependencies without relocating them:
jackson-databind2.18.2kafka-clients4.0.0kotlin-stdlib1.x (transitive fromopenai-java)anthropic-java,openai-java, etc.When users drop this fat jar into the Flink cluster's
lib/directory, orinclude it via
--jarsat job submission time, these classes land in thesame classloader as the user's job code. If the user's job also depends on
e.g.
kafka-clients(a very common scenario in Flink streaming jobs) orjackson-databindwith a different version, runtime errors likeNoSuchMethodErrororClassCastExceptionare likely.Note that Flink itself already relocates its own bundled dependencies
(e.g.
flink-shaded-jackson,flink-shaded-guava) precisely to avoidthis problem. flink-agents-dist should follow the same approach.
Analysis
The top offenders by size and conflict risk:
kafka-clients4.0.0openai-java+kotlin-stdlibjackson-databind2.18.2anthropic-javaProposed Solution
Relocate in shade plugin
Add
<relocation>rules indist/pom.xmlto move third-party packagesunder an
org.apache.flink.agents.shaded.*namespace:This requires verifying that internal usages of these libraries (especially
Jackson annotations on public API classes, Kafka SPI configurations) still
work correctly after relocation.
Are you willing to submit a PR?