Skip to content

ATLAS-5336: Upgrade Kafka to 3.9.1; embedded notification broker uses KRaft (no ZooKeeper)#689

Open
ramackri wants to merge 1 commit into
apache:masterfrom
ramackri:ATLAS-5336
Open

ATLAS-5336: Upgrade Kafka to 3.9.1; embedded notification broker uses KRaft (no ZooKeeper)#689
ramackri wants to merge 1 commit into
apache:masterfrom
ramackri:ATLAS-5336

Conversation

@ramackri

@ramackri ramackri commented Jul 4, 2026

Copy link
Copy Markdown

Summary

  • Bump Apache Kafka client and test broker from 2.8.2 to 3.9.1 (Scala 2.13).
  • Replace ZooKeeper-based embedded notification broker with KRaft via KafkaClusterTestKit.
  • Update docker dev Kafka image to kafka_2.13-3.9.1.

Motivation

  • Kafka 2.8.x clients are only partially compatible with modern 3.9.x / 4.x brokers.
  • Embedded dev/test broker no longer requires an in-process ZooKeeper instance.
  • Aligns Atlas with the 3.9.x bridge release before external cluster KRaft migration.

Changes

Area Before After
kafka.version 2.8.2 3.9.1
kafka.scala.binary.version 2.12 2.13
EmbeddedKafkaServer In-process ZK + KafkaServer KafkaClusterTestKit (KRaft)
Docker atlas-kafka image kafka_2.12-* kafka_2.13-3.9.1
Distro embedded config ZK-oriented settings KRaft; bootstrap.servers set at runtime
webapp/pom.xml Transitive JAX-RS only Explicit jackson-jaxrs + jsr311-api for REST after Jackson bump

webapp/pom.xml — why these dependencies were added

This PR bumps root jackson.version from 2.12.7 → 2.16.2 (required by Kafka 3.9 KafkaClusterTestKit test dependencies). That version flows into atlas-intg, which declares:

  • jackson-jaxrs-base at ${jackson.version}
  • jackson-jaxrs-json-provider at ${jackson.version}

Atlas server REST still runs on Jersey 1.19 (JAX-RS 1.1 / JSR-311). After the Jackson bump, the rebuilt server WAR pulled in Jackson JAX-RS 2.16 transitively while Jersey and its JSON providers expect the older 2.12-era JAX-RS integration. In manual Docker testing this produced an inconsistent classpath — Atlas failed to serve REST reliably (GET /api/atlas/admin/version did not return 200) even though the Kafka/embedded-broker changes were correct.

Fix in webapp/pom.xml:

Dependency Version Purpose
jackson-jaxrs-base 2.12.7 (pinned) JAX-RS integration layer for Jackson — kept at 2.12.7 for Jersey 1.19 compatibility
jackson-jaxrs-json-provider 2.12.7 (pinned) JacksonJaxbJsonProvider used by Jersey for JSON request/response serialization on Atlas REST APIs
jsr311-api 1.1.1 JAX-RS 1.1 API (javax.ws.rs.*) required by Jersey 1.x JSON providers

atlas-intg exclusions on the two jackson-jaxrs-* artifacts prevent the WAR from also loading the 2.16.2 transitive copies. Webapp now owns a single, known-good JAX-RS Jackson version for the server while the rest of the build can use Jackson 2.16.2 where Kafka requires it.

This is intentionally not a Jersey upgrade — it decouples server-side JAX-RS Jackson (2.12.7) from the core Jackson bump (2.16.2) needed for Kafka 3.9.1.

Testing

Unit tests

Test Module Command Result
KafkaNotificationTest notification mvn -pl notification -Dtest=KafkaNotificationTest test PASS — embedded KRaft broker starts; produce/consume on ATLAS_HOOK
NotificationHookConsumerKafkaTest webapp mvn -pl webapp -am -Dtest=NotificationHookConsumerKafkaTest -DskipEnunciate=true test PASS (3 tests) — embedded broker, hook consumer, import topic

NotificationHookConsumerKafkaTest was run with -am so atlas-server-api and sibling modules match compile-time APIs.

Manual — embedded KRaft broker smoke test

Unit tests exercise EmbeddedKafkaServer inside a short-lived Surefire JVM. A separate manual check verified the same path in the full Atlas server process (complete WAR classpath, normal Spring startup order, real graph store and Solr).

Setup: Atlas server running in Docker with Postgres and Solr. External Kafka broker was not required — embedded mode runs the broker in-process.

Steps:

  1. Set atlas.notification.embedded=true in atlas-application.properties (placeholder atlas.kafka.bootstrap.servers=localhost:9027).
  2. Restart Atlas and poll GET /api/atlas/admin/version until HTTP 200.
  3. Confirm server logs show:
    • EmbeddedKafkaServer.start(isEmbedded=true)
    • Starting embedded KRaft kafka (log.dir=.../data/kafka/kafka)
    • Embedded KRaft kafka server started at localhost:<ephemeral-port>
  4. Verify atlas.kafka.bootstrap.servers was rewritten at runtime to the live broker address (placeholder not used as-is).
  5. Restore atlas.notification.embedded=false and external bootstrap servers; restart Atlas.

Result: PASS — KRaft broker started inside the real server JVM; Atlas REST returned 200; graph, Solr, Jersey, and notification stack all initialized.

Manual — external Kafka 3.9.1 (production-like path)

Verified Atlas with atlas.notification.embedded=false against an external Kafka 3.9.1 broker (kafka_2.13-3.9.1), using the rebuilt Atlas server image.

Flow What was verified Result
Hive metadata ingestion CREATE TABLE via Hive; Atlas REST search for hive_column on new table Entity present — hook consumer processed ATLAS_HOOK
Classification → Ranger PII tag applied via Atlas REST; TagSync log showed ATLAS_ENTITIES consumption; tag mapping visible in Ranger Admin Classification propagated
Hive tag-based access control SELECT on classified column as allowed/denied users Deny and mask enforced per policy
Hive audit Ranger Admin access audit for Hive service Audit rows recorded for table access
Kafka authorizer + audit Produce/consume on Kerberos Kafka topic under Ranger policy Authorizer allow/deny reflected in Ranger audit

Overall: PASS — full metadata path (Hive → external Kafka → Atlas → ATLAS_ENTITIES → TagSync → Ranger) and audit paths worked with Kafka 3.9.1.

Notes

  • Production deployments should continue using atlas.notification.embedded=false and external bootstrap.servers.
  • Jackson core/databind bumped to 2.16.2 for Kafka 3.9 testkit; server JAX-RS Jackson intentionally remains 2.12.7 for Jersey 1.19 compatibility.

https://issues.apache.org/jira/browse/ATLAS-5336

… KRaft

Bump kafka-clients and embedded test broker from 2.8.2 to 3.9.1 (Scala 2.13),
replace ZooKeeper-based EmbeddedKafkaServer with KafkaClusterTestKit, and
update docker dev Kafka image packaging to kafka_2.13-3.9.1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant