Skip to content

[common] Support catalog context without Hadoop configuration#8193

Merged
JingsongLi merged 4 commits into
apache:masterfrom
tchivs:paimon-create-context-without-hadoop
Jun 15, 2026
Merged

[common] Support catalog context without Hadoop configuration#8193
JingsongLi merged 4 commits into
apache:masterfrom
tchivs:paimon-create-context-without-hadoop

Conversation

@tchivs

@tchivs tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Purpose

This is a narrower follow-up to #6653 for engines such as Trino that provide their own FileIO and should not require Hadoop configuration initialization.

Instead of refactoring the FileIO/CatalogContext hierarchy, this keeps all existing CatalogContext.create(...) behavior unchanged and adds an explicit CatalogContext.createWithoutHadoop(...) factory for the no-Hadoop path.

Changes

  • Keep existing CatalogContext.create(...) overloads loading Hadoop configuration by default for compatibility.
  • Add CatalogContext.createWithoutHadoop(...) for callers that provide their own FileIOLoader.
  • Make hadoopConf() fail with a clear IllegalStateException when called on a Hadoop-free context.
  • Add catalog tests, including a classloader test that filters Hadoop classes and verifies the new path can create a catalog without Hadoop on the classpath.

Tests

  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=CatalogFactoryTest test
  • mvn -pl paimon-common -am -Pfast-build -DfailIfNoTests=false -Dtest=FileIOTest,ResolvingFileIOTest test

Notes

A companion Trino change is being prepared to consume this API by passing Trino's FileIO loader and disabling Hadoop default configuration loading.

@tchivs

tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Added the Trino 440 companion branch that consumes this API:

The Trino side uses CatalogContext.createWithoutHadoop(...) with its own PaimonFileIOLoader, disables Hadoop default config loading, and avoids installing SecurityContext.

Validation run locally:

  • Paimon: mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=CatalogFactoryTest test
  • Paimon: mvn -pl paimon-common -am -Pfast-build -DfailIfNoTests=false -Dtest=FileIOTest,ResolvingFileIOTest test
  • Trino 440 companion: JAVA_HOME=/root/.jdks/temurin-21.0.11 PATH=/root/.jdks/temurin-21.0.11/bin:$PATH ./mvnw -pl plugin/trino-paimon -Dtest=TrinoConnectorFactoryTest,TrinoPluginTest test

@tchivs

tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

@JingsongLi PTAL.

This is a narrower follow-up to #6653 based on your previous feedback:

  • reduced the change from a broad FileIO/CatalogContext refactor to an explicit CatalogContext.createWithoutHadoop(...) API
  • kept all existing CatalogContext.create(...) behavior unchanged for compatibility
  • added a no-Hadoop classloader test
  • added a Trino 440 companion branch consuming the new API: https://github.com/tchivs/trino/tree/paimon/trino-440-paimon-1.5

Thanks.

@tchivs

tchivs commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Additional Trino 440 companion validation passed locally on branch https://github.com/tchivs/trino/tree/paimon/trino-440-paimon-1.5:

  • JAVA_HOME=/root/.jdks/temurin-21.0.11 PATH=/root/.jdks/temurin-21.0.11/bin:$PATH ./mvnw -pl plugin/trino-paimon -Dtest=TrinoColumnHandleTest,TrinoFilterConverterTest,TrinoPartitioningHandleTest,TrinoRowTest,TrinoSplitTest,TrinoTableHandleTest,PaimonTypeTest,TestTrinoMetadata,TrinoConnectorFactoryTest,TrinoPluginTest test

    • Result: 24 tests passed.
  • JAVA_HOME=/root/.jdks/temurin-21.0.11 PATH=/root/.jdks/temurin-21.0.11/bin:$PATH ./mvnw -pl plugin/trino-paimon -Dtest=TrinoITCase test

    • Result: 29 tests passed.

TrinoITCase starts a local DistributedQueryRunner, creates a temporary Paimon warehouse, creates and writes Paimon tables, and verifies Trino read/query paths against Paimon 1.5 with the no-Hadoop catalog context path.

hadoopConf == null ? getHadoopConfiguration(options) : hadoopConf);
hadoopConf == null && !loadHadoopConf
? null
: new SerializableConfiguration(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just create a method and do try catch? If there is no Hadoop class, set it directly to NULL.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I updated this in 8924d69: the extra loadHadoopConf flag and createWithoutHadoop API were removed, and CatalogContext now uses a private helper to load Hadoop configuration with try/catch. If Hadoop classes are not available, the stored Hadoop configuration is set to null.

I also updated the no-Hadoop classloader test to use the existing CatalogContext.create(options, preferIOLoader, fallbackIOLoader) path and verify the Hadoop configuration is null while catalog creation still works.

Local checks passed:

  • mvn -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=CatalogFactoryTest test
  • mvn -pl paimon-core spotless:check

GitHub Actions are running on the updated commit now.

@JingsongLi

Copy link
Copy Markdown
Contributor

+1

@JingsongLi JingsongLi merged commit c3dd464 into apache:master Jun 15, 2026
12 checks passed
@tchivs tchivs deleted the paimon-create-context-without-hadoop branch June 15, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants