Skip to content

JDBC: Automate schema version test discovery#4583

Open
venkateshwaracholan wants to merge 3 commits into
apache:mainfrom
venkateshwaracholan:jdbc-auto-schema-version-tests
Open

JDBC: Automate schema version test discovery#4583
venkateshwaracholan wants to merge 3 commits into
apache:mainfrom
venkateshwaracholan:jdbc-auto-schema-version-tests

Conversation

@venkateshwaracholan
Copy link
Copy Markdown
Contributor

@venkateshwaracholan venkateshwaracholan commented May 29, 2026

Fixes #2675

Summary

Run JDBC metastore integration tests against every H2 schema version discovered on the test classpath without maintaining a separate test class per schema version.

Changes

  • Add H2SchemaVersions to discover available h2/schema-vN.sql resources from the test classpath.
  • Add AtomicMetastoreManagerWithJdbcBasePersistenceImplSchemaTest using JUnit @TestFactory, DynamicContainer, and DynamicTest to execute the JDBC metastore test suite against each discovered schema version.
  • Remove the manually maintained schema-specific test subclasses (V0 through V4).

Validation

Ran:

./gradlew :polaris-relational-jdbc:test \
  --tests "org.apache.polaris.persistence.relational.jdbc.AtomicMetastoreManagerWithJdbcBasePersistenceImplV*SchemaTest"

Checklist

  • 🛡️ Don't disclose security issues! (contact security@apache.org)
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

Copy link
Copy Markdown
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, @venkateshwaracholan !


@Override
public int schemaVersion() {
return ${schemaVersion};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea... but it feels a bit like an overkill to me 😅

Could we not do the same with JUnit test factories and without on-the-fly code generation?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, using TestFactory and DynamicTests sounds like a better approach.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dimas-b @nandorKollar, good call. I dropped the Gradle codegen and switched to @testfactory / dynamic tests instead.

ClassLoader classLoader = H2SchemaVersions.class.getClassLoader();
SortedSet<Integer> versions = new TreeSet<>();
for (int version = 0; ; version++) {
String resource = String.format("h2/schema-v%d.sql", version);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd keep the the String.format which is used in the actual schema executor code in AtomicMetastoreManagerWithJdbcBasePersistenceImplTest base class:
String.format("%s/schema-v%d.sql", DatabaseType.H2.getDisplayName(), version)

SortedSet<Integer> versions = new TreeSet<>();
for (int version = 0; ; version++) {
String resource = String.format("h2/schema-v%d.sql", version);
if (classLoader.getResource(resource) == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this will skip schema files, when the numbering is not continuous, like for example: schema-0, schema-1, schema-3. Not sure that this we should prepare for this scenario too, or should we?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @nandorKollar 's comment and it's not a "nit" IMHO :)

I'd check every version number until DatabaseType.getLatestSchemaVersion() and fail if any version is missing the corresponding schema file resource.

@nandorKollar
Copy link
Copy Markdown
Contributor

Although the current approach works, I don't really like the reflective invocation and the fact, that we need to manually implement BeforeEach. I think we can simplify that part using ParmetrizedTestClass. Instead of dynamics tests, how about something like this:

@ParameterizedClass
@MethodSource("schemaVersions")
public class AtomicMetastoreManagerWithJdbcBasePersistenceImplSchemaTest extends AtomicMetastoreManagerWithJdbcBasePersistenceImplTest {
    @Parameter
    int schemaVersion;

    static SortedSet<Integer> schemaVersions() {
        ClassLoader classLoader = H2SchemaVersions.class.getClassLoader();
        SortedSet<Integer> versions = new TreeSet<>();
        for (int version = 0; ; version++) {
            String resource = String.format("h2/schema-v%d.sql", version);
            if (classLoader.getResource(resource) == null) {
                break;
            }
            versions.add(version);
        }
        if (versions.isEmpty()) {
            throw new IllegalStateException("No H2 schema files found on classpath");
        }
        return versions;
    }

    @Override
    public int schemaVersion() {
        return schemaVersion;
    }
}

Would this work? Does it look better than dynamic tests?


@TestFactory
Stream<DynamicNode> metastoreTestsForAllH2SchemaVersions() {
return H2SchemaVersions.discover().stream().map(this::testsForSchemaVersion);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for later: I'd rather we did this testing with PostgreSQL, which is the "main" database 🤔 ... H2 is not expected to be used in "prod", so full schema coverage on H2 does not seem to add assurance for end users 🤔

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we open a followup issue for that? I think Postgres already has a testcontainer, it seems that we already use it in Polaris integration tests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's keep this PR as "pure" refactoring and change from H2 to Posgres later.

Comment on lines +59 to +64
private static void runTest(Method method, int schemaVersion) throws Exception {
AtomicMetastoreManagerWithJdbcBasePersistenceImplTest testInstance =
new SchemaVersionTest(schemaVersion);
testInstance.setupPolarisMetaStoreManager();
method.setAccessible(true);
method.invoke(testInstance);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is flaky, IMHO, because we provide only a very narrow sub-set of functionality that @Test method may expect from the JUnit framework... but overall, it's probably ok for now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b that's why I thought that maybe using ParameterizedClass could be a better choice here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to try ParameterizedClass... TIL that it came with JUnit 6 😅

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SortedSet<Integer> versions = new TreeSet<>();
for (int version = 0; ; version++) {
String resource = String.format("h2/schema-v%d.sql", version);
if (classLoader.getResource(resource) == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @nandorKollar 's comment and it's not a "nit" IMHO :)

I'd check every version number until DatabaseType.getLatestSchemaVersion() and fail if any version is missing the corresponding schema file resource.

@venkateshwaracholan
Copy link
Copy Markdown
Contributor Author

Thanks @dimas-b and @nandorKollar for the feedback.

Switched from @TestFactory/reflection to @ParameterizedClass, so the inherited tests run with the normal JUnit lifecycle.
Updated schema discovery to validate every version from 0 through DatabaseType.H2.getLatestSchemaVersion() and fail if any expected schema resource is missing.
Reused the same schema resource path construction as the base test.

String resource =
String.format("%s/schema-v%d.sql", DatabaseType.H2.getDisplayName(), version);
if (classLoader.getResource(resource) == null) {
missingVersions.add(version);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this looks good, but what if we make it simpler. We can simply use DatabaseType.getLatestSchemaVersion() to create the parameters from 0..DatabaseType.getLatestSchemaVersion()-1, and parametrise the test with that number. The test itself will try to load the file, and if let's say v2 will be missing, then it will eventually fail, as we expect. Maybe the failure report isn't that elegant as with explicitly checking for the continuous versioning. Basically we don't even need this class, but

  static Stream<Integer> schemaVersions() {
    return IntStream.rangeClosed(0, DatabaseType.getLatestSchemaVersion());
  }

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run JDBCPersistence test against all schema version (not just the latest)

3 participants