Skip to content

Add TestInstanceOperationMisc with independent tests#154

Open
proud-parselmouth wants to merge 3 commits into
devfrom
anubagar/TestInstanceOperationMisc
Open

Add TestInstanceOperationMisc with independent tests#154
proud-parselmouth wants to merge 3 commits into
devfrom
anubagar/TestInstanceOperationMisc

Conversation

@proud-parselmouth
Copy link
Copy Markdown
Collaborator

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

This PR is part of the TestInstanceOperation dependency chain reduction effort. See TestInstanceOperation Dependency Chain Reduction for full analysis. Depends on #153

Description

  • Here are some details about my PR, including screenshots of any UI changes:

What: Extracts 6 miscellaneous tests from the monolithic TestInstanceOperation into a new independent test class TestInstanceOperationMisc, extending TestInstanceOperationBase.

How:

  • Created TestInstanceOperationMisc.java with 6 tests extracted from TestInstanceOperation:
    • testUnsetInstanceOperationOnSwapInWhenSwapping — expects HelixException when trying to ENABLE a SWAP_IN instance while the original instance with the same logicalId is still active
    • testNodeSwapAddSwapInFirst — add a SWAP_IN instance before any swap-out is set, verify the cluster converges
    • testSwapEvacuateAddRemoveEvacuate — expects HelixException when removing EVACUATE while an ENABLE instance with the same logicalId exists
    • testUnknownDoesNotTriggerRebalance — add UNKNOWN instances (both new zone and same logicalId as existing node), verify no IdealState changes
    • testEvacuationWithOfflineInstancesInCluster — evacuate an instance while 2 other instances are offline, verify min active replicas maintained and evacuating instance leaves top state
    • testEvacuateWithDisabledPartition — disable all partitions on an instance then evacuate, verify no upward state transitions occur; re-enable and re-disable partitions, verify no spurious transitions
  • Each test explicitly calls enabledTopologyAwareRebalance() at its start to be fully self-contained (the base class @BeforeMethod disables it by default).
  • Removed all dependsOnMethods chains — every test runs independently.
  • For testEvacuateWithDisabledPartition, test resources are added to _allDBs for proper @BeforeMethod cleanup, and a custom StateTransitionCountStateModelFactory participant is stopped at end to ensure clean teardown.
  • Chain cleanup code (e.g. removeOfflineOrInactiveInstances, dropTestDBs) removed from individual tests since @BeforeMethod handles full isolation reset.
  • TestInstanceOperation.java is not modified in this PR.

Tests

  • The following tests are written for this issue:
  1. TestInstanceOperationMisc.testUnsetInstanceOperationOnSwapInWhenSwapping
  2. TestInstanceOperationMisc.testNodeSwapAddSwapInFirst
  3. TestInstanceOperationMisc.testSwapEvacuateAddRemoveEvacuate
  4. TestInstanceOperationMisc.testUnknownDoesNotTriggerRebalance
  5. TestInstanceOperationMisc.testEvacuationWithOfflineInstancesInCluster
  6. TestInstanceOperationMisc.testEvacuateWithDisabledPartition
  • The following is the result of the mvn test command on the appropriate module:
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 191.754 s - in org.apache.helix.integration.rebalancer.TestInstanceOperationMisc [INFO] [INFO] Results: [INFO] [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 03:16 min [INFO] Finished at: 2026-03-27T09:39:48+05:30 [INFO] ------------------------------------------------------------------------

(If CI test fails due to known issue, please specify the issue and test PR locally. Then copy & paste the result of "mvn test" to here.)

Changes that Break Backward Compatibility (Optional)

  • My PR contains changes that break backward compatibility or previous assumptions for certain methods or API. They include:

(Consider including all behavior changes for public methods or API. Also include these changes in merge description so that other developers are aware of these changes. This allows them to make relevant code changes in feature branches accounting for the new method/API behavior.)

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

_gSetupTool.getClusterManagementTool().setInstanceOperation(CLUSTER_NAME,
evacuateInstanceName, InstanceConstants.InstanceOperation.EVACUATE);

verifier(() -> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire thing is essentially validateAssignmentInEv() in the TestInstanceOperationBase. You can directly use that.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstracted out the acceptable state set check and top state set check, and added validateAssignmentInEv below

ExternalView ev = assignment.get(resource);
for (String partition : ev.getPartitionSet()) {
AtomicInteger activeReplicaCount = new AtomicInteger();
ev.getStateMap(partition).values().stream().filter(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These states varies from the acceptable states mentioned in testEvacuationWithOfflineInstancesInCluster

ngngwr
ngngwr previously approved these changes Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@ngngwr ngngwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for addressing the comments

@proud-parselmouth proud-parselmouth force-pushed the anubagar/TestInstanceOperationEvacuationCancel branch from 19b1e52 to deb4797 Compare April 16, 2026 06:52
Base automatically changed from anubagar/TestInstanceOperationEvacuationCancel to dev April 16, 2026 11:56
@proud-parselmouth proud-parselmouth dismissed ngngwr’s stale review April 16, 2026 11:56

The base branch was changed.

Extract 6 miscellaneous tests from TestInstanceOperation into
TestInstanceOperationMisc extending TestInstanceOperationBase:
- testUnsetInstanceOperationOnSwapInWhenSwapping (expects HelixException)
- testNodeSwapAddSwapInFirst (add SWAP_IN instance first)
- testSwapEvacuateAddRemoveEvacuate (expects HelixException)
- testUnknownDoesNotTriggerRebalance (UNKNOWN instances cause no rebalance)
- testEvacuationWithOfflineInstancesInCluster (evacuate with 2 offline nodes)
- testEvacuateWithDisabledPartition (disable then evacuate, verify no
  upward state transitions)

All tests run independently with no dependsOnMethods chains. Tests call
enabledTopologyAwareRebalance() explicitly. testEvacuateWithDisabledPartition
adds test resources to _allDBs for proper @BeforeMethod cleanup.
TestInstanceOperation is not modified.

Made-with: Cursor
Replace inline hardcoded state filters with ACCEPTABLE_STATE_SET and
TOP_STATE_SET from TestInstanceOperationBase. Add validateAssignmentInEv
call after verifier for consistent assertion pattern. Remove stray
FOLLOWER state that was not in ACCEPTABLE_STATE_SET.

Made-with: Cursor
@proud-parselmouth proud-parselmouth force-pushed the anubagar/TestInstanceOperationMisc branch from e3e2481 to b228258 Compare May 12, 2026 13:38
The default test DBs in TestInstanceOperationBase use LeaderStandby
(top state = LEADER), but TOP_STATE_SET only contained MASTER. This
made the evacuating-instance top-state check in
TestInstanceOperationMisc.testEvacuationWithOfflineInstancesInCluster
silently always pass, weakening the assertion compared to the original
check (which used MASTER || LEADER) it was meant to replace.

Aligns TOP_STATE_SET with ACCEPTABLE_STATE_SET, which already covers
both MasterSlave and LeaderStandby state models.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants