Skip to content

Fix provider update race condition and Fury thread-local instance churn#1541

Draft
Copilot wants to merge 2 commits into
masterfrom
copilot/review-recent-issues
Draft

Fix provider update race condition and Fury thread-local instance churn#1541
Copilot wants to merge 2 commits into
masterfrom
copilot/review-recent-issues

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 15, 2026

Two independent bugs: a race condition causing unavailableProviderException under load when new providers register, and unbounded Fury instance creation causing full GC at high throughput.

AbstractCluster: connection-before-address ordering (#1490)

updateAllProviders and updateProviders were registering addresses in AddressHolder before connections were established in ConnectionHolder. This opened a window where load balancers could select a provider whose transport wasn't ready yet.

Fix: establish connections first, then make addresses visible.

// Before — address visible before connection exists
addressHolder.updateAllProviders(providerGroups);
connectionHolder.updateAllProviders(providerGroups);

// After — connection ready before address is discoverable
connectionHolder.updateAllProviders(providerGroups);
addressHolder.updateAllProviders(providerGroups);

FurySerializer: stop destroying thread-local Fury instances (#1424)

Every encode/decode call ended with fury.clearClassLoader(contextClassLoader) in a finally block, which destroys the thread-local Fury instance unconditionally. The next call reconstructs it from scratch — at 20k TPS this produced ~4,500 Fury instances and triggered FGC.

Fix: remove clearClassLoader from the hot path. fury.setClassLoader(contextClassLoader) at the start of each call already handles classloader changes correctly (it creates a new Fury instance only when the classloader actually differs), so removing the teardown allows thread-local instances to be reused across calls.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • repository.jboss.org
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java -classpath /home/REDACTED/work/sofa-rpc/sofa-rpc/.mvn/wrapper/maven-wrapper.jar -Dmaven.home=/home/REDACTED/work/sofa-rpc -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/sofa-rpc/sofa-rpc org.apache.maven.wrapper.MavenWrapperMain -f pom.xml -B -V -e -Dfindbugs.skip -Dcheckstyle.skip -Dpmd.skip=true -Dspotbugs.skip -Denforcer.skip -Dmaven.javadoc.skip -DskipTests -Dmaven.test.skip.exec -Dlicense.skip=true (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.

…ance creation overhead (#1424)

Co-authored-by: EvenLjj <15122299+EvenLjj@users.noreply.github.com>
Copilot AI changed the title [WIP] Review recent project issues Fix provider update race condition and Fury thread-local instance churn Mar 15, 2026
Copilot AI requested a review from EvenLjj March 15, 2026 15:15
@stale
Copy link
Copy Markdown

stale Bot commented May 16, 2026

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale Bot added the wontfix This will not be worked on label May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wontfix This will not be worked on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants