Skip to content

## RMI AccessControlException Regression: v2.4.0 works, v2.5.0+ fails (Tomcat + Security Manager) #17643

@cdash415

Description

@cdash415

Describe the bug

Java application auto instrumented(v2.24.0+) using annotation and managed by Opentelemetry Operator v1.40.0

Regression

  • v2.4.0: Production deployment successful, no errors
  • v2.5.0, v2.24.0, v2.26.1: AccessControlException consistently (post Security Manager enhancement > v2.5.0)

Error

Exception in thread "RMI TCP Connection(idle)" 
java.security.AccessControlException: access denied 
  ("java.net.SocketPermission" "10.149.46.62:55958" "accept,resolve")
    at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.checkAcceptPermission(TCPTransport.java:691)
    at sun.rmi.transport.tcp.TCPTransport.checkAcceptPermission(TCPTransport.java:311)

Error details:

  • Thread: "RMI TCP Connection(idle)"
  • Location: SecurityManager.checkAccept() at line 1035
  • Permission: SocketPermission for IP 10.149.46.62:55958
  • Actions denied: "accept,resolve"

Root Cause Chain

  1. Trigger: User uploads project ZIP file
  2. Process: Tomcat Catalina deploys the project dynamically
  3. Registration: Application manager registers project
  4. RMI Activation: Project registration triggers RMI service initialization
  5. Security Block: Java Security Manager denies RMI socket permission

Critical Finding

This error cannot be ignored as RMI errors occur during dynamic project deployment in the application. This is a functional blocker because:

  • Project deployment is core functionality
  • RMI is required for the application manager
  • Without RMI working, projects cannot be properly deployed/registered

What We Tested

Created multiple standalone tests with Security Manager:

  • Basic RMI with UnicastRemoteObject
  • Custom RMI base classes with delegation pattern
  • Restrictive security policies
  • Tested with both JDK 8 and JDK 17

Result: All standalone basic RMI tests PASS with v2.24.0

Production Tomcat: Consistently FAILS with v2.5.0+

The Problem

We cannot reproduce this issue outside Tomcat environment. All our simple RMI standalone tests pass, but production consistently fails. This suggests the issue is specific to Tomcat's container-managed environment.

Why We Need Help

Something changed in agent v2.5.0+ that works fine in standalone JVM but causes AccessControlException in Tomcat container. We don't have the expertise to:

  1. Identify what RMI instrumentation changed between v2.4.0 and v2.5.0+
  2. Understand why it works standalone but fails in real production setup/Tomcat
  3. Debug agent's bytecode instrumentation behavior after v2.5.0

Our Constraints

  • Cannot disable Security Manager (compliance requirement)
  • Don't want to disable rmi instrumentation for the java agent(using -Dotel.instrumentation.rmi.enabled=false) - this has been verified working once disabled for the same application. As we don't know what all metrics/traces will be missed accross cluster if we disable RMI instrumentation
  • Cannot modify application code (no source access)
  • Cannot reproduce outside client application in one of the k8s environment
  • Blocked from upgrading agent in production

Request

Could you:

  1. Review what changed in RMI instrumentation between v2.4.0 and v2.5.0+?
  2. Suggest debugging steps to identify why Tomcat environment behaves differently?
  3. Help us understand if this is an agent instrumentation issue or environment-specific configuration?

Available Resources

We have attached:

  • Security policies
  • Test code demonstrating standalone tests pass (4 test files)
  • Production error logs
  • Willing to provide more diagnostics or test patches

This regression is blocking our ability to upgrade the agent in production. Any guidance would be greatly appreciated.

Steps to reproduce

Test Files:

Run:
java -javaagent:opentelemetry-javaagent-v2.24.0.jar \ -Djava.security.manager \ -Djava.security.policy=production.policy \ -Dotel.javaagent.experimental.security-manager-support.enabled=true \ RMISecurityManagerTest 2>&1 & PID=$! sleep 12 kill $PID 2>/dev/null wait $PID 2>/dev/null echo "Exit code: $?"

Expected behavior

when Java-agent injected, it shouldn't cause SocketException or else it shouldn't block primary application

Actual behavior

Exception in thread "RMI TCP Connection(idle)"
java.security.AccessControlException: access denied
("java.net.SocketPermission" "10.149.46.62:55958" "accept,resolve")
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.checkAcceptPermission(TCPTransport.java:691)
at sun.rmi.transport.tcp.TCPTransport.checkAcceptPermission(TCPTransport.java:311)

Javaagent or library instrumentation version

v2.5.0, v2.24.0, v2.26.1

Environment

  • Java: Amazon Corretto 17.0.16
  • Container: Tomcat 10.1.42
  • Security Manager: Enabled
  • Application: Enterprise Java application using RMI

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds author feedbackWaiting for additional feedback from the authorneeds triageNew issue that requires triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions