feat(auth): mTLS endpoint for Regional Access Boundaries#13318
feat(auth): mTLS endpoint for Regional Access Boundaries#13318vverman wants to merge 12 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces centralized mTLS enablement checks and adds fallback support for SPIFFE credentials in MtlsUtils and X509Provider, alongside integrating mTLS transport initialization during regional access boundary refreshes. The review feedback suggests optimizing performance by removing redundant configuration checks and file parsing in X509Provider.getKeyStore() and GoogleCredentials.java, and improving robustness in RegionalAccessBoundary.java by replacing only the host name in the IAM credentials URL.
lsirac
left a comment
There was a problem hiding this comment.
Should we be checking GOOGLE_API_USE_MTLS_ENDPOINT?
| * @throws IOException if the configuration file is present but contains missing or malformed | ||
| * files | ||
| */ | ||
| public static boolean canMtlsBeEnabled( |
There was a problem hiding this comment.
I’m not sure that cert being present == automatically use mTLS. They can be using different credentials / not using it at all. So then we’d be adding mTLS setup and calls for credentials that are not actually using it.
I think the decision should be based on the credential type, and perhaps expose some state from the credential that we can use to check if mTLS should happen for these calls.
|
|
||
| @Override | ||
| public NetHttpTransport create() { | ||
| public HttpTransport create() { |
There was a problem hiding this comment.
qq, is this change necessary for the PR? I know this is marked with @internalapi and this isn't exactly customers are expected to interact with directly.
There are some small source and binary compatibility changes (I think very small chance) but I would prefer to keep it as-is unless we absolutely need to
There was a problem hiding this comment.
The reason I chose to do this is because ->
-
MtlsHttpTransportFactory It is an internal class and direct calls to .create() by library consumers are unlikely. Besides, NetHttpTransport is a child class of HttpTransport and the create() method is only used internally.
-
I use it for testing a scenario where mTLS RAB is called when mTLS is enabled.
regionalAccessBoundary_withMtlsEnabled_shouldCallAllowedLocationsUsingMtlsTransportFactory
There was a problem hiding this comment.
Can we avoid modifying the signature if it was only needed to accommodate the testing?
I think if this was introduced to return a stubbed MockHttpTransport in RegionalAccessBoundaryTest, we can instead stub a generic HttpTransportFactory instance in the tests rather than mocking the concrete MtlsHttpTransportFactory.
There was a problem hiding this comment.
NetHttpTransport is simply a default (concrete) implementation of HttpTransport.
Aside from making it easier to test, this also helps align with the parent class of MtlsHttpTransportFactory i.e. HttpTransportFactory which defines
HttpTransport create(); as the contract.
So the origin of i.e. MtlsHttpTransportFactory HttpTransportFactory and the destination i.e. IdentityPoolCredentials both define the contract for HttpTransport only.
I think this avoids adding additional lines of testing logic without affecting the logic.
Added an E2E test to check the call to RAB mtls endpoint and check x-allowed-locations header. Unified redundant method(s) waitForRegionalAccessBoundary.
2c53152 to
48c3b59
Compare
| if (userMtlsPolicy == null) { | ||
| userMtlsPolicy = | ||
| MtlsUtils.getMtlsEndpointUsagePolicy(SystemEnvironmentProvider.getInstance()); | ||
| } | ||
| if (transportFactory instanceof com.google.auth.mtls.MtlsHttpTransportFactory | ||
| || userMtlsPolicy == MtlsUtils.MtlsEndpointUsagePolicy.ALWAYS) { | ||
| url = url.replace("iamcredentials.googleapis.com", "iamcredentials.mtls.googleapis.com"); | ||
| } |
There was a problem hiding this comment.
Ok, how about this:
Every GoogleCredential has one RegionalAccessBoundaryManager. We put the transportFactory, url, and mtlsPolicy inside there (since these only need to be initialized once). The constructor can resolve the mtlsPolicy and cache it, use it to determine the HttpTransportFactory and cache it, and the updated endpoint can be cached as well.
Eventually, I think it might make sense for getREgionalAccessBoundaryUrl() to return the mtls vs non-mtls endpoint automatically, but can be outside of this PR.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive support for mutual TLS (mTLS) endpoint discovery and policy enforcement across Google credentials and regional access boundaries. Key changes include adding utility methods in MtlsUtils to resolve certificate configurations and prepare transport factories, refactoring X509Provider to use these helpers, and upgrading the IAM credentials URL to its mTLS counterpart when applicable. The review feedback highlights critical improvement opportunities: resolving a potential NullPointerException in MtlsUtils when the base transport factory is null, avoiding thread-safety issues in GoogleCredentials by not mutating the shared transportFactory field, and preventing test flakiness in RegionalAccessBoundary by dynamically evaluating the mTLS policy instead of caching it in a static field.
| public static HttpTransportFactory prepareTransportFactoryIfMtlsEnabled( | ||
| HttpTransportFactory baseTransportFactory, | ||
| EnvironmentProvider envProvider, | ||
| PropertyProvider propProvider, | ||
| String certConfigPathOverride) | ||
| throws IOException { | ||
|
|
||
| MtlsEndpointUsagePolicy mtlsPolicy = getMtlsEndpointUsagePolicy(envProvider); | ||
| try { | ||
| if (!canBeEnabled(envProvider, propProvider, certConfigPathOverride)) { | ||
| return baseTransportFactory; | ||
| } | ||
|
|
||
| if (baseTransportFactory instanceof MtlsHttpTransportFactory) { | ||
| // A custom MtlsHttpTransportFactory was already pre-configured by the user. | ||
| // Keep using it as-is without re-initializing. | ||
| return baseTransportFactory; | ||
| } | ||
|
|
||
| if (baseTransportFactory == OAuth2Utils.HTTP_TRANSPORT_FACTORY) { | ||
| // This is the default HttpTransportFactory assigned by credentials. | ||
| // Automatically discover and load client certificates to construct an mTLS factory. | ||
| X509Provider x509Provider = | ||
| new X509Provider(envProvider, propProvider, certConfigPathOverride); | ||
| KeyStore mtlsKeyStore = x509Provider.getKeyStore(); | ||
| return new MtlsHttpTransportFactory(mtlsKeyStore); | ||
| } | ||
|
|
||
| // A user configured non-mTLS HttpTransportFactory was explicitly injected. | ||
| // Reject it to avoid bypassing mTLS enforcement or overriding the user's factory. | ||
| throw new IOException( | ||
| "mTLS is enabled on the system, but a user configured non-mTLS HttpTransportFactory was provided: " | ||
| + baseTransportFactory.getClass().getName()); | ||
|
|
||
| } catch (Exception e) { | ||
| if (mtlsPolicy == MtlsEndpointUsagePolicy.ALWAYS) { | ||
| throw new IOException( | ||
| "mTLS is configured to ALWAYS, but initialization failed: " + e.getMessage(), e); | ||
| } | ||
| // Graceful fallback to standard transport if mTLS initialization fails under AUTO policy | ||
| return baseTransportFactory; | ||
| } | ||
| } |
There was a problem hiding this comment.
If baseTransportFactory is null (which is the default in GoogleCredentials), calling baseTransportFactory.getClass().getName() in the exception block will throw a NullPointerException.
To prevent this and handle the default case robustly, we should default baseTransportFactory to OAuth2Utils.HTTP_TRANSPORT_FACTORY if it is null. This allows automatic mTLS upgrade for default transports while avoiding any potential NullPointerException.
public static HttpTransportFactory prepareTransportFactoryIfMtlsEnabled(
HttpTransportFactory baseTransportFactory,
EnvironmentProvider envProvider,
PropertyProvider propProvider,
String certConfigPathOverride)
throws IOException {
MtlsEndpointUsagePolicy mtlsPolicy = getMtlsEndpointUsagePolicy(envProvider);
try {
HttpTransportFactory factory =
baseTransportFactory == null ? OAuth2Utils.HTTP_TRANSPORT_FACTORY : baseTransportFactory;
if (!canBeEnabled(envProvider, propProvider, certConfigPathOverride)) {
return factory;
}
if (factory instanceof MtlsHttpTransportFactory) {
// A custom MtlsHttpTransportFactory was already pre-configured by the user.
// Keep using it as-is without re-initializing.
return factory;
}
if (factory == OAuth2Utils.HTTP_TRANSPORT_FACTORY) {
// This is the default HttpTransportFactory assigned by credentials.
// Automatically discover and load client certificates to construct an mTLS factory.
X509Provider x509Provider =
new X509Provider(envProvider, propProvider, certConfigPathOverride);
KeyStore mtlsKeyStore = x509Provider.getKeyStore();
return new MtlsHttpTransportFactory(mtlsKeyStore);
}
// A user configured non-mTLS HttpTransportFactory was explicitly injected.
// Reject it to avoid bypassing mTLS enforcement or overriding the user's factory.
throw new IOException(
"mTLS is enabled on the system, but a user configured non-mTLS HttpTransportFactory was provided: "
+ factory.getClass().getName());
} catch (Exception e) {
if (mtlsPolicy == MtlsEndpointUsagePolicy.ALWAYS) {
throw new IOException(
"mTLS is configured to ALWAYS, but initialization failed: " + e.getMessage(), e);
}
// Graceful fallback to standard transport if mTLS initialization fails under AUTO policy
return baseTransportFactory;
}
}| if (userMtlsPolicy == null) { | ||
| userMtlsPolicy = | ||
| MtlsUtils.getMtlsEndpointUsagePolicy(SystemEnvironmentProvider.getInstance()); | ||
| } | ||
| if (transportFactory instanceof com.google.auth.mtls.MtlsHttpTransportFactory | ||
| || userMtlsPolicy == MtlsUtils.MtlsEndpointUsagePolicy.ALWAYS) { | ||
| url = url.replace("iamcredentials.googleapis.com", "iamcredentials.mtls.googleapis.com"); | ||
| } |
There was a problem hiding this comment.
Caching the mTLS endpoint usage policy in a static field (userMtlsPolicy) can lead to test pollution and flakiness, as different tests in the same JVM execution might configure different environment settings but will be stuck with the first cached value.
Since retrieving the policy from the environment provider is extremely fast, we should evaluate it dynamically per request instead of caching it statically.
MtlsUtils.MtlsEndpointUsagePolicy policy =
MtlsUtils.getMtlsEndpointUsagePolicy(SystemEnvironmentProvider.getInstance());
if (transportFactory instanceof com.google.auth.mtls.MtlsHttpTransportFactory
|| policy == MtlsUtils.MtlsEndpointUsagePolicy.ALWAYS) {
url = url.replace("iamcredentials.googleapis.com", "iamcredentials.mtls.googleapis.com");
}References
- Avoid caching system properties or environment variables in class fields if they are only evaluated once during object initialization, as caching is unnecessary in such cases.
| // Automatically discover certificates or enforce mTLS policy if applicable | ||
| // TODO: https://github.com/googleapis/google-cloud-java/issues/13461 | ||
| transportFactory = | ||
| MtlsUtils.prepareTransportFactoryIfMtlsEnabled( | ||
| transportFactory, getEnvironmentProvider(), getPropertyProvider(), null); | ||
|
|
||
| regionalAccessBoundaryManager.triggerAsyncRefresh( | ||
| transportFactory, (RegionalAccessBoundaryProvider) this, token); |
There was a problem hiding this comment.
Mutating the shared transportFactory field of GoogleCredentials inside refreshRegionalAccessBoundaryIfExpired can lead to thread-safety issues or unexpected side effects on other operations using the credentials.
Instead of reassigning the instance field, we can use a local variable for the upgraded transport factory and pass it to triggerAsyncRefresh.
| // Automatically discover certificates or enforce mTLS policy if applicable | |
| // TODO: https://github.com/googleapis/google-cloud-java/issues/13461 | |
| transportFactory = | |
| MtlsUtils.prepareTransportFactoryIfMtlsEnabled( | |
| transportFactory, getEnvironmentProvider(), getPropertyProvider(), null); | |
| regionalAccessBoundaryManager.triggerAsyncRefresh( | |
| transportFactory, (RegionalAccessBoundaryProvider) this, token); | |
| // Automatically discover certificates or enforce mTLS policy if applicable | |
| // TODO: https://github.com/googleapis/google-cloud-java/issues/13461 | |
| HttpTransportFactory upgradedTransportFactory = | |
| MtlsUtils.prepareTransportFactoryIfMtlsEnabled( | |
| transportFactory, getEnvironmentProvider(), getPropertyProvider(), null); | |
| regionalAccessBoundaryManager.triggerAsyncRefresh( | |
| upgradedTransportFactory, (RegionalAccessBoundaryProvider) this, token); |
| static final long TTL_MILLIS = 6 * 60 * 60 * 1000L; // 6 hours | ||
| static final long REFRESH_THRESHOLD_MILLIS = 1 * 60 * 60 * 1000L; // 1 hour | ||
|
|
||
| private static MtlsUtils.MtlsEndpointUsagePolicy userMtlsPolicy = null; |
There was a problem hiding this comment.
This static field is no longer needed if we evaluate the mTLS policy dynamically per request to avoid test pollution and flakiness.
References
- Avoid caching system properties or environment variables in class fields if they are only evaluated once during object initialization, as caching is unnecessary in such cases.
| // Automatically discover certificates or enforce mTLS policy if applicable | ||
| // TODO: https://github.com/googleapis/google-cloud-java/issues/13461 | ||
| transportFactory = | ||
| MtlsUtils.prepareTransportFactoryIfMtlsEnabled( | ||
| transportFactory, getEnvironmentProvider(), getPropertyProvider(), null); |
There was a problem hiding this comment.
This setup triggers synchronous filesystem and key-parsing overhead on every outgoing API request.
Since refreshRegionalAccessBoundaryIfExpired is called synchronously inside getRequestMetadata (the entry point for all API requests),
MtlsUtils.prepareTransportFactoryIfMtlsEnabled
runs before the cache/cooldown guards inside the async manager.
Because the upgraded transport factory is never cached back on the credentials instance, every API request repeats this flow: calling canBeEnabled() (checking File.isFile()) and constructing a new X509Provider to synchronously read and parse the certificate and private key files via FileInputStream.
| throw new IllegalArgumentException("The provided access token is expired."); | ||
| } | ||
|
|
||
| if (userMtlsPolicy == null) { |
There was a problem hiding this comment.
The static field userMtlsPolicy is lazily initialized using a non-synchronized double-check block.
In Java, writing/reading references without the volatile keyword or synchronization can lead to memory visibility issues where concurrent threads observe a partially initialized state or trigger redundant updates.
There was a problem hiding this comment.
Done! Thanks for the catch.
|
|
||
| @Override | ||
| public NetHttpTransport create() { | ||
| public HttpTransport create() { |
There was a problem hiding this comment.
Can we avoid modifying the signature if it was only needed to accommodate the testing?
I think if this was introduced to return a stubbed MockHttpTransport in RegionalAccessBoundaryTest, we can instead stub a generic HttpTransportFactory instance in the tests rather than mocking the concrete MtlsHttpTransportFactory.
Added logic to: