|
| 1 | +--- |
| 2 | +title: AWS STS Design for Ozone S3 |
| 3 | +summary: STS Support in Ozone |
| 4 | +date: 2025-10-30 |
| 5 | +jira: HDDS-13323 |
| 6 | +status: implementing |
| 7 | +author: Madhan Neethiraj, Ren Koike, Fabian Morgan, Stephen O'Donnell, Istvan Fajth, Uma Maheswara Rao Gangumalla |
| 8 | +--- |
| 9 | +<!-- |
| 10 | + Licensed under the Apache License, Version 2.0 (the "License"); |
| 11 | + you may not use this file except in compliance with the License. |
| 12 | + You may obtain a copy of the License at |
| 13 | +
|
| 14 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 15 | +
|
| 16 | + Unless required by applicable law or agreed to in writing, software |
| 17 | + distributed under the License is distributed on an "AS IS" BASIS, |
| 18 | + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 19 | + See the License for the specific language governing permissions and |
| 20 | + limitations under the License. See accompanying LICENSE file. |
| 21 | +--> |
| 22 | + |
| 23 | +# AWS STS Design for Ozone S3 |
| 24 | + |
| 25 | +# 1. Introduction |
| 26 | + |
| 27 | +S3 credentials used to communicate with Ozone S3 APIs are based on a Kerberos identity. |
| 28 | + |
| 29 | +Historically, the Ozone community has had interest in a REST API capable of programmatically generating |
| 30 | +temporary S3 credentials. |
| 31 | + |
| 32 | +Amazon AWS has the [Security Token Service (STS)](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html) which |
| 33 | +provides the ability to generate short-lived access to resources. |
| 34 | + |
| 35 | +The primary scope of this document is to detail the initial implementation of STS within the Ozone ecosystem. |
| 36 | + |
| 37 | +# 2. Why Use STS Tokens? |
| 38 | + |
| 39 | +Providing short-lived access to various resources in Ozone is useful in scenarios such as Data Lake |
| 40 | +solutions that want to aggregate data across multiple cloud providers. |
| 41 | + |
| 42 | +# 3. How Ozone STS Works |
| 43 | + |
| 44 | +The initial implementation of Ozone STS supports only the [AssumeRole](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html) |
| 45 | +API from the AWS specification. A new STS endpoint `/sts` on port `9880` (port `9881` for https) will be created to service STS requests in the S3 Gateway. |
| 46 | +We use a separate port for STS to align with AWS so we don't have conflicts at a later time. This means we have: |
| 47 | +- Admin port for Ozone specific S3 admin operations |
| 48 | +- STS port for STS APIs, analogous to AWS' separate STS endpoint |
| 49 | +- Existing dedicated port/endpoint for S3 object APIs. |
| 50 | + |
| 51 | +Furthermore, the initial implementation of Ozone STS focuses only on Apache Ranger for authorization in the first phase, |
| 52 | +as it aligns more with IAM policies. Support for the Ozone Native Authorizer may be provided in a future phase. |
| 53 | + |
| 54 | +## 3.1 Capabilities |
| 55 | + |
| 56 | +The Ozone STS implementation has the following capabilities: |
| 57 | + |
| 58 | +- Create temporary credentials that last from a minimum of 15 minutes to a maximum of 12 hours. The |
| 59 | +return value of the AssumeRole call will be temporary credentials consisting of 3 components: |
| 60 | + - accessKeyId - a generated String identifier (cryptographically strong using SecureRandom) beginning with the sequence "ASIA" |
| 61 | + - secretAccessKey - a generated String password (cryptographically strong using SecureRandom) |
| 62 | + - sessionToken - a Base64-encoded opaque String identifier |
| 63 | +- The temporary credentials will have the permissions associated with a role. Furthermore, an |
| 64 | +[AWS IAM Session Policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html#policies_session) can |
| 65 | +**optionally** be sent in the AssumeRole API call to limit the scope of the permissions further. If |
| 66 | +an IAM policy is specified, the temporary credential will have the permissions comprising the intersection of the role permissions |
| 67 | +and the IAM policy permissions. **Note:** If the IAM policy is specified and does not grant any permissions, then |
| 68 | +the generated temporary credentials won't have any permissions and will essentially be useless. |
| 69 | + |
| 70 | +## 3.2 Limitations in AssumeRole API Support |
| 71 | + |
| 72 | +The AWS AssumeRole API has various required and optional fields. We will support the two required fields, i.e. `RoleArn` |
| 73 | +and `RoleSessionName`. Additionally, we will support the following optional fields (all others will be rejected): |
| 74 | +- `DurationSeconds` |
| 75 | +- `Policy` |
| 76 | + |
| 77 | +## 3.3 Limitations in IAM Session Policy Support |
| 78 | + |
| 79 | +The AWS IAM policy specification is vast and wide-ranging. The initial Ozone STS implementation supports a limited |
| 80 | +subset of its capabilities. The restrictions are outlined below: |
| 81 | + |
| 82 | +- The only supported prefix in ResourceArn is `arn:aws:s3:::` - all others will be rejected. **Note**: a ResourceArn |
| 83 | +of `*` is supported as well. |
| 84 | +- The only supported Condition operator is `StringEquals` - all others will be rejected. |
| 85 | +- The only supported Condition key is `s3:prefix` - all others will be rejected. |
| 86 | +- Only one Condition operator per Statement is supported - a Statement with more than one Condition will be rejected. |
| 87 | +- The only supported Effect is `Allow` - all others will be rejected. |
| 88 | +- If a (currently) unsupported S3 action is requested, such as `s3:GetAccelerateConfiguration`, it will be silently ignored. |
| 89 | +Similarly, an invalid S3 action will be silently ignored. |
| 90 | +- Supported wildcard expansions in Actions are: `s3:*`, `s3:Get*`, `s3:Put*`, `s3:List*`, |
| 91 | +`s3:Create*`, and `s3:Delete*`. |
| 92 | + |
| 93 | +A sample IAM policy that allows read access to all objects in the `example-bucket` bucket is shown below: |
| 94 | +```JSON |
| 95 | +{ |
| 96 | + "Version": "2012-10-17", |
| 97 | + "Statement": [ |
| 98 | + { |
| 99 | + "Effect": "Allow", |
| 100 | + "Action": "s3:GetObject", |
| 101 | + "Resource": "arn:aws:s3:::example-bucket/*" |
| 102 | + } |
| 103 | + ] |
| 104 | +} |
| 105 | + |
| 106 | +``` |
| 107 | + |
| 108 | +### 3.3.1 Additional Context on Design Behavior |
| 109 | + |
| 110 | +As mentioned above, some limitations in IAM Session Policy support result in API calls being rejected, while others are |
| 111 | +silently ignored. This design behavior came after (much) discussion with external teams. One external team will send |
| 112 | +S3 actions that Ozone doesn't support, and they don't have flexibility to change what they are sending, so that's one |
| 113 | +reason for the silent ignore. This external team also mentioned that some research indicates AWS also does not fail |
| 114 | +the AssumeRole request just because the inline session policy references unknown or unsupported actions, but rather |
| 115 | +it will fail when the temporary credentials are used, so this design is accordance with that finding. Another external |
| 116 | +team agreed that behavior is fine for actions, but does not work for Conditions, because one can have a Condition to |
| 117 | +restrict calls by sourceIp, and if we silently ignore this, the client may incorrectly think the temporary credentials |
| 118 | +are restricted for use by that IP address, so the consensus was to reject the request for that scenario. |
| 119 | + |
| 120 | +## 3.4 SessionToken Format |
| 121 | + |
| 122 | +As mentioned above, one of the return values from the AssumeRole call will be the sessionToken. To support not |
| 123 | +storing temporary credentials server-side in Ozone, the sessionToken will comprise various components needed to validate |
| 124 | +subsequent S3 calls that use the token. The sessionToken will have the following information encoded: |
| 125 | + |
| 126 | +- originalAccessKeyId - this is the Kerberos identity of the user that created the sessionToken via the AssumeRole call. |
| 127 | +When the temporary credentials are used to make S3 API calls, this Kerberos identity (in conjunction with the role permissions and |
| 128 | +optional session policy) will be used to authorize the call. This identity is included in the sessionToken because |
| 129 | +S3 API calls (such as PutObject) require a Kerberos identity, but the temporary credentials don't have a |
| 130 | +Kerberos identity associated to them, therefore the Kerberos identity of the user that created the token will be used in |
| 131 | +these cases. |
| 132 | +- roleArn - the role used in the original AssumeRole call |
| 133 | +- encrypted secretAccessKey - this will be used to validate the AWS signature when the temporary credentials are used |
| 134 | +to make S3 API calls |
| 135 | +- sessionPolicy - when using the RangerOzoneAuthorizer, if Ranger successfully authorizes the AssumeRole call, |
| 136 | +it will return a String representing the role the token was authorized for. Furthermore, if an AWS IAM Session Policy |
| 137 | +was included with the AssumeRole request, the String return value will also include resources (i.e. buckets, keys, etc.) |
| 138 | +and permissions (i.e. ACLType) corresponding to the AWS IAM Session Policy. These resources and permissions, if present, |
| 139 | +would further limit the scope of the permissions and resources granted by the role in Ranger, such that the temporary |
| 140 | +credential will have the permissions comprising the intersection of the role permissions and the sessionPolicy permissions. |
| 141 | +- HMAC-SHA256 signature - used to ensure the sessionToken was created by Ozone and was not altered since it was created. |
| 142 | +- expiration time of the token (via `ShortLivedTokenIdentifier#getExpiry()`) |
| 143 | +- UUID of the OzoneManager secret key used to sign the sessionToken and encrypt the secretAccessKey (via `ShortLivedTokenIdentifier#getSecretKeyId()`) |
| 144 | + |
| 145 | +## 3.5 STS Token Revocation |
| 146 | + |
| 147 | +In the rare event temporary credentials need to be revoked (ex. for security reasons), a table in the OzoneManager RocksDB will be created |
| 148 | +to store revoked tokens, and a command-line utility will be created to add tokens to the table. A background cleaner service |
| 149 | +will be created to run every 3 hours to delete revoked tokens that have been in the table for more than 12 hours. The |
| 150 | +input parameter for the command-line utility will be the sessionToken - this value is returned in plain text as a result |
| 151 | +of the AssumeRole call (mentioned above). In this way, specific STS tokens can be revoked as opposed to all tokens. Furthermore, |
| 152 | +AWS doesn't have a standard API to revoke tokens therefore we are creating our own system. |
| 153 | +Note: STS token revocation checks are strictly enforced and will fail-closed if there are internal errors such as not |
| 154 | +being able to communicate with the revocation database table, etc. |
| 155 | +Note: The creator of the STS token or an S3/tenant admin are the only ones allowed to revoke a token. |
| 156 | + |
| 157 | +## 3.6 Prerequisites |
| 158 | + |
| 159 | +A user must be configured with a Kerberos identity in Ozone and the S3 `getSecret` command |
| 160 | +must be called to issue permanent S3 credentials. With these credentials, the AssumeRole API call can be made, but additional |
| 161 | +steps below are needed for the call to be successfully authorized. |
| 162 | +When using RangerOzoneAuthorizer, a role must be configured in Ranger UI for each role the AssumeRole API |
| 163 | +can be used with. Further, Apache Ranger policies should be in place to grant the user permission to assume the role. |
| 164 | + |
| 165 | +### 3.6.1 Additions to RangerOzoneAuthorizer |
| 166 | + |
| 167 | +The `IAccessAuthorizer` interface that both the RangerOzoneAuthorizer and OzoneNativeAuthorizer implement, will have a |
| 168 | +new method: |
| 169 | + |
| 170 | +```java |
| 171 | +default String generateAssumeRoleSessionPolicy(AssumeRoleRequest assumeRoleRequest) throws OMException { |
| 172 | + throw new OMException("The generateAssumeRoleSessionPolicy call is not supported", NOT_SUPPORTED_OPERATION); |
| 173 | +} |
| 174 | +``` |
| 175 | + |
| 176 | +When using RangerOzoneAuthorizer, the AssumeRole API call must invoke this method to ensure the caller is authorized to create |
| 177 | +temporary credentials, given the criteria in the AssumeRoleRequest. The AssumeRoleRequest input parameter will have the |
| 178 | +components: |
| 179 | +- `String` host - hostname of caller |
| 180 | +- `InetAddress` ip - IP address of caller |
| 181 | +- `UserGroupInformation` ugi - the user making the call |
| 182 | +- `String` targetRoleName - what role is being assumed |
| 183 | +- `Set<AssumeRoleRequest.OzoneGrant>` grants - further limiting the scope of the role according to the grants |
| 184 | + |
| 185 | +The grants parameter is optional, and would only be present if the AssumeRole API call had an IAM session policy JSON |
| 186 | +parameter supplied. A conversion utility, `IamSessionPolicyResolver` will process the IAM policy and convert it to a |
| 187 | +`Set<AssumeRoleRequest.OzoneGrant>`, in effect translating from S3 nomenclature for resources and actions to Ozone nomenclature of |
| 188 | +`IOzoneObj` and `ACLType`. Ranger would use all of this information to determine if the AssumeRole call should be |
| 189 | +successfully authorized, and if so, it will return a String representation of the granted permissions and paths. |
| 190 | + |
| 191 | +The format of this String is entirely up to the Ranger team. What is required from the Ozone side is to supply this String to Ranger when any |
| 192 | +subsequent S3 API calls are made that use STS tokens. In order to achieve this, the sessionPolicy String from Ranger will |
| 193 | +be included in the sessionToken response to the AssumeRole API call (as mentioned above), and Ozone will supply this String |
| 194 | +to Ranger whenever STS tokens are used on S3 API calls via a new `RequestContext.sessionPolicy` field in the |
| 195 | +`IAccessAuthorizer#checkAccess(IOzoneObj, RequestContext)` call. |
| 196 | + |
| 197 | +## 3.7 Overall Flow |
| 198 | + |
| 199 | +The following section outlines the overall flow when using STS in Ozone: |
| 200 | + |
| 201 | +- An authorized user for AssumeRole API calls must be configured in Ozone, and if using RangerOzoneAuthorizer, the role |
| 202 | +created in Ranger as per the Prerequisites above. |
| 203 | +- This authorized user (having permanent S3 credentials) makes the AssumeRole STS call to Ozone. |
| 204 | +- If successful, Ozone responds with the temporary credentials. |
| 205 | +- A client makes S3 API calls with the temporary credentials for up to as long as the credentials last. |
| 206 | +- When Ozone receives an S3 api call using temporary credentials, it will use the Kerberos identity associated with the |
| 207 | +originalAccessKeyId in the session token and perform the following checks: |
| 208 | + - Ensure that if the accessKeyId starts with "ASIA", that a sessionToken was included in the `x-amz-security-token` header |
| 209 | + - Ensure the sessionToken is not expired |
| 210 | + - Ensure the sessionToken is not revoked via a `keyMayExist` check in OzoneManager RocksDB |
| 211 | + - Validate the HMAC-SHA256 signature in the sessionToken |
| 212 | + - Decrypt the secretAccessKey from the sessionToken and validate the AWS signature |
| 213 | + - Authorize the call with either RangerOzoneAuthorizer or OzoneNativeAuthorizer |
| 214 | + |
| 215 | +Assuming all these checks pass, the S3 API call will be invoked. |
0 commit comments