Commit 543d53c
committed
feat(serve): add SageMaker GenAI inference benchmarking and recommendation
Adds sagemaker.serve.ai_inference_recommender, a thin ergonomic layer
over sagemaker-core's AIBenchmarkJob, AIRecommendationJob, and
AIWorkloadConfig resources.
ModelBuilder gains a new entry point and extends two existing verbs:
# Benchmark a deployed endpoint
job = mb.start_benchmark(endpoint=ep, workload=Workload.synthetic(...))
result = BenchmarkResult.from_job(job)
# Recommendation flow extends optimize() and deploy()
mb.optimize(workload=..., performance_target="throughput",
instance_types=["ml.g6.12xlarge"])
endpoint = mb.deploy(role=role) # top recommendation
endpoint = mb.deploy(role=role, recommendation_index=2) # alternative
print(result) and print(mb.recommendations[0]) render their data as
tables.
Public surface added under sagemaker.serve:
* Workload -- typed factory; extras pass through **params, validated
server-side.
* BenchmarkResult / BenchmarkMetrics / BenchmarkMetric -- parses the
AIPerf output.tar.gz from S3.
* Secret -- opt-in helper for tokens >512 chars (Secrets Manager).
* BenchmarkJob, RecommendationJob -- re-exports without the AI prefix.
* FeatureGatedError, WorkloadValidationError -- typed exceptions.
Pin-mode and workload-mode optimize() kwargs are mutually exclusive.
Recommendation deploy uses the ModelPackage path (auto-approves the
package the rec job publishes).
Includes 51 unit tests and 2 slow_test integ tests
(tests/integ/test_ai_inference_recommender_integration.py) verified
end-to-end against real AWS.
Rebased onto upstream to pick up #5860 (preserve falsy values in
sagemaker-core serialize), required so optimize_model=False reaches
the wire.1 parent 1b68ecf commit 543d53c
9 files changed
Lines changed: 774 additions & 10 deletions
File tree
- sagemaker-serve
- src/sagemaker/serve
- ai_inference_recommender
- tests/unit/test_ai_inference_recommender
Lines changed: 20 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
42 | 46 | | |
43 | 47 | | |
44 | 48 | | |
| |||
120 | 124 | | |
121 | 125 | | |
122 | 126 | | |
| 127 | + | |
| 128 | + | |
123 | 129 | | |
124 | 130 | | |
125 | 131 | | |
| |||
256 | 262 | | |
257 | 263 | | |
258 | 264 | | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
259 | 278 | | |
260 | 279 | | |
261 | 280 | | |
262 | 281 | | |
263 | 282 | | |
| 283 | + | |
264 | 284 | | |
265 | 285 | | |
266 | 286 | | |
| |||
Lines changed: 54 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
118 | 121 | | |
119 | 122 | | |
120 | 123 | | |
| |||
134 | 137 | | |
135 | 138 | | |
136 | 139 | | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
137 | 143 | | |
138 | 144 | | |
139 | 145 | | |
| |||
148 | 154 | | |
149 | 155 | | |
150 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
151 | 161 | | |
152 | 162 | | |
153 | 163 | | |
| |||
183 | 193 | | |
184 | 194 | | |
185 | 195 | | |
186 | | - | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
187 | 205 | | |
188 | 206 | | |
189 | 207 | | |
190 | 208 | | |
191 | 209 | | |
192 | 210 | | |
193 | 211 | | |
| 212 | + | |
| 213 | + | |
194 | 214 | | |
195 | 215 | | |
196 | 216 | | |
197 | 217 | | |
198 | 218 | | |
199 | 219 | | |
200 | 220 | | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
201 | 225 | | |
202 | 226 | | |
203 | 227 | | |
| |||
216 | 240 | | |
217 | 241 | | |
218 | 242 | | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
219 | 246 | | |
220 | 247 | | |
221 | 248 | | |
222 | 249 | | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
223 | 276 | | |
224 | 277 | | |
225 | 278 | | |
| |||
Lines changed: 103 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
24 | 38 | | |
25 | 39 | | |
26 | 40 | | |
| |||
29 | 43 | | |
30 | 44 | | |
31 | 45 | | |
| 46 | + | |
32 | 47 | | |
33 | 48 | | |
34 | 49 | | |
| |||
47 | 62 | | |
48 | 63 | | |
49 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
50 | 70 | | |
51 | 71 | | |
52 | 72 | | |
| |||
77 | 97 | | |
78 | 98 | | |
79 | 99 | | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
80 | 182 | | |
81 | 183 | | |
82 | 184 | | |
| |||
0 commit comments