You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A minimal Python library for writing and running benchmarks.
4
4
5
-
`microbenchmark` gives you simple building blocks — `Scenario`, `ScenarioGroup`, and `BenchmarkResult` — that you can embed directly into your project or call from CI. There is no CLI tool to install and no configuration to manage. You write a Python file, call `.run()` or `.cli()`, and you're done.
5
+
`microbenchmark` gives you simple building blocks — `Scenario`, `ScenarioGroup`, and `BenchmarkResult` — that you can embed directly into your project or call from CI. There is no CLI tool to install and no configuration to manage. You write a Python file, call `.run()` or `.cli()`, and you are done.
6
6
7
7
**Key features:**
8
8
9
9
- A `Scenario` wraps any callable with a fixed argument list and runs it `n` times, collecting per-run timings.
10
-
- A `ScenarioGroup` lets you combine scenarios and run them together.
10
+
- A `ScenarioGroup` lets you combine scenarios and run them together with a single call.
11
11
-`BenchmarkResult` holds every individual duration and gives you mean, best, worst, and percentile views.
12
-
- Results can be serialised to and restored from JSON.
12
+
- Results can be serialized to and restored from JSON.
13
13
- No external dependencies beyond the Python standard library.
#> 0.000012 (example value, actual result will vary)
47
+
print(result.mean) # example — actual value depends on your hardware
49
48
print(result.best)
50
-
#> 0.000010
51
49
print(result.worst)
52
-
#> 0.000018
50
+
print(len(result.durations))
51
+
#> 500
53
52
```
54
53
55
54
---
@@ -68,18 +67,20 @@ Scenario(
68
67
name,
69
68
doc='',
70
69
number=1000,
71
-
timer=time.perf_counter,
70
+
timer=..., # defaults to time.perf_counter
72
71
)
73
72
```
74
73
75
74
-`function` — the callable to benchmark.
76
-
-`args` — a list of positional arguments to pass on each call. `None` (the default) means the function is called with no arguments. The list is copied on construction, so mutating it afterwards has no effect.
75
+
-`args` — a list of positional arguments passed to `function` on every call. `None` (the default) and `[]` both mean the function is called with no positional arguments. The list is shallow-copied on construction, so appending to your original list afterward has no effect. Keyword arguments are not supported; wrap your callable in a `functools.partial` or a lambda if you need them.
77
76
-`name` — a short label for this scenario (required).
78
77
-`doc` — an optional longer description.
79
-
-`number` — how many times to call `function` per run. Must be at least `1`.
80
-
-`timer` — a callable that returns the current time as a `float`. Defaults to `time.perf_counter`. Useful for injecting a controlled clock in tests.
78
+
-`number` — how many times to call `function` per run. Must be at least `1`; passing `0` or a negative value raises `ValueError`.
79
+
-`timer` — a zero-argument callable that returns the current time as a `float`. Defaults to `time.perf_counter`. Useful for injecting a controlled clock in tests.
81
80
82
81
```python
82
+
from microbenchmark import Scenario
83
+
83
84
scenario = Scenario(
84
85
sorted,
85
86
args=[[3, 1, 2]],
@@ -89,26 +90,54 @@ scenario = Scenario(
89
90
)
90
91
```
91
92
93
+
For keyword arguments, use `functools.partial`:
94
+
95
+
```python
96
+
from functools import partial
97
+
from microbenchmark import Scenario
98
+
99
+
scenario = Scenario(
100
+
partial(sorted, key=lambdax: -x),
101
+
args=[[3, 1, 2]],
102
+
name='sort_descending',
103
+
)
104
+
```
105
+
92
106
### `run(warmup=0)`
93
107
94
108
Runs the benchmark and returns a `BenchmarkResult`.
95
109
96
-
The optional `warmup` argument specifies how many calls to make before timing begins. Warm-up calls execute the function and consume timer ticks, but their timings are not included in the result.
110
+
The optional `warmup` argument specifies how many calls to make before timing begins. Warm-up calls invoke the function and consume timer ticks, but their timings are not included in the result.
Turns the scenario into a small command-line programme. Call `scenario.cli()` as the entry point of a script and it will parse `sys.argv`, run the benchmark, and print the result.
123
+
Turns the scenario into a small command-line program. Call `scenario.cli()` as the entry point of a script and it will parse `sys.argv`, run the benchmark, and print the result.
107
124
108
125
Supported arguments:
109
126
110
127
-`--number N` — override the scenario's `number` for this run.
111
128
-`--max-mean THRESHOLD` — exit with code `1` if the mean time (in seconds) exceeds `THRESHOLD`. Useful in CI.
129
+
-`--help` — print usage information and exit.
130
+
131
+
Output format:
132
+
133
+
```
134
+
benchmark: <name>
135
+
mean: <mean>s
136
+
best: <best>s
137
+
worst: <worst>s
138
+
```
139
+
140
+
Values are in seconds. The `mean`, `best`, and `worst` labels are padded to the same width. If `--max-mean` is supplied and the actual mean exceeds the threshold, the same output is printed but the process exits with code `1`.
You can also create an empty group and combine it with others later:
205
+
206
+
```python
207
+
empty = ScenarioGroup()
208
+
print(len(empty.run()))
209
+
#> 0
210
+
```
211
+
173
212
**The `+` operator between scenarios** — adding two or more `Scenario` objects produces a `ScenarioGroup`:
174
213
175
214
```python
215
+
from microbenchmark import Scenario
216
+
217
+
s1 = Scenario(lambda: None, name='s1')
218
+
s2 = Scenario(lambda: None, name='s2')
176
219
group = s1 + s2
177
220
```
178
221
179
-
**Adding a scenario to a group** — the result is always a flat group:
222
+
**Adding a scenario to a group** — the result is always a flat group with no nesting:
180
223
181
224
```python
225
+
from microbenchmark import Scenario, ScenarioGroup
226
+
227
+
s1 = Scenario(lambda: None, name='s1')
228
+
s2 = Scenario(lambda: None, name='s2')
182
229
s3 = Scenario(lambda: None, name='s3')
183
230
group = s1 + s2 + s3
184
231
print(type(group).__name__)
@@ -188,6 +235,11 @@ print(type(group).__name__)
188
235
**Adding two groups together** — the result is a single flat group containing the scenarios from both:
189
236
190
237
```python
238
+
from microbenchmark import Scenario, ScenarioGroup
239
+
240
+
s1 = Scenario(lambda: None, name='s1')
241
+
s2 = Scenario(lambda: None, name='s2')
242
+
s3 = Scenario(lambda: None, name='s3')
191
243
g1 = ScenarioGroup(s1)
192
244
g2 = ScenarioGroup(s2, s3)
193
245
combined = g1 + g2
@@ -197,25 +249,30 @@ print(len(combined.run()))
197
249
198
250
### `run(warmup=0)`
199
251
200
-
Runs every scenario in order and returns a list of `BenchmarkResult` objects. The order in the list matches the order the scenarios were added.
252
+
Runs every scenario in order and returns a list of `BenchmarkResult` objects. The order in the list matches the order the scenarios were added. The `warmup` argument is forwarded to each scenario.
201
253
202
254
```python
255
+
from microbenchmark import Scenario, ScenarioGroup
256
+
257
+
s1 = Scenario(lambda: None, name='s1')
258
+
s2 = Scenario(lambda: None, name='s2')
259
+
group = ScenarioGroup(s1, s2)
203
260
results = group.run(warmup=50)
204
261
for result in results:
205
-
print(result.scenario.name, result.mean)
206
-
#> s1 ...
207
-
#> s2 ...
208
-
#> s3 ...
262
+
print(result.scenario.name)
263
+
#> s1
264
+
#> s2
209
265
```
210
266
211
267
### `cli()`
212
268
213
-
Runs all scenarios and prints their results separated by dividers.
269
+
Runs all scenarios and prints their results separated by `---`dividers.
214
270
215
271
Supported arguments:
216
272
217
273
-`--number N` — passed to every scenario.
218
274
-`--max-mean THRESHOLD` — exits with code `1` if any scenario's mean exceeds the threshold.
275
+
-`--help` — print usage information and exit.
219
276
220
277
```python
221
278
# benchmarks.py
@@ -251,14 +308,16 @@ worst: 0.000018s
251
308
252
309
### Fields
253
310
254
-
-`scenario` — the `Scenario` that produced this result, or `None` if the result was restored from JSON.
255
-
-`durations` — a tuple of per-call timings in seconds, one entry per call.
256
-
-`mean` — arithmetic mean of `durations`, computed with `math.fsum` to minimise floating-point error.
257
-
-`best` — the shortest individual timing.
258
-
-`worst` — the longest individual timing.
259
-
-`is_primary` — `True` for results returned directly by `run()`, `False` for results derived via `percentile()`.
311
+
-`scenario: Scenario | None` — the `Scenario` that produced this result, or `None` if the result was restored from JSON.
312
+
-`durations: tuple[float, ...]` — per-call timings in seconds, one entry per call.
313
+
-`mean: float` — arithmetic mean of `durations`, computed with `math.fsum` to minimize floating-point error.
314
+
-`best: float` — the shortest individual timing.
315
+
-`worst: float` — the longest individual timing.
316
+
-`is_primary: bool` — `True` for results returned directly by `run()`, `False` for results derived via `percentile()`.
260
317
261
318
```python
319
+
from microbenchmark import Scenario
320
+
262
321
result = Scenario(lambda: None, name='noop', number=100).run()
263
322
print(len(result.durations))
264
323
#> 100
@@ -268,34 +327,45 @@ print(result.is_primary)
268
327
269
328
### `percentile(p)`
270
329
271
-
Returns a new `BenchmarkResult` containing only the fastest `ceil(len(durations) * p / 100)` timings. The returned result has `is_primary=False`.
330
+
Returns a new `BenchmarkResult` containing only the `ceil(len(durations) * p / 100)`fastest timings, sorted by duration ascending. The returned result has `is_primary=False`. `p` must be in the range `(0, 100]`; passing `0` or a value above `100` raises `ValueError`.
272
331
273
332
```python
333
+
from microbenchmark import Scenario
334
+
335
+
result = Scenario(lambda: None, name='noop', number=100).run()
`p` must be in the range `(0, 100]`. Passing `0` or a value above `100` raises `ValueError`.
282
-
283
343
### `p95` and `p99`
284
344
285
345
Convenient cached properties that return `percentile(95)` and `percentile(99)` respectively. The value is computed once and cached for the lifetime of the result object.
286
346
287
347
```python
288
-
print(result.p95.mean <= result.mean)
289
-
#> True
348
+
from microbenchmark import Scenario
349
+
350
+
result = Scenario(lambda: None, name='noop', number=100).run()
351
+
p95 = result.p95
352
+
print(len(p95.durations))
353
+
#> 95
354
+
print(p95.is_primary)
355
+
#> False
290
356
```
291
357
292
358
### `to_json()` and `from_json()`
293
359
294
-
`to_json()`serialises the result to a JSON string. It stores all individual `durations`, `is_primary`, and the scenario's `name`, `doc`, and `number`.
360
+
`to_json()`serializes the result to a JSON string. It stores all individual `durations`, `is_primary`, and the scenario's `name`, `doc`, and `number`.
295
361
296
-
`from_json()` restores a `BenchmarkResult` from a JSON string produced by `to_json()`. Because the original callable cannot be serialised, the restored result has `scenario=None`.
362
+
`from_json()`is a class method that restores a `BenchmarkResult` from a JSON string produced by `to_json()`. Because the original callable cannot be serialized, the restored result has `scenario=None`. The `mean`, `best`, and `worst` fields are recomputed from `durations` on restoration.
297
363
298
364
```python
365
+
from microbenchmark import Scenario, BenchmarkResult
366
+
367
+
result = Scenario(lambda: None, name='noop', number=100).run()
| CI integration (`--max-mean`) | yes | no | via configuration |
320
392
|`+` operator for grouping | yes | no | no |
321
393
| External dependencies | none | none | several |
322
-
| Embeddable in your own code | yes | yes |test suite only|
394
+
| Embeddable in your own code | yes | yes |pytest plugin required|
323
395
324
-
`timeit` from the standard library is great for interactive exploration but gives you only a single aggregate number. `pytest-benchmark` is powerful but is tightly coupled to the `pytest` runner and brings its own dependencies. `microbenchmark`occupies the space between: richer than `timeit`, lighter than `pytest-benchmark`, and not tied to any test framework.
396
+
`timeit` from the standard library is great for interactive exploration but gives you only a single aggregate number and offers no per-call data. `pytest-benchmark` is powerful and well-integrated into the `pytest` ecosystem, but it is tightly coupled to the test runner and brings its own dependencies. `microbenchmark`sits between the two: richer than `timeit`, lighter and more portable than `pytest-benchmark`, and not tied to any test framework.
0 commit comments