Better readme

Евгений Блинов · Евгений Блинов · commit 05c4e1f420bd · 2026-04-09T22:30:23.000+03:00
diff --git a/README.md b/README.md
@@ -2,14 +2,14 @@
 
 A minimal Python library for writing and running benchmarks.
 
-`microbenchmark` gives you simple building blocks — `Scenario`, `ScenarioGroup`, and `BenchmarkResult` — that you can embed directly into your project or call from CI. There is no CLI tool to install and no configuration to manage. You write a Python file, call `.run()` or `.cli()`, and you're done.
+`microbenchmark` gives you simple building blocks — `Scenario`, `ScenarioGroup`, and `BenchmarkResult` — that you can embed directly into your project or call from CI. There is no CLI tool to install and no configuration to manage. You write a Python file, call `.run()` or `.cli()`, and you are done.
 
 **Key features:**
 
 - A `Scenario` wraps any callable with a fixed argument list and runs it `n` times, collecting per-run timings.
-- A `ScenarioGroup` lets you combine scenarios and run them together.
+- A `ScenarioGroup` lets you combine scenarios and run them together with a single call.
 - `BenchmarkResult` holds every individual duration and gives you mean, best, worst, and percentile views.
-- Results can be serialised to and restored from JSON.
+- Results can be serialized to and restored from JSON.
 - No external dependencies beyond the Python standard library.
 
 ---
@@ -44,12 +44,11 @@ def build_list():
 scenario = Scenario(build_list, name='build_list', number=500)
 result = scenario.run()
 
-print(result.mean)
-#> 0.000012  (example value, actual result will vary)
+print(result.mean)   # example — actual value depends on your hardware
 print(result.best)
-#> 0.000010
 print(result.worst)
-#> 0.000018
+print(len(result.durations))
+#> 500
 ```
 
 ---
@@ -68,18 +67,20 @@ Scenario(
     name,
     doc='',
     number=1000,
-    timer=time.perf_counter,
+    timer=...,   # defaults to time.perf_counter
 )
 ```
 
 - `function` — the callable to benchmark.
-- `args` — a list of positional arguments to pass on each call. `None` (the default) means the function is called with no arguments. The list is copied on construction, so mutating it afterwards has no effect.
+- `args` — a list of positional arguments passed to `function` on every call. `None` (the default) and `[]` both mean the function is called with no positional arguments. The list is shallow-copied on construction, so appending to your original list afterward has no effect. Keyword arguments are not supported; wrap your callable in a `functools.partial` or a lambda if you need them.
 - `name` — a short label for this scenario (required).
 - `doc` — an optional longer description.
-- `number` — how many times to call `function` per run. Must be at least `1`.
-- `timer` — a callable that returns the current time as a `float`. Defaults to `time.perf_counter`. Useful for injecting a controlled clock in tests.
+- `number` — how many times to call `function` per run. Must be at least `1`; passing `0` or a negative value raises `ValueError`.
+- `timer` — a zero-argument callable that returns the current time as a `float`. Defaults to `time.perf_counter`. Useful for injecting a controlled clock in tests.
 
 ```python
+from microbenchmark import Scenario
+
 scenario = Scenario(
     sorted,
     args=[[3, 1, 2]],
@@ -89,26 +90,54 @@ scenario = Scenario(
 )
 ```
 
+For keyword arguments, use `functools.partial`:
+
+```python
+from functools import partial
+from microbenchmark import Scenario
+
+scenario = Scenario(
+    partial(sorted, key=lambda x: -x),
+    args=[[3, 1, 2]],
+    name='sort_descending',
+)
+```
+
 ### `run(warmup=0)`
 
 Runs the benchmark and returns a `BenchmarkResult`.
 
-The optional `warmup` argument specifies how many calls to make before timing begins. Warm-up calls execute the function and consume timer ticks, but their timings are not included in the result.
+The optional `warmup` argument specifies how many calls to make before timing begins. Warm-up calls invoke the function and consume timer ticks, but their timings are not included in the result.
 
 ```python
+from microbenchmark import Scenario
+
+scenario = Scenario(lambda: list(range(100)), name='build', number=1000)
 result = scenario.run(warmup=100)
 print(len(result.durations))
-#> 10000
+#> 1000
 ```
 
 ### `cli()`
 
-Turns the scenario into a small command-line programme. Call `scenario.cli()` as the entry point of a script and it will parse `sys.argv`, run the benchmark, and print the result.
+Turns the scenario into a small command-line program. Call `scenario.cli()` as the entry point of a script and it will parse `sys.argv`, run the benchmark, and print the result.
 
 Supported arguments:
 
 - `--number N` — override the scenario's `number` for this run.
 - `--max-mean THRESHOLD` — exit with code `1` if the mean time (in seconds) exceeds `THRESHOLD`. Useful in CI.
+- `--help` — print usage information and exit.
+
+Output format:
+
+```
+benchmark: <name>
+mean:  <mean>s
+best:  <best>s
+worst: <worst>s
+```
+
+Values are in seconds. The `mean`, `best`, and `worst` labels are padded to the same width. If `--max-mean` is supplied and the actual mean exceeds the threshold, the same output is printed but the process exits with code `1`.
 
 ```python
 # benchmark.py
@@ -124,7 +153,7 @@ if __name__ == '__main__':
 ```
 
 ```
-$ python benchmark.py --number 1000
+$ python benchmark.py
 benchmark: build_list
 mean:  0.000012s
 best:  0.000010s
@@ -137,6 +166,8 @@ benchmark: build_list
 mean:  0.000012s
 best:  0.000010s
 worst: 0.000018s
+$ echo $?
+0
 ```
 
 ```
@@ -146,7 +177,7 @@ mean:  0.000012s
 best:  0.000010s
 worst: 0.000018s
 $ echo $?
-#> 1
+1
 ```
 
 ---
@@ -170,15 +201,31 @@ s2 = Scenario(lambda: None, name='s2')
 group = ScenarioGroup(s1, s2)
 ```
 
+You can also create an empty group and combine it with others later:
+
+```python
+empty = ScenarioGroup()
+print(len(empty.run()))
+#> 0
+```
+
 **The `+` operator between scenarios** — adding two or more `Scenario` objects produces a `ScenarioGroup`:
 
 ```python
+from microbenchmark import Scenario
+
+s1 = Scenario(lambda: None, name='s1')
+s2 = Scenario(lambda: None, name='s2')
 group = s1 + s2
 ```
 
-**Adding a scenario to a group** — the result is always a flat group:
+**Adding a scenario to a group** — the result is always a flat group with no nesting:
 
 ```python
+from microbenchmark import Scenario, ScenarioGroup
+
+s1 = Scenario(lambda: None, name='s1')
+s2 = Scenario(lambda: None, name='s2')
 s3 = Scenario(lambda: None, name='s3')
 group = s1 + s2 + s3
 print(type(group).__name__)
@@ -188,6 +235,11 @@ print(type(group).__name__)
 **Adding two groups together** — the result is a single flat group containing the scenarios from both:
 
 ```python
+from microbenchmark import Scenario, ScenarioGroup
+
+s1 = Scenario(lambda: None, name='s1')
+s2 = Scenario(lambda: None, name='s2')
+s3 = Scenario(lambda: None, name='s3')
 g1 = ScenarioGroup(s1)
 g2 = ScenarioGroup(s2, s3)
 combined = g1 + g2
@@ -197,25 +249,30 @@ print(len(combined.run()))
 
 ### `run(warmup=0)`
 
-Runs every scenario in order and returns a list of `BenchmarkResult` objects. The order in the list matches the order the scenarios were added.
+Runs every scenario in order and returns a list of `BenchmarkResult` objects. The order in the list matches the order the scenarios were added. The `warmup` argument is forwarded to each scenario.
 
 ```python
+from microbenchmark import Scenario, ScenarioGroup
+
+s1 = Scenario(lambda: None, name='s1')
+s2 = Scenario(lambda: None, name='s2')
+group = ScenarioGroup(s1, s2)
 results = group.run(warmup=50)
 for result in results:
-    print(result.scenario.name, result.mean)
-#> s1 ...
-#> s2 ...
-#> s3 ...
+    print(result.scenario.name)
+#> s1
+#> s2
 ```
 
 ### `cli()`
 
-Runs all scenarios and prints their results separated by dividers.
+Runs all scenarios and prints their results separated by `---` dividers.
 
 Supported arguments:
 
 - `--number N` — passed to every scenario.
 - `--max-mean THRESHOLD` — exits with code `1` if any scenario's mean exceeds the threshold.
+- `--help` — print usage information and exit.
 
 ```python
 # benchmarks.py
@@ -251,14 +308,16 @@ worst: 0.000018s
 
 ### Fields
 
-- `scenario` — the `Scenario` that produced this result, or `None` if the result was restored from JSON.
-- `durations` — a tuple of per-call timings in seconds, one entry per call.
-- `mean` — arithmetic mean of `durations`, computed with `math.fsum` to minimise floating-point error.
-- `best` — the shortest individual timing.
-- `worst` — the longest individual timing.
-- `is_primary` — `True` for results returned directly by `run()`, `False` for results derived via `percentile()`.
+- `scenario: Scenario | None` — the `Scenario` that produced this result, or `None` if the result was restored from JSON.
+- `durations: tuple[float, ...]` — per-call timings in seconds, one entry per call.
+- `mean: float` — arithmetic mean of `durations`, computed with `math.fsum` to minimize floating-point error.
+- `best: float` — the shortest individual timing.
+- `worst: float` — the longest individual timing.
+- `is_primary: bool` — `True` for results returned directly by `run()`, `False` for results derived via `percentile()`.
 
 ```python
+from microbenchmark import Scenario
+
 result = Scenario(lambda: None, name='noop', number=100).run()
 print(len(result.durations))
 #> 100
@@ -268,34 +327,45 @@ print(result.is_primary)
 
 ### `percentile(p)`
 
-Returns a new `BenchmarkResult` containing only the fastest `ceil(len(durations) * p / 100)` timings. The returned result has `is_primary=False`.
+Returns a new `BenchmarkResult` containing only the `ceil(len(durations) * p / 100)` fastest timings, sorted by duration ascending. The returned result has `is_primary=False`. `p` must be in the range `(0, 100]`; passing `0` or a value above `100` raises `ValueError`.
 
 ```python
+from microbenchmark import Scenario
+
+result = Scenario(lambda: None, name='noop', number=100).run()
 trimmed = result.percentile(95)
 print(trimmed.is_primary)
 #> False
-print(len(trimmed.durations) <= len(result.durations))
-#> True
+print(len(trimmed.durations))
+#> 95
 ```
 
-`p` must be in the range `(0, 100]`. Passing `0` or a value above `100` raises `ValueError`.
-
 ### `p95` and `p99`
 
 Convenient cached properties that return `percentile(95)` and `percentile(99)` respectively. The value is computed once and cached for the lifetime of the result object.
 
 ```python
-print(result.p95.mean <= result.mean)
-#> True
+from microbenchmark import Scenario
+
+result = Scenario(lambda: None, name='noop', number=100).run()
+p95 = result.p95
+print(len(p95.durations))
+#> 95
+print(p95.is_primary)
+#> False
 ```
 
 ### `to_json()` and `from_json()`
 
-`to_json()` serialises the result to a JSON string. It stores all individual `durations`, `is_primary`, and the scenario's `name`, `doc`, and `number`.
+`to_json()` serializes the result to a JSON string. It stores all individual `durations`, `is_primary`, and the scenario's `name`, `doc`, and `number`.
 
-`from_json()` restores a `BenchmarkResult` from a JSON string produced by `to_json()`. Because the original callable cannot be serialised, the restored result has `scenario=None`.
+`from_json()` is a class method that restores a `BenchmarkResult` from a JSON string produced by `to_json()`. Because the original callable cannot be serialized, the restored result has `scenario=None`. The `mean`, `best`, and `worst` fields are recomputed from `durations` on restoration.
 
 ```python
+from microbenchmark import Scenario, BenchmarkResult
+
+result = Scenario(lambda: None, name='noop', number=100).run()
+
 json_str = result.to_json()
 restored = BenchmarkResult.from_json(json_str)
 
@@ -315,10 +385,12 @@ print(restored.durations == result.durations)
 |---|---|---|---|
 | Per-call timings | yes | no | yes |
 | Percentile views | yes | no | yes |
-| JSON serialisation | yes | no | no |
-| CI integration (`--max-mean`) | yes | no | via plugins |
+| JSON serialization | yes | no | yes (internal format) |
+| Inject custom timer | yes | yes | no |
+| Warmup support | yes | no | yes (calibration) |
+| CI integration (`--max-mean`) | yes | no | via configuration |
 | `+` operator for grouping | yes | no | no |
 | External dependencies | none | none | several |
-| Embeddable in your own code | yes | yes | test suite only |
+| Embeddable in your own code | yes | yes | pytest plugin required |
 
-`timeit` from the standard library is great for interactive exploration but gives you only a single aggregate number. `pytest-benchmark` is powerful but is tightly coupled to the `pytest` runner and brings its own dependencies. `microbenchmark` occupies the space between: richer than `timeit`, lighter than `pytest-benchmark`, and not tied to any test framework.
+`timeit` from the standard library is great for interactive exploration but gives you only a single aggregate number and offers no per-call data. `pytest-benchmark` is powerful and well-integrated into the `pytest` ecosystem, but it is tightly coupled to the test runner and brings its own dependencies. `microbenchmark` sits between the two: richer than `timeit`, lighter and more portable than `pytest-benchmark`, and not tied to any test framework.