You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26-17Lines changed: 26 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,9 +5,9 @@ Data attribution methods estimate the effect on a behavior of interest of removi
5
5
6
6
## Core features
7
7
8
-
Per-token and per-sequence attribution is available everywhere. Both on-disk gradient stores and on-the-fly queries are supported. Almost every feature is available through both the CLI and a programmatic interface, which use a shared set of configuration dataclasses. To understand every available configuration option, [check out the documentation](https://bergson.readthedocs.io/en/latest/api.html#bergson.IndexConfig). Bergson uses FSDP2 or SimpleFSDP, BitsAndBytes, and low-level performance optimizations to support large models, datasets, and clusters.
8
+
Per-token and per-sequence attribution is available everywhere. On-disk gradient stores and on-the-fly queries are supported. Almost every feature is available through both the CLI and a programmatic interface, which use a shared set of configuration dataclasses. Configuration dataclasses are always serialized to disk so commands can be reproduced in one line. To understand every available configuration option, [check out the documentation](https://bergson.readthedocs.io/en/latest/api.html#bergson.IndexConfig).
9
9
10
-
Bergson integrates with HuggingFace Transformers and Datasets, and also supports on-disk datasets in a variety of formats.
10
+
Bergson uses FSDP2 or SimpleFSDP, BitsAndBytes, and low-level performance optimizations to support large models, datasets, and clusters. Bergson integrates with HuggingFace Transformers and Datasets, and also supports on-disk datasets in a variety of formats.
11
11
12
12
### Attribute through Training
13
13
@@ -24,7 +24,7 @@ For small queries and methods that don't use gradient compression (e.g., EK-FAC)
24
24
25
25
Per-module and per-attention head gradient storage enables mechanistic interpretability.
26
26
27
-
At a higher level, `bergson trackstar` pipelines all necessary steps for TrackStar-based attribution. See `bergson trackstar`.
27
+
At a higher level, `bergson trackstar`, `bergson ekfac`, and `bergson approx_unrolling` orchestrate several multi-step attribution methods.
28
28
29
29
# Announcements
30
30
@@ -49,20 +49,20 @@ pip install bergson
49
49
50
50
# Quickstart
51
51
52
-
To use MAGIC on a GPT-2 WikiText fine-tune:
52
+
Use MAGIC to attribute a GPT-2 WikiText fine-tune:
Full documentation is available at https://bergson.readthedocs.io/.
74
74
75
+
## Run with YAMLs
76
+
77
+
Every run writes a self-describing `config.yaml` into its output directory, enabling the command(s) to be reproduced:
78
+
79
+
```bash
80
+
bergson <path_to_config.yaml>
81
+
```
82
+
83
+
Each config contains `steps:` a list of `- command: {...}` entries plus a `metadata:` block (bergson version, timestamp, git SHA). Multi-step pipelines may optionally specify a `run_path`. See [`examples/pipelines/hessian_then_build.yaml`](examples/pipelines/hessian_then_build.yaml) for an example of a multi-step run.
Many workflows chain several Bergson commands together. Rather than running each command separately, you can express the whole sequence as a YAML file and run it with a single command:
109
-
110
-
```bash
111
-
bergson pipeline <path_to_yaml>
112
-
```
113
-
114
-
The YAML is a list of single-key entries, one per step, each holding that command's full config. See [`examples/pipelines/hessian_then_build.yaml`](examples/pipelines/hessian_then_build.yaml) for a runnable Hessian → build example.
115
-
116
125
## Query an On-Disk Gradient Index
117
126
118
127
We provide a query Attributor which supports unit normalized gradients and KNN search out of the box. Access it via CLI with
0 commit comments