Skip to content

Commit d33a6bb

Browse files
committed
feat: Consolidate per-run YAMLs into one reproducible config.yaml
1 parent bc84d80 commit d33a6bb

52 files changed

Lines changed: 1697 additions & 1303 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,3 +199,4 @@ zeki_requirements.txt
199199
*package.json
200200
david_wips/
201201
test_files/
202+
scripts/

README.md

Lines changed: 26 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ Data attribution methods estimate the effect on a behavior of interest of removi
55

66
## Core features
77

8-
Per-token and per-sequence attribution is available everywhere. Both on-disk gradient stores and on-the-fly queries are supported. Almost every feature is available through both the CLI and a programmatic interface, which use a shared set of configuration dataclasses. To understand every available configuration option, [check out the documentation](https://bergson.readthedocs.io/en/latest/api.html#bergson.IndexConfig). Bergson uses FSDP2 or SimpleFSDP, BitsAndBytes, and low-level performance optimizations to support large models, datasets, and clusters.
8+
Per-token and per-sequence attribution is available everywhere. On-disk gradient stores and on-the-fly queries are supported. Almost every feature is available through both the CLI and a programmatic interface, which use a shared set of configuration dataclasses. Configuration dataclasses are always serialized to disk so commands can be reproduced in one line. To understand every available configuration option, [check out the documentation](https://bergson.readthedocs.io/en/latest/api.html#bergson.IndexConfig).
99

10-
Bergson integrates with HuggingFace Transformers and Datasets, and also supports on-disk datasets in a variety of formats.
10+
Bergson uses FSDP2 or SimpleFSDP, BitsAndBytes, and low-level performance optimizations to support large models, datasets, and clusters. Bergson integrates with HuggingFace Transformers and Datasets, and also supports on-disk datasets in a variety of formats.
1111

1212
### Attribute through Training
1313

@@ -24,7 +24,7 @@ For small queries and methods that don't use gradient compression (e.g., EK-FAC)
2424

2525
Per-module and per-attention head gradient storage enables mechanistic interpretability.
2626

27-
At a higher level, `bergson trackstar` pipelines all necessary steps for TrackStar-based attribution. See `bergson trackstar`.
27+
At a higher level, `bergson trackstar`, `bergson ekfac`, and `bergson approx_unrolling` orchestrate several multi-step attribution methods.
2828

2929
# Announcements
3030

@@ -49,20 +49,20 @@ pip install bergson
4949

5050
# Quickstart
5151

52-
To use MAGIC on a GPT-2 WikiText fine-tune:
52+
Use MAGIC to attribute a GPT-2 WikiText fine-tune:
5353

5454
```bash
55-
bergson magic examples/magic/gpt2_wikitext_tiny.yaml
55+
bergson examples/magic/gpt2_wikitext_tiny.yaml
5656
```
5757

58-
To construct and query an on-disk index of randomly projected gradients:
58+
Construct and query an on-disk index of randomly projected gradients:
5959

6060
```bash
6161
bergson build runs/index --model EleutherAI/pythia-14m --dataset NeelNanda/pile-10k --truncation --token_batch_size 4096 --projection_dim 16
6262
bergson query --index runs/index --unit_norm
6363
```
6464

65-
To collect TrackStar attribution scores for an I.I.D sample query:
65+
Collect TrackStar attribution scores for an I.I.D sample query:
6666

6767
```bash
6868
bergson trackstar runs/trackstar --model EleutherAI/pythia-14m --query.dataset NeelNanda/pile-10k --data.dataset NeelNanda/pile-10k --data.truncation --token_batch_size 4096 --query.truncation --query.split "train[:20]"
@@ -72,6 +72,25 @@ bergson trackstar runs/trackstar --model EleutherAI/pythia-14m --query.dataset N
7272

7373
Full documentation is available at https://bergson.readthedocs.io/.
7474

75+
## Run with YAMLs
76+
77+
Every run writes a self-describing `config.yaml` into its output directory, enabling the command(s) to be reproduced:
78+
79+
```bash
80+
bergson <path_to_config.yaml>
81+
```
82+
83+
Each config contains `steps:` a list of `- command: {...}` entries plus a `metadata:` block (bergson version, timestamp, git SHA). Multi-step pipelines may optionally specify a `run_path`. See [`examples/pipelines/hessian_then_build.yaml`](examples/pipelines/hessian_then_build.yaml) for an example of a multi-step run.
84+
85+
```yaml
86+
steps:
87+
- build: {index_cfg: {run_path: runs/idx}, preprocess_cfg: {}}
88+
metadata:
89+
bergson_version: 0.9.1
90+
created: 2026-06-03T14:22:10Z
91+
git_sha: abc1234
92+
```
93+
7594
## Gradient Collection
7695
7796
You can build an index of gradients for each training sample from the command line, using `bergson` as a CLI tool:
@@ -103,16 +122,6 @@ You can also aggregate your query dataset into a single mean or sum gradient as
103122
bergson build <output_path> --model <model_name> --dataset <dataset_name> --aggregation mean --unit_normalize --hessian_path <path_to_hessian>
104123
```
105124

106-
## Run a Multi-Step Pipeline
107-
108-
Many workflows chain several Bergson commands together. Rather than running each command separately, you can express the whole sequence as a YAML file and run it with a single command:
109-
110-
```bash
111-
bergson pipeline <path_to_yaml>
112-
```
113-
114-
The YAML is a list of single-key entries, one per step, each holding that command's full config. See [`examples/pipelines/hessian_then_build.yaml`](examples/pipelines/hessian_then_build.yaml) for a runnable Hessian → build example.
115-
116125
## Query an On-Disk Gradient Index
117126

118127
We provide a query Attributor which supports unit normalized gradients and KNN search out of the box. Access it via CLI with

bergson/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
from .collector.collector import CollectorComputer
88
from .collector.gradient_collectors import GradientCollector
99
from .collector.in_memory_collector import InMemoryCollector
10-
from .config import (
10+
from .config.config import (
1111
AttentionConfig,
1212
DataConfig,
1313
IndexConfig,

0 commit comments

Comments
 (0)