Skip to content

Commit 72822e2

Browse files
committed
Update README.md
1 parent b01e957 commit 72822e2

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@
88

99
FisherAdapTune wraps any PyTorch model and optimizer with a Fisher-guided chunk-freeze loop:
1010

11-
1. **Fisher collection** diagonal FIM statistics are accumulated via [AdaFisher](scripts/adafisher.py) hooks on `Linear`, `Conv2d`, `BatchNorm2d`, and `LayerNorm` layers.
12-
2. **JS divergence tracking** Jensen-Shannon distance between consecutive Fisher histograms is computed per parameter chunk. A low, stable JS distance signals that a chunk has stopped learning.
11+
1. **Fisher collection** - diagonal FIM statistics are accumulated via [AdaFisher](scripts/adafisher.py) hooks on `Linear`, `Conv2d`, `BatchNorm2d`, and `LayerNorm` layers.
12+
2. **JS divergence tracking** - Jensen-Shannon distance between consecutive Fisher histograms is computed per parameter chunk. A low, stable JS distance signals that a chunk has stopped learning.
1313
3. **Iterative freezing** — parameter groups whose JS scores fall below an adaptive threshold are masked and frozen. Frozen parameter groups are skipped in forward/backward passes, saving computation in later training stages.
14-
4. **Decoupled weight decay** — applied only to active (unfrozen) parameter entries, so regularisation does not fight the freeze signal.
1514

16-
The trainer is fully **plug-and-play**: you supply the model, optimizer, data loaders, and two callables (`train_step_fn`, `val_step_fn`). No subclassing required.
15+
16+
The trainer is fully **plug-and-play**: you can supply the model, optimizer, data loaders, and two callables (`train_step_fn`, `val_step_fn`). No subclassing required. See the [Quick Start](#quick-start) section.
1717

1818
---
1919

0 commit comments

Comments
 (0)