Commit 4541d10
committed
Refactor activation normalization handling and remove ActivationNormalizer class, such that autoloading of activationnormalizer with load_pretrained works automatically.
This commit includes the following changes:
- Removed the `ActivationNormalizer` class from `utils.py`, centralizing activation normalization logic within the `NormalizableMixin`.
- Updated the `NormalizableMixin` to accept mean and standard deviation tensors directly, replacing the previous reliance on `ActivationNormalizer`.
- Modified the constructors of `BatchTopKSAE`, `CrossCoder`, and their trainers to accept `activation_mean` and `activation_std` parameters instead of an `activation_normalizer`.
- Adjusted normalization and denormalization methods to utilize the new mean and std tensors, ensuring that activations are processed correctly.
- Cleaned up related code in `cache.py`, `dictionary.py`, and various trainer files to reflect these changes.
These modifications enhance the clarity and maintainability of the code while ensuring proper handling of activation normalization across various components.1 parent 68b3025 commit 4541d10
5 files changed
Lines changed: 116 additions & 139 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
19 | 18 | | |
20 | 19 | | |
21 | 20 | | |
| |||
303 | 302 | | |
304 | 303 | | |
305 | 304 | | |
306 | | - | |
307 | | - | |
308 | | - | |
309 | 305 | | |
310 | 306 | | |
311 | 307 | | |
| |||
701 | 697 | | |
702 | 698 | | |
703 | 699 | | |
704 | | - | |
705 | | - | |
706 | | - | |
707 | 700 | | |
708 | 701 | | |
709 | 702 | | |
| |||
732 | 725 | | |
733 | 726 | | |
734 | 727 | | |
735 | | - | |
736 | | - | |
737 | | - | |
738 | | - | |
0 commit comments