22
33Node and channel metrics for neural network interpretability, importance, and interventions.
44
5- [ ![ Tests] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/test.yml/badge.svg )] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/test.yml )
6- [ ![ Lint] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/lint.yml/badge.svg )] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/lint.yml )
7- [ ![ Documentation] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/docs.yml/badge.svg )] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/docs.yml )
8- [ ![ Release] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/release.yml/badge.svg )] ( https://github.com/KempnerInstitute/nodelens/actions/workflows/release.yml )
5+ [ ![ Tests] ( https://github.com/KempnerInstitute/NodeLens/actions/workflows/test.yml/badge.svg )] ( https://github.com/KempnerInstitute/NodeLens/actions/workflows/test.yml )
6+ [ ![ Lint] ( https://github.com/KempnerInstitute/NodeLens/actions/workflows/lint.yml/badge.svg )] ( https://github.com/KempnerInstitute/NodeLens/actions/workflows/lint.yml )
7+ [ ![ Documentation] ( https://github.com/KempnerInstitute/NodeLens/actions/workflows/docs.yml/badge.svg )] ( https://github.com/KempnerInstitute/NodeLens/actions/workflows/docs.yml )
98[ ![ Python] ( https://img.shields.io/badge/python-%3E%3D3.8-3776AB?logo=python&logoColor=white )] ( pyproject.toml )
10- [ ![ Artifacts] ( https://img.shields.io/badge/Hugging%20Face-artifacts-ffcc33 )] ( https://huggingface.co/datasets/hsafaai/supernodes-scar-artifacts )
119[ ![ License] ( https://img.shields.io/badge/license-MIT-blue.svg )] ( LICENSE )
1210
1311NodeLens is a research codebase for studying which channels, neurons, and
14- features matter most for model behavior. The Python package is imported as
15- ` nodelens ` .
16-
17- The repository supports two related workflows:
18-
19- - General metric analysis for vision models, transformers, and LLMs.
20- - Paper-specific releases under ` projects/ ` , including the Supernodes and SCAR
21- artifact workflow.
12+ features matter most for model behavior. It combines activation capture,
13+ importance metrics, redundancy and information measures, structured
14+ interventions, and report generation in one configuration-driven workflow. The
15+ Python package is imported as ` nodelens ` .
2216
2317## What The Code Does
2418
25- ``` mermaid
26- flowchart LR
27- A[Model + calibration data] --> B[Capture activations and gradients]
28- B --> C[Compute channel metrics]
29- C --> D[Identify loss-critical cores]
30- C --> E[Estimate redundancy and halo structure]
31- D --> F[Structured pruning and ablation probes]
32- E --> F
33- F --> G[Figures, tables, manifests, HF artifacts]
19+ ``` text
20+ Model + data
21+ |
22+ v
23+ Activation and gradient capture
24+ |
25+ v
26+ Channel and node metrics
27+ |-- activation statistics
28+ |-- Rayleigh quotient and spectral alignment
29+ |-- mutual information, redundancy, and synergy
30+ |-- gradients, curvature, Taylor scores, and loss proxies
31+ |
32+ v
33+ Analysis and interventions
34+ |-- identify outliers or loss-critical cores
35+ |-- cluster channels by metric profile
36+ |-- test ablations, pruning, and sensitivity probes
37+ |-- generate figures, tables, summaries, and manifests
3438```
3539
3640Core capabilities:
3741
38- - Loss-sensitive channel scoring, including SCAR loss-proxy metrics.
39- - Activation, curvature, Taylor, Rayleigh quotient, and information-theoretic metrics.
40- - Structured pruning strategies for channel-level model analysis.
41- - Cluster and halo-style analyses for local redundancy structure.
42- - Reproducible project folders for paper artifacts and public releases.
43-
44- Supported model families include MLPs, CNNs, transformer language models, and
45- LLM backends through Hugging Face causal language models.
42+ - Metric analysis for MLPs, CNNs, transformers, and Hugging Face causal LMs.
43+ - Node and channel scoring with activation, alignment, information,
44+ redundancy, gradient, curvature, and loss-sensitive metrics.
45+ - Structured pruning and ablation tools for testing whether high-scoring
46+ channels are functionally important.
47+ - Clustering and cross-layer analyses for studying local organization,
48+ redundancy, and downstream dependence.
49+ - Project workflows under ` projects/ ` that show how to reproduce concrete
50+ analyses with the shared library.
4651
4752## Installation
4853
4954``` bash
50- git clone https://github.com/KempnerInstitute/nodelens .git
51- cd nodelens
55+ git clone https://github.com/KempnerInstitute/NodeLens .git
56+ cd NodeLens
5257conda env create -f environment.yml
5358conda activate nodelens
5459pip install -e .
@@ -62,40 +67,42 @@ pip install -e .[all]
6267
6368## Quick Start
6469
70+ Run experiments from YAML configs:
71+
6572``` bash
66- # Vision model analysis
73+ # Small vision smoke test
6774python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml
6875
69- # CNN pruning
76+ # CNN pruning and clustering
7077python scripts/run_experiment.py --config configs/vision_prune/resnet18_cifar10_full.yaml
7178
72- # LLM supernode and SCAR analysis
79+ # LLM channel analysis and structured FFN pruning
7380python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml
7481```
7582
76- Package the public Supernodes and SCAR artifacts :
83+ Use metrics directly from Python :
7784
78- ``` bash
79- python projects/supernodes_scar/scripts/prepare_hf_artifacts.py \
80- --output-dir outputs/supernodes_scar_hf \
81- --clean
85+ ``` python
86+ from nodelens.metrics import get_metric, list_metrics
87+
88+ print (list_metrics())
8289
83- python projects/supernodes_scar/scripts/verify_hf_artifacts.py \
84- outputs/supernodes_scar_hf
90+ metric = get_metric( " rayleigh_quotient " )
91+ scores = metric.compute( inputs = layer_inputs, weights = layer_weights)
8592```
8693
87- ## Paper Releases
94+ ## Project Workflows
8895
89- Paper-specific release material lives under ` projects/ ` . Reusable library code
90- stays in ` src/nodelens ` , while each project folder records the exact configs,
91- artifact layout, reproducibility notes, and release checklist for a paper .
96+ Reusable library code lives in ` src/nodelens ` . Project folders contain the
97+ configs, small helper scripts, and artifact descriptions needed to reproduce a
98+ specific analysis with the shared package .
9299
93100Current project:
94101
95- - ` projects/supernodes_scar/ ` : release material for " Supernodes and Halos:
96- Loss-Critical Hubs in LLM Feed-Forward Layers" .
102+ - ` projects/supernodes_scar/ ` : workflow for the Supernodes and SCAR study of
103+ loss-sensitive FFN channels in LLMs .
97104
98- Derived artifacts for this project are staged on Hugging Face :
105+ The Supernodes and SCAR project also has a public derived-artifact dataset :
99106
100107- ` https://huggingface.co/datasets/hsafaai/supernodes-scar-artifacts `
101108
@@ -106,28 +113,29 @@ Derived artifacts for this project are staged on Hugging Face:
106113| Activation metrics | ` activation_l2_norm ` , ` activation_variance ` , ` activation_outlier_index ` |
107114| Alignment metrics | ` rayleigh_quotient ` , ` delta_alignment ` |
108115| Information metrics | ` mutual_information_gaussian ` , ` pairwise_redundancy_gaussian ` , ` gaussian_pid_synergy_mmi ` |
109- | SCAR metrics | ` scar_activation_power ` , ` scar_taylor ` , ` scar_curvature ` , ` scar_loss_proxy ` |
116+ | Loss-sensitive metrics | ` scar_activation_power ` , ` scar_taylor ` , ` scar_curvature ` , ` scar_loss_proxy ` |
110117| Pruning strategies | ` magnitude ` , ` alignment ` , ` composite ` , ` cluster_aware ` , ` random ` |
111118
112119## Repository Layout
113120
114121``` text
115- nodelens /
122+ NodeLens /
116123|-- configs/
117- | |-- prune_llm / # LLM and SCAR configs
118- | |-- vision_prune / # Vision pruning configs
119- | `-- examples / # Small example configs
120- |-- projects/ # Paper-specific release material
124+ | |-- examples / # Small runnable configs
125+ | |-- prune_llm / # LLM channel-analysis and pruning configs
126+ | `-- vision_prune / # Vision pruning and clustering configs
127+ |-- projects/ # Reproducible project workflows
121128|-- scripts/
122129| |-- run_experiment.py # Main experiment entry point
123- | `-- run_analysis.py # Post-hoc analysis
130+ | `-- run_analysis.py # Post-hoc analysis entry point
124131|-- src/nodelens/
125132| |-- analysis/ # Visualization, clustering, cascade analysis
126133| |-- experiments/ # Experiment classes
127- | |-- metrics/ # Importance metrics
134+ | |-- metrics/ # Importance and information metrics
128135| |-- models/ # Model wrappers
129- | `-- pruning/ # Pruning strategies
130- |-- tests/ # Unit tests
136+ | |-- pruning/ # Pruning strategies
137+ | `-- services/ # Activation capture, scoring, and mask utilities
138+ |-- tests/ # Unit and integration tests
131139`-- docs/ # Documentation
132140```
133141
@@ -137,7 +145,8 @@ nodelens/
137145- [ API Reference] ( docs/api_reference.md )
138146- [ LLM Guide] ( docs/llm_guide.md )
139147- [ Metric Consistency] ( docs/METRIC_CONSISTENCY.md )
140- - [ Supernodes and SCAR Release Notes] ( projects/supernodes_scar/README.md )
148+ - [ Architecture] ( docs/ARCHITECTURE.md )
149+ - [ Supernodes and SCAR Workflow] ( projects/supernodes_scar/README.md )
141150
142151Build the Sphinx docs locally:
143152
@@ -155,8 +164,9 @@ pytest tests/unit/ -v
155164
156165## Citation
157166
158- If you use the Supernodes and SCAR release, please cite the paper and the
159- archived code/artifact versions listed in ` CITATION.cff ` .
167+ If you use NodeLens, cite the repository metadata in ` CITATION.cff ` . If you use
168+ a project workflow or public artifact dataset, also cite the associated paper
169+ and artifact record.
160170
161171## License
162172
0 commit comments