Skip to content

Commit b3db747

Browse files
chore: update .gitattributes for artefact data handling
- Applying black formatting.
1 parent 92e9076 commit b3db747

16 files changed

Lines changed: 156 additions & 1040 deletions

File tree

.gitattributes

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,22 @@
1+
# Force LF line endings for all text files
12
* text=auto eol=lf
23

4+
# Explicitly handle Python files
5+
*.py text eol=lf
6+
7+
# Handle configuration files
8+
*.yaml text eol=lf
9+
*.json text eol=lf
10+
*.toml text eol=lf
11+
*.md text eol=lf
12+
13+
# Mark data artifacts as binary to prevent corruption
14+
*.csv binary
15+
*.sqlite binary
16+
*.h5 binary
17+
*.pth binary
18+
*.pt binary
19+
*.pkl binary
20+
21+
# Large data and logs should definitely be binary
22+
*.log text eol=lf

.gitignore

Lines changed: 66 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -182,9 +182,9 @@ cython_debug/
182182
.abstra/
183183

184184
# Visual Studio Code
185-
# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
185+
# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
186186
# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
187-
# and can be added to the global gitignore or merged into this file. However, if you prefer,
187+
# and can be added to the global gitignore or merged into this file. However, if you prefer,
188188
# you could uncomment the following to ignore the entire vscode folder
189189
# .vscode/
190190

@@ -213,9 +213,73 @@ checkpoints/
213213
checkpoints_full/
214214
results/
215215
results_long_run/
216+
runs/
217+
vscode/
218+
.agent/
216219

217220
# Images
218221
*.png
219222
*.jpg
220223
*.jpeg
221224
*.svg
225+
docs/research_proposal.tex
226+
.vscode/settings.json
227+
hest_data/.gitattributes
228+
hest_data/HEST_v1_3_0.csv
229+
hest_data/README.md
230+
hest_data/cellvit_seg/INT1_cellvit_seg.geojson.zip
231+
hest_data/cellvit_seg/INT1_cellvit_seg.parquet
232+
hest_data/cellvit_seg/INT10_cellvit_seg.geojson.zip
233+
hest_data/cellvit_seg/INT10_cellvit_seg.parquet
234+
hest_data/cellvit_seg/INT11_cellvit_seg.geojson.zip
235+
hest_data/cellvit_seg/INT11_cellvit_seg.parquet
236+
hest_data/cellvit_seg/INT12_cellvit_seg.geojson.zip
237+
hest_data/cellvit_seg/INT12_cellvit_seg.parquet
238+
hest_data/cellvit_seg/INT13_cellvit_seg.geojson.zip
239+
hest_data/cellvit_seg/INT13_cellvit_seg.parquet
240+
hest_data/cellvit_seg/INT16_cellvit_seg.geojson.zip
241+
hest_data/cellvit_seg/INT16_cellvit_seg.parquet
242+
hest_data/cellvit_seg/INT19_cellvit_seg.geojson.zip
243+
hest_data/cellvit_seg/INT19_cellvit_seg.parquet
244+
hest_data/cellvit_seg/INT20_cellvit_seg.geojson.zip
245+
hest_data/cellvit_seg/INT20_cellvit_seg.parquet
246+
hest_data/cellvit_seg/INT21_cellvit_seg.geojson.zip
247+
hest_data/cellvit_seg/INT21_cellvit_seg.parquet
248+
hest_data/cellvit_seg/INT1_cellvit_seg.geojson
249+
hest_data/cellvit_seg/INT10_cellvit_seg.geojson
250+
hest_data/cellvit_seg/INT11_cellvit_seg.geojson
251+
hest_data/cellvit_seg/INT12_cellvit_seg.geojson
252+
hest_data/cellvit_seg/INT13_cellvit_seg.geojson
253+
hest_data/cellvit_seg/INT16_cellvit_seg.geojson
254+
hest_data/cellvit_seg/INT19_cellvit_seg.geojson
255+
hest_data/cellvit_seg/INT20_cellvit_seg.geojson
256+
hest_data/cellvit_seg/INT21_cellvit_seg.geojson
257+
hest_data/cellvit_seg/TENX175_cellvit_seg.geojson
258+
hest_data/cellvit_seg/TENX175_cellvit_seg.geojson.zip
259+
hest_data/cellvit_seg/TENX175_cellvit_seg.parquet
260+
hest_data/metadata/TENX175.json
261+
hest_data/patches/TENX175.h5
262+
hest_data/st/TENX175.h5ad
263+
hest_data/tissue_seg/TENX175_contours.geojson
264+
hest_data/wsis/TENX175.tif
265+
.idea/.gitignore
266+
.idea/csv-editor.xml
267+
.idea/deployment.xml
268+
.idea/jupyter-settings.xml
269+
.idea/misc.xml
270+
.idea/modules.xml
271+
.idea/SpatialTranscriptFormer.iml
272+
.idea/vcs.xml
273+
.idea/inspectionProfiles/profiles_settings.xml
274+
.idea/inspectionProfiles/Project_Default.xml
275+
.idea/runConfigurations/STF_Compute_Pathways.xml
276+
.idea/runConfigurations/STF_Train_PrimaryPathway.xml
277+
.gemini/settings.json
278+
.gemini/agents/literature-search.md
279+
.gemini/agents/test-triage.md
280+
281+
# Large Data Artifacts
282+
global_genes_stats.csv
283+
*.sqlite
284+
HEST_v1_3_0.csv
285+
global_genes.json

.pre-commit-config.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# See https://pre-commit.com for more information
2+
# See https://pre-commit.com/hooks.html for more hooks
3+
repos:
4+
- repo: https://github.com/pre-commit/pre-commit-hooks
5+
rev: v4.5.0
6+
hooks:
7+
- id: trailing-whitespace
8+
- id: end-of-file-fixer
9+
- id: check-yaml
10+
- id: check-added-large-files
11+
12+
- repo: https://github.com/psf/black
13+
rev: 24.2.0
14+
hooks:
15+
- id: black
16+
language_version: python3

config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ training:
1212
batch_size: 8
1313
learning_rate: 0.0001
1414
output_dir: "./runs"
15-
15+
1616
# MSigDB Pathway Settings
1717
pathways:
1818
default_collection: "hallmarks"

docs/TESTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ Or using the provided PowerShell script:
2424
- Sample filtering logic (by Organ, Disease, Technology).
2525
- Pattern generation for HEST subsets.
2626
- Unzipping logic for segmentation files.
27-
27+
2828
- `tests/test_splitting_logic.py`: Tests for `splitting.py`. Verifies:
2929
- Patient-level splitting (train/val/test).
3030
- Leakage prevention (ensuring patients don't overlap between splits).

0 commit comments

Comments
 (0)