Goal
Eliminate the per-push commit → push → CI auto-commits Clean ipynb → pull → rebase --squash → push --force loop by making utils/add_colab_main_buttons.py runnable locally before commit.
Why
Today, every push that touches notebooks triggers the badges job in .github/workflows/conda_env_test.yml, which runs add_colab_main_buttons.py and git-auto-commit-action adds a Clean ipynb <triggering-sha> commit on top. To keep history linear, the contributor then has to pull, squash with git rebase -i, and git push --force. Six steps for every notebook-touching push.
If the cleaner runs locally before commit, CI sees a no-op working tree (the script is idempotent — see line 56-58 of the script) and the auto-commit never lands. Loop collapses to: clean → commit → push → done.
Three things need to change for this to work in practice. (1) and (2) are required; (3) is opportunistic cleanup that fits the same theme.
(1) Make the cleaner runnable locally without heavy setup
The script needs bs4, lxml, nbformat — about 60 MB of deps. Today contributors have to either set up the conda env (4+ GB) or replicate the CI badge step (uv venv + utils/requirements.txt install). Both are too much friction for "I just want to commit a notebook change."
Proposed: add PEP 723 inline metadata to the script:
# /// script
# dependencies = [
# "bs4",
# "lxml",
# "nbformat",
# ]
# ///
Contributors then need only uv (single ~30 MB static binary) and run:
uv run utils/add_colab_main_buttons.py
uv resolves and caches the deps in an ephemeral env, no setup required. Side benefit: the CI badges job's current 5-step uv-venv-pip-install dance collapses to a single uv run step.
Alternative if uv adoption is undesirable: publish a minimal nmisp-clean container based on python:3.11-slim (~150 MB total, vs the 6.76 GB nmisp-test image). Pull-once, zero local Python setup. Could be a second variant in the existing build-test-image.yml.
(2) Fix SSH-remote URL parsing in get_github_username_repo_folder()
Currently the script parses git remote -v output with urllib.parse.urlparse, which doesn't understand scp-style SSH URLs (git@github.com:user/repo.git). On CI the remote is HTTPS (actions/checkout sets it), so it works. Locally with an SSH remote, URLs get mangled.
Reproduction (verified on villa today):
Switch to a clean pre-badge state and run the script:
git checkout test-this-01 # pre-badge state of 44c623d9 'add Visualizing matrices'
python utils/add_colab_main_buttons.py
git diff
Resulting badge URLs in every notebook:
https://colab.research.google.com/github/git@github.com:kangwonlee/nmisp/blob/test-this-01/...
^^^^^^^^^^^^^^^^
The literal git@github.com: prefix leaks into the path component because urlparse of git@github.com:kangwonlee/nmisp.git doesn't decompose into the expected (user, repo) pair.
Suggested fix (around line 105 in utils/add_colab_main_buttons.py):
def get_github_username_repo_folder(ipynb_path: str) -> Tuple[str, str]:
result = subprocess.check_output(
("git", "remote", "-v"),
cwd=ipynb_path,
encoding='utf-8',
)
line0 = result.splitlines()[0]
url = line0.split()[1]
# scp-style SSH: git@github.com:user/repo.git
if url.startswith("git@") and ":" in url:
path = url.split(":", 1)[1]
# https://github.com/user/repo.git
else:
path = up.urlparse(url).path
parts = path.lstrip("/").split("/")
name = parts[-2]
repo = os.path.splitext(parts[-1])[0]
return name, repo
Without this fix, local pre-commit normalization is unsafe — every notebook's badge URL gets corrupted.
(3) [Optional] Trim utils/requirements.txt
Audit of imports across utils/*.py and utils/tests/*.py vs declared deps:
| Declared |
Status |
bs4, lxml (transitive backend), pytest, pytest-xdist |
used |
jupyter |
only nbformat is actually imported — meta-package drags in notebook server, ipykernel, traitlets, etc. |
matplotlib, numpy, scipy, sympy |
grep-clean — not imported anywhere in utils/ or utils/tests/ |
Proposed minimum:
bs4
lxml
nbformat
pytest
pytest-xdist
5 packages instead of 9, nbformat (~few MB) instead of jupyter (~tens of MB of dead weight). Faster test_utils CI step, lighter footprint for any local install.
Parallel to existing #357 ("tests : prune old conda environment files") — same hygiene theme, different file.
Out of scope
- The branch-name embedding in badge URLs (
get_current_branch() line 156) is load-bearing, not a bug — it lets a Colab badge on a feature branch open that branch's notebook for WIP preview. Verified by demo: switching to test-this-01 and re-running the script correctly produces blob/test-this-01/... URLs.
- Removing the badges CI step entirely. It's still needed as a fallback for Colab contributors who can't run docker/uv locally.
Acceptance
After (1) and (2) ship, a contributor with a fresh SSH clone can do:
edit notebook → uv run utils/add_colab_main_buttons.py → git commit → git push
…with the CI badges job creating no Clean ipynb auto-commit, and no rebase/force-push needed.
Goal
Eliminate the per-push commit → push → CI auto-commits
Clean ipynb→ pull → rebase --squash → push --force loop by makingutils/add_colab_main_buttons.pyrunnable locally before commit.Why
Today, every push that touches notebooks triggers the
badgesjob in.github/workflows/conda_env_test.yml, which runsadd_colab_main_buttons.pyandgit-auto-commit-actionadds aClean ipynb <triggering-sha>commit on top. To keep history linear, the contributor then has to pull, squash withgit rebase -i, andgit push --force. Six steps for every notebook-touching push.If the cleaner runs locally before commit, CI sees a no-op working tree (the script is idempotent — see line 56-58 of the script) and the auto-commit never lands. Loop collapses to: clean → commit → push → done.
Three things need to change for this to work in practice. (1) and (2) are required; (3) is opportunistic cleanup that fits the same theme.
(1) Make the cleaner runnable locally without heavy setup
The script needs
bs4,lxml,nbformat— about 60 MB of deps. Today contributors have to either set up the conda env (4+ GB) or replicate the CI badge step (uv venv +utils/requirements.txtinstall). Both are too much friction for "I just want to commit a notebook change."Proposed: add PEP 723 inline metadata to the script:
Contributors then need only
uv(single ~30 MB static binary) and run:uvresolves and caches the deps in an ephemeral env, no setup required. Side benefit: the CIbadgesjob's current 5-step uv-venv-pip-install dance collapses to a singleuv runstep.Alternative if uv adoption is undesirable: publish a minimal
nmisp-cleancontainer based onpython:3.11-slim(~150 MB total, vs the 6.76 GBnmisp-testimage). Pull-once, zero local Python setup. Could be a second variant in the existingbuild-test-image.yml.(2) Fix SSH-remote URL parsing in
get_github_username_repo_folder()Currently the script parses
git remote -voutput withurllib.parse.urlparse, which doesn't understand scp-style SSH URLs (git@github.com:user/repo.git). On CI the remote is HTTPS (actions/checkoutsets it), so it works. Locally with an SSH remote, URLs get mangled.Reproduction (verified on villa today):
Switch to a clean pre-badge state and run the script:
git checkout test-this-01 # pre-badge state of 44c623d9 'add Visualizing matrices' python utils/add_colab_main_buttons.py git diffResulting badge URLs in every notebook:
The literal
git@github.com:prefix leaks into the path component becauseurlparseofgit@github.com:kangwonlee/nmisp.gitdoesn't decompose into the expected(user, repo)pair.Suggested fix (around line 105 in
utils/add_colab_main_buttons.py):Without this fix, local pre-commit normalization is unsafe — every notebook's badge URL gets corrupted.
(3) [Optional] Trim
utils/requirements.txtAudit of imports across
utils/*.pyandutils/tests/*.pyvs declared deps:bs4,lxml(transitive backend),pytest,pytest-xdistjupyternbformatis actually imported — meta-package drags in notebook server, ipykernel, traitlets, etc.matplotlib,numpy,scipy,sympyutils/orutils/tests/Proposed minimum:
5 packages instead of 9,
nbformat(~few MB) instead ofjupyter(~tens of MB of dead weight). Fastertest_utilsCI step, lighter footprint for any local install.Parallel to existing #357 ("tests : prune old conda environment files") — same hygiene theme, different file.
Out of scope
get_current_branch()line 156) is load-bearing, not a bug — it lets a Colab badge on a feature branch open that branch's notebook for WIP preview. Verified by demo: switching totest-this-01and re-running the script correctly producesblob/test-this-01/...URLs.Acceptance
After (1) and (2) ship, a contributor with a fresh SSH clone can do:
…with the CI
badgesjob creating noClean ipynbauto-commit, and no rebase/force-push needed.