Skip to content

Commit dab29fc

Browse files
authored
ci(run-notebook): replace whole-cell bash partition with nbclient (#573)
## Purpose The current `run-notebook` action partitions each code cell whole-cell: cells containing `!pip` go entirely to a bash script, the rest go to a Python script. A cell that mixes `!pip install` with Python imports — a standard Jupyter pattern — fails with `import: command not found` because bash tries to execute Python as a shell command. See PR #572 / [run 25743389812](https://github.com/pinecone-io/examples/actions/runs/25743389812/job/75599935724) for a live example. ## Solution Drive notebooks through [`nbclient`](https://nbclient.readthedocs.io/), the same tool Project Jupyter ships behind `jupyter nbconvert --execute`. `nbclient` launches a real `ipykernel` and processes cells line-by-line: `!pip` and other shell magics route through the kernel's magic handlers (subshell), Python runs as Python. Mixed cells work exactly as they do in Colab and Jupyter Lab. ## Changes - New `run-notebook.py`: 50-line nbclient driver. Takes the notebook path as its only arg. Exits 0 on success, 1 on any cell error, 2 on usage error. - Updated `action.yml`: installs `nbformat nbclient ipykernel` into the runner Python, then invokes the new script. No tempdir, no nested venv, no bash conversion. - Removed `convert-notebook.py` (~120 lines). Net diff: 66 insertions, 137 deletions. ## Verification Tested locally against the notebook from PR #572 (`docs/quick-tour/hello-pinecone.ipynb`, with `!pip install` and Python imports in the same cell — the case that breaks today's runner): ``` $ python .github/actions/run-notebook/run-notebook.py docs/quick-tour/hello-pinecone.ipynb Executing docs/quick-tour/hello-pinecone.ipynb PASS — docs/quick-tour/hello-pinecone.ipynb ``` ## Notes - **`check-structure` follow-up.** The lint rule "imports must be in the first code cell" was designed around today's runner: under the new runner, a single cell holding `!pip install` + imports is once again the correct pattern. Either relax the rule to skip pip-only cells when locating "the first code cell," or remove the rule entirely. Separate PR. - **Notebook dep isolation.** Notebook deps now install into the runner's Python (the kernel shares its interpreter), not a nested venv. Acceptable for CI (ephemeral runner) and simpler than the previous tempdir/venv dance. - **Per-cell timeout.** 600s by default, overridable via `NOTEBOOK_CELL_TIMEOUT`. - **Soak.** Worth running this against a known-passing notebook before merging to confirm no regressions on the existing fleet — let me know if you want me to dual-run on a sample. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Changes the CI notebook execution mechanism from a custom conversion/venv+bash flow to `nbclient` driving a real Jupyter kernel, which may alter dependency/runtime behavior and failure modes across the notebook suite. > > **Overview** > Updates the `run-notebook` composite action to execute notebooks end-to-end using `nbclient`/`ipykernel` instead of converting cells into separate bash/python scripts. > > This removes `convert-notebook.py`, installs `nbformat nbclient ipykernel` on the runner, and adds `run-notebook.py` which runs the notebook with a real kernel, returning clearer per-cell errors and supporting mixed `!pip` + Python cells (with an optional `NOTEBOOK_CELL_TIMEOUT`). > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit f87187a. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->
1 parent e01abf7 commit dab29fc

3 files changed

Lines changed: 78 additions & 137 deletions

File tree

Lines changed: 6 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
name: "Run Notebook"
2-
description: "Run a notebook"
2+
description: "Run a notebook end-to-end against a real Jupyter kernel via nbclient"
33

44
inputs:
55
notebook:
@@ -26,34 +26,18 @@ runs:
2626
with:
2727
python-version: '3.11'
2828

29-
- name: Install dependencies
29+
- name: Install runner dependencies
3030
shell: bash
3131
run: |
3232
pip install --upgrade pip
33-
pip install nbformat
33+
pip install nbformat nbclient ipykernel
3434
35-
- id: convert
35+
- name: Run the notebook
3636
shell: bash
37-
name: Convert notebook into tmpdir script
3837
run: |
39-
python .github/actions/run-notebook/convert-notebook.py ${{ inputs.notebook }}
40-
41-
- name: View the run script
42-
shell: bash
43-
run: |
44-
cat ${{ steps.convert.outputs.script_path }}
45-
46-
- name: View converted notebook content
47-
shell: bash
48-
run: |
49-
cat ${{ steps.convert.outputs.notebook_path }}
50-
51-
- name: Run the converted notebook
52-
shell: bash
53-
run: |
54-
bash ${{ steps.convert.outputs.script_path }}
38+
python .github/actions/run-notebook/run-notebook.py "${{ inputs.notebook }}"
5539
env:
5640
PINECONE_API_KEY: ${{ inputs.PINECONE_API_KEY }}
5741
OPENAI_API_KEY: ${{ inputs.OPENAI_API_KEY }}
5842
HF_TOKEN: ${{ inputs.HF_TOKEN }}
59-
ANTHROPIC_API_KEY: ${{ inputs.ANTHROPIC_API_KEY }}
43+
ANTHROPIC_API_KEY: ${{ inputs.ANTHROPIC_API_KEY }}

.github/actions/run-notebook/convert-notebook.py

Lines changed: 0 additions & 115 deletions
This file was deleted.
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
#!/usr/bin/env python
2+
"""Execute a notebook end-to-end against a real Jupyter kernel.
3+
4+
Replaces the cell-partitioning approach in convert-notebook.py with a
5+
straight nbclient drive. This means:
6+
7+
- !pip install and other shell magics run via the kernel's normal magic
8+
handling (the kernel spawns a subshell), exactly as in Colab/Jupyter Lab.
9+
- A code cell may freely mix `!pip install` and Python imports.
10+
- Errors surface with cell and traceback context, not as
11+
`line N: import: command not found`.
12+
13+
Notebook deps are installed into the same Python environment as the
14+
runner (the kernel and runner share an interpreter), so `!pip install foo`
15+
in cell 1 means cell 2 can `import foo`.
16+
17+
Usage:
18+
run-notebook.py <notebook-path>
19+
20+
Exits non-zero on any cell failure.
21+
"""
22+
23+
import os
24+
import sys
25+
26+
import nbformat
27+
from nbclient import NotebookClient
28+
from nbclient.exceptions import CellExecutionError, CellTimeoutError, DeadKernelError
29+
30+
CELL_TIMEOUT = int(os.environ.get("NOTEBOOK_CELL_TIMEOUT", "600"))
31+
32+
33+
def main() -> int:
34+
if len(sys.argv) != 2:
35+
print("usage: run-notebook.py <notebook-path>", file=sys.stderr)
36+
return 2
37+
38+
notebook_path = sys.argv[1]
39+
print(f"Executing {notebook_path}")
40+
41+
nb = nbformat.read(notebook_path, as_version=4)
42+
client = NotebookClient(
43+
nb,
44+
timeout=CELL_TIMEOUT,
45+
kernel_name="python3",
46+
resources={"metadata": {"path": os.path.dirname(notebook_path) or "."}},
47+
)
48+
49+
try:
50+
client.execute()
51+
except CellTimeoutError as exc:
52+
print(
53+
f"\nCell exceeded timeout of {CELL_TIMEOUT}s "
54+
f"(override via NOTEBOOK_CELL_TIMEOUT):\n{exc}",
55+
file=sys.stderr,
56+
)
57+
return 1
58+
except DeadKernelError as exc:
59+
print(
60+
f"\nKernel died during execution (OOM/segfault?):\n{exc}", file=sys.stderr
61+
)
62+
return 1
63+
except CellExecutionError as exc:
64+
print(f"\nCell execution failed:\n{exc}", file=sys.stderr)
65+
return 1
66+
67+
print(f"PASS — {notebook_path}")
68+
return 0
69+
70+
71+
if __name__ == "__main__":
72+
sys.exit(main())

0 commit comments

Comments
 (0)