Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
e0b0dd7
stop tracking .DS_Store (already gitignored)
shrey-Bish May 17, 2026
a64dae5
scaffold textbook IR (pydantic schema, 10 models)
shrey-Bish May 17, 2026
aac27a1
add markdown textbook ingester
shrey-Bish May 18, 2026
c4e482a
add PDF textbook ingester
shrey-Bish May 22, 2026
de531da
add --use-textbook: opt-in textbook grounding for course generation
shrey-Bish May 28, 2026
3e8820b
inject textbook TOC into foundation deliberations
shrey-Bish May 29, 2026
0791e38
upgrade default embedder to text-embedding-3-large
shrey-Bish Jun 5, 2026
366ef18
filter non-instructional sections from injected TOC
shrey-Bish Jun 5, 2026
0d6fec6
wire opt-in cross-encoder reranker and admin scaffolding into runtime
shrey-Bish Jun 5, 2026
cd66236
add self-consistency voting for citation verifier
shrey-Bish Jun 5, 2026
4daab8c
harden cross-encoder reranker integration
shrey-Bish Jun 5, 2026
885ecd9
preserve syllabus week/chapter numbering in SyllabusProcessor
shrey-Bish Jun 5, 2026
f12cf2d
add spatial-object page router for hybrid PDF extraction
shrey-Bish Jun 5, 2026
d68648c
add paged PyMuPDF4LLM ingester preserving real page numbers
shrey-Bish Jun 5, 2026
f0889cc
add VLM adapter for complex-page extraction
shrey-Bish Jun 5, 2026
8af6ca4
add hybrid PDF ingester + wire --vlm-extraction flag end-to-end
shrey-Bish Jun 5, 2026
017195d
add generate-side visual-content rules for hybrid extracted chunks
shrey-Bish Jun 5, 2026
c3aed32
deterministic ingestion via Textbook IR cache + pinned VLM calls
shrey-Bish Jun 5, 2026
8dfc9c6
emit visual-content paragraphs as standalone chunks
shrey-Bish Jun 5, 2026
167436a
deduplicate near-identical chunks in evidence block before showing LLM
shrey-Bish Jun 5, 2026
da85a5e
resolve citation tokens for any page within a multi-page chunk's range
shrey-Bish Jun 5, 2026
36e9acb
sentence-bounded verifier claim window
shrey-Bish Jun 5, 2026
2315c16
stitch dangling sentences across page boundaries before chunking
shrey-Bish Jun 5, 2026
6e74bb8
per-chapter top_k tuning by bound-chunk density
shrey-Bish Jun 5, 2026
4d2b255
strip malformed citation tokens at artifact-save time
shrey-Bish Jun 5, 2026
6aa012c
trim verifier chunk to the most relevant passage for the claim
shrey-Bish Jun 5, 2026
ef367c4
upgrade VLM extraction and query expansion to gpt-4o
shrey-Bish Jun 5, 2026
5abf943
report page coverage, per-class precision, top section per failure mode
shrey-Bish Jun 5, 2026
28d1ef5
fix KB attribute lookup in score_grounding's coverage summary
shrey-Bish Jun 6, 2026
953812c
thread chapter-promotion state through per-page heading normalisation
shrey-Bish Jun 6, 2026
6bf6e48
preserve visual chunks in evidence dedup + strip unresolvable citations
shrey-Bish Jun 6, 2026
d91ee47
add semantic gating, LLM write-time citation verifier, and LaTeX clea…
shrey-Bish Jun 7, 2026
4fb06b5
strip stray VLM markers from artifacts and add includegraphics suppor…
shrey-Bish Jun 7, 2026
7cf73db
polish PPTX export: backtick quotes, markdown leftovers, bare math fe…
shrey-Bish Jun 7, 2026
0ddfd4b
fix nested itemize parsing and image overflow in PPTX export
shrey-Bish Jun 7, 2026
7d21b5f
lift slide figures to the top so they render large
shrey-Bish Jun 7, 2026
998649f
add [grounding] extras group so vanilla installs stay light
shrey-Bish Jun 8, 2026
ff4ad7b
swap sentence-transformers + torch for fastembed in the grounding stack
shrey-Bish Jun 8, 2026
adf2627
rewrite internal version-prefixed comments in self-contained form
shrey-Bish Jun 8, 2026
3390d08
tighten claim-window detection, reranker warmup, ambiguous-token rescue
shrey-Bish Jun 11, 2026
ecb1544
cap chunk size at ingest + embedder + fail-fast on retrieval errors
shrey-Bish Jun 12, 2026
18fd81a
revert claim-window delegation to rfind heuristic
shrey-Bish Jun 13, 2026
bb079b3
preserve textbook figures through the slide-writer pipeline
shrey-Bish Jun 13, 2026
8a06983
switch default PDF ingestion to pymupdf4llm with native image extraction
shrey-Bish Jun 14, 2026
0dcb6af
remove the LLM-based reranker and the multi-draft slide path
shrey-Bish Jun 14, 2026
125fb97
strip citation tokens from final saved artifacts
shrey-Bish Jun 14, 2026
73e66d4
add textbook-chapter catalog for depth-first single-chapter delivery
shrey-Bish Jun 14, 2026
7649e5d
inject up to four visual chunks per slide
shrey-Bish Jun 14, 2026
fbadd92
implement gaps 1+3+8+9+10+11+13 in the slide pipeline
shrey-Bish Jun 14, 2026
3bda460
tag math-dense paragraphs as kind=equation at ingest
shrey-Bish Jun 14, 2026
3d698aa
drop textbook-specific jargon from production code and tests
shrey-Bish Jun 14, 2026
f1e045e
drop post-hoc grounding scorer from evaluate.py
shrey-Bish Jun 14, 2026
130a6ed
preserve faculty-drafted figures + normalize section titles
shrey-Bish Jun 14, 2026
8cb0dca
add figure-slide embedding match, render-quality fixes, and contract …
shrey-Bish Jun 15, 2026
9e7cc27
citation-free grounding: advisory verifier, equation-VLM, render + fi…
shrey-Bish Jun 17, 2026
5a13b5d
fix: eval summary print crashed on the grounding-fidelity aggregate
shrey-Bish Jun 17, 2026
1304e80
fix: pass admin-scaffolding prompt to generate_response as a message …
shrey-Bish Jun 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed .DS_Store
Binary file not shown.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ eval/
logs/
assets/
.cache/
.grounding_cache/

# Uploaded catalogs (user-specific)
catalog/uploaded_*.json
Expand Down
74 changes: 67 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ An AI-powered instructional design system based on the ADDIE model for automated
| 📄 **LaTeX/PDF Output** | Generate professional LaTeX slides and compile to PDF format |
| 🎨 **PowerPoint (PPTX) Export** | Convert LaTeX Beamer slides to visually rich PPTX using pptxgenjs with icons, shadows, and Slide Masters |
| ✅ **Automatic Evaluation** | Built-in evaluation system for assessing generated course materials |
| 📖 **Textbook Grounding** | *(opt-in)* Ground course content in a PDF or markdown textbook; each slide is written from retrieved textbook evidence. An advisory verifier checks claim faithfulness and a Grounding Fidelity % is reported. Available on CLI, API, and Web UI. |

### 🎬 How It Works

Expand Down Expand Up @@ -190,6 +191,7 @@ python -m http.server 8080
- Select "Not Use" for basic generation
- Select "Upload Catalog File" to upload a custom catalog JSON
- Select "Use Default Catalog" to use the default catalog
- **Textbook grounding** *(optional)*: upload one or more PDF/markdown files via the picker labelled "Textbook grounding (optional)". Leave empty to skip.

2. **Click "Generate Course"** to start the task

Expand Down Expand Up @@ -353,12 +355,21 @@ For developers who want to run the system locally from source:
### 2. Install Dependencies

```bash
# Python dependencies
pip install -r requirements.txt

# Or install in editable mode
# Vanilla install — minimal footprint, supports the standard
# course-writing pipeline (no textbook grounding).
pip install -e .

# Light install + textbook grounding (`--use-textbook PATH`).
# Adds pymupdf, markdown-it-py, rank-bm25, fastembed (ONNX-based
# bi-encoder and cross-encoder via onnxruntime; no torch dep).
# ~100 MB total on top of the base install.
pip install -e ".[grounding]"

# All-in-one (also installs the optional chromadb extras and any
# grounding deps): keeps the prior `requirements.txt`-based workflow
# working unchanged.
pip install -r requirements.txt

# Node.js dependencies (for PPTX generation)
npm install -g pptxgenjs

Expand Down Expand Up @@ -436,6 +447,10 @@ python run.py "AI Fundamentals" --catalog ai_catalog

# Combine catalog and copilot
python run.py "Educational Psychology" --copilot --catalog edu_psy

# Ground the course in a textbook (PDF/markdown file or directory)
python run.py "Data Mining" --catalog default_catalog \
--use-textbook path/to/textbook.pdf
```

**Minimal Working Example** (generates a small 3-week course in ~5 min):
Expand All @@ -458,6 +473,10 @@ Options:
--exp EXP_NAME Experiment name for saving output (default: exp1)
--seed SEED Random seed for reproducibility
--temperature TEMP Sampling temperature for LLM
--use-textbook PATH Ground course generation in a textbook (PDF or
markdown file, or a directory of either). When
omitted, generation runs identically to a vanilla
run — no grounding is applied.
--optimize STORAGE_ID Optimize mode: provide storage_id of uploaded PDFs
--requirements TEXT User requirements for optimization (with --optimize)
--chapter NAME Specific chapter to optimize (with --optimize)
Expand Down Expand Up @@ -490,6 +509,12 @@ curl http://localhost:8000/api/course/results/{task_id}/files
# Download a file
curl http://localhost:8000/api/course/results/{task_id}/download/chapter_1/slides.pdf \
--output slides.pdf

# Textbook grounding (optional) — upload a textbook, then pass its
# returned `path` as `textbook_path` in /api/course/generate above
curl -X POST http://localhost:8000/api/textbooks/upload \
-F "files=@chapter_1.pdf" -F "files=@chapter_2.pdf"
curl http://localhost:8000/api/textbooks/list
```

For complete API documentation, see [API Documentation](docs/API_DOCUMENTATION.md).
Expand All @@ -503,7 +528,8 @@ For complete API documentation, see [API Documentation](docs/API_DOCUMENTATION.m
| **Course Generation** | Generate complete course materials based on ADDIE model | Web interface, CLI (`run.py`), or RESTful API |
| **Catalog Mode** | Use structured catalog files for guided generation | `--catalog` flag or upload in web interface |
| **Copilot Mode** | Interactive feedback during generation | `--copilot` flag in CLI or enable in web interface |
| **Evaluation** | Automatic assessment of generated materials | `python evaluate.py --exp <exp_name>` |
| **Textbook Grounding** | Ground content in a PDF/markdown textbook from retrieved evidence | `--use-textbook PATH` flag in CLI, `textbook_path` in API, file picker in web interface |
| **Evaluation** | Automatic assessment of generated materials, with an optional Grounding Fidelity % | `python evaluate.py --exp <exp_name> [--rigorous]` |
| **Web Interface** | Visual interface for course generation | Open `frontend/index.html` in browser |
| **API Server** | RESTful API for programmatic access | `python api_server.py` or Docker |

Expand Down Expand Up @@ -547,16 +573,36 @@ Interactive mode that prompts for feedback after each phase of the ADDIE workflo
python run.py "Advanced Algorithms" --copilot --exp algo_course_v2
```

### Textbook Grounding

Opt-in. Pass `--use-textbook PATH` (a PDF, markdown file, or directory of either) and the system retrieves relevant textbook passages per chapter and writes each slide grounded in that retrieved evidence — teaching in its own words from the source rather than the model's parametric memory. Without the flag, vanilla output is unchanged.

```bash
python run.py "Data Mining" --catalog default_catalog --exp dm_grounded \
--use-textbook path/to/textbook.pdf
```

Embeddings are cached on disk after the first ingest (one-time per textbook). Per-chapter generation is modestly slower than vanilla because prompts carry retrieved excerpts.

**How the grounding works under the hood:**
- The textbook is ingested (`pymupdf4llm`) into a chapter → section → paragraph IR; equation-shaped image crops are converted to native LaTeX by a focused VLM pass (cached). Paragraphs are chunked (~512 tokens) and indexed for BM25 + dense (`text-embedding-3-large`) retrieval.
- Each chapter is decomposed into subtopics by the LLM; each subtopic is HyDE-expanded into a hypothetical textbook paragraph and used as a retrieval query. Per-section rankings across queries are fused via Reciprocal Rank Fusion (RRF, k=60), and a **book-relative gate** binds each chapter to its top sections — or **abstains** (writes ungrounded) when nothing scores well, rather than fabricate against weak retrieval.
- The writer injects a per-slide block of retrieved evidence with mandatory grounding rules (teach in your own words, abstain if unsupported, preserve worked examples / math notation). Deterministic post-passes handle figure placement, textbook captions, navigation frames, and LaTeX cleanup.
- After each chapter, an advisory content-fidelity verifier checks the generated claims against the writer's evidence and logs `content_verification.json` (claims supported / unsupported) — log-only, it never edits the deck. This feeds the Grounding Fidelity metric in evaluation.

### Automatic Evaluation

**Entry Point**: `evaluate.py` – Automatic assessment and scoring

```bash
# Evaluate a specific experiment
# Rubric scoring + Program-Chair / Test-Student validation
python evaluate.py --exp web_dev_v1

# Measurement-grade scoring + a binary Grounding Fidelity % on grounded runs
python evaluate.py --exp dm_grounded --rigorous
```

Evaluation results are saved in `eval/{experiment_name}/` directory.
Evaluation results are saved in the `eval/{experiment_name}/` directory. The default run is a 1–5 multi-agent rubric. `--rigorous` adds deterministic scoring (fixed seed, median-of-3), a `core_quality` headline (excluding metrics a slide deck structurally can't satisfy), and — on grounded runs — a **Grounding Fidelity %** aggregated from the per-chapter content-fidelity reports (claims supported vs. unsupported). That binary percentage is the sharp, A/B-comparable grounding signal the coarse 1–5 rubric can't provide.

### LaTeX-to-PPTX Conversion

Expand Down Expand Up @@ -636,6 +682,20 @@ python run.py "Advanced Algorithms" --copilot --exp algo_course_v2
# - Development → feedback on chapter materials
```

### Textbook-Grounded Course

```bash
# Step 1: Generate course grounded in a textbook
python run.py "Data Mining" --catalog default_catalog --exp dm_grounded \
--use-textbook path/to/textbook.pdf

# Step 2: Evaluate with the Grounding Fidelity % (rigorous mode)
python evaluate.py --exp dm_grounded --rigorous

# Step 3: Review per-chapter content-fidelity logs (claims supported vs. unsupported)
open exp/dm_grounded/chapter_1/content_verification.json
```

---

## 📖 Documentation
Expand Down
Loading