Skip to content

Commit 0a6d05c

Browse files
[docs] Add OB1 catalog generator + committed ob1-catalog.json
Ships the site-ready catalog artifact that the natejones.com /ob1 directory will load, plus the generator script and a PR gate drift check that keeps the artifact in sync with repo content. Pipeline: - scripts/build_catalog.py discovers every non-template contribution under the seven canonical category folders, loads metadata.json, rewrites intra-repo links in README.md, resolves requires_skills / requires_primitives into forward and reverse dependency edges, and emits resources/ob1-catalog.json. - .github/workflows/ob1-gate.yml now runs the generator in --check mode whenever a PR touches contribution content, the generator, or the artifact. A drift failure tells the contributor exactly which command to run. Generator-time validation (runs before artifact is emitted): - category in metadata.json matches the folder category - requires.open_brain is one of required | optional - learning_order is only set on extensions - every requires_skills / requires_primitives slug resolves to a real contribution - no two entries share the same site_path - every relative intra-repo link in a contribution README resolves to an existing file in the repo Link rewriting: - relative links to contribution folders or their READMEs become /ob1/<category>/<slug> - links to category index folders or READMEs become /ob1/<category> - deeper relative files inside contribution folders go to the raw GitHub blob URL - image / asset references go to raw.githubusercontent.com - other intra-repo links (docs/, root markdown) go to blob/tree URLs - absolute URLs, anchors, and mailto: / tel: schemes pass through Site loader precedence (documented here for PR3 in the site repo): - OB1_CATALOG_LOCAL_PATH env var takes precedence when set - otherwise fetch from raw.githubusercontent.com on main - missing or invalid artifact must fail the site build Pre-existing fix included: - recipes/email-history-import/README.md linked to a non-existent primitives/content-fingerprint-dedup path. The dedup logic lives in recipes/content-fingerprint-dedup. The generator's strict link check flagged it, so fixed the link instead of loosening the rule. Output: 59 catalog entries across all seven categories (recipes 29, skills 13, extensions 6, primitives 5, integrations 3, dashboards 2, schemas 1) — 4 required + 9 optional in skills, all other categories required for v1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 41e86b7 commit 0a6d05c

4 files changed

Lines changed: 3564 additions & 2 deletions

File tree

.github/workflows/ob1-gate.yml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,29 @@ jobs:
3131
- name: Install metadata schema validator
3232
run: python3 -m pip install check-jsonschema
3333

34+
- name: Check catalog is not stale
35+
id: catalog_check
36+
# Only run if any contribution content or metadata changed — a doc-only
37+
# or workflow-only PR cannot invalidate the catalog.
38+
run: |
39+
set +e
40+
changed=$(git diff --name-only origin/main...HEAD)
41+
if echo "$changed" | grep -qE '^(recipes|schemas|dashboards|integrations|skills|primitives|extensions)/[^_].*/(README\.md|metadata\.json)$' \
42+
|| echo "$changed" | grep -qE '^scripts/build_catalog\.py$' \
43+
|| echo "$changed" | grep -qE '^resources/ob1-catalog\.json$'; then
44+
python3 scripts/build_catalog.py --check
45+
rc=$?
46+
if [ "$rc" -ne 0 ]; then
47+
echo "catalog_stale=true" >> $GITHUB_OUTPUT
48+
echo "::error::resources/ob1-catalog.json is stale. Run \`python3 scripts/build_catalog.py\` and commit the result."
49+
exit "$rc"
50+
fi
51+
echo "catalog_stale=false" >> $GITHUB_OUTPUT
52+
else
53+
echo "catalog check skipped (no contribution or generator changes)"
54+
echo "catalog_stale=false" >> $GITHUB_OUTPUT
55+
fi
56+
3457
- name: Get changed files
3558
id: changed
3659
run: |

recipes/email-history-import/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ deno run --allow-net --allow-read --allow-write --allow-env pull-gmail.ts --list
100100
4. **Deduplicate** via sync-log (tracks Gmail message IDs already imported)
101101
5. **Embed** content via OpenRouter (`text-embedding-3-small`)
102102
6. **Classify** via LLM (topics, type, people, action items)
103-
7. **Upsert** into Supabase with SHA-256 [content fingerprint](../../primitives/content-fingerprint-dedup/README.md) — re-running produces zero duplicates
103+
7. **Upsert** into Supabase with SHA-256 [content fingerprint](../content-fingerprint-dedup/README.md) — re-running produces zero duplicates
104104

105105
### What gets filtered out
106106

@@ -115,7 +115,7 @@ Each imported email becomes one row in the `thoughts` table:
115115
- `content`: Email body with context prefix (`[Email from X | Subject: Y | Date: Z]`)
116116
- `embedding`: 1536-dim vector for semantic search (truncated to 8K chars)
117117
- `metadata`: LLM-extracted topics, type, people, action items, plus `source: "gmail"`, `gmail_id`, `gmail_labels`, `gmail_thread_id`
118-
- `content_fingerprint`: Normalized SHA-256 hash for dedup (see [content fingerprint primitive](../../primitives/content-fingerprint-dedup/README.md))
118+
- `content_fingerprint`: Normalized SHA-256 hash for dedup (see [content fingerprint primitive](../content-fingerprint-dedup/README.md))
119119

120120
## Troubleshooting
121121

0 commit comments

Comments
 (0)