You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: services/apps/git_integration/src/crowdgit/services/maintainer/maintainer_service.py
+27-7Lines changed: 27 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -360,11 +360,25 @@ def get_extraction_prompt(
360
360
361
361
- **Third-Party Check (MANDATORY — evaluate FIRST)**: Examine the **full file path** and the **repository URL** below. You MUST return `{{"error": "not_found"}}` immediately if ANY of these rules match:
362
362
363
-
**Rule 1 — Repo-name check (step by step)**:
364
-
1. Extract the repo name and org name from the repository URL (e.g. URL `https://github.com/numworks/epsilon` → repo=`epsilon`, org=`numworks`).
365
-
2. For each directory in the file path, check: is this directory name a common structural directory (like `src`, `docs`, `doc`, `.github`, `lib`, `pkg`, `test`, `community`, `content`, `tools`, `web`, `app`, `config`, `deploy`, `charts`, etc.)? If yes, skip it — it's fine.
366
-
3. For any directory that is NOT a common structural directory AND is NOT a governance keyword (maintainer, owner, contributor, etc.), check: does it appear as a substring of the repo name or org name, or vice versa? If NOT → this directory is a submodule or bundled library name that does not belong to this repo. Return `{{"error": "not_found"}}`.
367
-
Example: file `mylib/README.md` in repo `orgname/myproject` → `mylib` is not structural, not a governance keyword, and `mylib` does not appear in `myproject` or `orgname` → reject. But file `myproject/README.md` in the same repo → `myproject` matches the repo name → allow.
363
+
**Rule 1 — Repo-name check (APPLY TO EVERY DIRECTORY)**:
364
+
1. Extract the repo name and org name from the repository URL (e.g. URL `https://github.com/orgname/myproject` → repo=`myproject`, org=`orgname`).
365
+
2. Walk the file path directory-by-directory. For EACH directory (not just the first one):
b. If the directory name is a **governance keyword** (maintainer, maintainers, owner, owners, codeowner, codeowners, contributor, contributors, author, authors, governance, committer, committers, reviewer, reviewers, approver, approvers, emeritus, steward, stewards, credits, code_owners, core_team), CONTINUE.
368
+
c. Otherwise, check: does the directory name appear as a substring of the repo name or org name, or vice versa (case-insensitive)? If YES, CONTINUE. If NO → REJECT immediately with `{{"error": "not_found"}}`. This directory is a submodule, bundled library, or unrelated subcomponent.
369
+
3. If every directory passes, proceed to content analysis.
370
+
371
+
Examples of walking:
372
+
- `plugins/code-review/README.md` in repo `orgname/myproject`:
373
+
Walk → `plugins` (structural, CONTINUE) → `code-review` (not structural, not governance, not in "myproject" or "orgname") → REJECT.
374
+
- `src/qhull/README.txt` in repo `orgname/bambustudio`:
375
+
Walk → `src` (structural, CONTINUE) → `qhull` (not structural, not governance, not in "bambustudio" or "orgname") → REJECT.
376
+
- `lib/tinyusb/CONTRIBUTORS.rst` in repo `orgname/firmware`:
377
+
Walk → `lib` (structural, CONTINUE) → `tinyusb` (not structural, not governance, not in "firmware" or "orgname") → REJECT.
378
+
- `docs/developer-guide/maintainers.md` in repo `orgname/myproject`:
379
+
Walk → `docs` (structural, CONTINUE) → `developer-guide` (structural documentation sub-area, CONTINUE) → filename `maintainers.md` is governance → ALLOW.
380
+
- `.github/CODEOWNERS` in any repo:
381
+
Walk → `.github` (structural, CONTINUE) → filename `CODEOWNERS` is governance → ALLOW.
368
382
369
383
**Rule 2 — Vendor/dependency directory**: reject if any directory in the path is one of:
**Rule 4 — Hard depth limit**: reject if the path has more than 3 segments (e.g. `a/b/c/file` is 4 segments → reject). Legitimate governance files live at the root or 1-2 directories deep. No exceptions.
375
389
376
390
**Examples of paths that MUST be rejected:**
377
-
- `src/somelibrary/AUTHORS` in a repo that is NOT somelibrary (Rule 1)
378
-
- `subcomponent/README.md` in a repo with a different project name (Rule 1)
391
+
- `src/libname/AUTHORS` where `libname` is not related to the repo — bundled library inside a source tree (Rule 1)
392
+
- `lib/libname/CONTRIBUTORS.rst` where `libname` is not the repo — bundled dependency (Rule 1)
393
+
- `plugins/something/README.md` where `something` is not related to the repo — unrelated plugin/subcomponent (Rule 1)
394
+
- `packages/pkgname/README.md` where `pkgname` is not related to the repo — bundled package (Rule 1)
395
+
- `examples/foo/README.md` where `foo` is not related to the repo — sample documentation, not governance (Rule 1)
396
+
- `modules/modname/README.md` where `modname` is not related to the repo — module-specific doc (Rule 1)
397
+
- `libname/AUTHORS.md` at the root where `libname` does not match the repo — bundled library at root (Rule 1)
398
+
- `subcomponent/README.md` at the root where `subcomponent` is not related to the repo (Rule 1)
0 commit comments