Skip to content

Commit 28ba96d

Browse files
authored
feat(config): add exclude support to defaults (#17)
* feat(config): add exclude support to defaults * docs: README * chore: README * docs: README * feat(git): add commit availability check
1 parent c2378b7 commit 28ba96d

9 files changed

Lines changed: 257 additions & 55 deletions

README.md

Lines changed: 34 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,20 @@
11
# 🗃️ `docs-cache`
22

3-
Deterministic local caching of external documentation for agents and tools
3+
Deterministic local caching of external documentation for agents and developers
44

55
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
66
[![npm version](https://img.shields.io/npm/v/docs-cache)](https://www.npmjs.com/package/docs-cache)
77
[![Audit](https://github.com/fbosch/docs-cache/actions/workflows/audit.yml/badge.svg)](https://github.com/fbosch/docs-cache/actions/workflows/audit.yml)
88

99
## Purpose
1010

11-
Provides agents and automation tools with local access to external documentation without committing it to the repository.
11+
Provides agents and developers with local access to external documentation without committing it to the repository.
1212

1313
Documentation is cached in a gitignored location, exposed to agent and tool targets via links or copies, and updated through sync commands or postinstall hooks.
1414

1515
## Features
1616

17-
- **Local only**: Cache lives in the directory `.docs` (or a custom location) and _should_ be gitignored.
17+
- **Local only**: Cache lives in the directory `.docs` (or a custom location) and can be gitignored.
1818
- **Deterministic**: `docs-lock.json` pins commits and file metadata.
1919
- **Fast**: Local cache avoids network roundtrips after sync.
2020
- **Flexible**: Cache full repos or just the subdirectories you need.
@@ -54,20 +54,21 @@ npx docs-cache clean
5454
5555
## Configuration
5656

57-
`docs.config.json` at project root (or `docs-cache` inside `package.json`):
57+
`docs.config.json` at project root (or a `docs-cache` field in `package.json`):
5858

59-
```json
59+
```jsonc
6060
{
6161
"$schema": "https://github.com/fbosch/docs-cache/blob/master/docs.config.schema.json",
6262
"sources": [
6363
{
6464
"id": "framework",
6565
"repo": "https://github.com/framework/core.git",
66-
"ref": "main",
67-
"targetDir": "./agents/skills/framework-skill/references",
68-
"include": ["guide/**"]
69-
}
70-
]
66+
"ref": "main", // or specific commit hash
67+
"targetDir": "./agents/skills/framework-skill/references", // symlink/copy target
68+
"include": ["guide/**"], // file globs to include from the source
69+
"toc": true, // defaults to "compressed" (for agents)
70+
},
71+
],
7172
}
7273
```
7374

@@ -78,28 +79,29 @@ npx docs-cache clean
7879
| Field | Details | Required |
7980
| ---------- | -------------------------------------- | -------- |
8081
| `cacheDir` | Directory for cache. Default: `.docs`. | Optional |
81-
| `sources` | List of repositories to sync. | Required |
8282
| `defaults` | Default settings for all sources. | Optional |
83+
| `sources` | List of repositories to sync. | Required |
8384

8485
<details>
8586
<summary>Show default and source options</summary>
8687

8788
### Default options
8889

89-
All fields in `defaults` apply to all sources unless overridden per-source.
90-
91-
| Field | Details |
92-
| --------------------- | ---------------------------------------------------------------------------------------------------------------- |
93-
| `ref` | Branch, tag, or commit. Default: `"HEAD"`. |
94-
| `mode` | Cache mode. Default: `"materialize"`. |
95-
| `include` | Glob patterns to copy. Default: `["**/*.{md,mdx,markdown,mkd,txt,rst,adoc,asciidoc}"]`. |
96-
| `targetMode` | How to link or copy from the cache to the destination. Default: `"symlink"` on Unix, `"copy"` on Windows. |
97-
| `required` | Whether missing sources should fail. Default: `true`. |
98-
| `maxBytes` | Maximum total bytes to materialize. Default: `200000000` (200 MB). |
99-
| `maxFiles` | Maximum total files to materialize. |
100-
| `allowHosts` | Allowed Git hosts. Default: `["github.com", "gitlab.com"]`. |
101-
| `toc` | Generate per-source `TOC.md`. Default: `true`. Supports `true`, `false`, or a format (`"tree"`, `"compressed"`). |
102-
| `unwrapSingleRootDir` | If the materialized output is nested under a single directory, unwrap it (recursively). Default: `false`. |
90+
These fields can be set in `defaults` and are inherited by every source unless overridden per-source.
91+
92+
| Field | Details |
93+
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
94+
| `ref` | Branch, tag, or commit. Default: `"HEAD"`. |
95+
| `mode` | Cache mode. Default: `"materialize"`. |
96+
| `include` | Glob patterns to copy. Default: `["**/*.{md,mdx,markdown,mkd,txt,rst,adoc,asciidoc}"]`. |
97+
| `exclude` | Glob patterns to skip. Default: `[]`. |
98+
| `targetMode` | How to link or copy from the cache to the destination. Default: `"symlink"` on Unix, `"copy"` on Windows. |
99+
| `required` | Whether missing sources should fail. Default: `true`. |
100+
| `maxBytes` | Maximum total bytes to materialize. Default: `200000000` (200 MB). |
101+
| `maxFiles` | Maximum total files to materialize. |
102+
| `allowHosts` | Allowed Git hosts. Default: `["github.com", "gitlab.com"]`. |
103+
| `toc` | Generate per-source `TOC.md`. Default: `true`. Supports `true`, `false`, or a format: `"tree"` (human readable), `"compressed"` (optimized for agents). |
104+
| `unwrapSingleRootDir` | If the materialized output is nested under a single directory, unwrap it (recursively). Default: `false`. |
103105

104106
### Source options
105107

@@ -110,22 +112,13 @@ All fields in `defaults` apply to all sources unless overridden per-source.
110112
| `repo` | Git URL. |
111113
| `id` | Unique identifier for the source. |
112114

113-
#### Optional
114-
115-
| Field | Details |
116-
| --------------------- | ----------------------------------------------------------------------------------------------- |
117-
| `ref` | Branch, tag, or commit. |
118-
| `include` | Glob patterns to copy. |
119-
| `exclude` | Glob patterns to skip. |
120-
| `targetDir` | Path where files should be symlinked/copied to, outside `.docs`. |
121-
| `targetMode` | How to link or copy from the cache to the destination. |
122-
| `required` | Whether missing sources should fail. |
123-
| `maxBytes` | Maximum total bytes to materialize. |
124-
| `maxFiles` | Maximum total files to materialize. |
125-
| `toc` | Generate per-source `TOC.md`. Supports `true`, `false`, or a format (`"tree"`, `"compressed"`). |
126-
| `unwrapSingleRootDir` | If the materialized output is nested under a single directory, unwrap it (recursively). |
127-
128-
> **Note**: Sources are always downloaded to `.docs/<id>/`. If you provide a `targetDir`, `docs-cache` will create a symlink or copy pointing from the cache to that target directory. The target should be outside `.docs`. Git operation timeout is configured via the `--timeout-ms` CLI flag, not as a per-source configuration option.
115+
#### Optional (source-only)
116+
117+
| Field | Details |
118+
| ----------- | ---------------------------------------------------------------- |
119+
| `targetDir` | Path where files should be symlinked/copied to, outside `.docs`. |
120+
121+
> **Note**: Sources are always downloaded to `.docs/<id>/`. If you provide a `targetDir`; `docs-cache` will create a symlink or copy pointing from the cache to that target directory.
129122
130123
</details>
131124

docs.config.schema.json

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,13 @@
3333
"minLength": 1
3434
}
3535
},
36+
"exclude": {
37+
"type": "array",
38+
"items": {
39+
"type": "string",
40+
"minLength": 1
41+
}
42+
},
3643
"targetMode": {
3744
"type": "string",
3845
"enum": ["symlink", "copy"]
@@ -56,9 +63,6 @@
5663
"minLength": 1
5764
}
5865
},
59-
"unwrapSingleRootDir": {
60-
"type": "boolean"
61-
},
6266
"toc": {
6367
"anyOf": [
6468
{
@@ -70,9 +74,8 @@
7074
}
7175
]
7276
},
73-
"tocFormat": {
74-
"type": "string",
75-
"enum": ["tree", "compressed"]
77+
"unwrapSingleRootDir": {
78+
"type": "boolean"
7679
}
7780
},
7881
"additionalProperties": false
@@ -163,10 +166,6 @@
163166
}
164167
]
165168
},
166-
"tocFormat": {
167-
"type": "string",
168-
"enum": ["tree", "compressed"]
169-
},
170169
"unwrapSingleRootDir": {
171170
"type": "boolean"
172171
}

src/config-schema.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ export const DefaultsSchema = z
1515
ref: z.string().min(1),
1616
mode: CacheModeSchema,
1717
include: z.array(z.string().min(1)).min(1),
18+
exclude: z.array(z.string().min(1)).optional(),
1819
targetMode: TargetModeSchema.optional(),
1920
required: z.boolean(),
2021
maxBytes: z.number().min(1),

src/config.ts

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ export interface DocsCacheDefaults {
1919
ref: string;
2020
mode: CacheMode;
2121
include: string[];
22+
exclude?: string[];
2223
targetMode?: "symlink" | "copy";
2324
required: boolean;
2425
maxBytes: number;
@@ -80,6 +81,7 @@ export const DEFAULT_CONFIG: DocsCacheConfig = {
8081
ref: "HEAD",
8182
mode: "materialize",
8283
include: ["**/*.{md,mdx,markdown,mkd,txt,rst,adoc,asciidoc}"],
84+
exclude: [],
8385
targetMode: DEFAULT_TARGET_MODE,
8486
required: true,
8587
maxBytes: 200000000,
@@ -277,6 +279,10 @@ export const validateConfig = (input: unknown): DocsCacheConfig => {
277279
defaultsInput.include !== undefined
278280
? assertStringArray(defaultsInput.include, "defaults.include")
279281
: defaultValues.include,
282+
exclude:
283+
defaultsInput.exclude !== undefined
284+
? assertStringArray(defaultsInput.exclude, "defaults.exclude")
285+
: defaultValues.exclude,
280286
targetMode:
281287
defaultsInput.targetMode !== undefined
282288
? assertTargetMode(defaultsInput.targetMode, "defaults.targetMode")
@@ -434,7 +440,7 @@ export const resolveSources = (
434440
ref: source.ref ?? defaults.ref,
435441
mode: source.mode ?? defaults.mode,
436442
include: source.include ?? defaults.include,
437-
exclude: source.exclude,
443+
exclude: source.exclude ?? defaults.exclude,
438444
required: source.required ?? defaults.required,
439445
maxBytes: source.maxBytes ?? defaults.maxBytes,
440446
maxFiles: source.maxFiles ?? defaults.maxFiles,

src/git/fetch-source.ts

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,26 @@ const isPartialClone = async (repoPath: string) => {
131131
}
132132
};
133133

134+
const ensureCommitAvailable = async (
135+
repoPath: string,
136+
commit: string,
137+
options?: { timeoutMs?: number; allowFileProtocol?: boolean },
138+
) => {
139+
try {
140+
await git(["-C", repoPath, "cat-file", "-e", `${commit}^{commit}`], {
141+
timeoutMs: options?.timeoutMs,
142+
allowFileProtocol: options?.allowFileProtocol,
143+
});
144+
return;
145+
} catch {
146+
// commit not present, fetch it
147+
}
148+
await git(["-C", repoPath, "fetch", "origin", commit], {
149+
timeoutMs: options?.timeoutMs,
150+
allowFileProtocol: options?.allowFileProtocol,
151+
});
152+
};
153+
134154
type FetchParams = {
135155
sourceId: string;
136156
repo: string;
@@ -224,6 +244,9 @@ const cloneRepo = async (params: FetchParams, outDir: string) => {
224244
}
225245
cloneArgs.push(params.repo, outDir);
226246
await git(cloneArgs, { timeoutMs: params.timeoutMs });
247+
await ensureCommitAvailable(outDir, params.resolvedCommit, {
248+
timeoutMs: params.timeoutMs,
249+
});
227250
if (useSparse) {
228251
const sparsePaths = extractSparsePaths(params.include);
229252
if (sparsePaths.length > 0) {
@@ -275,6 +298,9 @@ const cloneOrUpdateRepo = async (params: FetchParams, outDir: string) => {
275298
await git(["-C", cachePath, ...fetchArgs], {
276299
timeoutMs: params.timeoutMs,
277300
});
301+
await ensureCommitAvailable(cachePath, params.resolvedCommit, {
302+
timeoutMs: params.timeoutMs,
303+
});
278304
} catch (_error) {
279305
// Fetch failed, remove corrupt cache and re-clone
280306
await removeDir(cachePath);
@@ -333,6 +359,11 @@ const cloneOrUpdateRepo = async (params: FetchParams, outDir: string) => {
333359
}
334360
}
335361

362+
await ensureCommitAvailable(outDir, params.resolvedCommit, {
363+
timeoutMs: params.timeoutMs,
364+
allowFileProtocol: true,
365+
});
366+
336367
await git(
337368
["-C", outDir, "checkout", "--quiet", "--detach", params.resolvedCommit],
338369
{

src/sync.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ export const getSyncPlan = async (
176176
filteredSources.map(async (source) => {
177177
const lockEntry = lockData?.sources?.[source.id];
178178
const include = source.include ?? defaults.include;
179-
const exclude = source.exclude;
179+
const exclude = source.exclude ?? defaults.exclude;
180180
const rulesSha256 = computeRulesHash({
181181
...source,
182182
include,
@@ -439,7 +439,7 @@ export const runSync = async (options: SyncOptions, deps: SyncDeps = {}) => {
439439
}
440440
}
441441
if (!options.json) {
442-
ui.step("Building cache layout", source.id);
442+
ui.step("Materializing", source.id);
443443
}
444444
const stats = await runMaterialize({
445445
sourceId: source.id,

tests/edge-cases-validation.test.js

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -393,6 +393,7 @@ test("defaults with all fields specified", async () => {
393393
ref: "main",
394394
mode: "materialize",
395395
include: ["**/*.md"],
396+
exclude: ["**/.cache/**"],
396397
targetMode: "copy",
397398
required: false,
398399
maxBytes: 1000000,
@@ -411,5 +412,18 @@ test("defaults with all fields specified", async () => {
411412
assert.equal(config.defaults.maxBytes, 1000000);
412413
assert.equal(config.defaults.maxFiles, 100);
413414
assert.deepEqual(config.defaults.allowHosts, ["github.com"]);
415+
assert.deepEqual(config.defaults.exclude, ["**/.cache/**"]);
414416
assert.equal(config.defaults.toc, true);
415417
});
418+
419+
test("defaults exclude applies to sources", async () => {
420+
const configPath = await writeConfig({
421+
defaults: {
422+
exclude: ["**/.cache/**"],
423+
},
424+
sources: [{ id: "test", repo: "https://github.com/example/repo.git" }],
425+
});
426+
427+
const { sources } = await loadConfig(configPath);
428+
assert.deepEqual(sources[0].exclude, ["**/.cache/**"]);
429+
});

0 commit comments

Comments
 (0)