Skip to content

Commit 281a26b

Browse files
authored
pdf-server: robust path validation + folder/file root support (#497)
* pdf-server: support folders in cli + files in roots * fix(pdf-server): use path.relative for robust ancestor dir matching - Replace string prefix matching (startsWith) with path.relative-based isAncestorDir() which correctly handles trailing slashes, path normalization, and prevents both .. traversal and prefix attacks. - Use path.resolve consistently when adding paths to allowedLocalFiles and allowedLocalDirs (main.ts was missing this). - Use resolved path for exact file match check in validateUrl. - Improve error message formatting for easier debugging. - Add tests for trailing slashes, grandparent dirs, and isAncestorDir. * chore(pdf-server): remove JSONL interaction logging Remove the withLogging transport wrapper, logInteraction helper, InteractionLogger type, and all log parameter threading that were added for debugging. * fix(pdf-server): support computer:// URL scheme for local files Some clients (e.g. Claude Code) use computer:// instead of file:// for local file references. Handle both schemes in isFileUrl and fileUrlToPath. * fix(pdf-server): add debug logging for rejected files, ignore .jsonl logs * fix(pdf-server): accept bare file paths in addition to file:// URLs Clients like Claude Desktop may pass raw absolute paths (e.g. /Users/.../file.pdf) instead of file:// URLs. Handle these in validateUrl and readPdfRange. * debug(pdf-server): add hex char diagnostics for path mismatch * fix(pdf-server): resolve symlinks to handle sandbox path remapping Use fs.realpathSync to resolve symlinks/bind mounts when comparing file paths against allowed directories. This fixes the path namespace mismatch where Claude Desktop sends container paths (e.g. /sessions/...) while MCP roots use host paths (e.g. /Users/...). Both the file path and directory paths are resolved through symlinks before comparison, so either side can be a symlink. Also stores realpath of roots in allowedLocalDirs/Files so matching works in both directions. Removes verbose hex diagnostics (no longer needed). * fix(pdf-server): decode percent-encoded bare paths (e.g. %20 for spaces) Clients may send bare file paths with percent-encoded characters (e.g., Application%20Support). Apply decodeURIComponent to bare paths before resolving, matching the behavior already used for file:// URLs. * fix(pdf-server): add full diagnostics to client-facing error message Include url, resolved path, realpath, allowedFiles, and per-directory path.relative results in the error returned to the client, so we can diagnose why isAncestorDir fails when paths look identical. * debug(pdf-server): add hex dump to diagnose invisible char differences path.relative returns 10 levels of .. despite paths looking identical, meaning they differ at the byte level. Add hex dump of first 30 chars of both paths to catch invisible Unicode or encoding differences. * fix(pdf-server): use inode comparison for bind-mount path matching The cowork VM remaps paths via bind mounts: the tool input uses VM-internal paths (/sessions/...) while MCP roots use host paths (/Users/...). Since path.relative can't match across bind mounts, compare directory inodes instead (dev+ino from fs.statSync). Walks up the file's parent directories checking each against allowed dirs by inode, which works regardless of path namespace differences. * fix(pdf-server): resolve VM-internal paths via directory basename matching When the MCP server runs on the host but receives unrewritten VM paths (e.g., /sessions/name/mnt/uploads/file.pdf), the tool call url argument isn't rewritten by the cowork VM proxy because it doesn't know which arguments are file paths. Fix: when path validation fails, try to match the file by finding the directory basename (e.g., 'uploads') in the VM path and looking for the relative path suffix under each allowed directory on the host. Also returns the resolved host path so readPdfRange reads the correct file on the host filesystem. * revert(pdf-server): remove resolveVmPath hack The VM path mismatch (VM-internal paths not rewritten in tool call arguments) should be fixed in the cowork VM proxy layer, not worked around in the MCP server. Removes: - resolveVmPath() function - resolvedPath field from validateUrl return type - effectiveUrl remapping in tool handlers - VM path resolution test * nit
1 parent ec389f4 commit 281a26b

File tree

5 files changed

+264
-30
lines changed

5 files changed

+264
-30
lines changed

examples/pdf-server/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
dist/
2+
*.jsonl

examples/pdf-server/.mcpbignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ src/
77
# Tests
88
*.test.*
99

10+
# Debug logs
11+
*.jsonl
12+
1013
# Development assets
1114
screenshot.png
1215
grid-cell.png

examples/pdf-server/main.ts

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
*/
66

77
import fs from "node:fs";
8+
import path from "node:path";
89
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
910
import { createMcpExpressApp } from "@modelcontextprotocol/sdk/server/express.js";
1011
import type { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
@@ -20,6 +21,7 @@ import {
2021
fileUrlToPath,
2122
allowedLocalFiles,
2223
DEFAULT_PDF,
24+
allowedLocalDirs,
2325
} from "./server.js";
2426

2527
/**
@@ -120,10 +122,16 @@ async function main() {
120122
// Register local files in whitelist
121123
for (const url of urls) {
122124
if (isFileUrl(url)) {
123-
const filePath = fileUrlToPath(url);
125+
const filePath = path.resolve(fileUrlToPath(url));
124126
if (fs.existsSync(filePath)) {
125-
allowedLocalFiles.add(filePath);
126-
console.error(`[pdf-server] Registered local file: ${filePath}`);
127+
const s = fs.statSync(filePath);
128+
if (s.isFile()) {
129+
allowedLocalFiles.add(filePath);
130+
console.error(`[pdf-server] Registered local file: ${filePath}`);
131+
} else if (s.isDirectory()) {
132+
allowedLocalDirs.add(filePath);
133+
console.error(`[pdf-server] Registered local directory: ${filePath}`);
134+
}
127135
} else {
128136
console.error(`[pdf-server] Warning: File not found: ${filePath}`);
129137
}

examples/pdf-server/server.test.ts

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ import path from "node:path";
33
import {
44
createPdfCache,
55
validateUrl,
6+
isAncestorDir,
67
allowedLocalFiles,
78
allowedLocalDirs,
89
pathToFileUrl,
@@ -271,4 +272,152 @@ describe("validateUrl with MCP roots (allowedLocalDirs)", () => {
271272
expect(result.valid).toBe(false);
272273
expect(result.error).toContain("File not found");
273274
});
275+
276+
it("should allow a file under an allowed dir with trailing slash", () => {
277+
const dir = path.resolve(import.meta.dirname);
278+
// Simulate a dir stored with a trailing slash (e.g. from CLI path)
279+
allowedLocalDirs.add(dir + "/");
280+
281+
const filePath = path.join(dir, "server.ts");
282+
const result = validateUrl(pathToFileUrl(filePath));
283+
expect(result.valid).toBe(true);
284+
});
285+
286+
it("should allow a file under a grandparent allowed dir", () => {
287+
// Allow a directory two levels up from the file
288+
const grandparent = path.resolve(path.join(import.meta.dirname, ".."));
289+
allowedLocalDirs.add(grandparent);
290+
291+
const filePath = path.join(import.meta.dirname, "server.ts");
292+
const result = validateUrl(pathToFileUrl(filePath));
293+
expect(result.valid).toBe(true);
294+
});
295+
296+
it("should accept computer:// URLs as local files", () => {
297+
const dir = path.resolve(import.meta.dirname);
298+
allowedLocalDirs.add(dir);
299+
300+
const filePath = path.join(dir, "server.ts");
301+
const encoded = encodeURIComponent(filePath).replace(/%2F/g, "/");
302+
const result = validateUrl(`computer://${encoded}`);
303+
expect(result.valid).toBe(true);
304+
});
305+
306+
it("should accept bare absolute paths as local files", () => {
307+
const dir = path.resolve(import.meta.dirname);
308+
allowedLocalDirs.add(dir);
309+
310+
const filePath = path.join(dir, "server.ts");
311+
const result = validateUrl(filePath);
312+
expect(result.valid).toBe(true);
313+
});
314+
315+
it("should decode percent-encoded bare paths (e.g. %20 for spaces)", () => {
316+
const fs = require("node:fs");
317+
const os = require("node:os");
318+
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "pdf test "));
319+
const testFile = path.join(tmpDir, "file.txt");
320+
321+
try {
322+
fs.writeFileSync(testFile, "hello");
323+
allowedLocalDirs.add(tmpDir);
324+
325+
// Encode spaces as %20 in the path (as some clients do)
326+
const encoded = testFile.replace(/ /g, "%20");
327+
const result = validateUrl(encoded);
328+
expect(result.valid).toBe(true);
329+
} finally {
330+
fs.rmSync(tmpDir, { recursive: true });
331+
}
332+
});
333+
334+
it("should allow file accessed via symlink when real dir is allowed", () => {
335+
const fs = require("node:fs");
336+
const os = require("node:os");
337+
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "pdf-test-"));
338+
const realDir = path.join(tmpDir, "real");
339+
const linkDir = path.join(tmpDir, "link");
340+
const testFile = path.join(realDir, "test.txt");
341+
342+
try {
343+
fs.mkdirSync(realDir);
344+
fs.writeFileSync(testFile, "hello");
345+
fs.symlinkSync(realDir, linkDir);
346+
347+
// Allow the REAL directory
348+
allowedLocalDirs.add(realDir);
349+
350+
// Access via the SYMLINK path — should still be allowed
351+
const symlinkPath = path.join(linkDir, "test.txt");
352+
const result = validateUrl(symlinkPath);
353+
expect(result.valid).toBe(true);
354+
} finally {
355+
fs.rmSync(tmpDir, { recursive: true });
356+
}
357+
});
358+
359+
it("should allow file when allowed dir is a symlink to real parent", () => {
360+
const fs = require("node:fs");
361+
const os = require("node:os");
362+
const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), "pdf-test-"));
363+
const realDir = path.join(tmpDir, "real");
364+
const linkDir = path.join(tmpDir, "link");
365+
const testFile = path.join(realDir, "test.txt");
366+
367+
try {
368+
fs.mkdirSync(realDir);
369+
fs.writeFileSync(testFile, "hello");
370+
fs.symlinkSync(realDir, linkDir);
371+
372+
// Allow the SYMLINK directory
373+
allowedLocalDirs.add(linkDir);
374+
375+
// Access via the REAL path — should still be allowed
376+
const result = validateUrl(testFile);
377+
expect(result.valid).toBe(true);
378+
} finally {
379+
fs.rmSync(tmpDir, { recursive: true });
380+
}
381+
});
382+
});
383+
384+
describe("isAncestorDir", () => {
385+
it("should return true for a direct child", () => {
386+
expect(isAncestorDir("/Users/test/dir", "/Users/test/dir/file.pdf")).toBe(
387+
true,
388+
);
389+
});
390+
391+
it("should return true for a nested child", () => {
392+
expect(isAncestorDir("/Users/test", "/Users/test/sub/dir/file.pdf")).toBe(
393+
true,
394+
);
395+
});
396+
397+
it("should return false for a file outside the dir", () => {
398+
expect(isAncestorDir("/Users/test/dir", "/Users/test/other/file.pdf")).toBe(
399+
false,
400+
);
401+
});
402+
403+
it("should return false for the dir itself", () => {
404+
expect(isAncestorDir("/Users/test/dir", "/Users/test/dir")).toBe(false);
405+
});
406+
407+
it("should prevent .. traversal", () => {
408+
expect(
409+
isAncestorDir("/Users/test/dir", "/Users/test/dir/../other/file.pdf"),
410+
).toBe(false);
411+
});
412+
413+
it("should prevent prefix-based traversal", () => {
414+
// /tmp/safe should NOT match /tmp/safevil/file.pdf
415+
expect(isAncestorDir("/tmp/safe", "/tmp/safevil/file.pdf")).toBe(false);
416+
});
417+
418+
it("should handle dirs with trailing slash", () => {
419+
expect(isAncestorDir("/Users/test/dir/", "/Users/test/dir/file.pdf")).toBe(
420+
true,
421+
);
422+
});
274423
});

0 commit comments

Comments
 (0)