Skip to content

Commit 2e0c074

Browse files
committed
feat(tools/prim): primordials audit + codemod CLI
`prim` is a multi-command CLI for auditing and migrating JavaScript built-in usage to @socketsecurity/lib/primordials. Two questions, one codemod: - prim coverage — call sites a project could migrate to existing primordials today (Object.keys → ObjectKeys, arr.map → ArrayPrototypeMap, etc.). - prim gaps — call sites with no matching primordial yet — the input list for expanding socket-lib/src/primordials.ts. - prim audit — coverage + gaps in one pass; --update-state writes a JSON snapshot per target for trend tracking. - prim state — inspect the persisted state file (.prim-state.json by default). - prim mod — codemod source files to use primordials. Dry-run by default; --apply writes; --include-guessed also rewrites prototype-method calls where the receiver type was inferred from the identifier name. Parser is vendored acorn-wasm (vendor/acorn-wasm/) — no npm install needed at audit time, no network access. Surface resolution prefers a sibling socket-lib/src/primordials.ts checkout (picks up unreleased exports during fleet development), falling back to the installed @socketsecurity/lib/dist/primordials.js in the target's node_modules. Receiver-type heuristics: - Unambiguous-method map: methods that exist on exactly one built-in type (.toUpperCase → String, .getTime → Date) classify by method name regardless of receiver — strongest signal. - Variable-name heuristic: `arr.map(...)` → Array. False positives surface flagged with `[guessed: …]` for manual dismissal. Layout: - tools/prim/ — the CLI (workspace package, type=module, .mts source) - vendor/acorn-wasm/ — the vendored parser (workspace package) - pnpm-lock.yaml — workspace package linkage
1 parent 53d0062 commit 2e0c074

19 files changed

Lines changed: 2381 additions & 0 deletions

pnpm-lock.yaml

Lines changed: 8 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

tools/prim/README.md

Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
# prim
2+
3+
CLI for auditing and migrating JavaScript built-in usage to
4+
[`@socketsecurity/lib/primordials`](https://github.com/SocketDev/socket-lib).
5+
6+
## What it does
7+
8+
Primordials capture references to JavaScript built-ins (`Object.keys`,
9+
`Array.prototype.map`, `JSON.parse`, …) at module load time, before user
10+
code can tamper with prototypes or globals. They're a hardening tool for
11+
code that processes adversarial input.
12+
13+
`prim` answers two questions across a project's bundled output:
14+
15+
- **Coverage**: which call sites already have a primordial available
16+
(i.e. you can replace `arr.map(fn)` with `ArrayPrototypeMap(arr, fn)`
17+
today)?
18+
- **Gaps**: which call sites have no matching primordial yet (and so
19+
the surface needs expansion in `socket-lib/src/primordials.ts`)?
20+
21+
It also includes a codemod (`prim mod`) that rewrites source files to
22+
use primordials, and a state file for tracking progress across runs.
23+
24+
## Install
25+
26+
```sh
27+
# Local checkout, run directly:
28+
node /path/to/socket-lib/tools/prim/bin/prim.mts --help
29+
```
30+
31+
Once published to npm, `pnpm dlx prim` will work too.
32+
33+
## Usage
34+
35+
`prim` is a multi-command CLI. Each subcommand does one thing.
36+
37+
```sh
38+
# Show help
39+
prim --help
40+
41+
# Find call sites you could migrate today (existing primordials)
42+
prim coverage --target ./socket-cli --dir dist
43+
44+
# Find gaps in the primordials surface (need socket-lib expansion)
45+
prim gaps --target ./socket-cli
46+
47+
# Both at once, as a snapshot
48+
prim audit --target ./socket-cli --update-state
49+
50+
# Inspect persisted state
51+
prim state
52+
53+
# Dry-run a codemod over your source tree
54+
prim mod --target . --dir src
55+
56+
# Apply for real (only after reviewing the dry-run!)
57+
prim mod --target . --dir src --apply
58+
59+
# Also rewrite prototype-method calls where the receiver type is
60+
# guessed from the variable name (more aggressive — needs review)
61+
prim mod --target . --dir src --include-guessed --apply
62+
```
63+
64+
### Subcommands
65+
66+
| Subcommand | Purpose |
67+
| --------------- | ------------------------------------------------------------------------------------------------------------------------------- |
68+
| `prim coverage` | Report call sites in the target that could be migrated to existing primordials. |
69+
| `prim gaps` | Report call sites that need a primordial that doesn't exist yet — the input list for expanding `socket-lib/src/primordials.ts`. |
70+
| `prim audit` | Run `coverage` + `gaps` and (optionally) persist the snapshot to the state file. |
71+
| `prim state` | Inspect the persisted state file. |
72+
| `prim mod` | Codemod source files to use primordials. Dry-run by default; pass `--apply` to write. |
73+
74+
## How it knows what's covered
75+
76+
`prim` resolves the primordials surface from one of two locations:
77+
78+
1. A sibling socket-lib checkout: `../socket-lib/src/primordials.ts`
79+
(used during fleet development).
80+
2. The installed `@socketsecurity/lib/dist/primordials.js` in the
81+
target's `node_modules`.
82+
83+
Whichever it finds first wins.
84+
85+
## Design
86+
87+
- **Parser**: vendored acorn-wasm at `<socket-lib>/vendor/acorn-wasm`
88+
(originally from
89+
[sdxgen](https://github.com/SocketDev/sdxgen/tree/main/vendor/acorn-wasm))
90+
— no npm install needed, no network access.
91+
- **Heuristics**: receiver-type guessing for prototype-method calls
92+
(e.g. `arr.map(...)``Array`). False positives show up flagged
93+
with `[guessed: …]` so they can be dismissed manually.
94+
- **Unambiguous-method map**: methods that exist on exactly one
95+
built-in type (`.toUpperCase` → String, `.getTime` → Date) are
96+
classified by the method name regardless of receiver, which is a
97+
stronger signal than the name heuristic.
98+
99+
## State file format
100+
101+
```json
102+
{
103+
"updated": "2026-04-22T20:10:19.200Z",
104+
"targets": {
105+
"socket-cli": {
106+
"coverage": [
107+
{ "primordial": "ObjectKeys", "count": 142 },
108+
{ "primordial": "ArrayPrototypeMap", "count": 86 }
109+
],
110+
"gaps": [{ "primordial": "WeakRefPrototypeDeref", "count": 3 }]
111+
}
112+
}
113+
}
114+
```
115+
116+
Default location is `<cwd>/.prim-state.json`; override with `--state`.

tools/prim/bin/prim.mts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#!/usr/bin/env node
2+
import { runCli } from '../src/cli.mts'
3+
runCli(process.argv.slice(2))

tools/prim/package.json

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
{
2+
"name": "prim",
3+
"version": "0.1.0",
4+
"description": "Audit a project's bundled output for primordials coverage and gaps against @socketsecurity/lib/primordials, and codemod source to use them",
5+
"private": true,
6+
"type": "module",
7+
"bin": {
8+
"prim": "./bin/prim.mts"
9+
},
10+
"main": "./src/index.mts",
11+
"exports": {
12+
".": "./src/index.mts"
13+
},
14+
"files": [
15+
"bin/",
16+
"src/",
17+
"README.md"
18+
],
19+
"dependencies": {
20+
"acorn-wasm": "workspace:*"
21+
},
22+
"engines": {
23+
"node": ">=22.18"
24+
},
25+
"license": "MIT"
26+
}

tools/prim/src/audit.mts

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
/**
2+
* @fileoverview Walk a directory of bundled JavaScript and emit
3+
* findings: every site where a primordial would (or already does)
4+
* apply.
5+
*
6+
* Each finding records:
7+
* - The primordial that maps to the call site (e.g. `ArrayPrototypeMap`).
8+
* - Whether that primordial is currently exported from socket-lib
9+
* (`covered`) or not yet (`gap`).
10+
* - File / line / column / source pattern for human inspection.
11+
*/
12+
13+
import { readdirSync, readFileSync, statSync } from 'node:fs'
14+
import path from 'node:path'
15+
16+
import { simple } from 'acorn-wasm'
17+
18+
import {
19+
TRACKED_GLOBALS,
20+
UNAMBIGUOUS_PROTOTYPE_METHODS,
21+
ctorPrimordialName,
22+
guessReceiverType,
23+
prototypePrimordialName,
24+
staticPrimordialName,
25+
} from './globals.mts'
26+
27+
/**
28+
* @typedef {Object} Finding
29+
* @property {string} primordial Name of the matching primordial.
30+
* @property {string} pattern Source-level pattern, e.g. `Object.keys(...)`.
31+
* @property {string} file Path relative to the target root.
32+
* @property {number} line
33+
* @property {number} column
34+
* @property {'covered'|'gap'} kind Whether the primordial exists today.
35+
*/
36+
37+
/**
38+
* @param {Object} opts
39+
* @param {string} opts.targetRoot
40+
* @param {string} opts.scanDir Directory to walk (e.g. `dist`).
41+
* @param {Set<string>} opts.exported Currently-exported primordials.
42+
* @param {string[]} [opts.skipDirs] Directories to skip during walk.
43+
* @param {string[]} [opts.skipFiles] Files to skip (basename match).
44+
* @returns {Finding[]}
45+
*/
46+
export function auditDirectory({
47+
targetRoot,
48+
scanDir,
49+
exported,
50+
skipDirs = ['external'],
51+
skipFiles = ['primordials.js', 'primordials.mjs', 'primordials.cjs'],
52+
}) {
53+
const findings = []
54+
const seen = new Set()
55+
56+
function record(file, loc, pattern, primordial) {
57+
const line = loc?.line ?? 0
58+
const column = (loc?.column ?? 0) + 1
59+
const dedupKey = `${file}:${line}:${column}:${primordial}`
60+
if (seen.has(dedupKey)) {
61+
return
62+
}
63+
seen.add(dedupKey)
64+
findings.push({
65+
primordial,
66+
pattern,
67+
file,
68+
line,
69+
column,
70+
kind: exported.has(primordial) ? 'covered' : 'gap',
71+
})
72+
}
73+
74+
const visitors = {
75+
NewExpression(node) {
76+
if (
77+
node.callee?.type !== 'Identifier' ||
78+
!TRACKED_GLOBALS.has(node.callee.name)
79+
) {
80+
return
81+
}
82+
record(
83+
node._relPath,
84+
node.loc?.start,
85+
`new ${node.callee.name}(...)`,
86+
ctorPrimordialName(node.callee.name),
87+
)
88+
},
89+
CallExpression(node) {
90+
if (node.callee?.type !== 'MemberExpression') {
91+
return
92+
}
93+
const { object, property } = node.callee
94+
if (!object || !property || property.type !== 'Identifier') {
95+
return
96+
}
97+
if (object.type === 'Identifier' && TRACKED_GLOBALS.has(object.name)) {
98+
record(
99+
node._relPath,
100+
node.loc?.start,
101+
`${object.name}.${property.name}(...)`,
102+
staticPrimordialName(object.name, property.name),
103+
)
104+
return
105+
}
106+
if (object.type === 'Identifier') {
107+
// Strongest signal: the method name itself maps to one type
108+
// unambiguously (e.g. `.toUpperCase()` → String only,
109+
// `.getTime()` → Date only).
110+
const methodType = UNAMBIGUOUS_PROTOTYPE_METHODS.get(property.name)
111+
if (methodType) {
112+
record(
113+
node._relPath,
114+
node.loc?.start,
115+
`${object.name}.${property.name}(...) [method: ${methodType}]`,
116+
prototypePrimordialName(methodType, property.name),
117+
)
118+
return
119+
}
120+
// Weaker signal: guess the receiver's type from its name.
121+
const guess = guessReceiverType(object.name)
122+
if (!guess) {
123+
return
124+
}
125+
record(
126+
node._relPath,
127+
node.loc?.start,
128+
`${object.name}.${property.name}(...) [guessed: ${guess}]`,
129+
prototypePrimordialName(guess, property.name),
130+
)
131+
}
132+
},
133+
MemberExpression(node) {
134+
if (
135+
node.computed ||
136+
node.object?.type !== 'Identifier' ||
137+
!TRACKED_GLOBALS.has(node.object.name) ||
138+
!node.property?.name
139+
) {
140+
return
141+
}
142+
const propName = node.property.name
143+
if (propName[0] !== propName[0].toLowerCase()) {
144+
return
145+
}
146+
record(
147+
node._relPath,
148+
node.loc?.start,
149+
`${node.object.name}.${propName}`,
150+
staticPrimordialName(node.object.name, propName),
151+
)
152+
},
153+
}
154+
155+
function auditFile(absPath, relPath) {
156+
const src = readFileSync(absPath, 'utf8')
157+
try {
158+
// Inject relPath onto every visited node by wrapping the visitors.
159+
// acorn-wasm's `simple` doesn't pass extra context, so we attach
160+
// it inside the file walker once.
161+
const wrapped = {}
162+
for (const [name, fn] of Object.entries(visitors)) {
163+
wrapped[name] = node => {
164+
node._relPath = relPath
165+
fn(node)
166+
}
167+
}
168+
simple(src, wrapped, {
169+
ecmaVersion: 'latest',
170+
sourceType: 'module',
171+
locations: true,
172+
allowImportExportEverywhere: true,
173+
allowAwaitOutsideFunction: true,
174+
allowHashBang: true,
175+
})
176+
} catch {
177+
// File didn't parse — skip silently. Lint/type pipelines catch
178+
// syntax errors elsewhere.
179+
}
180+
}
181+
182+
function* walkDir(dir) {
183+
for (const entry of readdirSync(dir)) {
184+
if (skipDirs.includes(entry) || skipFiles.includes(entry)) {
185+
continue
186+
}
187+
const abs = path.join(dir, entry)
188+
const stat = statSync(abs)
189+
if (stat.isDirectory()) {
190+
yield* walkDir(abs)
191+
} else if (entry.endsWith('.js') || entry.endsWith('.mjs')) {
192+
yield abs
193+
}
194+
}
195+
}
196+
197+
for (const abs of walkDir(scanDir)) {
198+
const rel = path.relative(targetRoot, abs)
199+
auditFile(abs, rel)
200+
}
201+
202+
return findings
203+
}

0 commit comments

Comments
 (0)