Skip to content

fix(embed): install @huggingface/transformers into codegraph's host node_modules#1178

Merged
carlos-alm merged 5 commits into
mainfrom
fix/1175-embed-install-cwd
May 20, 2026
Merged

fix(embed): install @huggingface/transformers into codegraph's host node_modules#1178
carlos-alm merged 5 commits into
mainfrom
fix/1175-embed-install-cwd

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

Summary

embed's lazy install calls execFileSync('npm', ['install', ...]) with no cwd. Default cwd = process.cwd(), which is whatever directory the user invoked codegraph from — typically not the directory where codegraph itself is installed.

The install lands in the user's project node_modules (and silently grows a @huggingface/transformers dep there), but the subsequent await import('@huggingface/transformers') resolves from codegraph's own package location, where the dep is missing. The user sees:

[ENGINE_UNAVAILABLE]: @huggingface/transformers was installed but failed to load. Please check your environment.

Pin cwd to the host directory that contains @optave/codegraph (resolved via _require.resolve('@optave/codegraph/package.json') then four path.dirname hops). The install now lands in the same node_modules that contains codegraph, so the dynamic import resolves correctly without polluting the user's working directory.

Falls back to cwd: undefined (≡ process.cwd()) when @optave/codegraph can't be resolved — typically a source-of-truth dev checkout — so the existing test suite (which mocks the import to fail) continues to pass.

Found during

Dogfooding v3.10.1-dev.80 — see #1175.

Test plan

  • npx vitest run tests/unit/prompt-install.test.ts — all 5 existing tests pass with the change in place
  • Reproduced original bug from a separate dogfood directory and verified the fix loads transformers correctly

…ode_modules

`embed`'s lazy install calls `execFileSync('npm', ['install', ...])` with
no `cwd`. Default cwd = `process.cwd()`, which is whatever directory the
user invoked codegraph from — typically NOT the directory where codegraph
itself is installed.

The resulting install lands in the user's project node_modules, but the
subsequent `await import('@huggingface/transformers')` resolves from
codegraph's own location, where the dep is missing. The user sees:

    [ENGINE_UNAVAILABLE]: @huggingface/transformers was installed
    but failed to load. Please check your environment.

…while their project's node_modules silently gains an unwanted dep.

Pin `cwd` to the host directory that contains @optave/codegraph (resolved
via `_require.resolve('@optave/codegraph/package.json')` then four
`path.dirname` hops). The install now lands in the same node_modules that
contains codegraph, so the dynamic import resolves correctly without
polluting the user's working directory.

Falls back to `cwd: undefined` (= process.cwd()) when @optave/codegraph
can't be resolved — typically a source-of-truth dev checkout — so the
existing test suite (which mocks the import to fail) continues to pass.

Closes #1175
@claude
Copy link
Copy Markdown

claude Bot commented May 20, 2026

Claude finished @carlos-alm's task —— View job


Code Review Complete

Reviewing PR #1178: fix(embed): install @huggingface/transformers into codegraph's host node_modules

Review Checklist:

  • Root cause analysis - understand the problem being solved
  • Read and analyze the changed file
  • Check for configuration changes (high scrutiny)
  • Verify test integrity
  • Evaluate code quality and security
  • Rate against backlog compliance criteria
  • Provide final assessment and recommendation

Root Cause Analysis

Problem Being Solved: The embed feature's lazy install mechanism was installing @huggingface/transformers into the user's working directory (process.cwd()) instead of codegraph's own node_modules. This caused the subsequent dynamic import to fail because Node.js module resolution looks for packages relative to codegraph's installation path, not the user's working directory. Users would see: [ENGINE_UNAVAILABLE]: @huggingface/transformers was installed but failed to load

Why This Approach: The fix correctly identifies that execFileSync needs an explicit cwd parameter to control where npm installs packages. The solution uses _require.resolve('@optave/codegraph/package.json') with path navigation to find the host directory containing codegraph's node_modules, ensuring the package is installed where the dynamic import can find it.

Risk Assessment: Low risk. This is a focused bug fix that doesn't change API contracts or core functionality. The fallback mechanism preserves existing behavior for development environments.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies added
  • Foundation-aligned: ✓ — Fixes a legitimate infrastructure bug without violating any principles
  • Problem-fit: 3 — Useful quality-of-life fix for embedding functionality, but doesn't directly address core "lost AI" problem
  • Breaking: No — purely internal behavior change, no API changes
  • Tier: 1

Code Quality Assessment

Excellent implementation quality:

  1. Root cause fix, not workaround: The fix addresses the actual problem (wrong install location) rather than masking symptoms

  2. Robust path resolution logic (lines 23-34):

    const pkgJsonPath = _require.resolve('@optave/codegraph/package.json');
    // dirname x4: package.json → codegraph → @optave → node_modules → <host>
    return path.dirname(path.dirname(path.dirname(path.dirname(pkgJsonPath))));

    Well-documented path navigation with clear comments explaining each hop

  3. Graceful fallback for development: Returns undefined when @optave/codegraph can't be resolved, preserving existing behavior for source checkouts and testing

  4. Consistent application: The installCwd is applied to both TTY and non-TTY code paths

Security considerations: ✅ Safe

  • Uses _require.resolve() which is secure and doesn't execute code
  • Path manipulation uses path.dirname() which is safe from traversal attacks
  • No additional command injection vectors introduced

Edge cases handled:

  • Development/testing environments (graceful fallback)
  • Both interactive and CI install flows
  • Package resolution failures (try/catch around _require.resolve)

Test Integrity

The existing test suite remains valid because:

  • Tests mock @huggingface/transformers import behavior, not the install mechanism
  • The fallback cwd: undefined preserves test environment behavior
  • According to PR description, all 5 existing tests pass

Note: Cannot verify test execution due to tool restrictions, but the logic is sound.

Critical Concerns

None identified. This is a clean, well-implemented bug fix.

Minor observation: The function name resolveNpmInstallCwd() clearly communicates purpose and the extensive comments make the path navigation logic easy to understand.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: This is a well-implemented fix for a legitimate bug. The solution correctly addresses the root cause, handles edge cases gracefully, maintains backward compatibility for development environments, and includes comprehensive documentation. The only reason it doesn't get 5 stars is that it's addressing a quality-of-life issue rather than a critical functional bug.

Recommendation: Merge this PR. It fixes a real user-facing issue where the embedding feature would fail after appearing to install successfully, which is confusing UX. The implementation is robust and safe.


@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR fixes a bug where npm install during lazy @huggingface/transformers loading ran in the wrong directory (process.cwd() instead of codegraph's own host directory), causing the subsequent dynamic import() to fail with ENGINE_UNAVAILABLE. The fix introduces resolveNpmInstallCwd() which walks four path.dirname hops up from @optave/codegraph/package.json to land on the host directory that actually owns codegraph's node_modules, with a safe fallback to undefined when the package can't be resolved (dev/source-checkout scenario).

  • resolveNpmInstallCwd() is exported and covered by two dedicated unit tests — one verifying the four-hop path computation, one verifying the undefined fallback on resolution failure.
  • Both the TTY and non-TTY execFileSync call sites in promptInstall now receive cwd: installCwd and --no-save, so no package.json in the host directory is mutated.

Confidence Score: 5/5

Safe to merge — the change is narrowly scoped to pinning the npm install working directory and does not touch any embedding, indexing, or query logic.

The root-cause analysis is correct: without a pinned cwd, npm installs the package in the wrong location and the dynamic import consistently fails. The four-hop dirname computation is verified by a dedicated unit test, the undefined fallback keeps the dev/source-checkout path working, and both install branches (TTY and non-TTY) now pass --no-save and cwd symmetrically. No regressions are introduced.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/search/models.ts Adds resolveNpmInstallCwd() to pin npm install cwd to the directory containing @optave/codegraph's node_modules; both TTY and non-TTY install paths now pass cwd and --no-save correctly.
tests/unit/prompt-install.test.ts Adds two dedicated unit tests for resolveNpmInstallCwd (happy path and error/fallback path); existing five tests continue to pass unchanged.
generated/dogfood/DOGFOOD_REPORT_v3.10.1-dev.80.md Generated dogfood report documenting the reproduction of the original bug and verification of the fix on v3.10.1-dev.80; no code changes.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[embed / loadTransformers called] --> B{import transformers}
    B -- success --> C[Return module]
    B -- fails --> D[promptInstall called]
    D --> E[resolveNpmInstallCwd]
    E --> F{resolve codegraph package.json}
    F -- resolves --> G["4x path.dirname → host dir"]
    F -- throws --> H["installCwd = undefined (process.cwd fallback)"]
    G --> I[installCwd = host dir]
    I --> J{process.stdin.isTTY?}
    H --> J
    J -- no TTY --> K["execFileSync npm install --no-save cwd=installCwd"]
    J -- TTY --> L[Prompt user y/N]
    L -- declined --> M[throw EngineError]
    L -- accepted --> K
    K -- success --> N[retry dynamic import]
    K -- fails --> M
    N -- success --> C
    N -- fails --> O[throw ENGINE_UNAVAILABLE]
Loading

Reviews (4): Last reviewed commit: "Merge branch 'main' into fix/1175-embed-..." | Re-trigger Greptile

Comment thread src/domain/search/models.ts Outdated
Comment on lines 163 to 167
execFileSync(NPM_BIN, ['install', packageName], {
stdio: 'inherit',
timeout: 300_000,
cwd: installCwd,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The interactive (TTY) install call is missing --no-save. Now that cwd is pinned to codegraph's host directory, a user who confirms the prompt will silently add @huggingface/transformers to the package.json sitting in that host directory (e.g., the project root when codegraph is a local dep, or the npm global prefix). The non-TTY path on line 141 already passes --no-save correctly.

Suggested change
execFileSync(NPM_BIN, ['install', packageName], {
stdio: 'inherit',
timeout: 300_000,
cwd: installCwd,
});
execFileSync(NPM_BIN, ['install', '--no-save', packageName], {
stdio: 'inherit',
timeout: 300_000,
cwd: installCwd,
});

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 15f11ed — TTY install path now passes --no-save and pins cwd: installCwd symmetrically with the non-TTY path.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Codegraph Impact Analysis

2 functions changed4 callers affected across 1 files

  • resolveNpmInstallCwd in src/domain/search/models.ts:25 (3 transitive callers)
  • promptInstall in src/domain/search/models.ts:138 (3 transitive callers)

The non-TTY auto-install path already passes `--no-save`. The TTY path did
not, which was fine when `cwd` defaulted to `process.cwd()` because that
already polluted the user's working dir (a separate, original bug).

Now that `cwd` is pinned to codegraph's host directory, omitting
`--no-save` would silently add `@huggingface/transformers` to the
`package.json` sitting next to codegraph's `node_modules` (the user's
project root when codegraph is a local dep, or the npm global prefix when
global-installed). Match the non-TTY path so installs stay ephemeral
regardless of code path.

Updates the corresponding assertion in tests/unit/prompt-install.test.ts.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai addressed the TTY install --no-save asymmetry you flagged — pushed 15f11ed.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

The fix looks correct. Both the TTY and non-TTY paths now consistently pass --no-save and cwd: installCwd. The commit message clearly explains why the omission was previously masked by the original cwd bug.

The change is minimal and targeted, the test assertion was updated to match, and the symmetry concern is fully resolved. LGTM. ✓

Mock createRequire so resolve returns a synthetic scoped package.json
path, then assert the 4 dirname hops land at the host directory.
Without this, only the catch-to-undefined branch was exercised and a
miscounted dirname would pass CI silently.

Closes follow-up suggestion from #1178 review.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai addressed the happy-path coverage gap you flagged in the review summary — pushed 2c96d78.

Added two unit tests in tests/unit/prompt-install.test.ts that mock node:module so createRequire().resolve('@optave/codegraph/package.json') returns a synthetic scoped path. The first asserts the four dirname hops land at the host directory; the second asserts the catch → undefined fallback when resolution throws. resolveNpmInstallCwd is now exported with an @internal tag (kept out of the search barrel).

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

The coverage additions look solid. Both branches of resolveNpmInstallCwd are now exercised:

  • Happy pathvi.doMock('node:module') intercepts the module-level createRequire call (since vi.resetModules() + vi.doMock + dynamic import forces a fresh module evaluation), the synthetic scoped path /host/node_modules/@optave/codegraph/package.json correctly resolves to /host after 4 dirname hops, and the assertion confirms it.
  • Fallback — the resolve throws, the catch swallows it, and undefined is returned.

One minor note: the _require instance is created at module load time (const _require = createRequire(import.meta.url)), so the mock needs to land before import() runs — which it does here because vi.doMock + vi.resetModules + dynamic import is the correct sequencing. Looks good.

The bonus --no-save fix in the interactive branch (line 162) is also the right call — keeping the TTY and non-TTY paths consistent.

This is ready to merge from a testing standpoint. LGTM.

@carlos-alm carlos-alm merged commit 0f9f0ef into main May 20, 2026
21 checks passed
@carlos-alm carlos-alm deleted the fix/1175-embed-install-cwd branch May 20, 2026 22:53
@github-actions github-actions Bot locked and limited conversation to collaborators May 20, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant