Feat/elixir support#228
Open
oberernst wants to merge 8 commits into
Open
Conversation
Adds 'elixir' to the Language union, .ex/.exs to EXTENSION_MAP, and wires tree-sitter-elixir.wasm (from the tree-sitter-wasms package) into WASM_GRAMMAR_FILES and getLanguageDisplayName. Also extends DEFAULT_CONFIG.include with **/*.ex and **/*.exs. After this commit .ex/.exs files are recognised and the grammar loads, but the default visitor emits no nodes for Elixir AST shapes — phase 2 adds the dispatcher. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds src/extraction/languages/elixir.ts with a visitNode hook that dispatches on the identifier text of `call` nodes (since tree-sitter-elixir represents every keyword form — defmodule, def, alias, if, case — as a call whose first child is an identifier). Implements: - defmodule Foo.Bar do … end → module node, name "Foo.Bar" - def name(args) → function "name/arity", public - defp name(args) → function "name/arity", private - defmacro / defmacrop → function with arity - Multi-clause defs merged → one node per name/arity per module - @moduledoc "…" → attached to enclosing module's docstring Out of scope (later phases): @doc, @behaviour, @callback, @SPEC, alias/import/require/use, defstruct, defprotocol/defimpl, and call-site extraction. Those are stubbed by swallowing the relevant call/attribute nodes so the default walker doesn't mis-interpret their operands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each directive produces one or more `import` nodes plus an
`imports`-kind unresolved reference from the enclosing module. The
ReferenceResolver will turn those into `imports` edges once it can
match the dotted module name to a known module node.
The full directive text lives on the import node's signature
(e.g. "use Phoenix.LiveView, layout: {…}"), so the mechanism is
queryable without adding a separate metadata field.
`alias Foo.{Bar, Baz}` expansion produces two import nodes named
`Foo.Bar` and `Foo.Baz`, each positioned on the matching child.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Inside `def`/`defp` bodies, emit a `calls`-kind unresolved reference for every user call site. Resolves the callee name from the `call` node's target child: helper(a, b) → "helper/2" String.upcase(x) → "String.upcase/1" x |> String.upcase → "String.upcase/1" (pipe LHS counts as arg 1) Pipe detection reads the source slice between the binary_operator's left and right children (tree-sitter-elixir doesn't expose the operator as a field). Bodies are now visited via `ctx.visitNode(body)` rather than iterating `body.namedChildren` — for inline `do: expr` clauses the body itself is the call we need to dispatch on. Multi-clause merging now pushes the existing function node onto scope before walking later clause bodies so nested calls anchor to the right caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `defstruct [:a, :b]` (and `defexception`) → struct node named after
the enclosing module, with one `field` child per atom or pair key.
- `defprotocol Sizable do def size(thing) end` → protocol node plus
abstract `function` nodes for each bodyless `def` clause.
- `defimpl Sizable, for: List do … end` → module node named
"Sizable.List" plus an `implements`-kind unresolved reference to
`Sizable`. Function bodies inside are extracted normally.
- `@behaviour Foo` → `implements`-kind unresolved reference from the
enclosing module to `Foo`.
`@doc`, `@callback`, and `@spec` are still swallowed by the attribute
handler so the default walker doesn't treat their operand call as a
user call site; they get proper handling in a later phase.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Elixir's control-flow keywords (case, cond, if, with, for, fn, …) and metaprogramming forms (quote, unquote, raise, throw) syntactically parse as `call` nodes — `case foo do … end` is literally a call to `case`. The earlier call-site walker emitted spurious `case/N`, `if/N`, etc. references. NON_CALL_FORMS now skips them at emit time. Measured: 50 nodes (1 module, 1 struct + 5 fields, 14 imports, 28 functions), 220 calls captured, 26 ms extraction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stdlib filtering, docs Round out Elixir support so cross-module queries, callers, and impact analysis work end-to-end on real Elixir/Phoenix projects. Extraction: - @doc "…" now sets the next def's docstring; @SPEC sets its signature; @callback emits an abstract function inside the enclosing module. - alias Foo.Bar.Baz (with optional `as: Short`) records the binding in a per-module alias map. handleUserCall consults that map and attaches the fully-qualified expansion to the unresolved ref's `candidates` array — so `Baz.foo/2` is recoverable as `Foo.Bar.Baz.foo/2` even when Baz isn't a globally visible name. Resolution: - matchByQualifiedName bridges Elixir's `.`-joined call refs (`Foo.Bar.changeset/2`) to codegraph's `::`-joined qualifiedNames (`Foo.Bar::changeset/2`). Without this bridge, every cross-file Elixir call stayed unresolved because endsWith() couldn't see past the separator mismatch. - The matcher now also tries each entry in ref.candidates — the alias-expansion target lights up here. - hasAnyPossibleMatch pre-filter recognises Elixir-shape names (`A.B.C.func/N`) by checking the trailing `func/N` segment against knownNames; otherwise the pre-filter rejected legitimate refs before the bridge could see them. - Built-in / stdlib filter: ELIXIR_STDLIB_PREFIXES skips Enum, String, Map, List, Phoenix, Ecto, Plug, Logger, etc. and Erlang modules \`:lists.\`, \`:maps.\`, \`:ets.\`. ELIXIR_KERNEL_FUNCS skips \`is_nil/1\`, \`to_string/1\`, \`inspect/1\`, and friends. Config / docs: - 'elixir' added to the validLanguages allowlist in src/config.ts (config-load would otherwise silently reject \`elixir\` as an invalid language value). - .ex/.exs added to import-resolver.ts EXTENSION_RESOLUTION so any relative-path resolution code path that lands on Elixir behaves. - README's "supported languages" tagline and table now list Elixir. - The "Language Support" test that asserts which languages getSupportedLanguages() returns now includes pascal, scala, and elixir (alongside the existing entries). Tests: - 30 Elixir extraction tests (modules, defs, aliases, calls, structs, protocols, behaviour, @doc/@spec/@callback, alias-aware candidates). - 2 Elixir resolution tests proving cross-file calls and \`imports\` edges resolve correctly through the Mix module-name convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Compared elixir.ts side-by-side with typescript.ts (canonical) and
scala.ts (closest peer with a custom visitNode). The project's
convention is sparse, terse comments only at non-obvious AST shapes
— no top-of-file docblocks, no section dividers, no JSDoc on
internal helpers.
- Drops the 15-line module docblock; the same context survives as a
5-line preamble.
- Removes section dividers (`// --- helpers ---`, etc.).
- Replaces JSDoc blocks on every helper with one-line WHY comments
where genuinely needed, or no comment at all where the function
name carries its own meaning.
- Trims tutorial-style explanations down to the load-bearing
sentence (state lifecycle for the buffers/aliasStack, the pipe
detection trick, the readonly-node mutation note).
- Deletes a dead for-loop in resolveDefBody that was a relic of an
earlier iteration of the body-resolution heuristic.
- Folds the small `nearestModuleId` walk that was duplicated in
handleAttribute branches and handleDefstruct into a helper.
Net: 305 deletions, 109 insertions. No behaviour change — all 30
extraction tests and 2 resolution tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What it says on the tin: a pass at elixir support in codegraph.