Skip to content

Commit d9e3c9b

Browse files
committed
fix(native): harden clojure extractor child iteration and document parity gaps
Address Greptile review feedback on PR #1097: - `find_first_symbol`/`find_second_symbol`: replace `?` with `match` so a `None` from `list_node.child(i)` skips and continues instead of aborting the whole scan, matching the JS extractor's `if (!child) continue;` semantics. Practically inert today (tree-sitter always returns `Some` for `i < child_count()`), but makes the invariant explicit and removes a fragile divergence. - `extract_clojure_params`: document the known `defmethod` dispatch-vector gap inherited from the JS extractor. - `handle_import_form`: document why top-level `(require 'some.ns)` and `(require '[some.ns :as s])` are silently dropped and point maintainers at the working `(ns ...)` path in `extract_ns_requires`.
1 parent 446c053 commit d9e3c9b

1 file changed

Lines changed: 34 additions & 2 deletions

File tree

crates/codegraph-core/src/extractors/clojure.rs

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,9 +120,16 @@ fn handle_list_form(
120120

121121
/// Find the first `sym_lit` or `kwd_lit` child, skipping delimiters and metadata.
122122
/// Mirrors `findFirstSymbol` in the JS extractor.
123+
///
124+
/// A missing child at index `i < child_count()` is treated as "skip and continue"
125+
/// to match the JS counterpart (`if (!child) continue;`), rather than aborting
126+
/// the search via `?`.
123127
fn find_first_symbol<'a>(list_node: &Node<'a>) -> Option<Node<'a>> {
124128
for i in 0..list_node.child_count() {
125-
let child = list_node.child(i)?;
129+
let child = match list_node.child(i) {
130+
Some(c) => c,
131+
None => continue,
132+
};
126133
if is_delimiter_or_meta(child.kind()) {
127134
continue;
128135
}
@@ -136,10 +143,16 @@ fn find_first_symbol<'a>(list_node: &Node<'a>) -> Option<Node<'a>> {
136143

137144
/// Find the second `sym_lit` or `kwd_lit` child. Used to extract the bound
138145
/// name from forms like `(defn foo [...] ...)`.
146+
///
147+
/// Like `find_first_symbol`, a missing child is skipped (not propagated via `?`)
148+
/// to preserve parity with the JS extractor.
139149
fn find_second_symbol<'a>(list_node: &Node<'a>) -> Option<Node<'a>> {
140150
let mut count = 0;
141151
for i in 0..list_node.child_count() {
142-
let child = list_node.child(i)?;
152+
let child = match list_node.child(i) {
153+
Some(c) => c,
154+
None => continue,
155+
};
143156
if is_delimiter_or_meta(child.kind()) {
144157
continue;
145158
}
@@ -290,6 +303,14 @@ fn handle_defn_form(
290303
fn extract_clojure_params(defn_node: &Node, source: &[u8]) -> Vec<Definition> {
291304
let mut params = Vec::new();
292305
// First `vec_lit` child is the parameter vector `[x y z]`.
306+
//
307+
// Known limitation (parity with JS extractor): for `defmethod` forms like
308+
// `(defmethod foo [:a :b] [x] body)`, the dispatch vector `[:a :b]` is the
309+
// first `vec_lit` and the actual parameter vector `[x]` is silently
310+
// skipped because of the `break` below. The dispatch vector contributes
311+
// no `sym_lit` entries (its elements are `kwd_lit`), so `params` ends up
312+
// empty rather than wrong. Tracked as a future enhancement once
313+
// visibility/metadata fields land in `Definition`.
293314
for i in 0..defn_node.child_count() {
294315
let child = match defn_node.child(i) {
295316
Some(c) if c.kind() == "vec_lit" => c,
@@ -345,6 +366,17 @@ fn handle_defrecord(node: &Node, source: &[u8], symbols: &mut FileSymbols, kind:
345366
});
346367
}
347368

369+
/// Handle a top-level `(require ...)`, `(use ...)`, or `(import ...)` form.
370+
///
371+
/// Known limitation (parity with JS extractor): in real Clojure code these
372+
/// top-level forms almost always use a quoted symbol (`(require 'some.ns)`
373+
/// → `quoting_lit`) or a quoted vector (`(require '[some.ns :as s])`).
374+
/// `find_second_symbol` only matches `sym_lit` / `kwd_lit`, so those shapes
375+
/// return `None` and the import is silently dropped here. Imports inside
376+
/// `(ns ...)` declarations are still extracted correctly by
377+
/// `extract_ns_requires` — that path is the recommended one and covers
378+
/// real-world Clojure code, while this top-level fallback only handles the
379+
/// degenerate unquoted shape.
348380
fn handle_import_form(node: &Node, source: &[u8], symbols: &mut FileSymbols, keyword: &str) {
349381
let name_node = match find_second_symbol(node) {
350382
Some(n) => n,

0 commit comments

Comments
 (0)