Skip to content

Migrate LSP YAML parser to gotreesitter#302

Merged
kindermax merged 1 commit into
masterfrom
codex/adopt-gotreesitter-for-project
Mar 18, 2026
Merged

Migrate LSP YAML parser to gotreesitter#302
kindermax merged 1 commit into
masterfrom
codex/adopt-gotreesitter-for-project

Conversation

@kindermax
Copy link
Copy Markdown
Collaborator

@kindermax kindermax commented Mar 18, 2026

Summary

  • swap the LSP YAML parser from the CGO-based tree-sitter bindings to the pure-Go gotreesitter client, updating parser helpers and queries accordingly
  • drop CGO/system toolchain settings from release/Docker workflows and add the new dependency plus changelog entry

Testing

  • Not run (not requested)

Summary by Sourcery

Migrate the LSP YAML parsing logic to use the pure-Go gotreesitter client instead of CGO-based tree-sitter bindings, and remove associated CGO/toolchain requirements from builds and release workflows.

Enhancements:

  • Refactor LSP YAML parsing helpers and queries to use a shared gotreesitter-based parser and updated node/query APIs.

Build:

  • Remove CGO, compiler, and sysroot configuration from GoReleaser, Docker, and release workflows now that a C toolchain is no longer required.

Documentation:

  • Document the LSP YAML parser migration and removal of C toolchain requirements in the changelog.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Mar 18, 2026

Reviewer's Guide

Migrates the LSP YAML parsing logic from the CGO-based tree-sitter bindings to the pure-Go gotreesitter client and removes CGO/system toolchain requirements from builds, Docker image, and release workflows while updating dependencies and documentation accordingly.

Class diagram for updated LSP YAML parsing with gotreesitter

classDiagram
    class parser {
        +getPositionType(document *string, position lsp.Position) PositionType
        +inMixinsPosition(document *string, position lsp.Position) bool
        +inDependsPosition(document *string, position lsp.Position) bool
        +extractFilenameFromMixins(document *string, position lsp.Position) string
        +getCommands(document *string) []Command
        +getCurrentCommand(document *string, position lsp.Position) *Command
        +findCommand(document *string, commandName string) *Command
        +extractDependsValues(document *string) []string
    }

    class Command {
        +name string
        +position lsp.Position
    }

    class yamlLanguageVar {
        +yamlLanguage ts.Language
    }

    class parseYAMLDocumentFn {
        +parseYAMLDocument(document *string) (*ts.Tree, []byte, error)
    }

    class cursorHelpers {
        +isCursorWithinNode(node *ts.Node, pos lsp.Position) bool
        +isCursorWithinNodePoints(startPoint ts.Point, endPoint ts.Point, pos lsp.Position) bool
        +isCursorAtLine(node *ts.Node, pos lsp.Position) bool
    }

    class getLineFn {
        +getLine(document *string, line uint32) string
    }

    parser --> Command : uses
    parser --> yamlLanguageVar : uses
    parser --> parseYAMLDocumentFn : uses
    parser --> cursorHelpers : uses
    parser --> getLineFn : uses
    yamlLanguageVar --> ts : uses
    parseYAMLDocumentFn --> ts : uses
    cursorHelpers --> ts : uses
Loading

Flow diagram for LSP YAML parsing with gotreesitter

flowchart TD
    A[Receive_document_and_optional_position] --> B[Call_parseYAMLDocument]
    B --> C{Parse_error?}
    C -- yes --> Z[Return_default_value_or_nil]
    C -- no --> D[Obtain_tree_and_docBytes]
    D --> E[Get_root_node_tree.RootNode]
    E --> F{Root_is_nil?}
    F -- yes --> Z
    F -- no --> G[Create_query_ts.NewQuery_with_yamlLanguage]
    G --> H{Query_error?}
    H -- yes --> Z
    H -- no --> I[Execute_query_query.Exec_with_root_and_docBytes]
    I --> J[Iterate_matches_matches.NextMatch]
    J --> K{More_matches?}
    K -- no --> Z
    K -- yes --> L[Iterate_captures_in_match]
    L --> M{Capture_matches_target_name?}
    M -- no --> J
    M -- yes --> N[Use_cursor_helpers_isCursorWithinNode_or_isCursorAtLine]
    N --> O{Cursor_or_line_matches?}
    O -- no --> J
    O -- yes --> P[Extract_Text_or_position_from_capture]
    P --> Q{Caller_context}
    Q -- inMixinsPosition --> R[Return_true_if_mixins_key_under_cursor]
    Q -- inDependsPosition --> S[Return_true_if_depends_value_under_cursor]
    Q -- extractFilenameFromMixins --> T[Return_filename_string]
    Q -- getCommands --> U[Append_Command_to_slice]
    Q -- getCurrentCommand --> V[Return_matching_Command]
    Q -- findCommand --> W[Return_Command_matching_name]
    Q -- extractDependsValues --> X[Append_depends_value_to_slice]
    R --> J
    S --> J
    T --> J
    U --> J
    V --> J
    W --> J
    X --> J
    Z[End_method_returning_bool_Command_or_slice]
Loading

File-Level Changes

Change Details Files
Replace CGO-based tree-sitter YAML integration with pure-Go gotreesitter in LSP parser and tests.
  • Swap imports from github.com/tree-sitter/go-tree-sitter and tree-sitter-yaml bindings to github.com/odvcencio/gotreesitter and grammars
  • Introduce a shared yamlLanguage variable and a parseYAMLDocument helper to construct parsers and parse documents once per operation
  • Update node point accessors from StartPosition/EndPosition to StartPoint/EndPoint and adjust cursor-within-node logic to use native lsp.Position types
  • Migrate query construction and execution to gotreesitter APIs, replacing QueryCursor/Marches/Utf8Text/Kind/CaptureIndexForName with Exec/NextMatch/Text/Type(Name-based capture lookup)
  • Simplify root node handling with nil-checks and ensure tree resources are released via Release instead of Close
  • Update tests to use the gotreesitter ts alias instead of the old tree-sitter binding
internal/lsp/treesitter.go
internal/lsp/treesitter_test.go
Remove CGO and external C toolchain requirements from build and release configuration.
  • Drop PKG_CONFIG and CC/CXX environment variables from .goreleaser.yml build matrix entries
  • Stop installing gcc and remove CGO-related environment variables (CGO_ENABLED, CGO_CFLAGS) from Dockerfile builder stage
  • Remove CGO-related environment variables and sysroot volume mounts from GitHub release workflow docker invocations
.goreleaser.yml
Dockerfile
.github/workflows/release.yaml
Update module dependencies and documentation for the new parser implementation.
  • Add github.com/odvcencio/gotreesitter as a direct dependency and remove tree-sitter YAML and go-tree-sitter plus the indirect go-pointer dependency from go.mod
  • Refresh go.sum to reflect the new dependency set
  • Add a changelog entry documenting the switch to the pure-Go gotreesitter YAML parser and the removal of C toolchain requirements
go.mod
go.sum
docs/docs/changelog.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="internal/lsp/treesitter_test.go" line_range="7" />
<code_context>
 import (
 	"strings"

+	ts "github.com/odvcencio/gotreesitter"
+	"github.com/odvcencio/gotreesitter/grammars"
 	"github.com/tliron/commonlog"
</code_context>
<issue_to_address>
**suggestion (testing):** Add/extend tests to validate behaviour of YAML parsing helpers with gotreesitter

This change swaps the YAML parser and adds `parseYAMLDocument`, altering how nodes/queries are handled (points, text, types, named captures). To guard against regressions, please add or extend tests around the main helpers that rely on these behaviours:

- `inMixinsPosition`
- `inDependsPosition`
- `extractFilenameFromMixins`
- `getCommands` / `getCurrentCommand` / `findCommand`
- `extractDependsValues`

Ensure the tests cover both block and flow sequences, multiple items, and nested structures (e.g. several differently shaped commands). A small table-driven set of YAML snippets with expected positions/outputs should be enough to validate the new parser behaves like the old one.

Suggested implementation:

```golang
	"reflect"
	"testing"

	ts "github.com/odvcencio/gotreesitter"
	"github.com/odvcencio/gotreesitter/grammars"
	"github.com/tliron/commonlog"
	lsp "github.com/tliron/glsp/protocol_3_16"
)

var logger = commonlog.GetLogger("test")

// helper to parse a YAML document with the same parser implementation used in production
func mustParseYAMLDocument(t *testing.T, content string) *ts.Node {
	t.Helper()

	parser := ts.NewParser()
	err := parser.SetLanguage(grammars.YAML())
	if err != nil {
		t.Fatalf("failed to set YAML grammar: %v", err)
	}

	tree, err := parser.Parse([]byte(content), nil)
	if err != nil {
		t.Fatalf("failed to parse YAML: %v", err)
	}

	root := tree.RootNode()
	if root == nil {
		t.Fatalf("nil root node for YAML content")
	}

	return root
}

// helper to create an LSP position
func pos(line, character uint32) lsp.Position {
	return lsp.Position{
		Line:      line,
		Character: character,
	}
}

// Test inMixinsPosition with block and flow sequences and multiple items
func TestInMixinsPosition_BlockAndFlowSequences(t *testing.T) {
	tests := []struct {
		name     string
		content  string
		position lsp.Position
		want     bool
	}{
		{
			name: "block sequence - inside first mixin",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			// position on "./common.yml"
			position: pos(4, 10),
			want:     true,
		},
		{
			name: "block sequence - inside second mixin",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			// position on "./other.yml"
			position: pos(5, 10),
			want:     true,
		},
		{
			name: "block sequence - outside mixins key",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			// somewhere on "tasks"
			position: pos(1, 1),
			want:     false,
		},
		{
			name: "flow sequence - inside first mixin",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position: pos(3, 15),
			want:     true,
		},
		{
			name: "flow sequence - between mixin items",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position: pos(3, 25),
			want:     true,
		},
		{
			name: "no mixins key",
			content: `
tasks:
  default:
    cmds:
      - echo "hello"
`,
			position: pos(3, 5),
			want:     false,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			root := mustParseYAMLDocument(t, tt.content)

			got := inMixinsPosition(root, tt.position)
			if got != tt.want {
				t.Fatalf("inMixinsPosition() = %v, want %v", got, tt.want)
			}
		})
	}
}

// Test inDependsPosition with block and flow sequences and nested tasks
func TestInDependsPosition_BlockFlowAndNested(t *testing.T) {
	tests := []struct {
		name     string
		content  string
		position lsp.Position
		want     bool
	}{
		{
			name: "block sequence - inside depends value",
			content: `
tasks:
  build:
    deps:
      - clean
      - compile
`,
			position: pos(4, 10),
			want:     true,
		},
		{
			name: "flow sequence - inside depends value",
			content: `
tasks:
  build:
    deps: [clean, compile]
`,
			position: pos(3, 15),
			want:     true,
		},
		{
			name: "nested task - depends for nested task",
			content: `
tasks:
  build:
    deps:
      - clean
  clean:
    cmds:
      - echo "clean"
`,
			position: pos(4, 10),
			want:     true,
		},
		{
			name: "outside depends key",
			content: `
tasks:
  build:
    cmds:
      - echo "build"
`,
			position: pos(3, 5),
			want:     false,
		},
		{
			name: "depends key but on key name, not value",
			content: `
tasks:
  build:
    deps:
      - clean
`,
			// on "deps" identifier
			position: pos(3, 4),
			want:     false,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			root := mustParseYAMLDocument(t, tt.content)

			got := inDependsPosition(root, tt.position)
			if got != tt.want {
				t.Fatalf("inDependsPosition() = %v, want %v", got, tt.want)
			}
		})
	}
}

// Test extractFilenameFromMixins over multiple structures
func TestExtractFilenameFromMixins(t *testing.T) {
	tests := []struct {
		name      string
		content   string
		position  lsp.Position
		wantFile  string
		wantFound bool
	}{
		{
			name: "block sequence - first element",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			position:  pos(4, 12),
			wantFile:  "./common.yml",
			wantFound: true,
		},
		{
			name: "block sequence - second element",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			position:  pos(5, 12),
			wantFile:  "./other.yml",
			wantFound: true,
		},
		{
			name: "flow sequence - first element",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position:  pos(3, 16),
			wantFile:  "./common.yml",
			wantFound: true,
		},
		{
			name: "flow sequence - second element",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position:  pos(3, 30),
			wantFile:  "./other.yml",
			wantFound: true,
		},
		{
			name: "outside mixins values",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
`,
			position:  pos(1, 1),
			wantFile:  "",
			wantFound: false,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			root := mustParseYAMLDocument(t, tt.content)

			gotFile, gotFound := extractFilenameFromMixins(root, tt.position)
			if gotFound != tt.wantFound {
				t.Fatalf("extractFilenameFromMixins() found = %v, want %v", gotFound, tt.wantFound)
			}
			if gotFile != tt.wantFile {
				t.Fatalf("extractFilenameFromMixins() file = %q, want %q", gotFile, tt.wantFile)
			}
		})
	}
}

// commandInfo is the minimal shape we expect from getCommands/findCommand/getCurrentCommand.
// Adapt the concrete type if the actual implementation uses a different name/fields.
type commandInfo struct {
	Name    string
	Depends []string
}

// Test getCommands, getCurrentCommand and findCommand with various command shapes
func TestCommandsHelpers_BlockAndFlowAndNested(t *testing.T) {
	content := `
tasks:
  default:
    cmds:
      - echo "default"
  build:
    deps:
      - clean
      - test
    cmds:
      - echo "build"
  test:
    deps: [lint]
    cmds:
      - echo "test"
  lint:
    cmds:
      - echo "lint"
`

	root := mustParseYAMLDocument(t, content)

	// getCommands should return all commands with their names and depends
	commands := getCommands(root)

	// we expect at least the commands defined above; the exact type may differ
	defaultCmd := findCommand(root, "default")
	if defaultCmd == nil {
		t.Fatalf("findCommand() did not return default command")
	}

	buildCmd := findCommand(root, "build")
	if buildCmd == nil {
		t.Fatalf("findCommand() did not return build command")
	}

	testCmd := findCommand(root, "test")
	if testCmd == nil {
		t.Fatalf("findCommand() did not return test command")
	}

	// getCurrentCommand: position within each command's body should map to that command
	tests := []struct {
		name     string
		position lsp.Position
		wantName string
	}{
		{
			name:     "inside default command",
			position: pos(4, 8),
			wantName: "default",
		},
		{
			name:     "inside build command",
			position: pos(8, 8),
			wantName: "build",
		},
		{
			name:     "inside test command",
			position: pos(13, 8),
			wantName: "test",
		},
		{
			name:     "inside lint command",
			position: pos(18, 8),
			wantName: "lint",
		},
	}

	if commands == nil {
		t.Fatalf("getCommands() returned nil")
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			cmd := getCurrentCommand(root, tt.position)
			if cmd == nil {
				t.Fatalf("getCurrentCommand() = nil, want command %q", tt.wantName)
			}

			name := cmd.Name
			if name != tt.wantName {
				t.Fatalf("getCurrentCommand().Name = %q, want %q", name, tt.wantName)
			}
		})
	}
}

// Test extractDependsValues for both block and flow sequences
func TestExtractDependsValues_BlockAndFlow(t *testing.T) {
	content := `
tasks:
  blockDeps:
    deps:
      - clean
      - build
    cmds:
      - echo "block"
  flowDeps:
    deps: [test, lint]
    cmds:
      - echo "flow"
  mixedDeps:
    deps:
      - deploy
    cmds:
      - echo "mixed"
`

	root := mustParseYAMLDocument(t, content)

	blockCmd := findCommand(root, "blockDeps")
	if blockCmd == nil {
		t.Fatalf("findCommand() did not return blockDeps")
	}
	flowCmd := findCommand(root, "flowDeps")
	if flowCmd == nil {
		t.Fatalf("findCommand() did not return flowDeps")
	}
	mixedCmd := findCommand(root, "mixedDeps")
	if mixedCmd == nil {
		t.Fatalf("findCommand() did not return mixedDeps")
	}

	tests := []struct {
		name     string
		command  interface{}
		wantDeps []string
	}{
		{
			name:     "block deps",
			command:  blockCmd,
			wantDeps: []string{"clean", "build"},
		},
		{
			name:     "flow deps",
			command:  flowCmd,
			wantDeps: []string{"test", "lint"},
		},
		{
			name:     "single dep",
			command:  mixedCmd,
			wantDeps: []string{"deploy"},
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			deps := extractDependsValues(root, tt.command)
			if !reflect.DeepEqual(deps, tt.wantDeps) {
				t.Fatalf("extractDependsValues() = %#v, want %#v", deps, tt.wantDeps)
			}
		})
	}
}

```

The new tests assume the following helper signatures and behaviours:

- `inMixinsPosition(root *ts.Node, position lsp.Position) bool`
- `inDependsPosition(root *ts.Node, position lsp.Position) bool`
- `extractFilenameFromMixins(root *ts.Node, position lsp.Position) (string, bool)`
- `getCommands(root *ts.Node) interface{}` (only checked for non-nil; adapt as needed)
- `getCurrentCommand(root *ts.Node, position lsp.Position) *commandInfo` (or another command type with a `Name` field)
- `findCommand(root *ts.Node, name string) *commandInfo` (or pointer to your concrete command type)
- `extractDependsValues(root *ts.Node, command interface{}) []string`

If your actual helpers have different signatures or types:

1. Adjust the tests to match the real types (e.g. replace `commandInfo` with your `command`/`task` type and update field accesses).
2. If `parseYAMLDocument` already exists as a shared helper, you can:
   - Remove the local `mustParseYAMLDocument` helper and reuse the existing one, or
   - Implement `mustParseYAMLDocument` as a thin wrapper around `parseYAMLDocument`.
3. The `position` values are based on the literal strings as written (0-based lines/columns). If your position mapping uses a different convention, tweak the `pos()` calls accordingly so the cursor falls on the intended nodes.
4. If `extractDependsValues` does not need the `root` node, drop that parameter from the tests and adjust the call site.
5. If `getCommands` returns a map or slice, update the tests to assert more detailed structure (e.g. command names, ranges) as appropriate for your implementation.

These tests are designed to validate consistent behaviour of the YAML helpers under the new `gotreesitter`-based parser, especially around block vs flow sequences, multiple items, and nested task structures. Adjust them to fit your concrete APIs while retaining the same coverage patterns.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

"reflect"
"testing"

ts "github.com/odvcencio/gotreesitter"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add/extend tests to validate behaviour of YAML parsing helpers with gotreesitter

This change swaps the YAML parser and adds parseYAMLDocument, altering how nodes/queries are handled (points, text, types, named captures). To guard against regressions, please add or extend tests around the main helpers that rely on these behaviours:

  • inMixinsPosition
  • inDependsPosition
  • extractFilenameFromMixins
  • getCommands / getCurrentCommand / findCommand
  • extractDependsValues

Ensure the tests cover both block and flow sequences, multiple items, and nested structures (e.g. several differently shaped commands). A small table-driven set of YAML snippets with expected positions/outputs should be enough to validate the new parser behaves like the old one.

Suggested implementation:

	"reflect"
	"testing"

	ts "github.com/odvcencio/gotreesitter"
	"github.com/odvcencio/gotreesitter/grammars"
	"github.com/tliron/commonlog"
	lsp "github.com/tliron/glsp/protocol_3_16"
)

var logger = commonlog.GetLogger("test")

// helper to parse a YAML document with the same parser implementation used in production
func mustParseYAMLDocument(t *testing.T, content string) *ts.Node {
	t.Helper()

	parser := ts.NewParser()
	err := parser.SetLanguage(grammars.YAML())
	if err != nil {
		t.Fatalf("failed to set YAML grammar: %v", err)
	}

	tree, err := parser.Parse([]byte(content), nil)
	if err != nil {
		t.Fatalf("failed to parse YAML: %v", err)
	}

	root := tree.RootNode()
	if root == nil {
		t.Fatalf("nil root node for YAML content")
	}

	return root
}

// helper to create an LSP position
func pos(line, character uint32) lsp.Position {
	return lsp.Position{
		Line:      line,
		Character: character,
	}
}

// Test inMixinsPosition with block and flow sequences and multiple items
func TestInMixinsPosition_BlockAndFlowSequences(t *testing.T) {
	tests := []struct {
		name     string
		content  string
		position lsp.Position
		want     bool
	}{
		{
			name: "block sequence - inside first mixin",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			// position on "./common.yml"
			position: pos(4, 10),
			want:     true,
		},
		{
			name: "block sequence - inside second mixin",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			// position on "./other.yml"
			position: pos(5, 10),
			want:     true,
		},
		{
			name: "block sequence - outside mixins key",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			// somewhere on "tasks"
			position: pos(1, 1),
			want:     false,
		},
		{
			name: "flow sequence - inside first mixin",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position: pos(3, 15),
			want:     true,
		},
		{
			name: "flow sequence - between mixin items",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position: pos(3, 25),
			want:     true,
		},
		{
			name: "no mixins key",
			content: `
tasks:
  default:
    cmds:
      - echo "hello"
`,
			position: pos(3, 5),
			want:     false,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			root := mustParseYAMLDocument(t, tt.content)

			got := inMixinsPosition(root, tt.position)
			if got != tt.want {
				t.Fatalf("inMixinsPosition() = %v, want %v", got, tt.want)
			}
		})
	}
}

// Test inDependsPosition with block and flow sequences and nested tasks
func TestInDependsPosition_BlockFlowAndNested(t *testing.T) {
	tests := []struct {
		name     string
		content  string
		position lsp.Position
		want     bool
	}{
		{
			name: "block sequence - inside depends value",
			content: `
tasks:
  build:
    deps:
      - clean
      - compile
`,
			position: pos(4, 10),
			want:     true,
		},
		{
			name: "flow sequence - inside depends value",
			content: `
tasks:
  build:
    deps: [clean, compile]
`,
			position: pos(3, 15),
			want:     true,
		},
		{
			name: "nested task - depends for nested task",
			content: `
tasks:
  build:
    deps:
      - clean
  clean:
    cmds:
      - echo "clean"
`,
			position: pos(4, 10),
			want:     true,
		},
		{
			name: "outside depends key",
			content: `
tasks:
  build:
    cmds:
      - echo "build"
`,
			position: pos(3, 5),
			want:     false,
		},
		{
			name: "depends key but on key name, not value",
			content: `
tasks:
  build:
    deps:
      - clean
`,
			// on "deps" identifier
			position: pos(3, 4),
			want:     false,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			root := mustParseYAMLDocument(t, tt.content)

			got := inDependsPosition(root, tt.position)
			if got != tt.want {
				t.Fatalf("inDependsPosition() = %v, want %v", got, tt.want)
			}
		})
	}
}

// Test extractFilenameFromMixins over multiple structures
func TestExtractFilenameFromMixins(t *testing.T) {
	tests := []struct {
		name      string
		content   string
		position  lsp.Position
		wantFile  string
		wantFound bool
	}{
		{
			name: "block sequence - first element",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			position:  pos(4, 12),
			wantFile:  "./common.yml",
			wantFound: true,
		},
		{
			name: "block sequence - second element",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
      - ./other.yml
`,
			position:  pos(5, 12),
			wantFile:  "./other.yml",
			wantFound: true,
		},
		{
			name: "flow sequence - first element",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position:  pos(3, 16),
			wantFile:  "./common.yml",
			wantFound: true,
		},
		{
			name: "flow sequence - second element",
			content: `
tasks:
  default:
    mixins: [./common.yml, ./other.yml]
`,
			position:  pos(3, 30),
			wantFile:  "./other.yml",
			wantFound: true,
		},
		{
			name: "outside mixins values",
			content: `
tasks:
  default:
    mixins:
      - ./common.yml
`,
			position:  pos(1, 1),
			wantFile:  "",
			wantFound: false,
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			root := mustParseYAMLDocument(t, tt.content)

			gotFile, gotFound := extractFilenameFromMixins(root, tt.position)
			if gotFound != tt.wantFound {
				t.Fatalf("extractFilenameFromMixins() found = %v, want %v", gotFound, tt.wantFound)
			}
			if gotFile != tt.wantFile {
				t.Fatalf("extractFilenameFromMixins() file = %q, want %q", gotFile, tt.wantFile)
			}
		})
	}
}

// commandInfo is the minimal shape we expect from getCommands/findCommand/getCurrentCommand.
// Adapt the concrete type if the actual implementation uses a different name/fields.
type commandInfo struct {
	Name    string
	Depends []string
}

// Test getCommands, getCurrentCommand and findCommand with various command shapes
func TestCommandsHelpers_BlockAndFlowAndNested(t *testing.T) {
	content := `
tasks:
  default:
    cmds:
      - echo "default"
  build:
    deps:
      - clean
      - test
    cmds:
      - echo "build"
  test:
    deps: [lint]
    cmds:
      - echo "test"
  lint:
    cmds:
      - echo "lint"
`

	root := mustParseYAMLDocument(t, content)

	// getCommands should return all commands with their names and depends
	commands := getCommands(root)

	// we expect at least the commands defined above; the exact type may differ
	defaultCmd := findCommand(root, "default")
	if defaultCmd == nil {
		t.Fatalf("findCommand() did not return default command")
	}

	buildCmd := findCommand(root, "build")
	if buildCmd == nil {
		t.Fatalf("findCommand() did not return build command")
	}

	testCmd := findCommand(root, "test")
	if testCmd == nil {
		t.Fatalf("findCommand() did not return test command")
	}

	// getCurrentCommand: position within each command's body should map to that command
	tests := []struct {
		name     string
		position lsp.Position
		wantName string
	}{
		{
			name:     "inside default command",
			position: pos(4, 8),
			wantName: "default",
		},
		{
			name:     "inside build command",
			position: pos(8, 8),
			wantName: "build",
		},
		{
			name:     "inside test command",
			position: pos(13, 8),
			wantName: "test",
		},
		{
			name:     "inside lint command",
			position: pos(18, 8),
			wantName: "lint",
		},
	}

	if commands == nil {
		t.Fatalf("getCommands() returned nil")
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			cmd := getCurrentCommand(root, tt.position)
			if cmd == nil {
				t.Fatalf("getCurrentCommand() = nil, want command %q", tt.wantName)
			}

			name := cmd.Name
			if name != tt.wantName {
				t.Fatalf("getCurrentCommand().Name = %q, want %q", name, tt.wantName)
			}
		})
	}
}

// Test extractDependsValues for both block and flow sequences
func TestExtractDependsValues_BlockAndFlow(t *testing.T) {
	content := `
tasks:
  blockDeps:
    deps:
      - clean
      - build
    cmds:
      - echo "block"
  flowDeps:
    deps: [test, lint]
    cmds:
      - echo "flow"
  mixedDeps:
    deps:
      - deploy
    cmds:
      - echo "mixed"
`

	root := mustParseYAMLDocument(t, content)

	blockCmd := findCommand(root, "blockDeps")
	if blockCmd == nil {
		t.Fatalf("findCommand() did not return blockDeps")
	}
	flowCmd := findCommand(root, "flowDeps")
	if flowCmd == nil {
		t.Fatalf("findCommand() did not return flowDeps")
	}
	mixedCmd := findCommand(root, "mixedDeps")
	if mixedCmd == nil {
		t.Fatalf("findCommand() did not return mixedDeps")
	}

	tests := []struct {
		name     string
		command  interface{}
		wantDeps []string
	}{
		{
			name:     "block deps",
			command:  blockCmd,
			wantDeps: []string{"clean", "build"},
		},
		{
			name:     "flow deps",
			command:  flowCmd,
			wantDeps: []string{"test", "lint"},
		},
		{
			name:     "single dep",
			command:  mixedCmd,
			wantDeps: []string{"deploy"},
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			deps := extractDependsValues(root, tt.command)
			if !reflect.DeepEqual(deps, tt.wantDeps) {
				t.Fatalf("extractDependsValues() = %#v, want %#v", deps, tt.wantDeps)
			}
		})
	}
}

The new tests assume the following helper signatures and behaviours:

  • inMixinsPosition(root *ts.Node, position lsp.Position) bool
  • inDependsPosition(root *ts.Node, position lsp.Position) bool
  • extractFilenameFromMixins(root *ts.Node, position lsp.Position) (string, bool)
  • getCommands(root *ts.Node) interface{} (only checked for non-nil; adapt as needed)
  • getCurrentCommand(root *ts.Node, position lsp.Position) *commandInfo (or another command type with a Name field)
  • findCommand(root *ts.Node, name string) *commandInfo (or pointer to your concrete command type)
  • extractDependsValues(root *ts.Node, command interface{}) []string

If your actual helpers have different signatures or types:

  1. Adjust the tests to match the real types (e.g. replace commandInfo with your command/task type and update field accesses).
  2. If parseYAMLDocument already exists as a shared helper, you can:
    • Remove the local mustParseYAMLDocument helper and reuse the existing one, or
    • Implement mustParseYAMLDocument as a thin wrapper around parseYAMLDocument.
  3. The position values are based on the literal strings as written (0-based lines/columns). If your position mapping uses a different convention, tweak the pos() calls accordingly so the cursor falls on the intended nodes.
  4. If extractDependsValues does not need the root node, drop that parameter from the tests and adjust the call site.
  5. If getCommands returns a map or slice, update the tests to assert more detailed structure (e.g. command names, ranges) as appropriate for your implementation.

These tests are designed to validate consistent behaviour of the YAML helpers under the new gotreesitter-based parser, especially around block vs flow sequences, multiple items, and nested task structures. Adjust them to fit your concrete APIs while retaining the same coverage patterns.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 18, 2026

Greptile Summary

This PR replaces the CGo-based tree-sitter/go-tree-sitter + tree-sitter-grammars/tree-sitter-yaml bindings with the pure-Go odvcencio/gotreesitter library, eliminating the C toolchain requirement from all build paths (Dockerfile, goreleaser targets, and GitHub Actions release workflow).

Key changes:

  • internal/lsp/treesitter.go: All six parser methods are migrated to the new API — StartPosition()→StartPoint(), EndPosition()→EndPoint(), node.Kind()→node.Type(yamlLanguage), Utf8Text()→Text(), and capture filtering switches from index-based to name-based (capture.Indexcapture.Name). A shared parseYAMLDocument() helper is introduced to de-duplicate parser/tree setup, and defensive root == nil guards are added.
  • Dockerfile: Removes CGO_ENABLED=1, CGO_CFLAGS, and gcc from the builder image; however, CGO_ENABLED=0 is not set explicitly.
  • .goreleaser.yml: Removes all C compiler/sysroot env vars from the three build targets but does not set CGO_ENABLED=0 to formally document the pure-Go intent.
  • .github/workflows/release.yaml: Correctly removes the CGO_ENABLED=1 env var and the /sysroot volume mount from both snapshot and release Docker steps.
  • go.mod / go.sum: 14 CGo tree-sitter checksum entries removed; one gotreesitter v0.9.2 entry added. A side-effect is a transitive downgrade of stretchr/testify from v1.9.0 to v1.7.0 (not directly imported by this module).

Confidence Score: 4/5

  • Safe to merge — the API migration is correct and tests are preserved; the only gaps are missing explicit CGO_ENABLED=0 declarations.
  • The migration is mechanically correct: every renamed method, changed argument order, and API idiom difference between the two libraries is properly handled. The refactored parseYAMLDocument helper reduces duplication and adds nil-root guards. Existing tests cover the core position-detection and extraction logic. The score is not a 5 because the tests are noted as not run, CGO_ENABLED=0 is not explicitly set in any build path (leaving a small gap in communicating build intent), and the transitive testify downgrade warrants a quick sanity-check.
  • .goreleaser.yml and Dockerfile — both would benefit from an explicit CGO_ENABLED=0 to fully document the pure-Go build contract.

Important Files Changed

Filename Overview
internal/lsp/treesitter.go Core migration from CGo-based go-tree-sitter to pure-Go gotreesitter: API methods correctly renamed (StartPosition→StartPoint, Kind→Type, Utf8Text→Text), capture filtering migrated from index-based to name-based, and parser setup refactored into a shared parseYAMLDocument helper with proper tree.Release() defers and nil root checks.
internal/lsp/treesitter_test.go Import updated to gotreesitter; existing test coverage for position detection, command extraction, and cursor logic is unchanged and sufficient to validate the migration.
Dockerfile Correctly removes CGO_ENABLED=1, CGO_CFLAGS, and gcc dependency; however, CGO_ENABLED=0 is not set explicitly, leaving the default as 1 in native build environments.
.goreleaser.yml C compiler and pkg-config environment variables removed from all three build targets; CGO_ENABLED is not explicitly set to 0, which is benign but misses the opportunity to document the pure-Go build intent.
.github/workflows/release.yaml Correctly removes CGO_ENABLED=1 and the sysroot volume mount from both the snapshot and release Docker run steps.
go.mod Cleanly swaps tree-sitter-grammars/tree-sitter-yaml and tree-sitter/go-tree-sitter (plus mattn/go-pointer) for odvcencio/gotreesitter v0.9.2.
go.sum Removes 14 tree-sitter-related checksums and adds gotreesitter; transitive testify dependency drops from v1.9.0 to v1.7.0.
docs/docs/changelog.md Adds an accurate changelog entry for the LSP parser migration under the current development version.

Sequence Diagram

sequenceDiagram
    participant LSP as LSP Handler
    participant P as parser
    participant H as parseYAMLDocument()
    participant GT as gotreesitter
    participant G as grammars.YamlLanguage

    Note over G,GT: Package init — yamlLanguage loaded once
    G-->>GT: YamlLanguage()

    LSP->>P: inMixinsPosition / inDependsPosition / getCommands / etc.
    P->>H: parseYAMLDocument(document)
    H->>GT: ts.NewParser(yamlLanguage).Parse(docBytes)
    GT-->>H: *Tree, error
    H-->>P: *Tree, []byte, error
    P->>GT: ts.NewQuery(queryStr, yamlLanguage)
    GT-->>P: *Query
    P->>GT: query.Exec(root, yamlLanguage, docBytes)
    GT-->>P: MatchIterator
    loop NextMatch()
        P->>GT: matches.NextMatch()
        GT-->>P: Match, ok
        P->>GT: capture.Node.Text / Type / Parent / StartPoint
        GT-->>P: result
    end
    P->>GT: tree.Release()
    P-->>LSP: bool / string / []Command
Loading

Comments Outside Diff (1)

  1. .goreleaser.yml, line 12-22 (link)

    P2 Set CGO_ENABLED=0 in goreleaser build configs

    The C compiler env vars (CC, CXX, PKG_CONFIG_*) have been correctly removed, but CGO_ENABLED is now left unset across all three build targets. When building the linux-amd64 target natively inside the goreleaser-cross container, Go defaults to CGO_ENABLED=1. Since there are no CGo imports, the binary will still build correctly — but setting it explicitly to 0 in each build's env: block communicates intent and prevents any accidental CGo linkage if a transitive dependency ever introduces it.

    Example for darwin-amd64 (apply the same to the other two build targets):

      - id: darwin-amd64
        main: ./cmd/lets
        goos:
          - darwin
        goarch:
          - amd64
        env:
          - CGO_ENABLED=0
        flags:
          - -mod=readonly

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Last reviewed commit: "Migrate to gotreesit..."

Comment thread Dockerfile
Comment thread go.sum
Comment on lines +82 to +83
github.com/odvcencio/gotreesitter v0.9.2 h1:ZROpRS+bTcC1mwofBp53l66Jv00FH0ccViSwGVmaBBM=
github.com/odvcencio/gotreesitter v0.9.2/go.mod h1:Sx+iYJBfw5xSWkSttLSuFvguJctlH+ma1BTxZ0MPCqo=
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Transitive testify version downgrade

Replacing go-tree-sitter with gotreesitter has caused stretchr/testify to drop from v1.9.0 to v1.7.0 as a transitive dependency (via gotreesitter's own test dependencies). Since testify is not directly imported anywhere in this module, the behavioral impact is minimal. However, this is worth being aware of in case any of the module's own transitive dependencies require v1.9.0 or above, which could cause unexpected go mod tidy conflicts down the line.

@kindermax kindermax force-pushed the codex/adopt-gotreesitter-for-project branch 3 times, most recently from a2d0763 to 3aedb47 Compare March 18, 2026 21:32
@kindermax kindermax force-pushed the codex/adopt-gotreesitter-for-project branch from 3aedb47 to 2234be0 Compare March 18, 2026 22:19
@kindermax kindermax merged commit e24933c into master Mar 18, 2026
5 checks passed
@kindermax kindermax deleted the codex/adopt-gotreesitter-for-project branch March 18, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant